Search
Topics
  Create an account Home  ·  Topics  ·  Downloads  ·  Your Account  ·  Submit News  ·  Top 10  
Modules
· Home
· Content
· FAQ
· Feedback
· Forums
· Search
· Statistics
· Surveys
· Top
· Topics
· Web Links
· Your_Account

Current Membership

Latest: LKorff
New Today: 11
New Yesterday: 69
Overall: 148088

People Online:
Visitors: 76
Members: 0
Total: 76

Languages
Select Interface Language:


Major ITIL Portals
For general information and resources, ITIL and ITSM World is the most well known for both ITIL and ITIL Books. A shorter snapshot approach can be found at ITIL Zone

Related Resources
Service related resources
Service Level Agreement
Outsourcing

Note: ® ITIL is a registered trademark of OGC. This portal is totally independent and is in no way related to them. See our Feedback Page for more information.


The Itil Community Forum: Forums

ITIL :: View topic - Different IM process depending on urgency?
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Different IM process depending on urgency?

 
Post new topic   Reply to topic    ITIL Forum Index -> The ITIL Service Desk
View previous topic :: View next topic  
Author Message
pk_
Itiler


Joined: Feb 03, 2010
Posts: 30

PostPosted: Wed Feb 03, 2010 10:55 pm    Post subject: Different IM process depending on urgency? Reply with quote

Hi all,

I'm currently rewriting our Incident Management process and I was wondering what peoples' views were on the following issue I've encountered.

A bit of background:

Our Incidents are primarily detected by 1st Line staff who constantly monitor alarm screens and check our CIs. We only very occasionally have Incidents raised by our customer or 3rd party suppliers on the phone or via email.

A number of Incidents related to Problems/KEs have immediate impact on our availability SLAs and so require immediate action by 1st line staff to resolve them and make the CI available again. (Every second counts in some of our availability SLAs.) These Incidents would not be escalated or assigned to anyone; merely dealt with by the person who detected them.

The issue:

Some of our Incidents will require immediate action before being logged (and then immediately closed), while others will need to be logged first, escalated, assigned etc.

I am wary of having a different process depending on an Incident's impact/category/urgency. Is this a valid concern or do you think I should go ahead and make this distinction?

Any comments much appreciated. Cheers!
Back to top
View user's profile
thechosenone69
Senior Itiler


Joined: Jun 06, 2007
Posts: 268

PostPosted: Wed Feb 03, 2010 11:48 pm    Post subject: Reply with quote

Pk,

Let me ask you few questions to be able to help you further.

Do you have a toolset in place? is there a response time defined in your SLA's or is it just Availability? do you have different SLA's tailored to different customers(Customer SLA's) or is it Service based?. Do you have OLA's in place to underpin the SLA's? Also If you are going to proceed in writing a different process(which is no harm if done right), how would you go in controling your staff to ensure that they log the incidents after its resolved and not by pass it?

Regards,

TCO
_________________
Ali Makahleh
Configuration Management(Blue Badge),
ITILV2 Service Manager(Red Badge),
ITILV3 Expert(Lilac Badge) Certified.

“If you can't describe what you are doing as a process, you don't know what you're doing." W. Edwards Deming.
Back to top
View user's profile
pk_
Itiler


Joined: Feb 03, 2010
Posts: 30

PostPosted: Thu Feb 04, 2010 12:36 am    Post subject: Reply with quote

Hi TCO,

We are in the process of setting up ChangeGear (http://www.sunviewsoftware.com/products/overview.aspx) for our Incident, Problem, Change, Config & Knowledge mgt purposes. Currently this is all done with seperate tools however I'm writing this process with ChangeGear in mind.

We have resolution times in our SLAs but not "response times" as such - the resolution times and availability requirements go hand in hand.

We have only one customer, with agreed Incident severities that dictate the SLAs. We do have OLAs in place.

Your last question is the main issue, I think. The problem we're facing is that because of the few Incidents that require resolution before logging, people are treating all Incidents in this way. This has led to Incidents not being logged, or logged so long after the event that the Incident records are missing key information.

I had thought about looking into automation between our alarm systems and ChangeGear, so that the Incidents that our guys need to resolve immediately are automatically logged - however I think this is a long way off, and I don't want to write the process with this in mind until I'm sure it's actually feasible.
Back to top
View user's profile
Diarmid
Senior Itiler


Joined: Mar 04, 2008
Posts: 1884
Location: Newcastle-under-Lyme

PostPosted: Thu Feb 04, 2010 12:56 am    Post subject: Reply with quote

Pk,

what an interesting question.

Before I start, you need, in any event an umbrella document that describes your incident management in terms of policy strategy and process structure.

You say that seconds count. In that case quality counts and your monitoring staff will have the capabilities and judgement to recognize and deal with these events that require instant fix. You also say that these events are associated with known errors. It will be best to stick to that and not allow "new" incidents to be treated in this way.

I am also assuming that these "emergency" incidents are not happening every two minutes, because then you would probably have to deal with it as an operational activity with dedicated staff and their own processes.

The true issue is getting it right. Normally logging first makes good sense because then you have a record come what may. But it sounds like you can achieve that through your monitoring and alerting tools. Perhaps the staff can press a few buttons while assimilating the event to get the record started; that way it will have a time stamp for when it was picked up as well as a glaring demand to be completed.

Obviously you still need to translate that into a proper incident record as soon as possible and so you might need some end of day/shift check to confirm that things were caught up with.

You also need good and frequent audit/review to ensure things are working as you require.

How do you identify when it is correct to follow this accelerated process? Well the event will link to a known error and that means your staff have to be on top of that (if they have to search for it, then perhaps the incident logging process would speed them up rather than slow them down if it is well designed); and it has to pose an immediate threat to a service (any service or some specific service(s)?) or it has to involve a failure in one of a set of specified CIs (again, if the logging process gets you there quickly it might be quicker to log the incidnet than not).

You are still left with issues. People will err on the safe side and if your precess is not rigorous enough, too many incident logs will be deferred. There is a cost to in terms of reliability of your incident records and in extra levels of audit and checking to compensate. That has to be equated with the savings of a few(?) seconds from instant action.

These are just some thoughts on some of the issues you want to look at, hopefully helpful, but far from comprehensive. I wouldn't like to say whether you should or should not go ahead without a very detailed understanding of the practicalities, costs, risks and business imperatives at the very least.
_________________
"Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
Back to top
View user's profile Send e-mail
thechosenone69
Senior Itiler


Joined: Jun 06, 2007
Posts: 268

PostPosted: Thu Feb 04, 2010 1:22 am    Post subject: Reply with quote

Pk,

In addition to what Diarmid said..

Since response time is not necessary with your clients, then I dont see how its a problem in delaying the process. Logging a ticket shouldnt take more than 2 minutes.

Anyways, in your scenario for urgent incidents I would recommend an easy solution like this:

- Leave the incident Unprocessed(Since your saying 2 minutes logging can cause availability)
- Let the engineer who's working on this incident, send an acknowlegement(Preferably a saved draft to buy time) to the rest of support staff for them to know that he's working on the incident.
- Now the engineer is focused on restoring the service as quickly as possible.
- Once service is restored the engineer can update the incident records with the details.
- If he didnt, you can verify that by audits, unlogged incident will be very easy to trace since engineers are sending ackowledgements with their names, verify the time they logged the ticket & why it took so long then hit them with a stick and ensure they dont do that again.


However, I dont recommend the steps I listed above to other organization Smile but it might fit yours. Remember ITIL should be tailored..
_________________
Ali Makahleh
Configuration Management(Blue Badge),
ITILV2 Service Manager(Red Badge),
ITILV3 Expert(Lilac Badge) Certified.

“If you can't describe what you are doing as a process, you don't know what you're doing." W. Edwards Deming.
Back to top
View user's profile
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3318
Location: London, UK

PostPosted: Thu Feb 04, 2010 3:18 am    Post subject: Reply with quote

PK

I need to drop a bomb here

serious emphasis

if you do not have a record of the incident and how it is dealt with, then how are you going to justify the staff.

second - Write your process w/o regard to the tool first.

A fool with a tool is still a fool.

You need to define the policy for Incident, problem, change config and release first. then the process and then the procedures - the procedure document is where you talk about the tool

An incident that is critical needs to be created as an incident record so that there is a place for all the staff working on it can have a central repositiory for it

If the SD / NOC is the team to resolve the issue - use the two person rule. 1 person deals with the incident resolution while other gets the paperwork started. Once the paperwork has been created that person can move on to the next issue while the individual who is resolving the incident can finish the paperwork after the solution has fixed the issue.

It does no good to the service that you are providing if you create the ticket - days later and dont get updates from the engineer in a timely manner

You basically have no proof of your team doing anything.

Also, and I shout this as loud as I can - the tool is NOT a replacement for staff nor is it really a staff multipler. It is a tool. That is all.

If you have multiple incidents happening at the same that are equally critical, you are screwed in more ways than one if you dont have a properly staffed SD
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
pk_
Itiler


Joined: Feb 03, 2010
Posts: 30

PostPosted: Fri Feb 05, 2010 1:04 am    Post subject: Reply with quote

Thank you all very for your replies - I would have replied sooner but have been a bit snowed under!

Diarmid,

Thanks very much for your comments. These “emergency” Incidents are, essentially, immediately taking a CI out of service and impacting on our SLAs. They must be resolved immediately.

Operators create an Incident record for each one, and this is audited daily by the service centre’s Supervisors.

They are the most common Incidents we encounter – perhaps 30-40 between 7am and 6pm each day. They are associated with Known Errors but will, depending on a new Release, reduce to a negligible level within the next 2 months.

I was intending to very clearly specify a limited list of Known Errors that the accelerated Incident process could apply to. However, as you say, the difficulty is ensuring that this approach is not taken in response to all Incidents.

TCO,

Again, thanks for your comments. The Operators are both logging and resolving these Incidents – they are not released to engineers. However your approach is basically what I was thinking of specifying.

UKVIKING

I agree with you - I’m in no way suggesting that these Incidents are not logged, whatever approach we take it's imperative that they're recorded. Creation of the Incident records is deferred for no longer than 5 minutes – certainly not days. In an ideal world I would take the 2 person rule but unfortunately resources don’t really allow for this.

I’m not sure what you mean regarding fool/tool, could you explain? I’m not suggesting that our tools are dictating our processes; TCO enquired whether we had a toolset, which is why I mentioned it.

In general:

What I envisaged was an Incident type that is similar to a Standard Change - they are pre-defined and can skip (or in this case swap around) some stages of the process. We can then check whether the "Emergency Incident" approach is being applied to "Normal Incidents" inappropriately and raise it with the staff member if required.
Back to top
View user's profile
thechosenone69
Senior Itiler


Joined: Jun 06, 2007
Posts: 268

PostPosted: Fri Feb 05, 2010 1:55 am    Post subject: Reply with quote

Pk,

The reason for me asking about that tool, cause there are alot of tools in the market that saves you the logging time and you can tailor it the way you like. For example: your 1st line will just have to pick up that alert that which will be automaticaly logged(staff might have to classify the incident), which literally takes seconds. Then the 1st line can concentrate on restoring the service refering to the KE's or KB. once thats done your staff can update the solution and resolve the ticket.

You can tailor those tools the way you want and they can make your life easier if you do use them right, you can automate most of the steps that I mentioned in my previous approach..

Just make sure you define the process, then tailor your tool upon that, as UK mentioned earlier a fool with a tool is still a fool Laughing

Also if you think that investment in problem management is necessary to have a rich structured Known Error Database and proactively incident prevention to lower down the number of incidents that your getting and to improve your availability then go ahead. You should know better than us in that.. Another suggestion I would give is read about availability techniques abd try to use the Extended incident life cycle technique, to see where most time is spent on each incident and where does it need improvement.
_________________
Ali Makahleh
Configuration Management(Blue Badge),
ITILV2 Service Manager(Red Badge),
ITILV3 Expert(Lilac Badge) Certified.

“If you can't describe what you are doing as a process, you don't know what you're doing." W. Edwards Deming.
Back to top
View user's profile
pk_
Itiler


Joined: Feb 03, 2010
Posts: 30

PostPosted: Fri Feb 05, 2010 2:01 am    Post subject: Reply with quote

TCO,

Yes, got you. We do have (potentially) the ability to have these Incidents automatically log records when they arise - however I don't know whether this is 100% possible yet, and so I don't want to mention the functionality in the process.

I'll have a read up on availability techniques. My ITIL books have arrived today (300 quid, oof!) so I intend to bury myself in them for the next few months anyway!

Cheers for the advice, I think I'm probably going to be on this site quite a bit having read a few of the good help you're all giving each other Very Happy
Back to top
View user's profile
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3318
Location: London, UK

PostPosted: Fri Feb 05, 2010 2:07 am    Post subject: Reply with quote

PK_

In response to your posts

if you dont have the resources to have a two person team work on a ticket

how are you going deal with 2 or more incidents where the limited resources fix the incident / restore service and then move on to the next ticket

by then the person who worked on 2 or 3 consectutive incident and restoring service will have forgotten what they have done on each.

This can be alleviated by writing it down as it is done. Sort of like the old timestamp rule for financial transactions

The phrase - a fool with a tool is still a fool - is a phrase used to describe companys / organizations etc/ that think that if they buy / use a tool to do X, all of their problems are over.

As you stated in the first post, you are looking at your processes.. You need to go policy, process, procedure, work instruction -where the last 2 should be tool centric

If your Availability SLA are written that badly and your SD / NOC / monitoring centre is so resource deprived (low staff numbers), I can almost guarentee that your Availability SLA will be breached every month.

Especially if there are multiple incidents requiring instant response happening in a short period.

Finally,
The incident process should be the incident process. The process flow should be the same for all incidents.

The only difference is the time scale between each stage form high priority / high severity and low priority / low severity incidenys

This way all staff knows exactly what the process is and dont require re-training / re-enforcement on when they move from covering one type to another

The time scale between the high pri/severity stages can be measured in second /minutes while the low pri/sev can be hours/days

But the process should the process

And lastly - please dont equate incident managment with change management process. IM restores existing service to what it was. CM redefines / adjusts / remakes (changes) the existing service in some respect.
While a resolution to the incident may involve CM, then the incident mgmt process should involve the CM Process - as stated in both the CM and IM policy and process documentation - and this would include some sort of review from the CM p.o.v
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
pk_
Itiler


Joined: Feb 03, 2010
Posts: 30

PostPosted: Fri Feb 05, 2010 2:41 am    Post subject: Reply with quote

Thanks UKV,

Believe me, the SLA and staffing issues are both things that drive me mad. However I don't have a say in them (no matter how much I drive the point home).

Quote:

Finally,
The incident process should be the incident process. The process flow should be the same for all incidents.

The only difference is the time scale between each stage form high priority / high severity and low priority / low severity incidenys

This way all staff knows exactly what the process is and dont require re-training / re-enforcement on when they move from covering one type to another

The time scale between the high pri/severity stages can be measured in second /minutes while the low pri/sev can be hours/days

And lastly - please dont equate incident managment with change management process. IM restores existing service to what it was. CM redefines / adjusts / remakes (changes) the existing service in some respect.
While a resolution to the incident may involve CM, then the incident mgmt process should involve the CM Process - as stated in both the CM and IM policy and process documentation - and this would include some sort of review from the CM p.o.v


Just to clarify - I'm using a Standard Change as an example of a Change where the process flow differs, as I'm considering having an Incident type where the process flow differs. I'm not suggesting involving the CM process in this particular conundrum.
Back to top
View user's profile
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3318
Location: London, UK

PostPosted: Fri Feb 05, 2010 3:25 am    Post subject: Reply with quote

PK

The standard change should follow the same CM process as every thing else.. it just does drives by the approval process if it meets criteria

What you should do is have 1 IM process and gates / logic paths for the severity / priority

That way the process is the same from top to bottom but the detail checks may bypass certain things or get to things faster
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
pk_
Itiler


Joined: Feb 03, 2010
Posts: 30

PostPosted: Fri Feb 19, 2010 11:32 pm    Post subject: Reply with quote

thanks for all the advice in this area, just thought i'd mention the approach i took to get round this (it's nothing groundbreaking)

our incident process now reads

Detect Incident - Categorise Incident - Assign Severity to Incident - Does an Incident Model Apply? - (if no)Log Incident - etc

if the answer is "yes" then one of several Incident Models is used instead of the 'general' Incident Management process. I've done one of these models for each of the types of incident that require immediate action/slightly different approach etc. For instance, one of the Incident Models is:

<Incident Type> Detected - Resolve Incident - Log Incident - Close Incident

this suits our operation perfectly and avoids any of the grey areas we had before. Smile
Back to top
View user's profile
Diarmid
Senior Itiler


Joined: Mar 04, 2008
Posts: 1884
Location: Newcastle-under-Lyme

PostPosted: Fri Feb 19, 2010 11:52 pm    Post subject: Reply with quote

PK

can I just warn you, someone is masquerading as you on another thread claiming you are an idiot.

I know it's not true because you appreciated our advice Smile
_________________
"Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
Back to top
View user's profile Send e-mail
pk_
Itiler


Joined: Feb 03, 2010
Posts: 30

PostPosted: Sat Feb 20, 2010 12:07 am    Post subject: Reply with quote

yeah he's asking really stupid questions as well, what a prat! Rolling Eyes
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    ITIL Forum Index -> The ITIL Service Desk All times are GMT + 10 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB 2.0.8 © 2001 phpBB Group
phpBB port v2.1 based on Tom Nitzschner's phpbb2.0.6 upgraded to phpBB 2.0.4 standalone was developed and tested by:
ArtificialIntel, ChatServ, mikem,
sixonetonoffun and Paul Laudanski (aka Zhen-Xjell).

Version 2.1 by Nuke Cops © 2003 http://www.nukecops.com

Forums ©

 

Logos/trademarks property of respective owner. Comments property of poster. Rest © 2004 Itil Community for Service Management & Foundation Certification. SV
Site source copyright (c)2003, and is Free Software under the GNU / GPL licence. All Rights Are Reserved.