Search
Topics
  Create an account Home  ·  Topics  ·  Downloads  ·  Your Account  ·  Submit News  ·  Top 10  
Modules
· Home
· Content
· FAQ
· Feedback
· Forums
· Search
· Statistics
· Surveys
· Top
· Topics
· Web Links
· Your_Account

Current Membership

Latest: LUYW
New Today: 56
New Yesterday: 72
Overall: 139746

People Online:
Visitors: 73
Members: 1
Total: 74 .

Languages
Select Interface Language:


Major ITIL Portals
For general information and resources, ITIL and ITSM World is the most well known for both ITIL and ITIL Books. A shorter snapshot approach can be found at ITIL Zone

Related Resources
Service related resources
Service Level Agreement
Outsourcing

Note: ® ITIL is a registered trademark of OGC. This portal is totally independent and is in no way related to them. See our Feedback Page for more information.


The Itil Community Forum: Forums

ITIL :: View topic - Incident to Problem?
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Incident to Problem?
Goto page 1, 2  Next
 
Post new topic   Reply to topic    ITIL Forum Index -> Problem Management
View previous topic :: View next topic  
Author Message
Slicster
Newbie
Newbie


Joined: Oct 25, 2006
Posts: 8

PostPosted: Wed Nov 01, 2006 3:43 am    Post subject: Incident to Problem? Reply with quote

I'd like to know if the following is correct. This is the way I see an incident turning into a problem whether it is single incidents or multiple ones. I also think that incidents should never been passed to Level2 or 3 support specialists. Let me know what you think...


INCIDENT becomes a problem when: No workaround or solution to restore the service

PROBLEM: A root cause must me identified and a solution or workaround must be provided

1. Incidents should not be sent to Level2/Level3 support
2. Incidents should not have a Service Call associated with them
3. If an incident is received at IT Service Desk and they cannot restore the service or provide a workaround they will create a Problem Record and assign it to the appropriate “Problem Manager”.
4. “Problem Manager” will take a decision and redirect the problem to the appropriate L2/L3 person or persons
5. L2/L3 person will identify the root cause of the problem
6. If the root cause is identified, it will be classified as a known problem and will be added to the knowledge base for future reference which will enable IT Service Desk to resolve the incident or provide a workaround in the future

The above simplified procedure will…
• Help reduce the amount of problems by providing ITSD with solutions or workarounds to incidents.
• Help identify re-occuring incidents that may require a Problem record
• Help to identify re-occuring problems that should be “root-cause” candidates.

“Problem Manger” is a role that a specific person/persons will have each section of IT. The “Problem Manager” will decide if the problem record requires immediate attention or if it is in fact and actual problem.
Back to top
View user's profile
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3293
Location: London, UK

PostPosted: Wed Nov 01, 2006 4:36 am    Post subject: Reply with quote

Not quite

An incident does not become a problem.

An incident will still continue on until the service is restored.

A problem record is created when an incident is selected for problem mgmt review and problem mgmt agrees to working on providing a solution to the unknown underlying root cause of the incident
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
m_croon
Senior Itiler


Joined: Aug 11, 2006
Posts: 262
Location: Netherlands

PostPosted: Wed Nov 01, 2006 8:45 am    Post subject: Reply with quote

John is right. In addition, I'd like to add that the goal of incident management is to restore services as FAST as possible. The goal of problem management is to structurally solve the root cause of a disruption. Therefor, it is quite logical and very well possible that an incident and problem on the same subject exist at the same time.

Cheers,

Michiel
Back to top
View user's profile Visit poster's website
Marcel
Senior Itiler


Joined: Sep 21, 2006
Posts: 63
Location: USA

PostPosted: Wed Nov 01, 2006 1:24 pm    Post subject: Reply with quote

To emphasize what Michiel said, if you say that incidents 'become' problems in case you require level2 or level3 support to work on their resolution, you are basically saying that restoring service is no longer 1st priority.

And as John said, incidents do not turn into problems. Problems are the unknown underlying causes of one or more incidents. However, the point that you make about a problem ticket being raised when incident management cannot restore service (by using a workaround or readily available solution) might be valid, depending on your model. Some organizations will indeed do that. Many others (and I personally lean towards this idea too) will keep the resolution within incident management. The main reason: incident management is time driven (which is key to restoring service as soon as possible) while problem management is more quality driven. The result of this may be that incident management ends up discovering a root cause, even though that is not the intention of that process. However, the fact that incident management is not looking for a root cause does not mean it doesn't find root causes sometimes.

I do not understand what you mean by saying that incidents should not have a service call associated to them.
Back to top
View user's profile
Slicster
Newbie
Newbie


Joined: Oct 25, 2006
Posts: 8

PostPosted: Thu Nov 02, 2006 3:08 am    Post subject: Rebuttal Reply with quote

Thank you for the replies.

When I say an Incident becomes a problem I actually mean that a problem record is created with a reference to the incident and the incident remains opened. This would mean that Level 2 or 3 will find a solution or workaround to the assigned problem and when one is found, all the incidents or incident related can be closed.

The "Service Call" is specific to the software we use and our Incident Management team currently creates a "Service Call" for each call received and relates it to an incident which in my opinion is an extra reference that is not required. The Service Desk agent should be able to know right away if the caller is reporting an incident or not. "Service Calls" should only be created for "CODE-18" situations where a "Configuration Item" is not affected and the user is asking a question or is not using something correctly.

I just see the whole thing as being much simpler if the Service Desk is the only group that deals with incidents. This would help other teams in providing Service Desk with workarounds or known errors to future incidents which would then in turn create a triangle that has many incidents with less problems and even less changes.

Let me know what you guys think. Thanks.
Back to top
View user's profile
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3293
Location: London, UK

PostPosted: Thu Nov 02, 2006 4:43 am    Post subject: Reply with quote

If there is an incident and it goes to 2nd 3rd or Nth level for the restoration of service... then it is still Incident Management

If the incident is marked for recommendation for finding the root cause of the loss/impact of service w/no concerns on the need to restore the service... then it is Problem Management

Case in point

Every Wintel machine using Win2k is Blue Screening. The servers and the desktops are blue screening like a field of smurfs. In order to restore service (incident management), the damn machines are rebooted. (Incident Management)

In order to find out what is causing the BSD, the error code is grabbed by the system support team to give to the Nth level support team for diagnosis. This team is unconcerned with the BSD (although people will breathe down their respective necks), they find out the root cause is the non-implementation of a securitty patch or service pack.

The team would then create a change request - most likely an Emergency - and request an CAB/EC to be convened. The Change Manager will most likely authorize the change immediately and inform the team to update the tickets after the patching is done.

The implementation team would download the patch and starting patching key machines to see if the BSDs stop. If the BSDs stop, then the root cause has been found, the Emergency Change now can be applied to all the rest of the BSDs machines.

If the patch did not work, the implemenation team would either undo install the patch if possible or keep trying to find the root cause.

Meanwhile, the current work around and solution to restore service is REBOOT the machine after a BSD. Every new incident is recorded hopefully with the error code from the BSD. The Service Desk may have a method to link all of the incident ticket together and also reference the problem & change record - regardless of whether it worked for all... It may have worked for some.

After some deep analysis and involving the network types, the root cause was actually not the Service Pack or the security patch not being applied but the fact that a UDP or TCP port was left open in the network where a unscrupulous person was sending data to force machines to reboot. Liek using the NUKE program or other admin level programs which can be used remotely

Once this was found, a NEW Change Record was created and another CAB/EC wsas held for the NETWORK team to close or diable the port(s) in question.

Meanwhile, the Known Error database and the Knowledge Base database would still have the work around for BSDs for Incident management as reboot them damn box

A Post Major Outage review would reveal that the security for network ports needs to be reviewed and updated. That means the poor network types would have to analyse how poor the network infrastructure is or is not and how susceptible to hacks it is.
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
Slicster
Newbie
Newbie


Joined: Oct 25, 2006
Posts: 8

PostPosted: Thu Nov 02, 2006 5:36 am    Post subject: ? Reply with quote

Just to be sure we're on the same page, when I mention L2/L3 I'm talking about leaving the Service Desk and going to either Network Admins, Microsystems Technicians or Developers.

If a user is getting a BSD, an incident is logged and SD troubleshoots the issue. They will usually perform a reboot which in your scenario would restore the service. This means this particular incident is resolved although other users have started calling with the same issue. Each user that calls generates a new incident record but because there are multiple re-occuring incidents, root-cause analasis is required and a problem record is created for L2/L3.

I just don't see why an incident would be elevated to L2/L3 ever if the only point it to restore service as quickly as possible by using a workaround or solution. If they do not have a workaround or solution in their knowledge base then root-cause-problem record is always required so the knowledge base can provide SD with a future workaround or fix.

Let's say the SD was unable to find a workaround to the BSD incident then they need to create a problem record so L2/L3 can find a solution. They would not send each incident to L2/L3 because in this scenario because they would be bombarded with incidents! Wheras with a problem record they would find the solution, and put all informatio/workaround/solution in that problem record which would automatically provide the SD with the solution or workaround to each recorded incident.

Let say a users XXXX software crashes and Sevice Desk is unable to restore the users service by searching the knowledge base or past incidents, they will create a Problem Record for Microsystems Technicians. Microsystems technicians would find the root-cause and provide a solution or workaround. Next time SD receives the same incident they will have enough information to restore the service.

It just seems to sloppy if everyone is working with incidents/problems and change requests. L2/L3 usually have enough of a load to not have to deal with filling in incidents all day. Maybe I'll looking at it backwards but it seems more logical this way in my head? I have a diagram on my desk called the "Pyramid of Change" and it shows the top level being incident then going down to problem and change. In the beginning it will be almost a square instead of a triangle but as time goes on and a knowledge transfer goes from different levels to SD there will be less and less problem records.

This is all so confusing to me becuase so many people have so many different ideas and I just want to know if I'm making any sense at all. Thanks.
Back to top
View user's profile
Marcel
Senior Itiler


Joined: Sep 21, 2006
Posts: 63
Location: USA

PostPosted: Thu Nov 02, 2006 1:44 pm    Post subject: Re: ? Reply with quote

By what you are suggesting, you are actually saying that if the SD (L1) cannot resolve an incident by implementing a readily available solution/work-around, restoring service ASAP is no longer your top priority. Instead, finding the root cause has now become your top priority. That's why I - and likely many others - do not agree with what you suggest.

Consider something more complicated than the BSD from the example. Let's say you users are calling about their application showing wrong data. This has not happened before, so there is no work-around laying on the shelf, ready to be implemented. The SD does not have the expertise to resolve this. They expect that it may have something to do with a new release of the application that was implemented the night before, but they are not sure as they don't have detailed knowledge of the application. Now what do you want them to do? Create a problem ticket and assign this to a developer (L2) so he can do root cause analysis? Or assign the incident ticket to that same developer, so that he can first of all determine how to restore service ASAP (e.g. by backing out the new release)? I would say the latter.

Of course, once the service has been restored, you may want to work on this issue from a problem management perspective, because you do want to correct the error and reimplement the release.

Get the picture?
Back to top
View user's profile
JoePearson
Senior Itiler


Joined: Oct 13, 2006
Posts: 116
Location: South Africa

PostPosted: Thu Nov 02, 2006 6:14 pm    Post subject: Re: ? Reply with quote

Marcel wrote:
By what you are suggesting, you are actually saying that if the SD (L1) cannot resolve an incident by implementing a readily available solution/work-around, restoring service ASAP is no longer your top priority. Instead, finding the root cause has now become your top priority. That's why I - and likely many others - do not agree with what you suggest.

That's a good way of putting it.

Another factor, which in fact follows on from this, is that L1/L2/L3 distinctions don't necessarily relate to the incident/problem distinction. That is more like: incident staff are directed to resolve incidents as quickly as possible; problem staff are directed to devote their time to in-depth investigation and root cause resolution.

Slicster, you talk about L2/L3 staff being too busy to deal with incidents all day. That's why it's best to separate incident and problem staff responsibilities if at all possible (even if they have a rota or something). But to resolve major incidents you may still need more skilled staff, who don't handle all incoming incidents, but are available for escalation - they are therefore L2/L3 incident staff. By the above logic they should still be devoted to incident handling not problem investigation.

The escalation routes for an open incident can go all the way to your suppliers without the work being classified as problem management.

Makes sense?
Back to top
View user's profile Visit poster's website
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3293
Location: London, UK

PostPosted: Thu Nov 02, 2006 9:07 pm    Post subject: Reply with quote

Slicster,

Incident Management is the process/cedures which get to restore service to the user/customer

It does not matter if the incident is refered up to L2/L#, it is still Incident Management - as long as the goal is to restore service

Remember the graphical steps in the Incident mgmt work flow from the ITIL Blue Book

Investigate & Diagnose....
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
Slicster
Newbie
Newbie


Joined: Oct 25, 2006
Posts: 8

PostPosted: Thu Nov 02, 2006 11:22 pm    Post subject: OK Reply with quote

I understand what you are all saying but what happens when there are many users calling? Only 1 incident should get escalated and the rest stay at the SD? SD takes care of all other incidents once the initial one has a workaround or solution? Can you relate other incidents to a "major incident" or something like that?

If many users are calling, isn't it a Problem?

PROBLEM definition according to ITIL: A condition indentified from multiple incidents exhibiting common symptoms, or from a single significant incident, indicative of a single error, for which the cause is unknown.

I would see an incident being escalated in the case of an individual user having a low priority incident. Ex. A user calls about an issue updating a distribution list that they work with on a non commun basis. SD does some troubleshooting but is unable to restore this service so the incident would be escalated to the Microsystems Technicians or Network Admins to investigate.
Back to top
View user's profile
Slicster
Newbie
Newbie


Joined: Oct 25, 2006
Posts: 8

PostPosted: Thu Nov 02, 2006 11:43 pm    Post subject: Addition Reply with quote

It becomes confusing for everyone when SD is working on an incident which they cannot restore service so they escalate it to the next level. If other users start calling 1 or 2 more incidents may slip to the next level before a problem record gets created. This means that next level support now has multiple incidents and a problem to fill in once a solution or workaround is found. I am working in an organization with 1000+ users and I would see this situation happening all too often.
Back to top
View user's profile
Marcel
Senior Itiler


Joined: Sep 21, 2006
Posts: 63
Location: USA

PostPosted: Fri Nov 03, 2006 3:45 am    Post subject: Reply with quote

Keep in mind that even when an incident ticket gets assigned to L2 or L3, the SD (L1) still owns the incident and drives the incident resolution process.

Indeed, you may receive multiple calls with regards to the same event. Most SD tools will allow you to create parent/child relationships (that is not the same as a major incident) between incident tickets. It would be the responsibility of the SD, first of all, to detect that multiple calls are about the same thing and then create these relationships. Of course, some tickets may slip through the cracks, but also L2 or L3 will still have the chance to create the parent/child relationships (the SD, with lower level expertise, may not always be able to realize that incidents are similar or otherwise related). By assigning the parent ticket to L2 or L3, the child tickets are still attached. By implementing the resolution for the parent incident, the child incidents will also be resolved (in case of let's say a server or network incident), or the SD could be given instructions on how to implement the resolution for each individual incident (in case of let's say desktop computer incidents).
Back to top
View user's profile
Slicster
Newbie
Newbie


Joined: Oct 25, 2006
Posts: 8

PostPosted: Fri Nov 03, 2006 4:05 am    Post subject: Reply with quote

Our software does not allow us to relate incident to other incidents. A SD agent would receive a call and verify if there are any other incidents related to the same "Configuration Item" and if so a problem should be created because it is a multiple occuring incident. Here's where I see an issue. LX will be working on restoring the service for the incident he was assigned and at the same time he will be getting all other incidents assigned to him and by resolving one he will have to provide the workaround/solution for all others as well. LX will then have to also find the root-cause for the problem record! Too much work, No? LX gets the incident and tries to restore service as fast as possible but then has a bundle of work just to fill out the incidents unless the remainder of incidents were to remain at the SD until the first is restored. Although, that would sift the incident load to the SD instead. I think I see things differently because of our software limitations. If we were able to use Parent/Child for incidents it would be very simple to escalate incidents to other levels.
Back to top
View user's profile
Slicster
Newbie
Newbie


Joined: Oct 25, 2006
Posts: 8

PostPosted: Fri Nov 03, 2006 4:52 am    Post subject: Makes Sense... Reply with quote

The only way we would be able to use Parent Child with our software and have simplified Incident elevation would be to use our "Service Call" feature. When a SD agent logs a call they will create a "Service Call" and see if there are any already existing incidents for the users issue and relate the "Service Call" to the incident. The incident would then only have one record with multiple "Service Calls" related. It's kind of the same Parent/Child I was using before but we're going down one level so a Problem would only be created when multiple Incident records were found and a "real" root-cause needs to be found to eliminate those future incidents from occuring. What do you think?
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    ITIL Forum Index -> Problem Management All times are GMT + 10 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB 2.0.8 © 2001 phpBB Group
phpBB port v2.1 based on Tom Nitzschner's phpbb2.0.6 upgraded to phpBB 2.0.4 standalone was developed and tested by:
ArtificialIntel, ChatServ, mikem,
sixonetonoffun and Paul Laudanski (aka Zhen-Xjell).

Version 2.1 by Nuke Cops © 2003 http://www.nukecops.com

Forums ©

 

Logos/trademarks property of respective owner. Comments property of poster. Rest © 2004 Itil Community for Service Management & Foundation Certification. SV
Site source copyright (c)2003, and is Free Software under the GNU / GPL licence. All Rights Are Reserved.