Posted: Tue Sep 01, 2009 5:30 am Post subject: Child Incident tickets and SLA reporting - advice needed
New to Incident Management and need some advice on how others are handling this:
1. Do you report on SLA for Child Incident tickets or just the Parent?
2. If you report on Child Incidents, do you assign them to the support team or Service Desk? I see a lot of comments where companies are leaving Child Incident tickets assigned to Service Desk but wouldn't that be saying Service Desk is responsible for the SLA on all these tickets then? It would impact their first-level resolution rate stats also.
3. Our feelings are Child Incidents should never be resolved until service is restored via an acceptable work-around or fix. But we're getting push back from our support groups saying if an issue is identified in a Parent Incident ticket then we should be able to resolve the Child tickets and let the caller know they can reference the Parent ticket #. Again, is this falsely reporting X number of tickets were resolved faster than they actually were?
Joined: Mar 04, 2008 Posts: 1894 Location: Helensburgh
Posted: Tue Sep 01, 2009 6:45 pm Post subject:
not sure what your precise definition of "child incident" is, but there is another thread that discussed this subject at some length.
What you report in an SLA entirely depends on what you have agreed with the business.
When I see this question phrased in this way, I get concerned lest people are trying to design the content of an SLA and hand it to the business. This would be bad, even if the business seems to be happy with it or encouraging it, because you would not be matching service to business requirements.
If I make some assumptions about your question, I can say the following:
There is a difference between a service being up and running and everyone being able to access it. This is a normal source of "child incidents".
The trouble is that there may be special action required for some users beyond simply restoring the app on the server or the network node (whatever was behind the issue) and therefore you cannot assume that everyone has a restored service unless you check.
If your SLA is concerned with availability to individual users over and above the general availability of a service, then you will obviously report these against your SLA.
However, unless you mean something else by "child incidents", you will most likely be counting one incident as having occurred regardless how many people have reported/experienced its effects.
Thought for the day:
When a service goes down, the fact will be reported by some, but not all users affected. Those that do not raise a report ar just as important as those that do, but I have never seen a discussion on, for example, whether a "child incident" should be raised on their behalf. but some of them may experience difficulty after the incident s believed to be fixed. Consideration of this situation could enlighten the thinking on the overall subject. _________________ "Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
I think I understand your question. My take on it is:
1. SLA covers all tickets regardless of whether they are parents or children
2. Whatever team is reponsible for the parent is also reponsible for the children & the ticket details should be updated and resolved at the same time as the parent.
3. I agree. It can only be resolved if there is a fix / workaround.
Hope this helps.
The important thing to remember about Service Level/Availability Reporting is that you report on service downtime, not directly on Incident duration. This is just the input into your calculation, you then need a good CMS/CMDB to understand the impact of this on services. A single network outage could impact dozens of services, and personally I would not raise child records for each of these. _________________ Liz Gallacher,
Accredited ITIL and ISO/IEC20000 Trainer and Consultant - Freelance
Joined: Sep 16, 2006 Posts: 3595 Location: London, UK
Posted: Mon Sep 07, 2009 10:24 pm Post subject:
I would go along with Liz on this
If you have a server farm which supports 200+ applications, services and functions and the farm crashes for short duration of time say 10 minutes due to a power spike or some such...., you could raise 200+ incident child tickets at 1 - 3 minutes each... by which time the outage is over and it turns out to be a loose network connection and the services were not impacted.. just not available
or the NMS tool suffered a hit.... _________________ John Hardesty
ITSM Manager's Certificate (Red Badge)
Change Management is POWER & CONTROL. /....evil laughter
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum