Posted: Fri Nov 16, 2007 8:48 pm Post subject: Metrics: correlation between unresolved incidents and age
Metrics: correlation between the number of unresolved incidents and their average age.
I’ve a question for metrics experts. This is about creating a performance indicator on a couple of metrics.
I’m refining the reporting I do on a weekly basis on the service desk activity and I’ve found an interesting metric: the average age of the stock
(here I call “the stock” the collection of unresolved incidents at an instant T, usually the end of the working day)
In other terms I consider everyday at 18h the number of unresolved incidents and their average age (instant_T - creation_time).
I’m interested in extracting a trend to understand if, week after week, month after month, we are improving our service. That means we leave incident starving unresolved less and less time.
The problem is that put the average age of the stock day after day on a graph for a given period and tracing a line is not enough to produce a valid performance indicator.
Because if one day (day 1) I have lot of unresolved issues in the stock and most of them are recently opened this gives a low average age.
If most of these “young” incidents are resolved the day after (day 2) and there are no new incidents logged since then this produces a stock with less issues but an higher value of the average age.
From my point of view situation of day 2 is better than day 1 but if you look only at the average age of the stock you could think the opposite.
That’s why I’m looking for a way to correlate the average age of the stock with the number of incidents in it (number of unresolved incidents), to produce a valid performance indicator, but so far I’ve not been able to find the good formula.
Any idea? Any contribution on this topic is more than welcome!
In you report calls that are logged later in the day get into the report. But just because they are reported later in the day should be be reported on??
Do you have an SLA against incident resolution. Thi scan be as basic as we will resolve P1's in 4 hours, P2's in 8 hours, P3's in 24 hours (all hours are busies hours if that is the requirement).
You then run a report to shouw you the calls resolved in SLA and out of SLA. For thise out of SLA you can then run the ageing report to show what the trend for out fo SLA calls is. If you can reduce the amount of calls that are out of SLA and the amount of time that they remain out of SLA you can show improved performance.
I am not convinced of the value of the report you mentioned if it just shows the age of the calls. It has to be tied into a metric and the main one to show value is increase First Contact Resolution (no ageing of calls) and reduce out of SLA calls (reducing the ageing).
Thats not to say that what you are reporting may not be valuable to your org. _________________ Mark O'Loughlin
ITSM / ITIL Consultant
I’ve got your point and I agree with you. I already extract an indicator which answers to your question. It is the average resolution time per class of priority.
As I told in my previous post I’m actually “refining” the report, this action deals more with adding some complementary information to the existing nut of consolidated metrics than produce the big picture.
(Having said that it is quite probable that I’ll reach the conclusion to not include this information in the report. No interest in having data unuseful for decision making)
What I’d like to sort out by analysing metrics on the pile of unresolved incidents on a daily basis is information about the turnover of such incidents. Do we have a couple of long lasting weird incidents or issues rotate fast?
So far I’ve found two ways to correlate the two metrics listed above (number of unresolved incidents “metric_1” and average age of such incidents “metric_2”) to get an indicator:
1) Simple way: produce the total age of the “stock” day by day. This information says nothing by itself but is interesting if analysed on a medium period trend. If the line goes down it’s a good sign, if it grows… you should look what’s going on within the SD activity…
2) A bit more complicated way: consider the delta day by day of the two metrics listed above. Then compare the signs of these variations. In a normal situation metric_1 and metric_2 should vary oppositely.
IF (sign (metric_1) + sign (metric_2) ) = 2 -> warnig: both value are increasing so something is out of control
IF (sign (metric_1) + sign (metric_2) ) in (-1,0,1) -> don’t care (normal behaviour)
IF (sign (metric_1) + sign (metric_2) ) = -2 -> very good: both values are decreasing, the Service desk is improving
In other terms this 2nd indicator is interesting as alarm bell.
About your last point: considering the First Contact Resolution metric. In our context we don’t take in account this information.
Regading First Contact Resolution. If you measure this and look to increase the % of First Contact Resolution you are actually decreasing the tickets that get left over. FCR may be low to start with but over time and given dedication the line will move upwards and positive results will show. Is this a better indicator to show management? It shows a proactive approach is being taken also.
Ask yourself do you monitor these "stock" tickets or look to reduce them by measuring FCR and increasing the FCR rate. It could be chicken and egg in your situation as to which way you go.
Granted that to increase FCR you have to train staff and have a good knowledgebase but the overall benefit to the business is justification.
Anyway best of luck on youe endevours. _________________ Mark O'Loughlin
ITSM / ITIL Consultant
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum