For general information and resources, ITIL and ITSM World is the most well known for both ITIL and ITIL Books. A shorter snapshot approach can be found at ITIL Zone
Note: ® ITIL is a registered trademark of OGC. This portal is totally independent and is in no way related to them. See our Feedback Page for more information.
Joined: Feb 12, 2007 Posts: 27 Location: Minneapolis, MN, USA
Posted: Sat Jan 29, 2011 7:51 am Post subject: How to measure MTBF or MTBSI
Mean Time Between Failure or Mean Time Between System Interruption, how do you measure?
If you have failure at Jan 1, then Feb 1, then March 1, then you can say you have a 1 month MTBF. But what happens if it is now August and you haven't had an failure since? Do you still report a 1 month mean, even though the last failure was 5 months ago? Also, what time period do you go back to for your mean calculation? 12 months? Beginning of calendar year?
MTBF is the mean elapsed time from the time an IT service or component is fully restored until the next occurrence of a failure in the same service or component.
As for the MTBSI(Mean Time Between System Incidents) it is a Metric used for measuring and reporting Reliability. MTBSI is the mean time from when a System or IT Service fails, until it next fails. MTBSI is equal to MTBF + MTRS MTRS(Mean Time to Restore Service.)
Regards, _________________ Ali Makahleh
Configuration Management(Blue Badge),
ITILV2 Service Manager(Red Badge),
ITILV3 Expert(Lilac Badge) Certified.
“If you can't describe what you are doing as a process, you don't know what you're doing." W. Edwards Deming.
Joined: Feb 12, 2007 Posts: 27 Location: Minneapolis, MN, USA
Posted: Tue Feb 01, 2011 2:12 am Post subject:
Sure, I understand the definitions. I hope you can see where I have a problem. If there isn't a "next occurrence", how do you calculate a mean? Or if your previous mean was 1 month, but we have yet to have the next occurrence several months later, how does that play in the calculation? Please see my example.
Joined: Sep 16, 2006 Posts: 3110 Location: London, UK
Posted: Tue Feb 01, 2011 2:48 am Post subject:
DanA
When there is only 1 incident, I would do the following
As I use excel, I would do the following
The date of the first (only) incident would be is b2
In c2, I would use the function now() to have the current date / time and add .25 (1/4 of a day). I would then have two distinct columns of fields
1 - Number of days since last incident (Failure)
2 - MTBF
Since the two columns would be the same # of days, this will visually state that there has not been an incident _________________ John Hardesty
ITSM Manager's Certificate (Red Badge)
Change Management is POWER & CONTROL. /....evil laughter
Joined: Mar 04, 2008 Posts: 1883 Location: Newcastle-under-Lyme
Posted: Tue Feb 01, 2011 5:38 am Post subject:
DanA wrote:
If there isn't a "next occurrence", how do you calculate a mean?
It is always the case that the next occurrence has not happened. It is nothing to do with how long since it did. Any calculation using the time since the last incident is really saying: 'if we had an incident today the mean time would be x'
since the incident has not occurred this is pretty meaningless until x is as great as the mean time calculated to the point of the last incident. From then on x is relevant as it indicates some notion of improvement, whether fortuitous or otherwise is another matter.
As to how far back you go: well you go back to the beginning of time. At least you go back far enough for the satisfaction of your customer, and far enough to give you valid information on which to analyse your service record and design improvement goals. whether that is a few months or a few years probably has more to do with how much your services and service objectives change over time rather than some artificial concept like a year. _________________ "Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
Joined: Feb 12, 2007 Posts: 27 Location: Minneapolis, MN, USA
Posted: Thu Feb 03, 2011 1:51 am Post subject:
I like this answer and I'll follow up with another question: Is there a better measurement? Seems that Availability combined with MTBSI gives a decent view of your overall % uptime, plus adds measurement of stability. Is anyone doing anything different?
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum