Search
Topics
  Create an account Home  ·  Topics  ·  Downloads  ·  Your Account  ·  Submit News  ·  Top 10  
Modules
· Home
· Content
· FAQ
· Feedback
· Forums
· Search
· Statistics
· Surveys
· Top
· Topics
· Web Links
· Your_Account

Current Membership

Latest: LRTI
New Today: 29
New Yesterday: 42
Overall: 148440

People Online:
Visitors: 54
Members: 2
Total: 56 .

Languages
Select Interface Language:


Major ITIL Portals
For general information and resources, ITIL and ITSM World is the most well known for both ITIL and ITIL Books. A shorter snapshot approach can be found at ITIL Zone

Related Resources
Service related resources
Service Level Agreement
Outsourcing

Note: ITIL is a registered trademark of OGC. This portal is totally independent and is in no way related to them. See our Feedback Page for more information.


The Itil Community Forum: Forums

ITIL :: View topic - Define an Outage
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Define an Outage

 
Post new topic   Reply to topic    ITIL Forum Index -> The ITIL Service Desk
View previous topic :: View next topic  
Author Message
ChrisB
Newbie
Newbie


Joined: Nov 20, 2007
Posts: 4

PostPosted: Wed Nov 21, 2007 3:45 am    Post subject: Define an Outage Reply with quote

During a cross-functional meeting today, an interesting observation was raised. What, in fact, is an outage?

Sometimes, the answer is black & white. A service is completely unavailable because somebody shut a server down or some such event.

Other times, however, we have a grey area. Suppose, for example, that a server is experiencing a "memory leak" due to a particular application. The server is up and running, the application is up and running, but response is so slow that the system is unusable. Although the switch is on, nobody can use the system to do their job - or maybe 1% of the people can - or 5% - or 25%?

Or suppose that an application is up and running, but for some reason a popular module of the application is non-responsive. The application (service) isn't down, but for all intents and purposes it's disrupted.

We expect our technicians to tell us if a service is or is not experiencing an outage. Left to their own judgement, we get inconsistency across the enterprise (this is a Fortune 500 company).

What black & white criteria can we present to our technicians to define what constitutes an outage?
Back to top
View user's profile
Mark-OLoughlin
Senior Itiler


Joined: Oct 12, 2007
Posts: 306
Location: Ireland

PostPosted: Wed Nov 21, 2007 4:05 am    Post subject: Reply with quote

Hi,

I define an outage along these lines to cover exactly what you have mentioned

the system/service/CI is unavailable / down or disrupted to such an extend that it may as well be unaccessable and that users of the service/system/CI are unable to work

The key is to include "... or disrupted to such an extend ..."
_________________
Mark O'Loughlin
ITSM / ITIL Consultant
Back to top
View user's profile
dboylan
Senior Itiler


Joined: Jan 03, 2007
Posts: 189
Location: Redmond, WA

PostPosted: Wed Nov 21, 2007 6:56 am    Post subject: Re: Define an Outage Reply with quote

I think you may be allowing the Technical side of the house to define an outage. According to ITIL, the Business is the group that defines an outage. This is done through the Service Level Management process and encompasses both interruptions of service and performance degradation.

Also, ITIL doesn't use the term Outage except when the Availability Management process is calculating availability. ITIL uses the term Incident. An Incident is defined as any event outside the normal delivery of a service that causes (or may cause) an interruption or degradation in the service.

Per this definition, a server in cluster that fails would meet the criteria of an Incident. The increased risk of an outage and the increased risk of performance degradation, even though there was no perceivable affect in the delivery of the service, are grounds for it to be considered an Incident.

If you are measuring Availability and need to know the times when services are unavailable, then it is dependent on Service Level Management to define what levels of service delivery are acceptable to the business.

Don
Back to top
View user's profile
ChrisB
Newbie
Newbie


Joined: Nov 20, 2007
Posts: 4

PostPosted: Wed Nov 21, 2007 7:59 am    Post subject: Reply with quote

Considering that I wear both an Availability and SLM hat within the organization, then are you saying that the definition of "Outage" is negotiable with the customer? I was hoping that there was some black & white definition within some ITIL book that I've just overlooked.
Back to top
View user's profile
dboylan
Senior Itiler


Joined: Jan 03, 2007
Posts: 189
Location: Redmond, WA

PostPosted: Wed Nov 21, 2007 11:45 am    Post subject: Reply with quote

Per ITIL, every aspect in IT is decided through negotiations with the business.
Back to top
View user's profile
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3318
Location: London, UK

PostPosted: Wed Nov 21, 2007 7:45 pm    Post subject: Reply with quote

ChrisB

As stated already, an outage is defined based on context

If there is a cluster of 10 web servers and 1 panic boots, the service is NOT affected (directly) but the server is affected (directly)

There should be an incident record for the panic reboot tracking the panic reboot based on the fact that the panic reboot is not part of normal service (DIGRESS: Just because Microsoft O/S reboots as part of the poor design does not mean it is part of the service GRIN END DIGRESS)

So from an Incident Mgmt POV, there was an incident w/ an outage / down time while the server is rebooting

From a Problem Mgmt POV, should this incident be used as a reason to initiate a Problem record . Does the same server panic boot often, does the same O/S, service pack, patch level, Architecture, Make/model or applications on the server always panic boot.. look for trends

From an Availability Management POV and where it tracks the Web Service - there was NO downtime as the 9 of 10 servers still provided service.
If AM is tracking % of a cluster and performance metrics and the service still hummed along.... no outage.. however...if the AM track the service performance and the performance failed to meet the defiend spec, then AM is impacted

From a SLM POV, how is the SLA written, if the SLA is written and it states that Web Service - based on Clustered servers can be w/in SLA if #% of cluster is active and # performance is met. Then teh SLA is either breached or not depending on the SLA

In other words.

IT DEPENDS...... which is the paradigm for ITIL
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    ITIL Forum Index -> The ITIL Service Desk All times are GMT + 10 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB 2.0.8 © 2001 phpBB Group
phpBB port v2.1 based on Tom Nitzschner's phpbb2.0.6 upgraded to phpBB 2.0.4 standalone was developed and tested by:
ArtificialIntel, ChatServ, mikem,
sixonetonoffun and Paul Laudanski (aka Zhen-Xjell).

Version 2.1 by Nuke Cops 2003 http://www.nukecops.com

Forums ©

 

Logos/trademarks property of respective owner. Comments property of poster. Rest 2004 Itil Community for Service Management & Foundation Certification. SV
Site source copyright (c)2003, and is Free Software under the GNU / GPL licence. All Rights Are Reserved.