| View previous topic :: View next topic |
| Author |
Message |
AndyW Senior Itiler

Joined: Feb 14, 2008 Posts: 77
|
Posted: Thu Feb 14, 2008 11:03 pm Post subject: How to handle Server Reboots? |
|
|
Hi,
I have a question regarding Server Reboots.
Which ITIL process handles server Reboots? I was under the impression to raise either a standard or an emergency RfC for all server reboots in order to assess the possible impact this reboot might has have on other systems and to have a documented evident.
But now I was thinking of handling this within Incident Management.
What is best practice? |
|
| Back to top |
|
 |
dboylan Senior Itiler

Joined: Jan 03, 2007 Posts: 189 Location: Redmond, WA
|
Posted: Thu Feb 14, 2008 11:06 pm Post subject: |
|
|
| Server reboots should be handled through the Change Management process. When the RFC has had approval from the business, then the outage time is considered a Maintenance Window. Outages that occur during Maintenance Windows are not considered Incidents and do not count against Availability. |
|
| Back to top |
|
 |
AndyW Senior Itiler

Joined: Feb 14, 2008 Posts: 77
|
Posted: Thu Feb 14, 2008 11:27 pm Post subject: |
|
|
ok but. We have a very complex CHG process in place. Only the process for Standard or Emergency changes is less complex and pretty straight forward. Thats why I have forced people to raise emergency tickets for reboots as of now.
Unfortunately we are in the situation that many reboots will be done outside a Maintenance Window for serveral reasons for example a proccess crashed on the application server and so on ...
Question is if this can not be covered via the INC process. For example people raise a ticket for the reboot (documented evidence) and figure out the possible impact offline. Then they simply follow so called "operational procedure" eg. reboot server and wait  |
|
| Back to top |
|
 |
Mark-OLoughlin Senior Itiler

Joined: Oct 12, 2007 Posts: 306 Location: Ireland
|
Posted: Thu Feb 14, 2008 11:37 pm Post subject: |
|
|
The reason server roboots are considered to be under change control is in relation to your last comment "Then they simply follow so called "operational procedure" eg. reboot server and wait "
Who has looked at the possible knock on effects of the reboot?
Who has signed off on the reboot from a server owner point of view
Who has told the service about this etc, etc. Incident is concerned with getting users back ASAP. However the conflict with reboots is that while it may get some users back quickly it may also affect others (loss of service) during the reboot. Change is a form of control ti mininise the risk and impact of changes.
My advise is to to keep these under emergency change control but define a way of getting them approved quickly and executed quickly under the conditions of the emergency change process. _________________ Mark O'Loughlin
ITSM / ITIL Consultant |
|
| Back to top |
|
 |
UKVIKING Senior Itiler

Joined: Sep 16, 2006 Posts: 3110 Location: London, UK
|
Posted: Fri Feb 15, 2008 12:00 am Post subject: |
|
|
My only addendum to this regarding change mgmt, incident mgmt and reboots
If the reboot is part of an incident resolution process to restore service, then the incident should be used
if the reboot is going to happen at a specific date/time for a reason, then change mgmt should control it _________________ John Hardesty
ITSM Manager's Certificate (Red Badge)
Change Management is POWER & CONTROL. /....evil laughter |
|
| Back to top |
|
 |
AndyW Senior Itiler

Joined: Feb 14, 2008 Posts: 77
|
Posted: Fri Feb 15, 2008 12:26 am Post subject: |
|
|
Ok lets assume the following:
out of the blue an application gets a problem and a reboot would be necessary. Users call the Service Desk and they can not fix the issue within the INC process. They do the routing to the 2nd level resolver group. They analyse the issue and figure out that a reboot would be necessary.
Do they then raise an emergency for this? (please also keep in mind this would increase our statistic of the number of emergency changes drastically)
or do they simply update the incident ticket? |
|
| Back to top |
|
 |
Mark-OLoughlin Senior Itiler

Joined: Oct 12, 2007 Posts: 306 Location: Ireland
|
Posted: Fri Feb 15, 2008 12:52 am Post subject: |
|
|
Hi,
"please also keep in mind this would increase our statistic of the number of emergency changes drastically" - so what? If it increases or decrease the number is not the point. The numbers are to tell us something. If it increases your emergency changes the story it is pointing to is that you currently have an unstable environment which in turn should tell you to investigate why, lack of upgrades, monitoring, pro active measures etc?
I am of the mindset to require an Emergency Change to ensure the correct controls are in place and followed. This can have a retrospective change logged to ensure admin does not bog us down. Also minimal approvers is required but someone in the business approves the reboot. The reason is that other users may be affected by the server being rebooted and additional calls may be logged to the service desk. Just having L2 reboot the server will not address either of these.
I understand that the other view is to use incident records and process. I don't case as long as the following takes place
1) Business approves the reboot - so they are aware of any impact
2) Service Desk is made aware as they could get increased calls
3) Reboots are trended to see if a problem can be identofied _________________ Mark O'Loughlin
ITSM / ITIL Consultant |
|
| Back to top |
|
 |
UKVIKING Senior Itiler

Joined: Sep 16, 2006 Posts: 3110 Location: London, UK
|
Posted: Fri Feb 15, 2008 1:02 am Post subject: |
|
|
Like mark says,
while I would prefer if the incident was the control, there needs to be a control / audit trail of the problem
That being said, if the service is NOT impacted by the incident, a reboot is uncalled for.
But even if the service was impacted so bad that a system reboot is needed, the resolution team needs to do due diligence and make sure the noc etc are aware of the plan to reboot the system
but to deny a emergency change because the change the stats mean you are focusing on the stats rather than the actually work
pragmatism is needed as well as reality check
things go wrong. some times a lot. track it count it. explain it. but dont try to fret over the fact that numbers go up and down.. they will _________________ John Hardesty
ITSM Manager's Certificate (Red Badge)
Change Management is POWER & CONTROL. /....evil laughter |
|
| Back to top |
|
 |
AndyW Senior Itiler

Joined: Feb 14, 2008 Posts: 77
|
Posted: Tue Mar 04, 2008 1:42 am Post subject: |
|
|
Hi,
is there something "official" I can find the answer to that question in the ITIL books? How does the ITIL framework address this issue?
What about the Problem that the Change Manager is not available 24 by 7.
Say a reboot is necessary. An emergency change will be created and the Change Manager is not available?
Of course emergency changes can be raised afterwards but only for documentation purposes. But this does not serve the objective. Since then no impact assessment will be done.
Cheers, |
|
| Back to top |
|
 |
UKVIKING Senior Itiler

Joined: Sep 16, 2006 Posts: 3110 Location: London, UK
|
Posted: Tue Mar 04, 2008 1:54 am Post subject: |
|
|
AndyW
There is nothing official about whether reboots should fall under change management or not...
It just depends.
A reboot or restart of 1 server that provides service interrupts that services. Therefore, because of the interruptions.. a change request - if that is what the company wants to do - should be used to track arequest out of the blue to reboot the server.
If the reboot is part of the incident resolution.... Web site hung... reboot/restart the d**n windows server, then there is no real need to double ticket the work
if the networks or systems want to reboot a server / router on wednesday for 1 to # weeks, then the best way to let people know about it is to push it through chaneg control
As a reboot merely onoffs the machine, there is no change in configuration for the machine just the powered state.. therefore under strict itil - changes for configuration work on all CIs - it dont really applly
but you need to track these some where _________________ John Hardesty
ITSM Manager's Certificate (Red Badge)
Change Management is POWER & CONTROL. /....evil laughter |
|
| Back to top |
|
 |
asrilrm Senior Itiler

Joined: Oct 07, 2007 Posts: 441 Location: Jakarta, INA
|
Posted: Tue Mar 04, 2008 12:22 pm Post subject: |
|
|
I would place it on incident management, but my company has decided to put it on service request management.
We had some reasons not to put it on CM:
- As John said, it was merely on-off switch, no change to CIs
- We could put them as emergency RFCs, but that would contradict with the spirit of CM, which is "every change should be well planned". Imagine if there are so many emergencies, it would affect the CM's performance.
As to Mark's thought, of course emergency change is required but let's say in one period the number of emergency changes exceeds the number of normal changes, that would give a bad impression about the CM team. |
|
| Back to top |
|
 |
AndyW Senior Itiler

Joined: Feb 14, 2008 Posts: 77
|
Posted: Tue Mar 04, 2008 5:41 pm Post subject: |
|
|
Hi Asrilrm,
thanks for sharing your ideas.
Some remarks from my side:
your concerns regarding the high number of emergency RfCs: Yes this is a valid point I also had thought about this. But I think the disadvantages are acceptable especially if you can highlight them in your reporting e.g. Emergency Changes due to Reboots.
I came to the decision to put the reboots under change control rather than under the control of INC.
INC would only create a ticket and reboots the server without thinking about the impact this has on other applications. Since we had a lot of troubles due to the reboots I clearly see this under CHG control.
Cheers, |
|
| Back to top |
|
 |
Ed Senior Itiler

Joined: Feb 28, 2006 Posts: 411 Location: Coventry, England
|
Posted: Tue Mar 04, 2008 6:15 pm Post subject: |
|
|
| UKVIKING wrote: |
There is nothing official about whether reboots should fall under change management or not...
As a reboot merely onoffs the machine, there is no change in configuration for the machine just the powered state.. therefore under strict itil - changes for configuration work on all CIs - it dont really applly
but you need to track these some where |
Sorry John - I just cannot let this pass
In strict ITIL terms
The service itself should be a CI
In it's pre reboot position the server is either down or frozen denying the service it provides to the user. The CI is not available.
Rebooting the server restores the service to the user, thereby changing the state of a CI. This requires a Change.
I do allow a Standard Change for this, recognising that we know the risks, and that they are well documented. _________________ Regards
Ed |
|
| Back to top |
|
 |
UKVIKING Senior Itiler

Joined: Sep 16, 2006 Posts: 3110 Location: London, UK
|
Posted: Tue Mar 04, 2008 7:04 pm Post subject: |
|
|
Ed,
Then we will have to disagree.
A reboot to a server in itself or a restart to a service like a web service on a server in not a configuration change
How can the configuration attributes be changed. It is either On or Off.
I agree that if the server is part of an identified service, then that's a nother kettle of fish
Run the change process for the impact (lack thereof) of the power cycle
NOTE: I know unix servers that reboot w/in 2-3 minutes while microsoft servers can takes 10s of minutes
The unix server rebooted so fast that the minimum threshold for a NMS event was 5 minutes and we never caught it but through log files
THAT being said....
If some one wants the server to be rebooted / restarted; then there to be a trail/control/management ofthat request - and chaneg is the most appropriate _________________ John Hardesty
ITSM Manager's Certificate (Red Badge)
Change Management is POWER & CONTROL. /....evil laughter |
|
| Back to top |
|
 |
Mark-OLoughlin Senior Itiler

Joined: Oct 12, 2007 Posts: 306 Location: Ireland
|
Posted: Tue Mar 04, 2008 7:50 pm Post subject: |
|
|
Hi,
ref ""As to Mark's thought, of course emergency change is required but let's say in one period the number of emergency changes exceeds the number of normal changes, that would give a bad impression about the CM team.
The metrics are not there to make the team look good. They are there to report the reality - from which you can identify issues and remedial actions.
As to whether or not a CI changes is one consideration - bear in mind the other - the potential for impact to the business / service. Will Incident or Change control this better?
We can get to literal sometimes bit the bottom line is that whatever you choose to do, do it to ensure minimun disruption in a controlled manner that has been approved by the business and has been communicated to all in advance - or as the reboot is happening and make sure the Service Desk are informed throughout. _________________ Mark O'Loughlin
ITSM / ITIL Consultant |
|
| Back to top |
|
 |
|