Search
Topics
  Create an account Home  ·  Topics  ·  Downloads  ·  Your Account  ·  Submit News  ·  Top 10  
Modules
· Home
· Content
· FAQ
· Feedback
· Forums
· Search
· Statistics
· Surveys
· Top
· Topics
· Web Links
· Your_Account

Current Membership

Latest: MAso
New Today: 68
New Yesterday: 173
Overall: 131310

People Online:
Visitors: 47
Members: 5
Total: 52 .

Languages
Select Interface Language:


Major ITIL Portals
For general information and resources, ITIL and ITSM World is the most well known for both ITIL and ITIL Books. A shorter snapshot approach can be found at ITIL Zone

Related Resources
Service related resources
Service Level Agreement
Outsourcing

Note: ITIL is a registered trademark of OGC. This portal is totally independent and is in no way related to them. See our Feedback Page for more information.


The Itil Community Forum: Forums

ITIL :: View topic - Problem to action relation
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Problem to action relation

 
Post new topic   Reply to topic    ITIL Forum Index -> Problem Management
View previous topic :: View next topic  
Author Message
sstef
Newbie
Newbie


Joined: Aug 30, 2007
Posts: 6

PostPosted: Fri Aug 31, 2007 7:15 am    Post subject: Problem to action relation Reply with quote

Hello,

I am new in this forum and I would appreciate any comment

In my company there is home grown problem management processes and tool, in last couple of months effort is made to align it with ITIL best practices, however we are still fighting with one basic question.

What is a problem in ITIL terminology?

Since we are running highly redundant environment most of the major incidents are caused by several misbehaviors. After major incidents we perform postmortem investigation resulting in postmortem report with several agreed actions. One example is:
1.) Store procedure ABC causing SQL outage to be reviewed and fixed- sql dev team
2.) Investigate why protection mechanism in APPL1 did not work- appl1 dev team
3.) Investigate improving recovery time by automating SQL cluster failover - automation team
4.) Reconfigure monitoring to detect misbehaving store procedures before they consume all SQL resources -SQL admin team
...

We are currently discussing two ways to model this

1.) Create 4 Problem records relate them to incident record and follow up on them independently. All Problem records will be owned by problem management team and assigned for investigation to technical groups

2.) Create one problem record out of this incident and 4 different actions/tasks. Problem record will stay with Problem management team and actions will go to technical teams.

What would be better way and more in line with ITIL practices?

Thanks in Advance and regards,

Sstef
Back to top
View user's profile Send e-mail
Guerino1
Senior Itiler


Joined: Jan 01, 2006
Posts: 500
Location: New Jersey

PostPosted: Fri Aug 31, 2007 2:18 pm    Post subject: Reply with quote

Hello Sstef,

I would recommend that you not confuse "Tasks" with "Problems". Problems are not a work/To-Do list. Think of a Problem as a "perceived" or "known" defect that requires deeper analysis and understanding, before work can be assigned to address it (for example, root cause analysis).

The four things you listed look more like Tasks to me.

However, all of this doesn't mean that the one original Problem you list and track won't eventually break out into more dependent and/or independent Problems. That can happen too.

Anyhow, I hope this helps.

My Best,

Frank Guerino, CEO
TraverseIT
On-Demand ITIL Platform
Back to top
View user's profile Send e-mail Visit poster's website
sstef
Newbie
Newbie


Joined: Aug 30, 2007
Posts: 6

PostPosted: Fri Aug 31, 2007 4:22 pm    Post subject: Reply with quote

Thanks Guerino1,

That is our problem where to put borther line between problem and task/action.

Somebody could argue that

1.) Store procedure ABC causing SQL outage to be reviewed and fixed
2.) Investigate why protection mechanism in APPL1 did not work

are two different errors whith two different root causes happening on two CIs (database and application servers), they eventually will be fixed with two changes and their only link is that they contributed to the impact of the same incident.

On the other hand we can also say these are only two tasks we identifeid while analyzing one incident.

To me it very much depends on the way how we formulate task/problem and I am still wondering what would be right criteria
Back to top
View user's profile Send e-mail
jpgilles
Senior Itiler


Joined: Mar 29, 2007
Posts: 123
Location: FRance

PostPosted: Fri Aug 31, 2007 6:16 pm    Post subject: Reply with quote

Hi, I'll try to help and secribe how I would address the issue....

For once, I'll take it from a theorical perspective: your investigations show that you need to "do" A and B to solve an issue and prevent incidents to happen again. Are A, B different problems or several tasks of a single problem? My answer to that would be :
if doing the required changes to any of A or B DOES resolve an issue and prevent certain types of incidents to reoccur, then both A and B are problems (you have 2 different problems). Adversely, if you need to implement the required changes to A AND B to fix the situation, then they both relate to a single problem (that may not be clearly identifed at that stage) .

I am not familiar with V3, so I am speaking about V2 (hoping there are not too many differences on the subject.....): Problem management is basically devided into 2 different set of activities, that are Problem Control and Error Control . One way to take it is to consider that at the problem control stage, you have not clearly identified the root cause and the problem may not be clearly defined, whereas , at the error control stage, you have clearly identified the root cause(s) and the problems. In reality, you may have opened a problem for which you find two causes , that you can basically turn into 2 errors that you will work on and may fix separately, however the problem will be solved only when the two resolutiosn are in place. Each of the error control activities may be defined into tasks ... Depending on the complexity you may run your problem resolution as a projetc as Frank mentioned alreday somewhere.

Hope it can help...

Best regards
JP
_________________
JP Gilles
Back to top
View user's profile Send e-mail
Guerino1
Senior Itiler


Joined: Jan 01, 2006
Posts: 500
Location: New Jersey

PostPosted: Fri Aug 31, 2007 10:05 pm    Post subject: Reply with quote

Hello Sstef,

You're right. It always does depend on how you formulate your descriptions. In the following examples you give...

sstef wrote:

1.) Store procedure ABC causing SQL outage to be reviewed and fixed
2.) Investigate why protection mechanism in APPL1 did not work


It's normal to log two Incidents:

1: Stored Procedure ABC did not work.
2: APPL1 did not work.

At this point, you do not know (based on the descriptions you've given) that there are any legitimate Problems at hand. At this point, you need to do research to see if there are. You would create two separate Tasks (items of work) that would:

1: Investigate why Stored Procedure ABC failed.
2: Investigate why APPL1 didn't work.

On the other hand, if you kept getting "repeat Incidents", where the stored procedure or application fail multiple times, you can log a perceived Problem that needs to be invested further, because there might be a root cause that can be addressed to eliminate the repeat Incidents and the associated work and impact to the enterprise and its customers.

Again, there is no one way to do all of this. I can tell you that the above seems to be the most consistent methodology that we see (and teach).

I hope you find this useful.

My Best,

Frank Guerino, CEO
TraverseIT
On-Demand ITIL Platform
Back to top
View user's profile Send e-mail Visit poster's website
sstef
Newbie
Newbie


Joined: Aug 30, 2007
Posts: 6

PostPosted: Sun Sep 02, 2007 10:54 pm    Post subject: Reply with quote

Hello,

Thanks guys for prompt and useful responses, I would like to ask one related question. During post recovery incident analysis usually several errors in the infrastructure are detected they are not related to root cause of incident but they may have contributed to the impact of it and they certainly need to be addressed either by workaround or fix. Examples would be

1.) Permission on the server XYZ misconfigured (preventing tool ABC vital for incident recovery to run)
2.) Configuration Database entry wrong resulting in wrong escalation and extending incident recovery time
3.) Monitoring misconfiguration resulting in incident not being discover for first several hours


We have to approaches how to model this currently being discussed
1.) Open separate problem (or known error if root cause is identified) for each of these misbehaviors.
2.) Open task in main problem following that incident and eventually trigger RFC for fixing that misbehavior from that main problem.

For me the first approach looks cleaner but it also creates some overhead, what is your opinion on that?

Again thanks in advance and regards,
Slobodan Stefanovic
Back to top
View user's profile Send e-mail
Guerino1
Senior Itiler


Joined: Jan 01, 2006
Posts: 500
Location: New Jersey

PostPosted: Mon Sep 03, 2007 12:23 am    Post subject: Reply with quote

Hello Slobodan,

sstef wrote:
Thanks guys for prompt and useful responses, I would like to ask one related question. During post recovery incident analysis usually several errors in the infrastructure are detected they are not related to root cause of incident but they may have contributed to the impact of it and they certainly need to be addressed either by workaround or fix. Examples would be....

We have to approaches how to model this currently being discussed
1.) Open separate problem (or known error if root cause is identified) for each of these misbehaviors.
2.) Open task in main problem following that incident and eventually trigger RFC for fixing that misbehavior from that main problem.

For me the first approach looks cleaner but it also creates some overhead, what is your opinion on that?


There are many ways to handle this and none are wrong. We allow our customers to create "related Problems", so that they have traceability back to the original Problem(s) and we also allow them to create related Tasks, RFCs, etc. so that they can maintain traceability between all of these item types, as well. Since we can allow all permutations of relationships/linkages, my personal preference on how to handle this is:

    1. Register a new "Related Problem" for each problem you find.
    2. Ensure that each new Problem is assigned to the appropriate Product and/or Service Owner. This will ensure that experts are assigned to the Problem(s). It will also ensure accountability for them. In assigning the Problems to these Owners/teams you will be able to set attributes such as your assessed Severity, Expected Dates for completion, notes you've learned that might help the team(s), Requested Work Priority, etc. Once you've assigned it, there should be a permanent linkage between you as the creator and them as the owners. They will eventually assign the Problems to workers to create more linkage and you should be able to see everything that goes on, across your enterprise to fix those Problems.
    3. Let them (the Product & Service teams) manage the Problem(s) any way they see fit, as different teams may have different processes for doing work. If the Problems are big enough to warrant Tasks that they want to create, assign, and Track, they will do so. And, because everything is centralized in our system, you, as the Problem Manager, would be able to track the work, stay on top of it, contribute to it, and coordinate it all, very easily. You'd be like the "Command, Control & Communications" tower in an airport, except you'd be managing Problem related work, not airplanes.
    4. The owners of the Problems will then appropriately create whatever RFCs are needed, as they fix them.

Remember, the job of Problem Management is not to try and solve all Problems. It's primary functions are things like:

    1: Facilitate identification of Problems
    2: Ensure Problems are assigned/routed to the right owners/teams
    3: Ensure that appropriate information is collected about the Problems to make working on them as efficient as possible.
    4: Coordinate and manage the "progress" of Problem correction.
    5: Help the teams that are responsible for fixing Problems by providing horizontal insight, about the environment(s), as most Product/Service teams have vertical views.
    6: Reporting metrics, statistics, etc. to appropriate stakeholders.
    7: Ensure "transparency" into the data (i.e. data within records is as accurate as possible) and "traceability" across the data (i.e. linkages "between" data elements are all properly identified and maintained, such as "relationships").

In most cases, the Problem Management teams cannot possibly have all the skills necessary to debug all the Problems and they certainly won't have the resources to perform the work to fix them all.

The best Problem Management teams I've seen are the ones that realize that they can't possibly know or do it all. They're usually a single, small group in enterprises that tend to be pretty large, comparatively. As a result, they realize that their effectiveness comes in routing and coordinating Problem related work.

Anyhow, I hope you find this useful.

My Best,

Frank Guerino, CEO
TraverseIT
On-Demand ITIL Platform
Back to top
View user's profile Send e-mail Visit poster's website
sstef
Newbie
Newbie


Joined: Aug 30, 2007
Posts: 6

PostPosted: Wed Sep 05, 2007 8:46 pm    Post subject: Reply with quote

Thanks a lot Frank for your detailed answer we are trying to implement something which is in line with your recommendation the only major difference I see is that instead of relating problems we relate all problems to inital incident or incdents if misbehavior is reflected in several of them.

What is in your ionion advantege of directly relating problems instead of linking all of them to incidents?

Regards,
Slobodan Stefanovic
Back to top
View user's profile Send e-mail
Guerino1
Senior Itiler


Joined: Jan 01, 2006
Posts: 500
Location: New Jersey

PostPosted: Wed Sep 05, 2007 10:44 pm    Post subject: Reply with quote

Hello Slobodan,

sstef wrote:
What is in your ionion advantege of directly relating problems instead of linking all of them to incidents?


"Linking" and "Relating" are really the same thing. The only difference is that in my frame of mind, there is a descriptive relationship that describes how two things are related/linked together. So, for example, you can link one Problem to another Problem as a Root Cause Contributor or as a Sub-Problem. Or, you can relate an Incident to a Problem as the Problem Driver. Or you can relate an Incident to another Incident as a Repeat Incident. I can go on and on, as there are many possible relationship permutations. It's the relationships that help you "understand" what you're looking at. Sadly, most systems don't effectively allow for or handle relationships, properly, and this is why so many enterprises are misguided into buying and managing "separate" CMDB tools.

Anyhow, I hope this helps.

My Best,

Frank Guerino, CEO
TraverseIT
On-Demand ITIL Platform
Back to top
View user's profile Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    ITIL Forum Index -> Problem Management All times are GMT + 10 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB 2.0.8 © 2001 phpBB Group
phpBB port v2.1 based on Tom Nitzschner's phpbb2.0.6 upgraded to phpBB 2.0.4 standalone was developed and tested by:
ArtificialIntel, ChatServ, mikem,
sixonetonoffun and Paul Laudanski (aka Zhen-Xjell).

Version 2.1 by Nuke Cops 2003 http://www.nukecops.com

Forums ©

 

Logos/trademarks property of respective owner. Comments property of poster. Rest 2004 Itil Community for Service Management & Foundation Certification. SV
Site source copyright (c)2003, and is Free Software under the GNU / GPL licence. All Rights Are Reserved.