Search
Topics
  Create an account Home  ·  Topics  ·  Downloads  ·  Your Account  ·  Submit News  ·  Top 10  
Modules
· Home
· Content
· FAQ
· Feedback
· Forums
· Search
· Statistics
· Surveys
· Top
· Topics
· Web Links
· Your_Account

Current Membership

Latest: Rolandmugs
New Today: 24
New Yesterday: 83
Overall: 141546

People Online:
Visitors: 70
Members: 2
Total: 72 .

Languages
Select Interface Language:


Major ITIL Portals
For general information and resources, ITIL and ITSM World is the most well known for both ITIL and ITIL Books. A shorter snapshot approach can be found at ITIL Zone

Related Resources
Service related resources
Service Level Agreement
Outsourcing

Note: ITIL is a registered trademark of OGC. This portal is totally independent and is in no way related to them. See our Feedback Page for more information.


The Itil Community Forum: Forums

ITIL :: View topic - System Monitoring KPI's
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

System Monitoring KPI's

 
Post new topic   Reply to topic    ITIL Forum Index -> The ITIL Service Desk
View previous topic :: View next topic  
Author Message
swal87
Itiler


Joined: Sep 03, 2008
Posts: 21

PostPosted: Wed Mar 18, 2009 1:21 am    Post subject: System Monitoring KPI's Reply with quote

Good Afternoon,

I am trying to put together some KPIs for a System Monitoring team i have setup. There main purpose is to do event management and investigate monitoring alerts on a 3 tier hosted atchitecture.

The problem i have is that i have writers block and can't come out of my change management bubble.

Any ideas on KPIs for this.

So far i have

Ensure monitoring dashboard is at least 90% in green (good) state.
Back to top
View user's profile
UKVIKING
Senior Itiler


Joined: Sep 16, 2006
Posts: 3296
Location: London, UK

PostPosted: Wed Mar 18, 2009 3:31 am    Post subject: Reply with quote

before you do KPIs

wjat is the NMS threshold set as
is there a setting w/in the tool to unalert if the alert goes and down just outside of the threshold

You should do the following before setting impossible KPIs to be met

What is being monitored ?
Is the monitoree single tier or through tier
are you monitoring the drill through or each item

For ex

a nms check to log into the web server - create query, execute query present response w/in timings

What should be check

what is the actions after the nms tool shows an alert

the NMS tool I used had a threshold of 5 minutes. Because ICMP protocol is last and SNMP is higher, the alerts that were ICMP based - ping etc
were set to a high threshold w/a % successful
the SNMP protocol or the TCP/IP or UDP traffic that is used would also have a different effect

What do they do if the tool is set is isolated

With the threshold set to 5 minutes, the NOC I worked would still get hundreds if not thousand ... link up /...link down or host not responding when the traffic was high. Did we create tickets. nope
unless there were trends and other indications

Merely using event mgmt to generate and action tickets is pavlovian

Ping.. open ticket
ping close ticket
ping open ticket
ping close ticket

all during the shift

What is the purpose of the team. The purpose should drive the KPis
_________________
John Hardesty
ITSM Manager's Certificate (Red Badge)

Change Management is POWER & CONTROL. /....evil laughter
Back to top
View user's profile
swal87
Itiler


Joined: Sep 03, 2008
Posts: 21

PostPosted: Wed Mar 18, 2009 4:12 am    Post subject: System Monitoring KPI's Reply with quote

UK VIKING, you raise a good point about the drill through etc.

The monitoring tool we use is called NimBus, basically we have a DB, ALB and various App servers dependant on the service we offer.

Therefore on a top level monitoring screen we have icons that represent services approx 40, each of the 40 have a drill down to show the the DB, ALB and App status.

Typical metrics are, CPU, Disk Space, user concurrency, and some SP's that generate perfmon user counters relating to core functions on sql speed.

Each probe / metric has a threshold and will have a status colour dependant on status Green good - red urgent problem
, i.e. we dont really want to be alerted if the db space drops from 90% to 89%. reallistically we only want to know when we get to 10%.

A KPI i wanted to introduce is that all space alerts are dealt within x time of alerting. Trouble is managing and knowing when a breach has occured.

I am also stuck on what other kpi's to introduce. I hoep the above detail helps jog some ideas.
Back to top
View user's profile
Diarmid
Senior Itiler


Joined: Mar 04, 2008
Posts: 1884
Location: Newcastle-under-Lyme

PostPosted: Wed Mar 18, 2009 9:39 am    Post subject: Reply with quote

Don't you just love KPIs? Easy to say; impossible to apply.

So, 90% in green: is that averaged over a minute, hour, day, week, ...? More importantly, what does 90% mean in terms of service levels and quality of service?

So, space incident resolved within x time of alert; but the crucial aspect is the rate at which space is disappearing; some things creep up and will take another week to get from 85% to 86%, others will take twenty micro seconds. You don't want to drive your staff to treat the two cases the same; so how do you express the KPI?

Is the resolution process always the same? do different resolution processes require different amounts of time to perform? Do different resolution processes have long or short term effectiveness? do some resolution processes require additional work later while others fix the situation properly?

What do you want to achieve? Is it defensive (keep everything going and you are okay) or improvement (quicker than before and using less resource) based. It's probably both; so you need KPIs that measure outcomes and KPIs that measure processes. You might notice that your green example is outcomes and your space one is processes.

Sorry, this is rambling. Too late at night. I'll post it anyway, in case it helps.
_________________
"Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
Back to top
View user's profile Send e-mail
Display posts from previous:   
Post new topic   Reply to topic    ITIL Forum Index -> The ITIL Service Desk All times are GMT + 10 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB 2.0.8 © 2001 phpBB Group
phpBB port v2.1 based on Tom Nitzschner's phpbb2.0.6 upgraded to phpBB 2.0.4 standalone was developed and tested by:
ArtificialIntel, ChatServ, mikem,
sixonetonoffun and Paul Laudanski (aka Zhen-Xjell).

Version 2.1 by Nuke Cops 2003 http://www.nukecops.com

Forums ©

 

Logos/trademarks property of respective owner. Comments property of poster. Rest 2004 Itil Community for Service Management & Foundation Certification. SV
Site source copyright (c)2003, and is Free Software under the GNU / GPL licence. All Rights Are Reserved.