Search
Topics
  Create an account Home  ·  Topics  ·  Downloads  ·  Your Account  ·  Submit News  ·  Top 10  
Modules
· Home
· Content
· FAQ
· Feedback
· Forums
· Search
· Statistics
· Surveys
· Top
· Topics
· Web Links
· Your_Account

Current Membership

Latest: MYarbroug
New Today: 51
New Yesterday: 55
Overall: 148173

People Online:
Visitors: 47
Members: 1
Total: 48 .

Languages
Select Interface Language:


Major ITIL Portals
For general information and resources, ITIL and ITSM World is the most well known for both ITIL and ITIL Books. A shorter snapshot approach can be found at ITIL Zone

Related Resources
Service related resources
Service Level Agreement
Outsourcing

Note: ITIL is a registered trademark of OGC. This portal is totally independent and is in no way related to them. See our Feedback Page for more information.


The Itil Community Forum: Forums

ITIL :: View topic - Capacity Mgmt Metrics
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Capacity Mgmt Metrics

 
Post new topic   Reply to topic    ITIL Forum Index -> ITIL Service Delivery
View previous topic :: View next topic  
Author Message
SwissTony
Senior Itiler


Joined: Feb 26, 2009
Posts: 118
Location: Geneva

PostPosted: Wed Aug 26, 2009 5:47 pm    Post subject: Capacity Mgmt Metrics Reply with quote

Been having a confrontation with a newly appointed associate director who has recently been involved in conducting an ISO20k internal audit.

Within the report he has stated that CPM is failing as no 'processing capacity' metrics are monitored, measured, and reported preventing correct decisions to be made. Evil or Very Mad

Now my issue here, which I would appreciate you thoughts & comments on are, we monitor CPU & Memory usage with Ganglia (previously using Orca). We have real time access to the information, however, choose not to measure the CPU or Memory levels of 800+ servers (mixture of windows, linux, aix, unix, physical, virtual etc). My justification is that we recieve alerts when thresholds are reached so that the users are not affected - that when new applications / services are being added the guys have access to this information to judge whether or not to add it to box X. Finally, that to measure CPU & Memory over so many servers to make it efficient the data would have to be culmiated into so sort of average, overall figure, which then makes it completely in-effective.

Do you guys measure 'processing capacity'? & How do you do it?
Back to top
View user's profile
Diarmid
Senior Itiler


Joined: Mar 04, 2008
Posts: 1884
Location: Newcastle-under-Lyme

PostPosted: Wed Aug 26, 2009 7:44 pm    Post subject: Reply with quote

Tony,

if you do not know the present average and peak utilization of a machine, how do you determine its capacity to support a new application, a modified application with new functionality, a change in utilization patterns. The absence of threshold alerts is no guarantor of available capacity for additional usage.

Performance monitoring is not Capacity Management although it is a necessary component.

Now the practical aspect is, as always, cost and risk. What is the cost of maintaining current baselines for each machine? What are the consequences if a particular machine "overloads" (on the busiest day of the year, for example).

So it can be okay to make a judgement call. Especially for small stable systems running non-critical systems. It can amount to how tolerant your business activity is of small glitches in responsiveness occurring for a period;how readily you can either add capacity or reshuffle apps between boxes; how acceptable it is to regress and move a system newly added when the capacity is found to be wanting.

But consider this scenario:

A project is proposed to revamp a cluster of applications that are spread over 20 small servers. This will be a big deal. Therefore, either:
- you set up life size tests to measure everything as it will be
- you measure the new apps and apply modelling techniques (or at least extrapolations although that is less reliable) to predict impact on capacity
- you make a judgement call (I don't think so)

If the first is too expensive (on people, machines and time) and possibly rather difficult, then you have to look at the second. This is fine. But if you do not have current baselines for these machines you will have to extend the project to achieve these before you proceed (typically that involves working through peak demand periods which may be monthly for example)

So one of the risks is that it might become important in the future. But it is still cost and risk.

Your auditor is only correct if you cannot show the cost and risk analysis to support what you do.
_________________
"Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
Back to top
View user's profile Send e-mail
SwissTony
Senior Itiler


Joined: Feb 26, 2009
Posts: 118
Location: Geneva

PostPosted: Wed Aug 26, 2009 10:12 pm    Post subject: Reply with quote

Diarmid, thanks for your response..

Diarmid wrote:

if you do not know the present average and peak utilization of a machine, how do you determine its capacity to support a new application, a modified application with new functionality, a change in utilization patterns. The absence of threshold alerts is no guarantor of available capacity for additional usage.


Ganglia provides the view of the CPU & Memory usage per server, and retains historical information to be able to see 'normal' performance. Therefore, in the example you cited we would review the performance of the servers to know whether they could handle the changes.

The key is that we do not make this a specific metric to report on due to the volume of the servers.

I know ITIL does not view thresholds as being a pro-active mechanism, however, I disagree with this as this provides IT with the early warning to ensure that users are not impacted.

I'm sure there is something I am missing, just cannot see it yet.
Back to top
View user's profile
Diarmid
Senior Itiler


Joined: Mar 04, 2008
Posts: 1884
Location: Newcastle-under-Lyme

PostPosted: Thu Aug 27, 2009 12:05 am    Post subject: Reply with quote

SwissTony wrote:
The key is that we do not make this a specific metric to report on due to the volume of the servers.


Report to whom? Who is interested in the capacity of every single server apart from the Capacity Manager and the Operations Manager? The business isn't, nor should be the CIO nor the head of Service Management. Although those latter two would expect to see evidence that you do have all that information reliably at your fingertips.

Reports to senior management on the performance of Capacity Management function should be focussed on overall capacity and how ready it is to meet planned and unplanned variance in demand; activities in support of projects and future planning both for business, applications and technology/equipment changes; any interesting trends that are emerging; issues resolved and under investigation.

I have to say that threshold monitoring is essentially reactive in terms of Capacity Management (although it can be considered pro-active in terms of Incident and Problem Management) as there is no element of prediction or anticipation involved; rather, reaching it is something you react to. It's a bit moot though, because the words slip about too much in this area. It's probably better just to be clear about what you require and how you achieve that.

My real point is that threshold monitoring is incapable of predicting the impact of additional workloads on a system because that impact is not additive. Bottlenecks can appear out of nowhere.

[PS. reference to bottleneck implies no particular preference for John Fahey.]
_________________
"Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
Back to top
View user's profile Send e-mail
Cking
Newbie
Newbie


Joined: Feb 27, 2009
Posts: 16
Location: North Coast, USA

PostPosted: Thu Aug 27, 2009 12:37 am    Post subject: Reply with quote

Tony,
I've been bouncing around a few sites concerned with measurement and capacity...
One term that often shows up alongside Processing Capacity is Cost of Transaction. (Forgive me if it's in ITIL as well, I've been introduced to alot of terms from many sources lately)

So not only do you need to capture the box metrics, but also usage / throughput / transactional metrics to match up with performance metrics.
... we haven't gotten there yet either.

Then when you know of a change in demand coming down the line (seasonal, new customer, etc) , you can estimate what the increase in transactions will demand from your infrastructure.

For new apps or enhancements, ideally that relationship would be captured during the QA phase.

And Yes, when you roll historical data into averages you lose visibility of reality. So maybe keeping peaks, or average of peaks alongside a variance metric like standard deviation or confidence interval (6Sigma background Wink ) might give a better picture.
Back to top
View user's profile
Display posts from previous:   
Post new topic   Reply to topic    ITIL Forum Index -> ITIL Service Delivery All times are GMT + 10 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB 2.0.8 © 2001 phpBB Group
phpBB port v2.1 based on Tom Nitzschner's phpbb2.0.6 upgraded to phpBB 2.0.4 standalone was developed and tested by:
ArtificialIntel, ChatServ, mikem,
sixonetonoffun and Paul Laudanski (aka Zhen-Xjell).

Version 2.1 by Nuke Cops 2003 http://www.nukecops.com

Forums ©

 

Logos/trademarks property of respective owner. Comments property of poster. Rest 2004 Itil Community for Service Management & Foundation Certification. SV
Site source copyright (c)2003, and is Free Software under the GNU / GPL licence. All Rights Are Reserved.