Posted: Mon Sep 29, 2008 11:17 pm Post subject: Desktop Slowdowns - What approach to take?
Again I am a newbie to this forum. I have been a Problem manager now for 3 yrs.
I have a very specific query re on-going series of problems I have encountered, the subject being desktop slowdowns. I work in a medium size org with around 7,000 desktops delivering multiple apps to distinct business units. This is a recurring issue with users located centrally in a head office environment.
I find it very hard with the distinct support teams all too happy to say their piece of the puzzle is clean. These teams include Network, Client, Windows Infrrastructure and In house application support teams.
Has any one had a similar issues and how did they progress?
Joined: Oct 13, 2006 Posts: 116 Location: South Africa
Posted: Tue Sep 30, 2008 8:03 pm Post subject:
Tough one. You're working against the motivation of each team to make their area look good ... and perhaps to avoid difficult work.
I see what John says, and that's important, but that approach can still quickly run into denial: "we have proved that that stated root cause does not apply to our area" or experts say that is normal operation of a network, the desktops should cope with it, or any number of other excuses.
You need to appoint a problem coordinator (could be yourself) and register this as a cross-functional problem that requires end-to-end resolution. What I'm saying here is your process, and your policy and your top management commitment, need to be behind this so everyone feels responsible.
Then, ideally your problem coordinator would be cross-functionally skilled ... but those people are hard to find! At least, you need
A very precise description of the problem - not the root cause, because you don't know it yet, but the symptoms. So that you have a definite goal: "if this goes away, we have finished". You must define when the incidents or events occur, who is affected, what diagnostics can be used, etc.
Otherwise you have a general description that each team can interpret differently, or only address a small part of.
Appointed contacts in each affected team, with regular minuted meetings. All functional areas need to allocate time to this outside of routine work - that's one of the classic definitions of "problem" - management deciding that focused effort is required.
The team has to understand the end-to-end working and interdependencies. If you don't have a cross-functionally skilled expert, you'll have to create this understanding during the problem determination work.
A list of possible causes - brainstormed, evolving - use your favourite problem determination methodology - but involve everyone.
Avoid a culture of apportioning blame. The root cause may not be due to someone's error or some badly functioning component - it's more likely to be due to some setting or configuration that would be fine in other circumstances but not yours. Everyone has to work together and contribute.
If you can't get commitment to the above, then you have to be hardnosed and say there is no commitment to fixing this "problem" and it should be taken off the list.
Joined: Mar 04, 2008 Posts: 1893 Location: Helensburgh
Posted: Tue Sep 30, 2008 8:28 pm Post subject:
This is really a performance management issue and should come under the remit and skill set of Capacity Management.
I have experienced such issues go on for two or three years until people stopped testing their best guess and started analysing the system end to end, relying on data to point to the underlying problem.
It is quite common for there to be more than one cause or for the slow periods to be caused a combination of factors. The best (but possibly expensive, nevertheless an investment that can pay for itself) way to get to the root is by modelling the system. There are at least two companies in the UK that can provide software for this (and consultancy).
The underlying problem with the speculative (brainstorming) approach is that you are no further forward when you next get poor performance and you have no way of knowing whether it is for the same reasons or not.
Once you have done measurement you have a baseline for the future and if you have a model you can test where future bottlenecks might occur.
The various groups will find it harder to argue with data and the cause(s0 often turn out to be factors that no single group could have reasonably anticipated. Getting all the data together can therefore help to get the teams to work together. _________________ "Method goes far to prevent trouble in business: for it makes the task easy, hinders confusion, saves abundance of time, and instructs those that have business depending, both what to do and what to hope."
William Penn 1644-1718
Joined: Sep 16, 2006 Posts: 3477 Location: London, UK
Posted: Tue Sep 30, 2008 8:53 pm Post subject:
Diarmid is correct. This is Perf Mgmt part of capacity
As the problem mgr, what you should query // etc as part of developing the strategy for finding the solution to this issue
You should break it down like the below point
Each one will spawn a sub question
1 - What services (foreground/background) are being run on the desktops in question.
- which ones are memory hogs
= how much memory / swap space on each machine
<= is there a std for mem or swap space
2 - what admin level of services are running and when....
1 - what is the network architecture for these affected machines
2 - what is the b/w and capacity of VLANS, LAN and equipment (switches, routers, bridges)
Doing this can help figure out what's what
The usual suspects in this - from my experience is
1 - AV software and updates done on desktop as well as the network / mail gateway. (design issue)
2 = age of kit involved / capabilities of kit
3 - network satuarted / over worked (Capacity mgmt) _________________ John Hardesty
ITSM Manager's Certificate (Red Badge)
Change Management is POWER & CONTROL. /....evil laughter
Posted: Tue Sep 30, 2008 10:43 pm Post subject: Oh Capacity is it!
As a capacity manager, I too have experienced this typical scenario many times and to be honest, I've sometimes seen a problem accepted rather than solved.
Granted, this is a performance / capacity issue and the threads so far have advised some good avenues and tips both from a business and more technical perspective...
However, upon reading the vikings comment on AntiVirus for example being a typical culprit it just sprung one thought not yet mentioned...
How effective is your change log? If it's effective and accurate, maybe get a rep of each area on a conference call and discuss the possiblilty of any recent changes that may have triggered this service issue. OK, maybe the deskstops have slowly deteriorated but it's still worth a try.
Alongside this, you need to follow the vikings diagnostics approach, i.e. who, what, when, how, and hopefully the why will follow, but remember you're the problem manager and not a desktop engineer! You need technical support guys pushing with you and not against you...
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum