Category Archives: Service Management

Congratulations, your IT might be less sick today

help desk fire I came across an article in Computerworld titled “The Help Desk is Hot Again” articulating the revived popularity of the Help Desk. It explains that the Help Desk “serves as a vital liaison between employees’ mobile technologies and the networks, servers and applications that support them.” Help Desks certainly serve an important purpose however, this positioning feels slightly askew. For most IT organizations the Help Desk is where you go when you have a problem and need help. Help Desks do not understand how IT consumers are experiencing IT and are certainly not a liaison. I can see how there is a logical leap from issue management to evaluating the health of IT but do you go to the doctor when you are well?

Until recently, visibility into the consumer side of IT was not considered essential when measuring IT service availability. The assumption was that maniacally monitoring data center health provided enough data to show how effectively IT supported the business. For most organizations, IT availability and ‘end-user’ satisfaction is evaluated with metrics provided by the help desk, showing what went wrong and when. From the perspective of issues this may be acceptable but it hardly provides an accurate view on how the business is using IT. It would be like asking a doctor “so, how healthy does the world look today?,” where the answer would be “It looks pretty sick”.

This whole situation has been exacerbated by the use of mobile devices, the growth in non-corporate cloud-based application sources and the influx of people entering the industry who were born digital.  These new market entrants have learned to become more self-sufficient than any generation before and would rather have the flu than call the service desk. Many of today’s mobile issues are ‘fleeting’ with performance being a variable impacted by increasingly complex and congested network connectivity. For many, it’s easier just to wait it out.  Does the help desk capture this experience? No.

So, if the objective is to understand how IT is used and experienced, then you don’t start from the data center. The starting place is the IT consumer. This requires more than a set of tools giving visibility from ‘the edge,’ it will require IT support to organize and focus teams on IT consumer activity.  Measuring experience means understanding how IT is used, when it is used and where it is used, not just when there is an issue. Capturing, monitoring and analyzing IT consumer activity allows IT organizations to assess the true IT business impact, regardless of where the user is, what they are using or where their applications are sourced.

This approach is not going to be an easy for IT departments that have spent decades focusing on silo’d data center elements and back-end applications transactions. IT consumer activity monitoring is not an option. Users do not use one device, do not remain in one place and do not use just one application. IT innovation, mobility and IT consumer creativity will continue to push the limits of IT operations management with those able to adjust their IT management focus benefiting from greater IT decision making and business alignment.

The service desk must evolve to be a true high-touch solution and this can only be done when it is also used to monitor how all IT consumers are experiencing IT.   IT organizations that do not plan to focus on their IT consumers will be left struggling, trying to manage increasingly diverse IT needs using tools providing a datacenter centric application performance snapshot, stumbling their way towards the edge by trying to see through increasingly complex third-party service black-holes.

proactive sounds cool, but being reactive is just easier.

predisctiveRecently I’ve been involved in discussions about how new IT monitoring tools will make IT support teams smarter and far more proactive.  By smarter I mean having a greater understanding of IT health and by proactive I mean being aware of situations before or as they occur.

I’d argue that becoming smarter is a prerequisite to becoming proactive.  Monitoring for issues is so much easier when you know what you are looking for and understand the ramifications. The best way for IT support to become smarter is hire smartest, most experienced people. Becoming proactive is not so straight forward.

The idea that tools will make a reactive, crisis driven, IT operations team into a proactive one is nonsense. For decades monitoring tools have been able to set policy forewarning of events and giving support staff a heads-up on potential issues. The reasons this capability has not delivered on the promise are numerous; including events that are ‘potential issues’ or ‘warnings’ rarely classified as a high priority items, support staff not noticing them (or ignored them) or the method of event delivery being the wrong one.  It has had little to do with the monitoring tools. Reality is; most IT organizations are not measured on outage avoidance but on fixing issues once outages occur.

It’s easier to be the hero who got the order processing application back up than the person who said they had helped avoid the problem occurring in the first place (“you did what?” “oh sure you did, well done – help yourself to a medal”).   If an organization wants to be proactive then it needs to have people goaled and measured on finding issues before they become problems. Security officers actively monitor and analyze data to proactively identify anomalies,  irregular activity and behaviors, monitored events to stop hackers, cyber attacks, virus’ etc. Apparently, it is not acceptable to wait for security problems to occur before they get addressed.  For IT support to do this will require a number of changes including;

  1. an organization measured against outage avoidance.
  2. information delivered in ways that the support team will take notice of.
  3. information that means something and is actionable.

an organization measured against outage avoidance. An IT organization that prides itself on being proactive but measures itself against MTTR or MTBF is not fully proactive. The speed IT operations responds and fixes an issue is not a good measure of proactive efficiency without factoring in the speed issue was detected in the first place. IT operations effectiveness would have greater relevance if it was tied to outage avoidance.  This type of metric is not easy to capture using monitoring tool reporting (too many sources, limited business impact assessment) so it requires a way to immediately consolidate, log and track the identification to remediation process. The easiest way to do this is using a service desk.  This information would demonstrate how IT operations provides value, while showing increases in IT operational efficiencies.

information delivered in ways that the support team will take notice of. IT organizations invest a lot of time and effort trying to detect and process events, but few put the same effort into ensuring events are immediately delivered to the right IT personnel. A proactive state dictates that event data is delivered and owned as soon as it is detected. This means the mechanism chosen to deliver the data is as important as the effort associated with collecting the information in the first place. Most IT organizations still rely on event management tool consoles; however, an unwatched console will result in missed events. Sending events to mobile devices (e.g. in the form of an IM) and/or the use of alert notification tools can reduce the time it takes to become event-aware. Alert notification tools help support a proactive objective by automating the delivery of alerts to the appropriate IT operations personnel through the most-effective communications channel, in support of established escalation and outage procedures and also provide the mechanism for an event to be delivered, acknowledged and owned.

information that means something and is actionable. If you are not actively looking for something, it’s unlikely you’ll find it. A blindingly obvious statement but when monitors are being used in IT operations they are typically being used to aid root-cause-analysis on known reported issues where support knows there’s an issue and understands the sort of thing they need to look for. However, when there is no obvious problem it takes skills and experience to scroll through long lists of technical event data to identify the most critical, business impacting issues.  Knowing how things relate to the bigger picture requires the skill to assess the overall impact of multiple unassociated events and that means taking the yellow ones as seriously as the red ones.  This approach is the new way IT support must work, looking for subtle changes and behaviours in the IT infrastructure, applications and IT consumers, analyzing potential impacts and executing a plan to remediate the issue before it effects the business. This approach demands dedicating support personnel to IT analysis and moving them away from monitoring consoles when they have time or are motivated to do so by complaints from IT consumers.

If you ignore the price to the business, being reactive doesn’t cost a thing.

a path to improving end user experience

smilie 2I don’t believe anyone can dispute the growing influence end users have on how IT services are chosen, sourced and evaluated.  This does not mean IT operations organizations are ready to fully embrace the end user as a specific focus.  Many assume application transaction monitoring and mobile device software update support is enough – at least for the time being.  The reality is it isn’t enough and treating the end user like peripheral hardware is not to their benefit. This is managing the situation – not enabling the end user.

Improving end user experience is not about keeping an eye on them or trying to support their mobile devices it’s about removing IT barriers, reducing complexity and making them more self sufficient and productive. This objective is best broken down into logical areas;

  1. Support
  2. Social Enablement
  3. Security & Resilience
  4. Productivity

Each area has a set of activities and objectives:

  • Support: Identify, address and report common/local issues, pre-emptive problem management and real-time end user IT status specific their individual needs and priorities.
  • Social Enablement:  Social, communication and collaboration tools to foster and enable information flow between different users with common interests, goals and objectives.
  • Security & Resilience: End user and device authentication, content protection and data protection and recovery.
  • Productivity: BYOD enablement allows the conducting of business from any device and location. Users download and given access to applications and access to local resources and information on company facilities based on their specific needs and within company policy.

It is unrealistic to think the objectives for each activity can be accomplished all at once. They are only achievable if each activity has a path containing logical, measurable steps.  This is also needed as each activity can have ties to others (e.g. to deliver a level of support requires a level of security and resilience).

In the paper Path to Improving the End-User Experience the activities are explained and broken down into the five levels (undefined, reactive, proactive, service and business) providing objectives to assess the current end user environment and improve upon it.

A barrier to success is IT operations’ need to enable the users from the datacenter perspective.  If the end user is the focus then the starting point is the end user (do IT users care about the datacenter?).  However, to show value a plan must have two perspectives, one IT operations and the other the end user.  In the paper each level describes the activity and value to both IT operations and the end user.  This allows IT operations to associate effort and investment directly with end user productivity.

Improving end-user experience, satisfaction and making them more productive increases a company’s effectiveness and makes it more competitive. It’s a no-brainer.

service intelligence transforms the service desk

faceFor decades IT has struggled to understand how end-users use IT.  The only point-of-reference being the service desk which is the only place where the user community interfaces regularly with IT. However, it’s not easy to provide a view on end user IT value when all you have as reference are issues.

So, you call the service desk and you get the standard interrogation, a number of questions to help identify your issue and send it, with a degree of accuracy, to the right support team. Even though updates have been made to service desks for decades the core capability, managing problems and incidents, remains the same (this situation is made clear by Chris Dancy and his example of  ‘Form Based Work Flow’).  Irrespective of how functionally rich the service desk you use is tickets opened and problem resolution metrics are still used to show how effective IT supports the business. This ‘suck less’ metric is not good. The problem with problems is problems are not the same. And that’s the problem.

Things that prevent people doing their job are typically reported (e.g. connectivity or passwords) but things that are not show-stoppers and just annoying  (e.g. a sporadic performance problems, jammed printers etc) are not. For many it’s just way too much hassle. The reality is, end users suffer from poor application performance more often than any service desk log shows. The user will talk with their colleagues to make sure they are not the only victim and possibly just wait  because it’s easier to assume IT operations knows about it or just wait for the problem to fix itself (e.g. less user traffic, moving to a different location or using a different device).

There is no place to go to understand the overall end user experience leaving IT operations to make the assumption that if there are no major issues then the user must be fine. The problem using the service desk as a way to deter user satisfaction is it’s not a monitoring system. It simply logs incidents and manages them in line with established escalation and outage procedures. The use of infrastructure monitoring tools provide a view of the health of one datacenter (or one component type) and the use of most APM tools provides a partial view of the end user applications performance. There have been attempts to provide end-user visibility to the service desk to create a more intelligent, business aware, solution.  The attempts include providing self-help options, end user keystroke logging and control over windows end-point devices (primarily windows). However, even with some of these capabilities being offered the service desk remains a reactive incident management solution focused on supporting issues already impacting the end-user.

As the end-user environment becomes more complex (agile application releases, cloud based apps, BYOD, increased mobility etc) the ability for service managers to support the business will become harder and the use of internal datacenter performance metrics alone will not be relevant in a world where the IT user is using applications disparate sources on a multitude of different devices. Service managers must be able to understand both what the business uses and how the business uses IT. The ability to understand end-user behavior will move the service desk from a passive incident reporting system into a solution that provides the IT support organization with visibility into how the business uses IT.  This visibility will enable service managers to manage incidents more effectively and identify business trends which will impact the IT services provided to the business. Understanding how the business uses IT should enable service managers to plan accordingly in regards to how the support organization is staffed to provide service quality.

If you are not looking you will not find it

IT operations remains a reactive practice, hoping that technology will make them more proactive. The truth is if IT operations is not focused on being proactive then it will remain in a reactive state no matter what tools are used.  The same can be said for the service desk. For it to become a service intelligence solution also requires a change in how the service managers use it. Products that provide visibility into how IT is used also requires the service managers to take an active role in looking for trends that indicate something abnormal is occurring (e.g. people using an application on a specific device dealing with poor performance).

The path to intelligence

Service desks have yet to evolve to the intelligent solution I’ve talked about however, forward thinking IT organizations are already starting to think this way. It requires traditional organizational barriers to come down between the service desk and IT operations. A high-bred role is created that uses APM tools (primarily end user focused products – EUAM) to look for potential issues. The information is then passed to the service desk – automatically or manually through the opening of a ticket and a dashboard at the service desk showing specific performance trends as they pertain to applications and end users. Even though the service desk and APM tools remain separate today using them together should provide benefits – once collaboration has been established between service managers and IT operations.

 The value

  • End-user experience is tracked against service levels with tickets opened proactively when an end-user (or end-user group) experiences a degradation in service
  • The service desk understands the current end-user experience, the devices being used, their location, their normal activity and the applications being used providing greater visibility into how the business is using IT.
  • The service desk is made aware of the end-user experience no matter where the applications are sourced (locally, internal or external). This enables accurate incident ticket assignment.
  • End-user activity is available for ‘play-back’ to help understand and identify what was being done at the time an issue occurred enabling effective root-cause analysis.