do IT users care about the datacenter?

datacenterThe datacenter – the IT business hub. Or it used to be.

End users could not care less about it. What they care about is applications availability and response times and the ability to get IT access from whatever device they choose and from wherever they want. The business increasingly makes decisions on what applications are used, where the applications are sourced and who supports them. There’s no love affair between the business and the organization called IT operations because it’s not about technology it’s about getting the job done. It’s not that datacenter availability isn’t important it’s just not important to users – the business measures IT value against the quality of support and applications availability and performance not servers, storage and networks.

Some will struggle with how monitoring the datacenter does not equate to understanding and measuring business availability. It was not so long ago companies providing datacenter outsourcing services would have a huge display in reception with topology maps showing a red, green, yellow status of the datacenter infrastructure. I can only assume it was designed to show control and understanding because I’d argue the computer room could spontaneously combust and no-one would be any the wiser until the end-users reported problems accessing their applications.

How many times have you thought “I wonder if the servers are performing well today?” or  “I hope my files are backed up and secure”.  What you probably think is “email is slow, IT need to fix it now” and if data is lost or corrupted “IT had better get it back now”.  My point is this, the datacenter will continue to be critical to the IT organization responsible for managing it – not to the businesses that use it. For the business it’s all about the application – no matter where it resides or who manages it and the fact an application requires hardware and software to live is, from the user perspective, is irrelevent. It’s assumed.

In late December 2012 Netflix had issues. The fact it was over a holiday period made the problem even more annoying. It was a Netflix problem and twitter lit up with customer feedback for Netflix. Netflix blamed the issue on Amazon Web Services servers and said Amazon was addressing it. So, that’s ok then? It’s not a Netflix application problem – it’s a an Amazon server problem.  It doesn’t matter if Amazon’s servers were the real problem, it is Netflix’s job to make sure their applications are not plagued by a weakness in server capacity, performance, architecture or design – no matter who they decided to source this critical task to. Subscribers to Netflix do not pay Amazon.

It’s the same for any IT organization delivering IT application services whether they are internal or external to a business. Monitoring the datacenter to identify and solve issues is one thing – using the same element monitoring to try and demonstrate value to the business is another.  Managing the datacenter is mandatory, however using element based availability metrics as proof of IT business value and application availability is no longer acceptable.

From a business perspective the value of IT is assessed through the lens of their business users – not the datacenter.  This will increasingly result in IT value being assessed from the end user to the application source measured against services levels which means datacenter components can go up and down all they like as long as it doesn’t have a detrimental affect on business service levels.  With the growing trend to use applications from cloud based service providers who can tell where all the parts of an application are?  Netflix is hardly unique, architecturally, in the way it provides services. As more applications are made available in the cloud the location of the supporting infrastructure is likely to be in the hands of one or more additional cloud service providers.

So, who cares about the datacenter? The people responsible for managing it, developers, testers and business unit personnel who pay for capacity. For users and the business  – it’s all about the application.

service intelligence transforms the service desk

faceFor decades IT has struggled to understand how end-users use IT.  The only point-of-reference being the service desk which is the only place where the user community interfaces regularly with IT. However, it’s not easy to provide a view on end user IT value when all you have as reference are issues.

So, you call the service desk and you get the standard interrogation, a number of questions to help identify your issue and send it, with a degree of accuracy, to the right support team. Even though updates have been made to service desks for decades the core capability, managing problems and incidents, remains the same (this situation is made clear by Chris Dancy and his example of  ‘Form Based Work Flow’).  Irrespective of how functionally rich the service desk you use is tickets opened and problem resolution metrics are still used to show how effective IT supports the business. This ‘suck less’ metric is not good. The problem with problems is problems are not the same. And that’s the problem.

Things that prevent people doing their job are typically reported (e.g. connectivity or passwords) but things that are not show-stoppers and just annoying  (e.g. a sporadic performance problems, jammed printers etc) are not. For many it’s just way too much hassle. The reality is, end users suffer from poor application performance more often than any service desk log shows. The user will talk with their colleagues to make sure they are not the only victim and possibly just wait  because it’s easier to assume IT operations knows about it or just wait for the problem to fix itself (e.g. less user traffic, moving to a different location or using a different device).

There is no place to go to understand the overall end user experience leaving IT operations to make the assumption that if there are no major issues then the user must be fine. The problem using the service desk as a way to deter user satisfaction is it’s not a monitoring system. It simply logs incidents and manages them in line with established escalation and outage procedures. The use of infrastructure monitoring tools provide a view of the health of one datacenter (or one component type) and the use of most APM tools provides a partial view of the end user applications performance. There have been attempts to provide end-user visibility to the service desk to create a more intelligent, business aware, solution.  The attempts include providing self-help options, end user keystroke logging and control over windows end-point devices (primarily windows). However, even with some of these capabilities being offered the service desk remains a reactive incident management solution focused on supporting issues already impacting the end-user.

As the end-user environment becomes more complex (agile application releases, cloud based apps, BYOD, increased mobility etc) the ability for service managers to support the business will become harder and the use of internal datacenter performance metrics alone will not be relevant in a world where the IT user is using applications disparate sources on a multitude of different devices. Service managers must be able to understand both what the business uses and how the business uses IT. The ability to understand end-user behavior will move the service desk from a passive incident reporting system into a solution that provides the IT support organization with visibility into how the business uses IT.  This visibility will enable service managers to manage incidents more effectively and identify business trends which will impact the IT services provided to the business. Understanding how the business uses IT should enable service managers to plan accordingly in regards to how the support organization is staffed to provide service quality.

If you are not looking you will not find it

IT operations remains a reactive practice, hoping that technology will make them more proactive. The truth is if IT operations is not focused on being proactive then it will remain in a reactive state no matter what tools are used.  The same can be said for the service desk. For it to become a service intelligence solution also requires a change in how the service managers use it. Products that provide visibility into how IT is used also requires the service managers to take an active role in looking for trends that indicate something abnormal is occurring (e.g. people using an application on a specific device dealing with poor performance).

The path to intelligence

Service desks have yet to evolve to the intelligent solution I’ve talked about however, forward thinking IT organizations are already starting to think this way. It requires traditional organizational barriers to come down between the service desk and IT operations. A high-bred role is created that uses APM tools (primarily end user focused products – EUAM) to look for potential issues. The information is then passed to the service desk – automatically or manually through the opening of a ticket and a dashboard at the service desk showing specific performance trends as they pertain to applications and end users. Even though the service desk and APM tools remain separate today using them together should provide benefits – once collaboration has been established between service managers and IT operations.

 The value

  • End-user experience is tracked against service levels with tickets opened proactively when an end-user (or end-user group) experiences a degradation in service
  • The service desk understands the current end-user experience, the devices being used, their location, their normal activity and the applications being used providing greater visibility into how the business is using IT.
  • The service desk is made aware of the end-user experience no matter where the applications are sourced (locally, internal or external). This enables accurate incident ticket assignment.
  • End-user activity is available for ‘play-back’ to help understand and identify what was being done at the time an issue occurred enabling effective root-cause analysis.

crowd sourcing and the self-sufficient digital native

Screen shot 2013-01-08 at 1.40.27 PMIT savvy is no longer the exclusive domain of the IT organization. IT plays a pivotal role in many end users day-to-day activity and is as natural as breathing in and out.  Digital natives entering the market has led to the creation of a new type of IT user, one where self-sufficiency has become a way of life. Social IT activity rarely includes a support organization ready to leap to your aid in times of trouble instead the user relies on support found from using search engines, blogs, on-line documentation and through social collaboration.  When the IT savvy digital native enters the job market their ability to deal with (or at least attempt to deal with) IT issues (e.g. connectivity, access or file sharing) is significantly greater when compared with people entering the market only a decade ago.

Working with digital natives I find them to be more self-sufficient and believe IT problems can be solved faster if they are given the ability to do it themselves.  This emerging environment ‘should’ create changes into how users are enabled and supported. For example; if the end user is more self sufficient then service management tools should provide a lot more than a hotline support number. Service management tools could be enhanced with self help, intelligent search, automated recovery and importantly, crowd sourced information. Crowd sourced information could allow users to understand how IT is being experienced by their colleagues while also helping the IT support organization understand end user experience and aid root cause analysis.  This capability is especially important with applications sourced from diverse locations and the prolific use of mobile devices.  The reality is; a service desk has no clue where you are and what you are doing so when problems occur it’s just the beginning. The only view of IT service and what’s really being experienced can be attained from the end user and increasingly, by the end user. All this information is collected, analyzed and delivered without a single communication with a datacenter.

Crowd sourced application experience data provides a far greater understanding of overall end user status easier and with far less complexity, cost and effort than any traditional datacenter centric IT management tool. Of course, it does not give the deep-dive information many APM tools provide but in this case it’s not just about application availability (e.g. downrightnow.com) it’s about helping the end user become more productive and self sufficient.

This is not something found in IT operations management today, however the concept has been used in other types of applications (e.g. waze and GPS navigation) where crowd-sourced data provides work-arounds and options. For road navigation it could advise taking an alternative route due to an accident, for IT it could be to use an alternative printer or avoid using a mobile device in an area where performance is being impacted.

What I have described is a future state so for the time-being digital natives are going to continue to find ways to support themselves – no matter if it’s for their own personal needs or those provided by their employers.

end user activity monitoring (EUAM)

The focus on end user activity monitoring (EUAM) continues grow in importance due to the end user influence on how IT is used and how applications are sourced. Forward thinking IT organizations recognize the value of gaining insight into end user activity as it can enable more effective applications support, improved IT service and ensures end users are more productive.

However, EUAM not something readily embraced by end users who consider anything that tracks and monitors their activity a personal infringement.  In some countries privacy hurdles will need to be overcome requiring the tools that provide activity monitoring to establish levels of interaction.

Highway cameras are used to take pictures of cars breaking the speed limit policy with all vehicles under the limit passing with no picture and no record of their presence.  The same can be said for EUAM, the objective is to identify trends, abnormalities and performance degradations not to track and record all activity.  Today’s EUAM tools will understand what devices are used, the configuration, specified software loaded, application performance, activations and where appropriate, the location. However, unlike the speed camera the job of EUAM is focused on enablement not policing.

EUAM augments and enhances existing application performance monitoring tools, by providing a ‘front-end’ end user understanding of how IT is being experienced.  It allows IT organizations to tie the end user experience with the ‘back-end’ data center applications infrastructure. This can be incredibly powerful as it allows a full end-to-end view of the entire application interaction from mouse click to database record retrieval.

So what capabilities should be expected from an end user applications monitoring solution? It’s certainly more than has been available for years, which is typically a combination of synthetic transaction monitoring, desktop management and end user issues opened at the service desk. EUAM provides real end user activity in one tool. Depending on how intrusive a company needs to go (or allowed to go) the following EUAM capabilities should be considered when looking for the right EUAM product;

  • Real-Time Application Response Monitoring
    • Information in real-time revealing degradations in applications performance preferably before the end user sees the impact.
  • End User Behavioral Analysis
    • Information on how end users access applications, when the applications are used and even where the access is required.
  • Visibility through service providers, clouds and content delivery networks
    • End-to-end visibility of applications performance irrespective of where the applications are sourced and the cloud environments between the source and the end user.
  • Application Activations
    • Visibility into when and how long applications are used enabling IT support to schedule and plan IT operational activity more effectively.
  • Keystroke/Activity Logging
    • Increasing root-cause capabilities by allowing IT support to see what was happening on the end user device when an issue occurred.
  • Device Information (type, software revisions, configuration)
    • Ensuring that the end user has the required configuration to support effective application release processes and allowing more effective issue identification.
  • End user location (*if company policy and/or privacy laws allow)
    • Allows IT to track where end users access applications and on what devices. This helps with performance degradation issue analysis and root-cause analysis.

Recent coverage of EUAM;

http://apmdigest.com/end-user-experience-application-performance-management-bmc

http://www.bmc.com/products/euem/end-user-experience.html?intcmp=redirect_product-listing_end-user-experience

http://www.businesswire.com/news/home/20121029005618/en/Aternity®-User-Activity-Monitoring-Windows®-8

 

where do you go when IT gets in your way?

When IT gets in the way of doing your job where do you go for help? The service desk? Someone in IT operations? Google? Phone a friend?  Big problems such as an application outages or the most common password issues are typically covered. These are certainly an inconvenience but they either get the right level of attention or are easily to fix.

But what about poor performance when getting mail on your phone at the airport, getting access to a printer, or an inability to connect to wi-fi in a company facility?  These situations can be temporary but have no obvious path of remediation and they can totally ruin your day. It’s the small stuff that is the hardest to deal with. Most corporate IT organizations are ill-prepared to deal with this level of end user interaction and the end user is hesitant to send an email into the help desk or spend 30 minutes queuing to talk to someone for what is considered a trivial low priority problem.

In the world of private IT use there is no central help desk or an IT department however people have learned to deal with issues.  The option to send a complaint or send a problem description to someone believed to be at blame is always available, with mixed results, some solving the issue, some not and some ignored.  Then there is search.  Someone, somewhere must have had a similar problem. This approach works even though it may not provide the exact answer it will at least send you to interest groups with any number of smart people willing to provide guidance.  For a large number of reasons (e.g. company regulations) this type of activity is not something a business would readily adopt internally.

Managing issues through crowd sourcing.

Applications have been available for years which allow people to comment on services, products, restaurants, etc. Recently this capability has taken a real-time aspect where guidance can be provided through experience and observation. An example of this is the ‘human’ GPS, Waze (http://www.waze.com).   For the few people on the planet who have not heard of this application it allows drivers to share information on their mobile devices in real-time on traffic conditions (jams, police speed traps, accidents etc). This provides road condition awareness and allows the application to find you alternative routes.

Now, imagine using a version of this in business for IT.  Going back to an example I gave at the beginning; you are at the airport trying to get email on your phone and it’s not going well.  This can make you feel a bit of a victim and make you think – Is this problem something temporary? Is it just me? Have I done something stupid? Has someone else? However, what if you had an application that showed you your applications status, allowed you to see if anyone else is having similar issues, allowing you to immediately know if its a general email problem, a location problem or a device problem and if there are any workarounds. It would also allow you to see if the problem has been reported, report it yourself, add yourself to the problem list and track the problem. For the user it provides awareness and possible fixes. For the IT support organization it provides the ability to understand who is being affected, where they are and what they are using.  With so many applications being sourced outside the datacenter, the avalanche of mobile devices used for business and people constantly moving around while still trying to remain working the only way to help the end user help themselves is through crowd-sourcing applications augmenting the businesses IT operations management tools.

Power to the people.

why IT cannot ignore the end-user

Until recently visibility into the end user world was not considered essential when measuring the availability of IT service. It was assumed that focusing on datacenter metrics provided enough information to show how effectively IT supported the business. For most IT organizations the lens on the business remains the metrics provided by the service desk. From the perspective of issues this may be acceptable but it hardly represents how the business is using IT. It would be like asking a doctor “so, how healthy does the world look today?”

This whole situation has been exacerbated by the use of mobile devices and the growth in non-corporate cloud-based application sources.  So how does an IT department understand how the business is experiencing IT when it no longer has the luxury of concentrating its attention on the corporate data center? As of yet, there isn’t a consensus of opinion on how to address this situation leaving most to continue to look to their legacy IT infrastructure monitoring tools (see Infrastructure monitoring. How relevant is it?) supplemented with network performance tools and/or APM tools.

If the objective is to understand how IT is used and experienced then you you don’t start from the data center. The starting place is the end user. This requires more than a set of tools giving visibility from ‘the edge’ it will require IT support to organize and focus teams on end user activity.  Measuring experience means understanding how IT is used, when it is used and where it is used and not just when it is an issue. Capturing and analyzing this content allows IT organizations to assess the true business impact of IT irrespective of where the user is, what they are using or where their applications are sourced.

This approach is not going to be an easy for IT departments that have spent decades focusing on silo’d datacenter elements and back-end applications transactions. However, end user activity monitoring is not an option. Users do not use one device, do not remain in one place and do not use just one application. IT innovation, mobility and end user creativity will continue to push the limits of IT operations management with those able to adjust their IT management focus benefitting from greater IT decision making and business alignment.

Those that don’t will be left struggling trying to manage increasingly diverse IT needs using tools providing on a datacenter centric, application performance snapshot stumbling their way towards the edge through trying to see through increasingly complex third-party service black-holes.