Você está na página 1de 4

Powering Digital Transformation with

Intelligent Monitoring & Analytics


How Using the Right Data in Context Improves Speed, Stability, Incident Command and MTTR

During our recent webinar, Using the Right 2. Experimentation and the Speed-Quality Paradox Cannot

Data In Context to Prevent IT Outages, guest Be Ignored


Another way that speed plays out is in the ability to
speaker, Forrester Principal Analyst Charles
experiment more quickly — and this has been one
Betz shared his insights on the importance
of the great paradoxes. As we move into these more
of leveraging the different types of agile and DevOps-influenced forms of development
machine data, collected through intelligent and deployment and delivery, how do we maintain
monitoring and analytics solutions, to quality? So, you have an idea, and you need to test that
enable observability and act on critical idea. You need to bring it in contact with the market.

insights that allow you to speed IT and You need to build, and the term for this is to create a
minimum viable product, or MVP. You then need to test
accelerate digital transformation. Here are
this in various ways against reality — against customer
some of our takeaways from this webinar.
demand, for example — and measure the response. That
is what gives you the learning that you need that then
1. Digital Transformation Means Need for Speed informs your decisions, e.g., building along the same
The overriding theme for much of digital transformation is lines with some further improvements, or going back to
this need for speed. As we can see, it’s no longer about the drawing board and generating some new ideas, or
having large, well-established firms that operate at their whether the whole endeavor is something that should be
own pace and at their own leisure. This is an age where abandoned. And this is how we ultimately understand
speed matters. Speed is essential — speed to market, our business outcomes.
speed to experiment. And this is driving so much of what
we are seeing with IT and digital operations today. We
66% of professionals agree that digital
know that the customer is king. Forrester calls this the
technology enhancements are the driving
age of the customer. Customers are driving vendors
force behind their business strategy
and suppliers with their insistence on fast, delightful
experiences. All of this comes together into this overall
drive for speed and how the industry has been responding 3. For DevOps, Change Can Actually Improve Stability
to that. What we see is that first, overall, the customers DevOps is expanding across the industry — it has passed
and the business technology professionals are agreeing the experimenters, the innovators. It is now firmly in the
that digital technology enhancements are driving the early majority. And, in fact, I think we’re going to be
business. And when we ask these same people, “What seeing late majority in the next 18 months. We already
do you mean by that? How do you want to improve the have nearly half of the industry already aware of it,
customer experience?” Many answer that improving experimenting with it. The success stories are just too
online customer experience is essential. compelling. But DevOps is still overcoming the big debate
of stability versus change. But one of the fundamental containers. But the common sense concern is this:
realizations that the DevOps pioneers started to work with If I’ve automated all the easy stuff, then what is left?
is - that making smaller, more frequent changes actually All the hard stuff.
might lead to better system stability. Imagine that.
There is some evidence about mean time to resolution
More frequent changes in and of themselves do not (MTTR) going up in organizations that, on the surface,
equal better results and stability. However, when you appear pretty mature. But it seems to make sense to me
make changes more frequently, by nature, you will also that as we, with DevOps, have faster release cycles, it
be making smaller changes. And this has been shown in means we are solving problems more quickly. If we solve
company after company to improve your time to market problems more quickly, we lose a longstanding concept in
and retain stability. IT service management called the “known error.” In the old
days when someone found an issue, you created a known
error record so when somebody ran into the issue again,
74% of developers say their organizations
the help desk could do x, y and z. Now we are in a day
are using DevOps to some degree, with
and age where things like that get fixed right away. They
52% increasing
get fixed, they get patched, you roll the patch out, you
have increasing release frequency.
4. Traditional Incident Management is Becoming
Incident Command
We are so dependent on digital systems as a society,
and this is resulting in some interesting responses. Amazon
realized some years back that traditional IT incident
management was no longer sufficient for the level of
dependency other companies were beginning to have Mean-time-to-Repair
(MTTR)
on the Amazon cloud. And so, they started looking
more broadly and found some very interesting material
coming out of the U.S. emergency services, including a
Mean time
protocol and a method known as the National Incident to assemble

Management System, which is currently maintained by the


U.S. Department of Homeland Security. There are certain
sets of defined protocols and expectations that people
must follow when in these very critical situations. And
increasingly, we are seeing more and more interest in this
framing, in these perspectives as incidents in digital systems
Mean-time-to-identify Mean-time-to-Know Mean-time-to-Fix Mean-time-to-Verify

become more socially impacting. In fact, this was a major (MTTI) (MTTK) (MTTF) (MTTV)
theme at the DevOps Enterprise Summit in San Francisco,
where a number of safety professionals, non-IT, were
gathered on stage and challenged the DevOps Enterprise
Summit attendees to increase the level of professionalism And so, your known error life cycles start to decrease —
because digital systems were becoming so socially and which means your understanding of the likely operational
economically critical. outage scenarios, issue scenarios — that stock of
knowledge — is going down as you increase your release
cycles. That has been essential for the resolution of IT and
5. Automation Drives Need for Monitoring digital incidents and issues. Broadly speaking this is all a
and Observability form of knowledge work, and we know that knowledge
One of the themes that has presented itself recently is workers spend up to 20 percent of their time looking for
this problem of what happens when we automate all this kind of information. And all of this, in the context of
of the easy things. We have amazing resiliency in our digital systems, leads to a critical need for monitoring
digital systems. We have the ability to automatically heal and observability.
6. Using Intelligent Analytics to Understand Critical Data like, for example, shopping cart abandonment.
Types Enables Fast IT Well, perhaps the shopping cart abandonment
Monitoring and observability are key to the success of on your e-commerce site is due to the fact that
digital transformation, experimentation, speed and quality, your performance is degraded to the point
where people are no longer going to do business
stability in DevOps, incident command, and improving
with you that day. Like infrastructure metrics,
MTTR — key to the success of “fast IT.” Monitoring and
proper analytics application metrics can provide
observability, in turn, are tremendous producers and
higher-level alerts about application health and
consumers of various kinds of information that enable
performance in the form of events.
efforts to modernize IT. It’s important to note that there
are two sets of equally important capabilities: collecting Event Data
this data and analyzing this data. Companies should An event is a significant marker that helps us
seek out the platforms that have the most robust analytics understand and contextualize and turn the
capabilities for the broadest sets of data. And we assert logs and metrics from mere data into actual
that there are five critical data types for this endeavor. information that is then actionable. For example,
we may see an attempt to log in a dozen times
on a server. That’s just 12 data points, but then we
Log Data
are also reasonably confident that this probably
Computers and the software running on them all
means that there is a brute force attack or other
produce logs. They show, in great detail, the state
potential security exploit in progress. And that
changes, the events, the various functions, users
can then be understood to be an event, and
logging in, attempting to log in, hackers trying
that is actually much more useful information.
to log in. We also then use various techniques to
We’ve taken a set of information — from the logs,
analyze them. And it’s, of course, the analytics that
from the infrastructure, from the applications —
provide the greatest value when you are looking
created higher-order events, then correlated that
at log data. At the lowest level, you can see what
information to parse out actionable events.
is being logged as an event, and then as you
analyze the logs to determine what happened.
Model Data
When we see these 3 or 5 or 10 things in a log, we
While events can be correlated to find patterns,
know that this may mean there is a situation, so we
event engines have no knowledge of systems,
generate a higher-level notification, i.e., an event.
their interconnectedness, nor their dependencies
on each other. This is where model data comes
Metrics
into play. The overall model data may be the
Metrics can tell us about operational concerns in
hardest form of data to collect and understand.
the infrastructure like performance degradation
It is data, the metadata, that actually gives you
or capacity utilization. There’s a wide variety of
the overall context for how to understand all of
infrastructure metrics that give us understanding
the previous data, the log data, the event
into operational concerns that might even require
data, the infrastructure and application metrics.
the dispatch of a crisis team or, at the very least,
There is an immense level of complexity as we
require proactive remediation. There are also
start to model these systems and we look at
longer-cycle business-as-usual concerns, like,
the dependencies between datastores and
maybe we need to invest in some more capacity.
midtier processing and edge processing. We
We can collect these on interval. With proper
need to understand this model once it is actually
analytics, metrics provide rich data for diagnosing
operationalized and we start to understand and
issues that have occurred and for preventing issues
look at the dependencies and the telemetry we
that may occur. These analytics can produce
are getting off of the runtime operational systems.
higher-level alerts in the form of events.
But actually, compiling and maintaining this
dependency data is still a nontrivial problem.
Application Data
Application data is a form of specialized metrics.
But your application metrics are very different,
and that’s where you start seeing business metrics,
SUMMARY
As we venture into the brave new world that awaits us over the next couple of years, we will bring all the data together and
start using data warehousing techniques. Because understanding all of these data types, in context, is critical, there will
be significant focus on platforms that bridge data silos. These platforms will be key to enabling “fast IT”and will be a key
ingredient in businesses that survive digital transformation and remain relevant for years to come.

ZENOSS CLOUD
Zenoss works with the world’s largest organizations to ensure their IT services and applications are always on. As the leader
in software-defined IT operations, Zenoss uniquely collects all types of machine data to build real-time IT service models
that train machine learning algorithms to predict and eliminate outages in hybrid IT environments, dramatically reducing
downtime and IT spend.

Zenoss Cloud is the first SaaS-based intelligent IT operations management platform that streams and normalizes all machine
data, uniquely enabling the emergence of context for preventing service disruptions in complex, modern IT environments.
Zenoss Cloud builds the most granular and intelligent infrastructure relationship models possible at any scale and
proactively provides unparalleled holistic health and deep performance insights to optimize any IT environment.

Technology vendors have taken many different approaches over the years to help prevent IT service outages and improve
overall IT performance. These approaches include infrastructure monitoring, artificial intelligence operations, log analytics
and more. Some approaches collect performance data from systems directly, some rely on logs, some rely on events, while
others rely on data sent from agents. Zenoss Cloud is the unique platform that combines all of these approaches.

ZENOSS CLOUD HELPS CUSTOMERS:


Increase Operational Agility

Automate processes and streamline collaboration to


enable faster service delivery
Support new business models at the speed of demand
Deliver management as a service for DevOps teams For more information, or to request a Zenoss Cloud
trial, please visit https://www.zenoss.com.
Accelerate Technology Adoption

Simplify cloud migrations and adoption of software-


defined and converged technologies
Eliminate risk associated with digital transformation

Apply consistent monitoring policies across all cloud


E www.zenoss.com

and on-premises systems


1-512-687-6854 (direct)
Ensure Service Reliability 1-888-936-6770 (toll free)
Identify issues, isolate root cause and accelerate
resolution before disruptions impact users or business
twitter.com/zenoss
Evolve from availability and performance to capacity
and optimization
Transition IT to event-driven outcomes www.linkedin.com/
company/zenoss-inc-
Consolidate Monitoring Tools

Increase IT visibility and eliminate silos while reducing


overhead and spend

Streamline across teams with collaboration workflows


(ChatOps)

Drive new efficiencies with Smart View, the machine


learning–powered dynamic user interface

Você também pode gostar