Escolar Documentos
Profissional Documentos
Cultura Documentos
Operations
Manager 2005
Operations Guide
Optimize
Author: Dan Wesley
Program Manager: Guatam Bhatia
Published: December 2004
Applies To: Microsoft Operations Manager 2005
Document Version: Release 1.0
The information contained in this document represents the current view of Microsoft Corporation
on the issues discussed as of the date of publication. Because Microsoft must respond to
changing market conditions, it should not be interpreted to be a commitment on the part of
Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the
date of publication.
This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES,
EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting
the rights under copyright, no part of this document may be reproduced, stored in or introduced
into a retrieval system, or transmitted in any form or by any means (electronic, mechanical,
photocopying, recording, or otherwise), or for any purpose, without the express written
permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.
Unless otherwise noted, the example companies, organizations, products, domain names, e-mail
addresses, logos, people, places, and events depicted herein are fictitious, and no association
with any real company, organization, product, domain name, e-mail address, logo, person, place,
or event is intended or should be inferred.
2004 Microsoft Corporation. All rights reserved.
Microsoft, MS-DOS, Windows, Windows NT, Windows Server, Active Directory, ActiveSync, and
Windows Mobile are either registered trademarks or trademarks of Microsoft Corporation in the
United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their
respective owners.
Acknowledgments
Primary Reviewers: Adam Stone, Gautam Bhatia, James Hedrick
Managing Editor: Sandra Faucett
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
Optimize
C H A P T E R 6
This chapter provides guidance for analyzing Microsoft® Operations Manager 2005 (MOM)
performance, and for identifying potential and existing performance issues.
In This Chapter
• Introduction
• Characteristics of an Optimized MOM System
• Sudden Increases in Resource Usage
• General Indicators of a Performance Issue
• Lower the Risk of Performance Issues
• Assess MOM Database and Management Server Activity
• Assess Agent Activity
• Assess Console Activity
Introduction
Consider optimizing and tuning MOM to:
• Reduce or eliminate performance bottlenecks that are reducing overall effectiveness or
causing a system failure.
• Improve performance to increase operational effectiveness or to scale up to manage more
computers.
When you are dealing with performance issues, or you want to tune your system, it is
recommended that you:
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
6 Chapter 6 Microsoft Operations Manager 2005 Operations Guide
• Clearly define goals and objectives for optimizing MOM, and confirm that the results of
optimizations can be quantified.
• Review capacity planning and sizing documents to ensure that computers are appropriately
sized for each component in the MOM system (for example, the a server hosting the MOM
Database).
• Review the existing MOM architecture to ensure that it is adequate for supporting the
number of computers that you are managing, and that the load on architectural components
falls within supported limits (for example the number of agents reporting to a Management
Server).
• Confirm that the MOM support team is aware of known hardware and networking issues that
may have existed before MOM was installed.
• Confirm that the MOM support team is familiar with the pre-defined performance thresholds
provided in the Management Packs that are installed.
• Make sure that historical data is available in, order identify trends and develop the
appropriate performance benchmarks. See also: Benchmarks.
• Take a systematic approach to identify performance issues and implement changes.
• Start with a system-wide view and identify key performance indicators.
• Isolate the part of the system that you want to focus on, and identify the appropriate
performance indicators.
• Implement changes, and monitor the system, to verify the results of the changes by
collecting historical data.
• Have a back out plan for reversing configuration changes that may create new
bottlenecks or further degrade performance.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
Introduction 7
Benchmarks
Although Management Packs provide rules to generate alerts when certain thresholds are
exceeded, it is necessary to determine if the general performance of a computer is constant, or
changing, so potential issues can be addressed proactively.
Performance benchmarks, based on real time and historical data, are essential for determining
whether or not performance on a MOM component has changed. For example, unless there is a
baseline, you cannot conclude that Management Server performance is “slower” or “faster”.
Characteristics of an
Optimized MOM System
Overall MOM performance is affected by many variables, and the impact of each variable varies
from organization to organization. For example, networks are a common element in most
organizations, but the extent of networking, network capacity and network reliability differs in
every organization.
In general, an optimized MOM system has the following characteristics:
• The operational database size remains below 15 GB.
• Only the Management Packs that are needed are installed, and only the rules that are
required are enabled.
• The volume of events and alerts that are generated does not overload the Management
Server or the MOM database.
• The volume of operational data from managed computers is evenly distributed among
Management Servers (in management groups that have more than one MOM server).
• Communication between MOM components is efficient and consistent, and alert latency
remains below 2 minutes, during normal operations.
• Database re-indexing and grooming jobs complete successfully, and in a timely manner.
• The data transformation services (DTS) job that is run against the operational database
completes successfully.
• Ongoing operations management tasks, such as computer and attribute discovery, agent
installs/uninstalls, and configuration changes complete successfully and do not overload the
Management Servers for an extended period of time.
• Resource utilization on all the servers, including the managed computers, stays within
acceptable ranges during normal operations.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
8 Chapter 6 Microsoft Operations Manager 2005 Operations Guide
Sudden Increases in
Resource Usage
It is normal for sudden increases (spikes) in resource usage to occur on all the MOM components
under certain conditions. These spikes are caused by different activities that take place during
normal MOM operations, and can they can create temporary bottlenecks that increase alert
latency. Any optimizing activity should factor in these increases in resource usage. It is
recommended that you:
• Distinguish spikes in resource usage from ongoing resource utilization issues.
• Use performance counters and MOM reports to identify the cause and frequency of the
spikes.
Typically, surges in resource usage are not considered to be an ongoing performance issue, and
you can implement processes to minimize the impact of performance spikes as part of your
optimizing activities.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
Introduction 9
The re-index job, which runs every Sunday at 3 A.M., causes the database server disk to be
heavily utilized. This can cause some alert latency for the duration that it runs, which is usually
20-30 minutes.
Note
Other jobs, including grooming, update database, as well as
the Data Transformation Services (DTS) job, do not contribute
significantly to alert latency.
General Indicators of a
Performance Issue
There are several key indicators that provide advance warning of potential performance issues.
Perform the following checks daily:
• The size of the MOM Database:
• Is increasing rapidly or filling quickly.
• Is increasing more than anticipated.
• MOM Database re-indexing and grooming jobs are taking longer.
• Performance on the database server or Management Server is consistently slow.
• There are alerts indicating that the Management Server queue is full.
• Resource utilization (for example, CPU or memory) is unusually high on any computer that
has a MOM component installed.
• Alert and event latency is higher than normal, or is increasing.
• Response time on the consoles is slow.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
10 Chapter 6 Microsoft Operations Manager 2005 Operations Guide
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
Introduction 11
• As a best practice, do not load a Management Server to the maximum supported limit for
managed computers.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
12 Chapter 6 Microsoft Operations Manager 2005 Operations Guide
Note
The Alert Tuning Solution documented in Chapter 8, “Tools”,
provides extensive guidance for tuning Management Packs
that could be appropriate for this task.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
Introduction 13
• There are no permission issues between the Data Access Server (DAS) and the database.
• See if the MOMService process is restarting frequently. This process restarts if the private
bytes exceed 300 MB on the server. This can occur when the number of agents approaches
the supported limit, or if there are several Management Packs installed. This limit can be
changed by changing the value for the
HKEY_LOCAL_MACHINE\SOFTWARE\Mission Critical
Software\OnePoint\MaxServerPrivateBytes registry key.
• See if the server queue is filling up frequently. If so, conduct the analysis in “Server queue
assessment”.
• See if other applications are consuming too many resources on the Management Server, and
the SQL instance for the OnePoint database on the database server.
Note
If discovery data is filling the queue, this situation should
resolve itself within 1-4 hours. The length of time depends on
the number of agents on which data was changed, and what
Management Pack the data changed for. For example, the IIS
service discovery packet is larger that the Windows Base
Operating System service discovery packet.
• If the discovery simple count is 0, but the performance counter \MOM Server(*)\DB Alert
Simple Count is greater than zero for more than a few minutes, , the system may be
experiencing an alert storm. Use the Operator console to view the alerts are coming in, to see
if the number of alerts is much higher than usual.
• Check to see if there are several Operator consoles in the management group that are
currently refreshing view. If this is the case, check to see if any queries are taking longer
than expected to return data to the consoles.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
14 Chapter 6 Microsoft Operations Manager 2005 Operations Guide
• If the Reporting database is installed on the same computer as the operational database,
check to see if the reporting DTS job is running. If it is, investigate to see if the job is taking
longer that usual to complete.
• Check to see if there are other SQL Server jobs running at the time that CPU usage is high.
• If you still haven’t identified the issue, use the SQL Profiler to see what queries are running
either with high CPU usage, or for a long time.
Activity on the disk where the OnePoint database resides is high
If disk idle time is less than 20%:
• Check the size of the current SampledNumericData partition. Do this using Enterprise
Manager. Check the table name, where Current=1 in PartitionTables table, and then check
the size of that SND table. If the table is greater than 5 million for every 1 GB of memory
that SQL can use, check to see if it recovers soon after the next partitioning job.
If you see this pattern every night, you may have to add more memory to the database server,
and ensure that SQL Server is using the extra memory. Another option is to reduce your
performance data load.
• Repeat the preceding process with the current Event partition. If this is the problem area,
then you may have to add more memory to the database server and ensure that SQL Server
is using the extra memory. Another option is to reduce your performance data load.
• Check to see if there are any SQL jobs, in particular the Re-index job, running at the time
that disk activity is high.
• Check to see if the reporting DTS job is running. If it is, see if it is running for longer than
usual.
• If you still haven’t identified the issue, use the SQL Profiler to see what queries are running
with high disk activity.
High CPU utilization by the MOMService process
If the MOMService process CPU usage is over 80% on the Management Server:
• Repeat the steps used to check the discovery simple count (“High CPU utilization on the
database server”).
Activity on the Management Server disk is high
If the idle time is less than 20% on the Management Server:
• Make sure MOM 2005 RTM is installed. MOM 2005 RC had an issue with disk utilization
on the Management Server.
Server queue filling up from time to time
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
Introduction 15
If resource consumption is not high, but the server queue is filling up from time to time, check
the following patterns:
• If the server queue is filling up every 15 minutes, it could be the performance counter
collection. Check the database disk idle time to see if there is a corresponding spike to the
queue filling up. If this is the case, there is disk bottleneck and either faster or additional
disks are required.
• If the server queue is filling up towards the end of the time, and it recovers at midnight, there
is probably a high volume of performance data or events. There should be corresponding
high disk activity on the database server.
• If the pattern for the server queue filling up is periodic, look for SQL jobs running at those
times.
• If the server queue is constantly at 100%, see if the server queue simple count is constant. If
so, make sure MOM 2005 RTM is installed. MOM 2005 RC had an issue with the server
queue getting deadlocked.
• If there is no pattern for the server queue filling up, enable tracing and check mc8 logs on the
server. Look for errors that correspond to the queue filling up.
If there are no resource bottlenecks on the database and Management Servers, and the \MOM
Server(*)\Queue Space Percent Used never exceeds 10% for more than a few minutes, but alert
latency is still high, then the cause of the latency is likely the agent.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
16 Chapter 6 Microsoft Operations Manager 2005 Operations Guide
• Check communications from the agent to the Management Server by: pinging the server, and
by using telnet to connect to port 1270 on the Management Server.
• Check the network bytes/second and bandwidth on the server and agent computers. Verify
that all of the available bandwidth is not being used, especially in low-bandwidth scenarios.
• If none of the preceding cases are true, turn on tracing, and check the agent’s mc8 logs for
error events indicating that the agent cannot connect to the Management Server.
The agent service is restarting
Check the MOMHost private bytes performance counter. The MOMService restarts if the private
bytes of any MOMHost process exceed 100 MB on the agent. This is caused when running
responses or scripts consume large amounts of memory. You can adjust the maximum private
bytes limit by changing the registry settings, shown in Table 6.1.
Table 6.1 Private bytes registry keys
Setting Key
Default host private bytes HKEY_LOCAL_MACHINE\SOFTWAR
E\Mission Critical
Software\OnePoint\MaxDefaultH
ostPrivateBytes
Script host private bytes HKEY_LOCAL_MACHINE\SOFTWAR
E\Mission Critical
Software\OnePoint\MaxScriptHo
stPrivateBytes
Resource usage
Check to see if high resource usage (CPU, memory, disk) is causing a bottleneck.
• Verify that there are no other applications with heavy resource usage, which might be
depriving the MOM service of the resources that it requires.
• If the MOMHost process is consuming too many resources, check to see what responses or
scripts are running at the time of heavy resource utilization.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community
Introduction 17
• If the agent queue is at 100%, check to see if the agent queue simple count is constant. If this
is the case, verify that MOM 2005 RTM is installed. There was a known issue with queue
deadlocking with the MOM 2005 Release Candidate (RC).
• If MOM RTM is installed, turn on tracing, and examine the agent’s mc8 logs to see if there
are errors that correspond to the times that the queue filled up.
Did you find this information useful? Please send your suggestions and comments about
the documentation to momdocs@microsoft.com.
Looking for more MOM information? Experience the power of customer communities!
MOM Community