Wednesday, December 28, 2005

Monitor and Alert Systems I - Concept of Instrumentation

We have been building solutions for monitoring and alerts that is centered around the Service Level Management (SLM) architecture through the concept of instrumentation.
Instrumentation is used to describe the technologies and processes for monitoring and measuring the performance and availability of system components. Through instrumentation, we are then able to monitor the system behavior and assess the impact of changing operation.

Instrumentation takes on two forms:
  1. Element Instrumentation, which tracks the status and behavior of individual components, such as network devices, servers and applications.
  2. Service Instrumentation, which tracks the behavior of services using active and passive collectors.

Difference Between Element and Service Instrumentation

Element Instrumentation is used to collect data for monitoring threshold, such as the CPU busy percentage and the percentage of received packets that contains transmission errors.

One of most commonly used protocol is Simple Network Management Protocol (SNMP), where the configuration and performance monitoring instrumentation is organized in a standardized naming directory called Management Information Base (MIB). The MIB provides a universal directory of names for configuration and performance data of standard system and network elements. An agent embedded in an element enables a remote Instrumentation Manager to access and manipulate MIB variables at the element via the SNMP protocol. The SNMP protocol provides mechanisms to

  1. read (GET) performance and status variables from an element MIB,
  2. change (SET) configuration parameters, and
  3. report events (TRAP).

Service Instrumentation, on the other hand, tracks the behavior of services using active and passive collectors, typically to measure the end-to-end response of an application transaction.

Active collectors add to the traffic of a system, essentially perform a small experiment to validate compliance with key parameters. For example, the “ping” tool that sends a single packet to a remote system component, which then immediately returns a copy. The tool can measures the time delay between when the packet went out and the copy returned, and if multiple packets are sent out, the tool also reports the percentage that returns

Passive collectors rely on system traffic and facilities that are already there to provide performance data. For example, the use of existing log files to measure workload and server response time.

Instrumentation Manager

A lot of our clients have already made substantial investments in both elements and service instrumentation, but lack a centralized platform to monitor the status, collect operational statistics and receive real-time alarms when immediate attention is required.

In the next session, we will talk more on Instrumentation Manager.

2 Comments:

Blogger Sunshine said...

Happy New Year, my dear friend.

9:51 PM  
Blogger totoro said...

Hehehe, we shld ask Miko to come over nx time... long time hv such hearty drink already:)

2:14 PM  

Post a Comment

<< Home