[ Next Article | Previous Article | Book Contents | Library Home | Legal | Search ]
Performance Toolbox Version 1.2 and 2 for AIX: Guide and Reference

Chapter 12. Recording Performance Data onRemote Systems

This chapter provides information about recording performance data on remote systems.

Recording on Remote Systems Overview

Monitoring of performance data via the network is important and extremely useful if you know when and what to monitor. Unfortunately, that is not always or even normally the case. Quite commonly, performance problems arise and are felt by end users without the system administrator knowing about it until it's too late to start a monitoring session.

Therefore, the xmservd daemon permits any system with the Agent component installed to record the activity on the system at all or selected times and for any set of performance statistics. This allows a system administrator to use the activity recording for an after-the-fact analysis of the performance problems. This capability is called the xmservd recording facility and is controlled through the xmservd recording configuration file.

Whenever xmservd is configured to record the activity of the system where it's running, this prevents the daemon from dying as described in the Life and Death of xmservd section. The daemon considers itself to be configured for recording if a recording configuration file is present.

All recording files created by xmservd are placed in the directory /etc/perf. Recording file names are azizo.yymmdd where the part after the period is built from the day the first record was written to the file. A recording for February 26, 1994 would thus be called /etc/perf/azizo.940226. The recording activity for any one day always goes to the same file, even when xmservd is stopped and started over the day. If a recording file for the day exists when xmservd starts, it appends additional activity to that file; otherwise it creates the file. For further details about how xmservd uses recording files, see the Retain Line section.

Recordings produced by xmservd have one or more statsets. One is created for each sampling interval defined in the recording configuration file. Each statset is assigned a number equal to the sampling interval divided by the minimum sampling interval of the xmservd daemon.

The Recording Configuration File

The recording configuration file must be supplied by the system administrator who configures a host. No recording configuration file is supplied as part of Performance Toolbox for AIX. The file is in ASCII format. When xmservd starts, it first tries to locate the recording configuration file as /etc/perf/xmservd.cf. If this file doesn't exist, xmservd looks for the recording configuration file as/usr/lpp/perfagent/xmservd.cf. If either file exists, xmservd considers itself configured for recording and parses the recording configuration file for instructions about when and what to record.

The recording configuration file must contain the following lines:

The recording configuration file may also contain: The lines are described in the following sections. They must appear in the sequence shown and the keywords or metric names must begin in column one of each line. White space must separate individual entries on the lines. In addition to the required line types, the recording configuration file may contain blank lines and comment lines that begin with the character # (number sign).

A program, xmscheck, to parse and analyze a recording configuration file is supplied as part of the Agent component. This program allows you to check the validity of a recording configuration file before it is moved to the /etc/perf directory. The program is described in the The xmscheck Preparser section. The xmscheck command should be run after creating or editing the recording configuration file for errors. If a metric is not valid on the local system, xmservd will terminate processing the recording file.

Retain Line

The primary purpose of the retain line is to specify how long time recording files must be retained. It also defines how many days each recording file covers. The format of the retain line is:
retain days_to_keep [days_per_file]
retain This keyword identifies the line.
days_to_keep Must be a number larger than one. It specifies the minimum number of days a recording file must be kept before xmservd deletes it.
days_per_file Optional. If specified, gives the number of days a recording file shall contain. This number must be smaller than or equal to days_to_keep. If not specified, this value defaults to the value specified for days_to_keep.

Whenever the xmservd daemon is started, and whenever it is running and midnight is passed, it checks to see if any of the recording files in directory /etc/perf is old enough to be deleted. This is done by calculating a factor, rf as the integer value:

rf = (days_to_keep + days_per_file - 1) / days_per_file
If the number d1 is the day number corresponding to the yymmdd part of the recording file name, and the current day number is d2, then the recording file is retained when the following expression is true; otherwise it is erased:
d2 - d1 rf x days_per_file
If days_per_file is larger than one, xmservd looks for a file with a name that indicates it is less than days_per_file old. If such a file exists, recording continues to that file. If not, a new file with a name generated from today's date is created.

When an existing recording file is opened by xmservd, the daemon checks the first (configuration) record in the file. This record carries the time and date of the last modification to the recording configuration file as of the time the recording file was created. If the recording configuration file has been modified since that time, xmservd begins the recording by appending a full set of control records to the file and adding the character "@" at the end of the file name. Most programs that process such a file only process the part of the file up to the second set of control records.

Frequency Line

The frequency line sets the default sampling interval for metrics. This interval is used for all metrics for which you do not specify a different sampling interval on their metric lines. The line looks like this:

frequency interval
frequency This keyword identifies the line.
interval Specifies the sampling interval in milliseconds. The value specified is rounded to the nearest multiple of the min_remote_interval value as specified with the -i command line argument to xmservd or its default value.

Recordings contain one set of statistics for each sampling interval you specify with this line type and on metric lines. It is recommended that no set of statistics ever has more than 256 metrics.

Metric Lines

One metric line must be supplied for each metric you want recorded. The metric lines have the following format:

metric_name [interval]
metric_name Must be the full path name of a statistic. Because xmservd can only access local statistics, the path name must not include the hosts part of the path name. The path name does not begin with a / (slash).

Process contexts have a name consisting of the process ID, a ~ (tilde), and the name of the executing program. To reach a statistic for a specific process, you can specify the process context name as either the process ID followed by the tilde, or the name of the executing program. The example below shows how to specify a statistic for the wait pseudo process, which, on AIX Version 3.2, always has a process ID of 514. Both lines point to the same statistic.

Proc/514~/usercpu
Proc/wait/usercpu

If you specify a name of a program currently executing in more than one process, only the first one encountered is used. Generally, recording of process statistics from xmservd is discouraged except for processes that are expected to never die. If a process dies, it is deleted from the statset and is not added back, should the process be restarted later.

interval Optional. If specified, defines the sampling interval in milliseconds to use for recording this metric. If omitted, the metric is recorded with the sampling interval specified on the frequency line.

Start-Stop Lines

The start-stop lines specify when recordings shall start and stop. Multiple lines may be used. The format of a start-stop line is:

start dd hh mm dd hh mm

The first set of dd hh mm values specifies the time to start recording; the second set specifies the time to stop recording.

start Identifies the line.
dd Specifies the number indicating the day of the week when you want a recording to start and stop. Sunday is day number 0, Saturday is day number 6. Can be specified as a single day number, as a range such as 1-5 (Monday through Friday), or as a series of day numbers separated by commas such as 1,3,5 (Monday, Wednesday, and Friday).
hh Specifies the hour on a 24-hour clock (midnight is 00) when you want a recording to start and stop. Can be specified as a single hour, as a range of hours such as 07-19 (7 am through 7 pm), or as a series of hours separated by commas such as 9,12,15 (9 am, 12 noon, 3 pm).
mm Specifies the minute when you want a recording to start and stop. Can be specified as a single minute value or as a series of minute values separated by commas such as 0,30 (every 30 minutes).

Exercise care when matching start and stop times -- especially when using multiple start-stop lines. It can be difficult to do so without plotting the recording intervals on a time scale. Therefore, the programxmscheck is available to preparse a recording configuration file and help you evaluate the resulting recording intervals.

The following examples help you understand how recording intervals are defined. First, consider the following start-stop line, which causes recording to take place for 10 minutes every half hour between 9 am and 6 pm on all weekdays. Notice that the last time recording starts every day is at 17:30 (5:30 pm).

start 1-5 9-17 0,30 1-5 9-17 10,40
If another start-stop line was added, that line would augment the first one. This is done by laying the intervals out on a time scale where all start and stop points are marked. The time scale is then processed from the beginning, creating a final set of start and stop marks by eliminating all stop marks that fall at the same minute as a start mark. Assume we supply the following two start-stop lines:
start 1-5 9-17 0,30 1-5 9-17 10,40
start 5 18-19 0,30 5 18-19 10,40
This would cause recording to take place for 10 minutes every half hour between 9 am and 6 pm on the first four weekdays and between 9 am and 8 pm on Fridays. The same could have been specified with:
start 1-4 9-17 0,30 1-4 9-17 10,40
start 5 9-19 0,30 5 9-19 10,40
The time scale created by xmservd does not wrap to the next week. Therefore, if you want recording from 11.30 pm to 12.30 am every night of the week, you need two lines:
start 0-6 23 30 1-6 00 30
start 0 0 0 0 0 30
For continuous recording at all times, specify:
start 0 0 0 0 0 0

Command Lines

Command lines allow the xmservd recording facility to execute commands or scripts when an old recording file is deleted. These are specified in the recording configuration file with the following format:

command /bin/ptxmerge /var/perf/temp %s /var/perf/year_to_date
command /bin/mv -f /var/perf/temp /var/perf/year_to_data

The %s in the line refers to the file to delete. The first line uses the ptxmerge program to merge the recording file, which is about to be deleted with the year_to_date file of accumulated recording files, and place the output from the merge to a temporary file. The last line moves the temporary file over to the previous year_to_date file. Note that this series of commands is not safe; it is meant only to illustrate the facility. To do the above, you should use a script that makes sure that lack of disk space doesn't cause you to lose data.

Hot Lines

HotSets allow metrics to be monitored by activity rather than by name. HotSets are defined by the format shown below. The values correspond to the arguments of the SpmiAddSetHot subroutine call:

# format of line to define hotfeed followed by examples:

#key                      max   thres  freq                           seve  trap
#word metric              resp  hold   uency  feed_type  except_type  rity  no

hot  LAN/*/framesin       1       0    60000  Always
hot  Disk/*/busy          3      50    10000  Threshold  Trap          0     14
hot  FS/*/%totfree        3      95   300000  Threshold  Both          4     16
hot  FS/rootvg/*/%totfree 0      95   300000  Always
hot  RTime/LAN/*/above99  3      80   300000  Threshold  Exception     2
hot The key word indicating HotSet recording.
metric The metric with a wildcard in the specification.
maxresp The maximum number of responses to record. If the feed_type is Threshold, this value must be greater than 1. One exception/trap is sent for each metric that exceeds the threshold upto the maximum value of this field.
threshold If feed_type is set as the Threshold, this field is the value which must be exceeded for the exception/trap to be sent. If the value is specified as a negative number, the threshold is considered to be exceeded if the monitored metric is lower than the numeric threshold value.
frequency The frequency to monitor the metrics. This value is in milliseconds.
feed_type The valid types are: Always or Threshold.
exception_type The valid types are: Exception, Trap, or Both.
severity If sending an exception, the severity level for the exception.
trap_number If sending a trap, the trap number to send.

Each line defines a separate HotSet. No more than MAX_HOT_COUNT (40) Hot events will be processed at any given time. HotSets should be used to monitor thresholds that represent abnormal performance behavior, and not the norm.

The recording support programs ptxhottab and ptx2stat should be used to process xmservd recording files which contain HotSet values.

Example of Recording Configuration File

# SAMPLE RECORDING CONFIGURATION FILE
# Keep files at least 7 days and let each file contain
# two day's recordings
retain 7 2
# Set default sampling interval to one minute
frequency 60000
# Give five statistics to record with default frequency
CPU/cpu0/user
CPU/cpu0/kern 
Mem/Real/sysrepag
Mem/Virt/pagein
Mem/Virt/steal
# Two additional statistics are recorded every 20 seconds
IP/NetIF/tr0/ioctet 20000
IP/NetIF/tr0/ooctet 20000
# record every weekday from 8.30 am to 5 pm, except during
# the lunch hour from noon to 1 pm
start 1-5 8 30 1-5 17 0
start 1-5 13 0 1-5 12 0

The xmscheck Preparser

When xmservd is started with the command line argument -v, its recording configuration file parser writes the result of the parsing to the log file. The output includes a copy of all lines in the recording configuration file, any error messages, and a map of the time scale with indication of when recording starts and stops.

While this is useful to document what is read from the recording configuration file, it is not a very useful tool for debugging of a new or modified recording configuration file. Therefore, the program xmscheck is available to preparse a recording configuration file before you move it to the directory /etc/perf, where xmservd looks for the recording configuration file.

When xmscheck is started without any command line argument, it parses the file /etc/perf/xmservd.cf. This way, you can determine how the running daemon is configured for recording. If a file name is specified on the command line, that file is parsed.

Output from xmscheck goes to stdout. The parsing is done by the exact same module that does the parsing in xmservd. That module is linked in as part of both programs. The parsing checks that all statistics specified are valid and prints the time scale for starting and stopping recording in the form of a "time table."

In the time table, each minute has a numeric code. The meaning of codes is as follows:

0 Recording is inactive. Neither a start nor a stop request was given for the minute.
1 Recording is active. Neither a start nor a stop request was given for the minute.
2 Recording is inactive. A stop request was given for the minute.
3 Recording is active. A start request was given for the minute.

The following Sample xmscheck Time Table Formatting shows how xmscheck formats the time table. Only the part of the table that covers Tuesday is shown. The Example Recording Configuration File was used to produce this output.

Day 2, Hour 00:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 01:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 02:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 03:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 04:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 05:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 06:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 07:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 08:
000000000000000000000000000000311111111111111111111111111111
Day 2, Hour 09:
111111111111111111111111111111111111111111111111111111111111
Day 2, Hour 10:
111111111111111111111111111111111111111111111111111111111111
Day 2, Hour 11:
111111111111111111111111111111111111111111111111111111111111
Day 2, Hour 12:
200000000000000000000000000000000000000000000000000000000000
Day 2, Hour 13:
311111111111111111111111111111111111111111111111111111111111
Day 2, Hour 14:
111111111111111111111111111111111111111111111111111111111111
Day 2, Hour 15:
111111111111111111111111111111111111111111111111111111111111
Day 2, Hour 16:
111111111111111111111111111111111111111111111111111111111111
Day 2, Hour 17:
200000000000000000000000000000000000000000000000000000000000
Day 2, Hour 18:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 19:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 20:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 21:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 22:
000000000000000000000000000000000000000000000000000000000000
Day 2, Hour 23:
000000000000000000000000000000000000000000000000000000000000

[ Next Article | Previous Article | Book Contents | Library Home | Legal | Search ]