[ Next Article | Previous Article | Book Contents | Library Home | Legal | Search ]
AIX Versions 3.2 and 4 Performance Tuning Guide

Assessing Memory Requirements via the rmss Command

rmss is an acronym for Reduced-Memory System Simulator. rmss provides you with a means to simulate RS/6000s with different sizes of real memories that are smaller than your actual machine, without having to extract and replace memory boards. Moreover, rmss provides a facility to run an application over a range of memory sizes, displaying, for each memory size, performance statistics such as the response time of the application and the amount of paging. In short, rmss is designed to help you answer the question: "How many megabytes of real memory does a RS/6000 need to run AIX and a given application with an acceptable level of performance?"--or in the multiuser context--"How many users can run this application simultaneously in a machine with X megabytes of real memory?"

In AIX Version 4, the rmss command is packaged as part of the Performance Toolbox for AIX. To determine whether rmss is available, use:

lslpp -lI perfagent.tools

If this package has been installed, rmss is available.

It is important to keep in mind that the memory size simulated by rmss is the total size of the machine's real memory, including the memory used by AIX and any other programs that may be running. It is not the amount of memory used specifically by the application itself. Because of the performance degradation it can cause, rmss can be used only by root or a member of the system group.

The following sections describe rmss in further detail:

Two Styles of Using rmss

rmss can be invoked in two ways: (1) to change the memory size and exit; or (2) as a driver program, which executes a specified application multiple times over a range of memory sizes and displays important statistics that describe the application's performance at each memory size. The first invocation technique is useful when you want to get the look and feel of how your application performs at a given system memory size, when your application is too complex to be expressed as a single command, or when you want to run multiple instances of the application. The second invocation technique is appropriate when you have an application that can be invoked as an executable or shell script file.

Note: Before using rmss, it is a good idea to use the command schedtune -h 0 to turn off VMM memory-load control. Otherwise, VMM memory-load control may interfere with your measurements at small memory sizes. When your experiments are complete, reset the memory-load-control parameters to the values that are normally in effect on your system (if you normally use the default parameters, use schedtune -D).

Using rmss to Change the Memory Size and Exit

To change the memory size and exit, use the -c flag:

# rmss -c memsize

For example, to change the memory size to 12MB, use:

# rmss -c 12

memsize is an integer or decimal fraction number of megabytes (for example, 12.25). Additionally, memsize must be between 4MB and the amount of physical real memory in your machine. Depending on the hardware and software configuration, rmss may not be able to change the memory size to less than 8MB, because of the size of inherent system structures such as the kernel. When rmss is unable to change to a given memory size, it displays an informative error message.

rmss reduces the effective memory size of a RS/6000 by stealing free page frames from the list of free frames that is maintained by the VMM. The stolen frames are kept in a pool of unusable frames and are returned to the free frame list when the effective memory size is to be increased. Also, rmss dynamically adjusts certain system variables and data structures that must be kept proportional to the effective size of memory.

It may take a short while (up to 15 to 20 seconds) to change the memory size. In general, the more you wish to reduce the memory size, the longer rmss takes to complete. When successful, rmss responds with the following message:

Simulated memory size changed to  12.00 Mb.

To display the current memory size, use the -p flag:

# rmss -p

To this, rmss responds:

Simulated memory size is  12.00 Mb.

Finally, if you wish to reset the memory size to the actual memory size of the machine, use the -r flag:

# rmss -r

No matter what the current simulated memory size, using the -r flag sets the memory size to be the physical real memory size of the machine. Since this example was run on a 16MB machine, rmss responded:

Simulated memory size changed to  16.00 Mb.
Note: The rmss command reports usable real memory. On machines that contain bad memory or memory that is in use, rmss reports the amount of real memory as the amount of physical real memory minus the memory that is bad or in use by the system. For example, the rmss -r command might report:
Simulated memory size changed to 79.9062 Mb.

This could be a result of some pages being marked bad or a result of a device that is reserving some pages for its own use (and thus not available to the user).

Using the -c, -p, and -r Flags

The -c, -p and -r flags of rmss have an advantage over the other options in that they allow you to experiment with complex applications that cannot be expressed as a single executable or shell script file. On the other hand, the -c, -p, and -r options have a disadvantage in that they force you to do your own performance measurements. Fortunately, there is an easy way to do this. You can use vmstat -s to measure the paging-space activity that occurred while your application ran.

By running vmstat -s, running your application, then running vmstat -s again, and subtracting the number of paging-space page ins before from the number of paging-space page ins after, you can determine the number of paging-space page ins that occurred while your program ran. Furthermore, by timing your program, and dividing the number of paging-space page ins by the program's elapsed run time, you can obtain the average paging-space page-in rate.

It is also important to run the application multiple times at each memory size. There are two good reasons for doing so. First, when changing memory size, rmss often clears out a lot of memory. Thus, the first time you run your application after changing memory sizes it is possible that a substantial part of the run time may be due to your application reading files into real memory. But, since the files may remain in memory after your application terminates, subsequent executions of your application may result in substantially shorter elapsed times. Another reason to run multiple executions at each memory size is to get a feel for the average performance of the application at that memory size. The RS/6000 and AIX are complex systems, and it is impossible to duplicate the system state each time your application runs. Because of this, the performance of your application may vary significantly from run to run.

To summarize, you might consider the following set of steps as a desirable way to use this style of rmss invocation:

while there are interesting memory sizes to investigate:
  {
  change to an interesting memory size using rmss -c;
  run the application once as a warm-up;
  for a couple of iterations:
    {
    use vmstat -s to get the "before" value of paging-space page ins;
    run the application, while timing it;
    use vmstat -s to get the "after" value of paging-space page ins;
    subtract the "before" value from the "after" value to get the
       number of page ins that occurred while the application ran;
    divide the number of paging-space page ins by the response time
       to get the paging-space page-in rate;
    }
  }
run rmss -r to restore the system to normal memory size (or reboot)

The calculation of the (after - before) paging I/O numbers can be automated by using the vmstat.sh script that is part of the PerfPMR package.

Using rmss to Run a Command over a Range of Memory Sizes

The -s, -f, -d, -n, and -o flags are used in combination to invoke rmss as a driver program. As a driver program, rmss executes a specified application over a range of memory sizes and displays statistics describing the application's performance at each memory size. The syntax for this invocation style of rmss is given below:

rmss [ -s smemsize ] [ -f fmemsize ] [ -d memdelta ]
     [ -n numiterations ] [ -o outputfile ] command

The -n flag is used to specify the number of times to run and measure the command at each memory size. The -o flag is used to specify the file into which to write the rmss report, while command is the application that you wish to run and measure at each memory size. Each of these flags is discussed in detail below.

The -s, -f, and -d flags are used to specify the range of memory sizes. The -s flag specifies the starting size, the -f flag specifies the final size, and the -d flag specifies the difference between sizes. All values are in integer or decimal fractions of megabytes. For example, if you wanted to run and measure a command at sizes 24, 20, 16, 12 and 8MB, you would use the following combination:

-s 24 -f 8 -d 4

Likewise, if you wanted to run and measure a command at 16, 24, 32, 40, and 48MB, you would use the following combination:

-s 16 -f 48 -d 8

If the -s flag is omitted, rmss starts at the actual memory size of the machine. If the -f flag is omitted, rmss finishes at 8MB. If the -d flag is omitted, there is a default of 8MB between memory sizes.

What values should you choose for the -s, -f, and -d flags? A simple choice would be to cover the memory sizes of RS/6000s that are being considered to run the application you are measuring. However, increments of less than 8MB can be useful, because you can get an idea of how much "breathing room" you'll have when you settle on a given size. For instance, if a given application thrashes at 8MB but runs without page ins at 16MB, it would be useful to know where within the 8 to 16MB range the application starts thrashing. If it starts at 15MB, you may want to consider configuring the system with more than 16MB of memory, or you may want to try to modify the application so that there is more breathing room. On the other hand, if the thrashing starts at 9MB, you know that you have plenty of breathing room with a 16MB machine.

The -n flag is used to specify how many times to run and measure the command at each memory size. After running and measuring the command the specified number of times, rmss displays statistics describing the average performance of the application at that memory size. To run the command 3 times at each memory size, you would use the following:

-n 3

If the -n flag is omitted, rmss determines during initialization how many times your application must be run in order to accumulate a total run time of 10 seconds. rmss does this to ensure that the performance statistics for short-running programs will not be significantly skewed by transient outside influences, such as daemons.

Note: If you are measuring a very brief program, the number of iterations required to accumulate 10 seconds of CPU time can be very large. Since each execution of the program takes a minimum of about 2 elapsed seconds of rmss overhead, you should probably specify the -n parameter explicitly for short programs.

What are good values to use for the -n flag? If you know that your application takes much more than 10 seconds to run, then you can specify -n 1 so that the command is run and measured only once at each memory size. The advantage of using the -n flag is that rmss will finish sooner because it will not have to spend time during initialization to determine how many times to run your program. This can be particularly valuable when the command being measured is long-running and interactive.

It is important to note that rmss always runs the command once at each memory size as a warm-up before running and measuring the command. The warm-up is needed to avoid the I/O that occurs when the application is not already in memory. Although such I/O does affect performance, it is not necessarily due to a lack of real memory. The warm-up run is not included in the number of iterations specified by the -n flag.

The -o flag is used to specify a file into which to write the rmss report. If the -o flag is omitted, the report is written into the file rmss.out.

Finally, command is used to specify the application to be measured. command can be an executable or shell script, with or without command-line arguments. There are some limitations on the form of the command however. First, it cannot contain the redirection of input or output (for example, foo > output, foo < input). This is because rmss treats everything to the right of the command name as an argument to the command. If you wish to redirect, you must place the command in a shell script file.

Normally, if you want to store the rmss output in a specific file, you would use the -o option. If you want to redirect the stdout output of rmss (for example, to concatenate it to the end of an existing file) then, with the Korn shell, you need to enclose the rmss invocation in parentheses, as follows:

# (rmss -s 24 -f 8 foo) >> output
Interpreting rmss Results

This section gives suggestions on how to interpret performance statistics produced by rmss. Let's start out with some typical results.

The Report Generated for the foo Program example was produced by running rmss on a real-life application program, although the name of the program has been changed to foo for anonymity. The specific command that would have been used to generate the report is:

# rmss -s 16 -f 8 -d 1 -n 1 -o rmss.out foo
Report Generated for the foo Program
Hostname:  widgeon.austin.ibm.com
Real memory size:   16.00 Mb
Time of day:  Thu Jan  8 19:04:04 1990
Command:  foo
   
Simulated memory size initialized to  16.00 Mb.
    
Number of iterations per memory size = 1 warm-up + 1 measured = 2.
   
Memory size  Avg. Pageins  Avg. Response Time    Avg. Pagein Rate
(megabytes)                     (sec.)           (pageins / sec.)
-----------------------------------------------------------------
16.00            115.0           123.9                 0.9
15.00            112.0           125.1                 0.9
14.00            179.0           126.2                 1.4
13.00             81.0           125.7                 0.6
12.00            403.0           132.0                 3.1
11.00            855.0           141.5                 6.0
10.00           1161.0           146.8                 7.9
9.00            1529.0           161.3                 9.5
8.00            2931.0           202.5                 14.5

The report consists of four columns. The leftmost column gives the memory size, while the Avg. Pageins column gives the average number of page ins that occurred when the application was run at that memory size. It is important to note that the Avg. Pageins column refers to all page in operations, including code, data, and file reads, from all programs, that completed while the application ran. The Avg. Response Time column gives the average amount of time it took the application to complete, while the Avg. Pagein Rate column gives the average rate of page ins.

First, concentrate on the Avg. Pagein Rate column. From 16MB to 13MB, the page-in rate is relatively small (< 1.5 page ins/sec). However, from 13MB to 8MB, the page-in rate grows gradually at first, and then rapidly as 8MB is reached. The Avg. Response Time column has a similar shape: relatively flat at first, then increasing gradually, and finally increasing rapidly as the memory size is decreased to 8MB.

Here, the page-in rate actually decreases when the memory size changes from 14MB (1.4 page ins/sec.) to 13MB (0.6 page ins/sec.). This should not be viewed with alarm. In a real-life system it is impossible to expect the results to be perfectly smooth. The important point is that the page-in rate is relatively low at both 14MB and 13MB.

Finally, there are a couple of deductions that we can make from the report. First of all, if the performance of the application is deemed unacceptable at 8MB (as it probably would be), then adding memory would improve performance significantly. Note that the response time rises from approximately 124 seconds at 16MB to 202 seconds at 8MB, an increase of 63%. On the other hand, if the performance is deemed unacceptable at 16MB, adding memory will not improve performance much, because page ins do not slow the program appreciably at 16MB.

Examples of Using the -s, -f, -d, -n, and -o Flags

To investigate the performance of a shell script named ccfoo that contains the command cc -O -c foo.c in memory sizes 16, 14, 12, 10, 8 and 6MB; run and measure the command twice at each memory size; and write the report to the file cc.rmss.out, enter:

# rmss -s 16 -f 6 -d 2 -n 2 -o cc.rmss.out ccfoo
Report for cc

The output is:

Hostname:  terran
Real memory size:   32.00 Mb
Time of day:  Mon Apr 20 16:23:03 1992
Command:  ccfoo

Simulated memory size initialized to  16.00 Mb.
   
Number of iterations per memory size = 1 warm-up + 2 measured = 3.
   
Memory size   Avg. Pageins     Avg. Response Time   Avg. Pagein Rate
(megabytes)                         (sec.)          (pageins / sec.)
--------------------------------------------------------------------
16.00               0.0              0.4                    0.0  
14.00               0.0              0.4                    0.0  
12.00               0.0              0.4                    0.0  
10.00               0.0              0.4                    0.0  
8.00                0.5              0.4                    1.2  
6.00                786.0           13.5                   58.4 
  
Simulated final memory size.

This shows that we were too conservative. Clearly the performance degrades badly in a 6MB machine, but it is essentially unchanged for all of the larger sizes. We can redo the measurement with a narrower range of sizes and a smaller delta with:

rmss -s 11 -f 5 -d 1 -n 2 ccfoo

This gives us a clearer picture of the response-time curve of the compiler for this program:

Hostname:  terran
Real memory size:   32.00 Mb
Time of day:  Mon Apr 20 16:11:38 1992
Command:  ccfoo 
   
Simulated memory size initialized to  11.00 Mb.
   
Number of iterations per memory size = 1 warm-up + 2 measured = 3.

Memory size   Avg. Pageins     Avg. Response Time   Avg. Pagein Rate
(megabytes)                       (sec.)            (pageins / sec.)
--------------------------------------------------------------------
11.00               0.0            0.4                    0.0  
10.00               0.0            0.4                    0.0  
9.00                0.5            0.4                    1.1  
8.00                0.0            0.4                    0.0  
7.00                207.0          3.7                    56.1 
6.00                898.0         16.1                    55.9 
5.00                1038.0        19.5                    53.1

Simulated final memory size.
Report for a 16MB Remote Copy

The following example illustrates a report that was generated (on a client machine) by running rmss on a command that copied a 16MB file from a remote (server) machine via NFS.

Hostname:  xray.austin.ibm.com
Real memory size:   48.00 Mb
Time of day:  Mon Aug 13 18:16:42 1990
Command:  cp /mnt/a16Mfile /dev/null
    
Simulated memory size initialized to  48.00 Mb.
    
Number of iterations per memory size = 1 warm-up + 4 measured = 5.
   
Memory size   Avg. Pageins   Avg. Response Time  Avg. Pagein Rate
(megabytes)                     (sec.)           (pageins / sec.)
-----------------------------------------------------------------
48.00              0.0            2.7                   0.0
40.00              0.0            2.7                   0.0
32.00              0.0            2.7                   0.0
24.00              1520.8        26.9                  56.6
16.00              4104.2        67.5                  60.8
8.00               4106.8        66.9                  61.4

Note that the response time and page-in rate in this report start relatively low, rapidly increase at a memory size of 24MB, and then reach a plateau at 16 and 8MB. This report shows the importance of choosing a wide range of memory sizes when you use rmss. If this user had only looked at memory sizes from 24MB to 8MB, he or she might have missed an opportunity to configure the system with enough memory to accommodate the application without page ins.

Report for find / -ls >/dev/null

The next example is a report that was generated by running rmss on the shell script file findbench.sh, which contained the command find / -ls > /dev/null, which does an ls of every file in the system. The command that produced the report was:

# rmss -s 48 -d 8 -f 4.5 -n 1 -o find.out findbench.sh

A final memory size of 4.5MB was chosen because it happened to be the smallest memory size that was attainable by using rmss on this machine.

Hostname:  xray.austin.ibm.com
Real memory size:   48.00 Mb
Time of day:  Mon Aug 13 14:38:23 1990
Command:  findbench.sh
    
Simulated memory size initialized to  48.00 Mb.
    
Number of iterations per memory size = 1 warm-up + 1 measured = 2.
    
Memory size    Avg. Pageins    Avg. Response Time   Avg. Pagein Rate
(megabytes)                        (sec.)           (pageins / sec.)
--------------------------------------------------------------------
48.00               373.0            25.5                  14.6
40.00               377.0            27.3                  13.8
32.00               376.0            27.5                  13.7
24.00               370.0            27.6                  13.4
16.00               376.0            27.3                  13.8
8.00                370.0            27.1                  13.6
4.50                1329.0           57.6                  23.1

As in the first example, the average response times and page-in rate values remain fairly stable as the memory size decreases until we approach 4.5MB, where both the response time and page-in rate increase dramatically. However, the page-in rate is relatively high (approximately 14 page ins/sec.) from 48MB through 8MB. The lesson to be learned here is that with some applications, no practical amount of memory would be enough to eliminate page ins, because the programs themselves are naturally I/O-intensive. Common examples of I/O-intensive programs are programs that scan or randomly access many of the pages in very large files.

Hints for Using the -s, -f, -d, -n, and -o Flags

One helpful feature of rmss, when used in this way, is that it can be terminated (by the interrupt key, Ctrl-C by default) without destroying the report that has been written to the output file. In addition to writing the report to the output file, this causes rmss to reset the memory size to the physical memory size of the machine.

You can run rmss in the background, even after you have logged out, by using the nohup command. To do this, precede the rmss command by nohup, and follow the entire command with an & (ampersand):

# nohup rmss -s 48 -f 8 -o foo.out foo &

Important Rules to Consider When Running rmss

No matter which rmss invocation style you are using, it is important to recreate the end-user environment as closely as possible. For instance, are you using the same model CPU? same model disks? same network? Will the users have application files mounted from a remote node via NFS or some other distributed file system? This last point is particularly important, as pages from remote files are treated differently by the VMM than pages from local files.

Likewise, it is best to eliminate any system activity that is not related to the desired system configuration or the application you are measuring. For instance, you don't want to have people working on the same machine as rmss unless they are running part of the workload you are measuring.

Note: You cannot run multiple invocations of rmss simultaneously.

When you have completed all runs of rmss, it is best to shutdown and reboot the system. This will remove all changes that rmss has made to the system and will restore the VMM memory-load-control parameters to their normal settings.


[ Next Article | Previous Article | Book Contents | Library Home | Legal | Search ]