[ Next Article | Previous Article | Book Contents | Library Home | Legal | Search ]
General Programming Concepts: Writing and Debugging Programs

Chapter 8. Large Program Support

This chapter provides information about using the large address-space model to accommodate programs requiring data areas that are larger than conventional segmentation can handle.

Note: The discussion in this chapter only applies to 32-bit processes. For information about the default 32-bit address space model and the 64-bit address space model, see and in this book.

The system hardware divides the currently active 32-bit virtual address space into 16 independent segments, each addressed by a separate segment register. The operating system refers to segment 2 (virtual address 0x20000000) as the process private segment. This segment contains most of the per-process information, including user data, user stack, kernel stack, and user block.

Because the system places user data and the user stack within a single segment, the system limits the maximum amount of stack and data to slightly less than 256MB. This size is adequate for most applications. The kernel stack and u-block are relatively small and of fixed size. However, certain applications require large initialized or uninitialized data areas in the data section of a program. Other large data areas can be created dynamically with the malloc, brk or sbrk subroutine.

Some programs need larger data areas than allowed by the default address-space model. Programs that need the larger data areas can use the large address-space model to request the necessary amount of data space.

Understanding the Large Address-Space Model

The large address-space model enables large data applications while allowing programs that use a smaller space to follow the smaller model. To allow a program to use the large address-space model, you must set the o_maxdata field in the XCOFF header of the program to indicate the amount of data needed.

In the large address-space model, the data in the program is laid out beginning in segment 3 when the value is non-zero. (The data is laid out beginning in segment 3, even if the value is smaller than a segment size.) The program consumes as many segments as needed to hold the amount of data indicated by the o_maxdata field, up to a maximum of 8 segments. The program can therefore have up to 2 gigabytes of data.

Other aspects of the program address space remain unchanged. The user stack, kernel stack, and u-block continue to reside in segment 2. Also, the data resulting from loading a private copy of a shared library is placed in segment 2. Only program data is placed in segment 3 or higher.

As a result of this organizational scheme, the user stack is still limited by the size of segment 2. (However, the user stack can be relocated into a shared memory segment.) In addition, fewer segments are available for mapped files.

While the size of initialized data in a program can be large, there is still a restriction on the size and placement of text. In the executable file associated with a program, the offset of the end of the text section plus the size of the loader section must be less than 256MB. This is required so that this read-only portion of the executable will fit into segment 1 (the TEXT segment). Because of these restrictions, a program cannot have a very large text section.

Enabling the Large Address-Space Model

The large address space model is used if any nonzero value is given for the maxdata keyword. Use the -bmaxdata option only if the program needs very large data areas.

Use the -bmaxdata flag with the ld command to enable the large address-space model.

For example, to link a program that will have the maximum 8 segments reserved to it, the following command line could be used:

cc sample.o -bmaxdata:0x80000000

The number 0x80000000 is the number of bytes, in hexadecimal format, equal to eight 256MB segments. Although larger numbers can be used, they are ignored because a maximum of 8 segments can be reserved. The value following the -bmaxdata flag can also be specified in decimal or octal format.

Using the following shell commands, you can patch large programs to use large data without relinking them:

/usr/bin/echo '\0200\0\0\0'|dd of=executable_file_name bs=4 
count=1 seek=19 conv=notrunc
Note: Use the full name of the echo command (/usr/bin/echo) to avoid invoking any of the shell echo subcommands by mistake.

The echo string generates the binary value 0x80000000. This dd command seeks to the proper offset in the executable file and modifies the o_maxdata field. Do not use the dd command on nonexecutable object files, loadable modules, or shared libraries.

Executing Programs with Large Data Areas

When a program attempts to execute a program with large data areas, the system recognizes the requirement for large data and attempts to modify the soft limit on data size to accommodate that requirement. However, if it does not have permission to modify the soft limit, the program ends.

In addition, it is also possible that the data size specified in the o_maxdata field may be too small to accommodate the amount of space required for initialized or uninitialized data. In this case, the process ends, and an error is reported.

The attempt is also unsuccessful if the new soft limit is above the hard limit for the process. For example, the login process usually sets the hard limit to infinity. However, if the calling process has modified its hard limit using either the ulimit command in the Bourne shell or the limit command in the C shell, the newly modified soft limit may be above the hard limit for the process. In this case, the process will be killed during exec processing. In this situation, the only message you receive is killed, which informs you that the process was killed.

For more information on the ulimit command in the Bourne shell, see "Bourne Shell Special Commands" in AIX Version 4.3 System User's Guide: Operating System and Devices. For more information about the limit command in the C shell, see "Command Substitution in the C Shell" and "Filename Substitution in the C Shell" in AIX Version 4.3 System User's Guide: Operating System and Devices.

After placing the program's initialized and uninitialized data in segments 3 and beyond, the system computes the break value. The break value defines the end of the process's static data and the beginning of its dynamically allocatable data. Using the malloc, brk or sbrk subroutine, the process is free to move the break value toward the end of the segment identified by the maxdata field in the a.out header file.

For example, if the value specified in the maxdata field in the a.out header file is 0x80000000, then the maximum break value is up to the end of segment 10 or 0xafffffff. The brk subroutine extends the break across segment boundaries, but not beyond the point specified in the maxdata field.

The majority of subroutines are unaffected by large data programs. The semantics of the fork subroutine remain unchanged. Large data programs can run other large or small programs, as well as load and unload other modules.

The setrlimit subroutine allows the soft data limit to be set to any value that does not exceed the hard limit. However, because of the inherent limitation of the address space model used by the process, it may not be able to increase its size to the value that is set.

Special Considerations

Programs with large data spaces require a large amount of paging space. For example, if a program with a 2-gigabyte address space tries to access every page in its address space, the system must have 2 gigabytes of paging space. The operating system page-space monitor terminates processes when paging space runs low. Programs with large data spaces are terminated first because they typically consume a large amount of paging space.

Debugging programs with large data is similar to debugging other programs. The dbx command can debug these large programs actively or from a core dump. A full core dump should not be performed because programs with large data areas produce large core dumps, which consume large amounts of file-system space.

Some application programs may be written in such a way that they rely on characteristics of the address space model. Programs in which the large address space is enabled use a different address space model than programs without the large address space enabled. This could cause problems for applications which make assumptions about the address space model they are running in. In general, avoid application programs that make assumptions about the address space model.

Related Information

The cc command, dd command, ld command.

XCOFF Object (a.out) File Format.

The brk or sbrk subroutine, exec subroutine, fork subroutine, malloc subroutine, setrlimit subroutine.

Bourne Shell Special Commands in AIX Version 4.3 System User's Guide: Operating System and Devices.

Program Address Space Overview.


[ Next Article | Previous Article | Book Contents | Library Home | Legal | Search ]