Architecture exploitation
COBOL V6 continues to support the ARCH option (short for architecture) introduced in COBOL V5. This option exploits new hardware instructions and enables you to get the most out of your hardware investment.
The default setting for ARCH is 7, and other
supported values are 8, 9, 10, 11 and 12.
For more information on the facilities available at each level, and the mapping of these ARCH levels to specific hardware models, see ARCH in the Enterprise COBOL for z/OS® Programming Guide.
Each successive ARCH level allows the compiler to exploit more facilities in your hardware leading to the potential for increased performance. To illustrate the benefits from a COBOL application perspective, each ARCH level will be examined in greater detail below.
ARCH(7)
Hardware Feature: Long displacement instructions
Why This Matters For COBOL Performance: COBOL programs often work with a large amount of WORKING-STORAGE, LOCAL-STORAGE, and LINKAGE SECTION data. The long displacement instructions reduce the need for initializing Base Locator cells and tying up registers for this purpose.
Instead, the compiler uses the much larger reach of the long displacement instructions to access 256 times as much data as the standard displacement instructions used exclusively in earlier releases of the compiler.
Hardware Feature: 64-bit “G” format instructions
Why This Matters For COBOL Performance: Whenever BINARY (and its synonym types of COMP and COMP-4) data exceeds nine decimal digits, then the standard instruction set that operates on 32-bit registers can no longer contain the full range of values. In earlier releases of the compiler, this meant converting to another data type (such as packed decimal) or maintaining pairs of 32-bit registers. Both of these solutions add extra overhead and reduce performance.
Even if your BINARY data is declared with 9 or fewer decimal digits, intermediate arithmetic results can exceed nine decimal digits and might require a conversion from a 32-bit to a 64-bit representation. The conversion is needed because some 10 decimal-digit values and all values greater than 10 decimal digits cannot be fully encoded in the 32-bit two's complement representation that is used for BINARY data.
For example, a multiplication of two PIC 9(5) digit BINARY data items results in an intermediate value of 10 digits, as the source operand digit values of five must be added to arrive at the intermediate precision.
When using addition, the intermediate precision is one greater than the highest of the operand precision values. This means that adding a PIC 9(9) value to a PIC 9(1) value results in an intermediate precision of 10 digits.
In V6, the 64-bit “G” format instruction set is used to hold values with up to 19 decimal digits and therefore a type conversion or using pairs of register is no longer needed and performance is dramatically increased. Above the 64-bit limit type conversions are still required, but the performance of these cases has also been improved in V6. See BINARY (COMP or COMP-4) for a more in-depth discussion of BINARY data and interaction with TRUNC suboptions.
Hardware Feature: 32-bit immediate form instructions for a range of arithmetic, logical, and compare operations
Why This Matters For COBOL Performance: When your application contains binary data involved in arithmetic or compares, particularly if the data exceeds 5 digits, there is considerable opportunity for the compiler to take advantage of these new ARCH(7) 32-bit immediate form instructions.
Compiling with V4 would only allow the use of at most 16 bits worth of immediate data in a single instruction. Any larger values required storing the data in the literal pool, or using multiple other instructions to construct the immediate value in a register.
Both alternatives are less efficient in time and space.
With ARCH(7), immediate values up to 32 bits can be embedded directly in the instruction text with no need to reference the literal pool or generate other instructions that will increase path length. The result is generally smaller and faster code and less literal pool usage (saving the space and any delay in retrieving the data).
ARCH(8)
Hardware Feature: Decimal Floating Point (DFP)
Why This Matters For COBOL Performance: Decimal Floating Point is a natural fit for the packed decimal (COMP-3) and external decimal (DISPLAY) types that are ubiquitous in most COBOL applications. Using ARCH(8) and some OPTIMIZE setting above 0 enables the compiler to convert larger multiply and divide operations on any type of decimal operands to DFP, in order to avoid an expensive callout to a library routine.
This is possible as the hardware precision limit for DFP is much greater than is allowed in the packed decimal multiply and divide instructions.
The overhead of converting to DFP means that it is not suitable for all decimal arithmetic that would not need a library call. However, the ARCH(10) option described later in this section enables much greater use of DFP to improve performance.
Hardware Feature: Larger Move Immediate Instructions
Why This Matters For COBOL Performance: MOVEs of literal data and VALUE clause statements are common in many COBOL applications. Lower ARCH settings and all earlier compiler releases only contained support for moving a single byte of literal data in a single instruction, for example, by using the MVI - Move Immediate Instruction.
Any larger literal data required storing the constant value in the literal pool and using a memory move instruction to initialize the data item. This was less efficient in time and space than being able to embed larger immediate values directly in the instruction text.
With ARCH(8), several new move immediate instruction variants are available to move up to 16 bytes of sign extended data using one or two of these new instructions.
Also, these instructions are exploited regardless of the data type, so binary, internal/external decimal, alphanumeric, and even floating point literals take advantage of these more efficient instructions.
ARCH(9)
Hardware Feature: Distinct Operands Instructions
Why This Matters For COBOL Performance: Updating a data item or index to a new value while retaining the original value occurs frequently in many contexts in a typical COBOL application. One instance is when processing a table as some base value for the table is updated to access the various elements within the table. Under lower ARCH settings or in all earlier compiler releases, almost all instructions available that took two operands to produce a result would also overwrite the input first operand with the result.
For example: a conceptual operation such as:
Implemented with a pre ARCH(9) instruction variant would conceptually have to perform the operation as:
C = A
This means if the original value of A is required in another context, it must first be saved:
T = T + B
C = T
With ARCH(9), the distinct-operands facility is exploited to take advantage of the new variants of many arithmetic, shift, and logical instructions that will not destructively overwrite the first operand.
So the operation can be implemented in a more straightforward way:
That removes the need for extra instructions to save the original value as it is naturally preserved with the distinct operand instruction form. This feature reduces path length leading to better performance.
ARCH(10)
Hardware Feature: Improved Decimal Floating Point (DFP) Performance
Why This Matters For COBOL Performance: Using ARCH(8) and an OPTIMIZE setting greater than 0 already enables the compiler to make use of DFP to improve performance of packed and external decimal arithmetic in some particular instances. ARCH(10) goes further by adding efficient instructions to convert between DISPLAY (in particular unsigned and trailing signed overpunch zoned decimal) types and DFP.
These ARCH(10) instructions lower the overhead for using DFP for arithmetic on zoned decimal data items and enable the compiler to make much greater use of DFP to improve performance.
Instead of converting zoned decimal data items to packed decimal format to perform arithmetic, the compiler will convert zoned decimal data directly to DFP format and then back again to zoned decimal format after the computations are complete. This generally results in better performance, as the DFP instructions operate on in-register (compared to in-memory) data that is more efficiently handled by the hardware in many cases.
ARCH(11)
Hardware Feature: Improved conversion between packed decimal and Decimal Floating Point (DFP)
Why This Matters For COBOL Performance: At ARCH(10), the compiler is able to convert more efficiently between DISPLAY types and DFP, enabling the compiler to make significant use of DFP to improve performance of packed and external decimal arithmetic. While instructions to convert between packed decimal and DFP existed at ARCH(10), they were inefficient, and the benefit of performing packed arithmetic in DFP was outweighed by the cost of converting packed decimal values to and from DFP.
With ARCH(11), there are new instructions that convert between packed decimal and DFP more efficiently. They lower the overhead for using DFP arithmetic on packed decimal data items, enabling the compiler to make further use of DFP than at ARCH(10).
Instead of performing arithmetic on packed decimal items, the compiler will convert packed decimal data to DFP format and then back again to packed decimal format after the computations are complete. This generally results in better performance, as the DFP instructions operate on in-register (compared to in-memory) data that is more efficiently handled by the hardware in many cases. Due to the more efficient conversion instructions, the benefit of performing arithmetic in DFP outweighs the added cost of converting between packed decimal and DFP instead of performing packed arithmetic directly.
Hardware Feature: Vector Registers
Why This Matters For COBOL Performance: The new vector facility is able to operate on up to 16 byte-sized elements in parallel. With ARCH(11), COBOL V6 is able to take advantage of the new vector instructions to accelerate some forms of INSPECT statements by working with 16 bytes at a time. This can be much faster than operating on 1 byte at a time.

ARCH(12)
Hardware Feature: Vector packed decimal instructions
Why This Matters For COBOL Performance: In ARCH(11) and below, packed decimal arithmetic can only be performed using in-memory data, or by converting the data to Decimal Floating Point (DFP). In ARCH(12), the new vector packed decimal facility enables the compiler to perform native packed decimal arithmetic on data-in registers. This provides the performance advantages of using registers instead of memory, while eliminating the overhead of converting data back and forth between packed decimal and DFP.
