5.3.1 Defaults Used in XFD Files

There are several elements of COBOL that require special handling when data dictionaries are built. These include multiple record definitions, REDEFINES, FILLER, and OCCURS. This section describes how ACUCOBOL-GT handles each of these situations.

Note that in many instances you can override the default behavior described below by placing special comment lines in the FDs of your COBOL code. These comments are called directives, and are described in section 5.3.2. For example, the WHEN directive allows you to use multiple definitions for a single set of data by specifying when each definition should be used.

Databases generally do not support the notion of multiple definitions for the same column. (In a similar sense, when you are editing a data file, it makes sense to see and change only one view of the data, rather than multiple views.) As the following paragraphs explain, whenever a COBOL program gives more than one definition for the same data, the compiler makes a choice about which definition to use in the data dictionary. Then it disregards the rest.

KEY IS phrase

Fields named in KEY IS phrases of SELECT statements are included as fields in the XFD. Other fields that occupy the same areas as the key fields (by either explicit or implicit redefinition) are not included by name, but are mapped to the key field column names by the data dictionary.

Remember, if the field named in the KEY IS phrase is a group item, it will not become a named field in the XFD unless a USE GROUP directive is used (see "Understanding how the XFD file is formed,"above).

REDEFINES clause

Fields contained in a redefining item occupy the same positions as the fields being redefined. The compiler needs to select only one of the field definitions to use. The default rule that it follows is to use the fields in the item being redefined as column names; fields that appear subordinate to a REDEFINES clause are mapped to column names by the data dictionary.

Multiple record definitions

This same rule extends to multiple record definitions. In COBOL, multiple record definitions are essentially redefinitions of the entire record area. This leads to the same complication that is encountered with REDEFINES: multiple definitions for the same data. So the compiler needs to select one definition to use.

Because the multiple record types can be different sizes, the compiler needs to use the largest one, so that it can cover all of the fields adequately. Thus, the compiler's rule is to use the fields in the largest record defined for the file. If more than one record is of the "largest" size, the compiler uses the first one.

Group items

Note that group items are, by default, never included in a data dictionary for the same reason that REDEFINES are excluded: they result in multiple names for the same data items. You can, however, choose to combine grouped fields into one data item by specifying the USE GROUP directive, described in section 5.3.3.8.

FILLER data items

In a COBOL FD, FILLER data items are essentially place holders. FILLER items are not uniquely named and thus cannot be uniquely referenced. For this reason, they are not placed into the Acucorp data dictionary. The dictionary maintains the correct mapping of the other fields, and no COBOL record positional information is lost.

Sometimes you need to include a FILLER data item, such as when it occurs as part of a key. In such a case, you could include it under a USE GROUP directive or give it a name of its own with the NAME directive, described in section 5.3.3.5.

OCCURS clauses

An OCCURS clause always requires special handling, because the Acu4GL runtime system must assign a unique name to each database column. The runtime accomplishes this by appending sequential index numbers to the item named in the OCCURS.

For example, if the following were part of a file's description:


03  employee-table occurs 20 times.

  05  employee-number pic 9(3)

then these column names would be created in the database table:

employee_number_1 employee_number_2 . . . employee_number_10 employee_number_11 . . employee_number_20

Note that the hyphens in the COBOL code are translated to underscores in database field names, and the index number is preceded by an extra underscore.

The alfred record editor shows only the names of the fields, without subscripts or indexes.

Summary of dictionary fields

Fields defined with an OCCURS clause are assigned unique sequential names. Fields without names are disregarded.

When multiple fields occupy the same area, the compiler chooses only one of them unless you have a WHEN directive to distinguish them. To choose:

The compiler preserves fields mentioned in KEY IS phrases.

It discards group items unless USE GROUP is specified.

It discards REDEFINES.

It uses the largest record if there are multiple record definitions.

Identical field names

In COBOL you distinguish fields with identical names by qualification. For example, there are two fields named RATE in the following code, but they can be qualified by their group items. Thus, you would reference RATE OF TERMS-CODE and RATE OF AR-CODE in your program:


01  record-area.

  03  terms-code.

    05  rate pic s9v999.

    05  days pic 9(3).

    05  descript pic x(15).

  03  ar-code.

    05  rate pic s9v999.

    05  days pic 9(3).

    05  descript pic x(15).

However, database systems consider duplicate names an error. Thus, if more than one field in a particular file has the same name, the data dictionary will not be generated for that file.

The solution to this situation is to add a NAME directive (see section 5.3.3.5) that associates an alternate name with one or both of the conflicting fields.

Long field names

The compiler determines whether field names are unique within the first 18 characters. This is because some RDBMSs truncate longer field names to 18 characters. (In the case of the OCCURS clause described above, the truncation is to the original name, not the appended index numbers.) If the field names are not unique within 18 characters, the XFD file is generated, but a warning message is issued.

You can, instead of allowing default truncation, use the NAME directive to give a shorter alias to a field with a long name. Note that within the COBOL application you will continue to use the original name. The NAME directive affects only the data dictionary.

Naming the XFD

The compiler needs to give a name to each XFD file (data dictionary) that is built. It attempts to build the name from your COBOL code, although there are some instances where the name in the code is nonspecific, and you must provide a name.

Each XFD name is built from a starting name that is derived (if possible) from the SELECT statement in your COBOL code. The following paragraphs explain how that occurs.

ASSIGN name is a variable

If the SELECT for the file has a variable ASSIGN name (ASSIGN TO filename), then you must specify a starting name for the XFD file via a FILE directive in your code. This process is described in section 5.3.3.4.

ASSIGN name is a constant

If the SELECT for the file has a constant ASSIGN name (such as ASSIGN TO "COMPFILE"), then that name is used as the starting name for the XFD name.

ASSIGN name is generic

If the ASSIGN phrase refers to a generic device (such as ASSIGN TO "DISK"), then the compiler uses the SELECT name as the starting name.

Forming the final XFD name

From the starting name, the final name is formed as follows:

1. The compiler removes any extensions from the starting name.

2. It constructs a "universal" base name by stripping out directory information that fits any of the formats used by the operating systems that run ACUCOBOL-GT.

3. It converts the base name to lower case.

4. It appends the letters ".xfd" to the base name.

Examples of XFD names

COBOL code:                            File name:
ASSIGN TO "usr/ar/customers.dat"       customer.xfd

SELECT TESTFILE, ASSIGN TO DISK        testfile.xfd

ASSIGN TO "-D SYS$LIB:HELP"            help.xfd

ASSIGN TO FILENAME                     (you specify)