D.4.4 Data Type Mapping

Overview of ODBC data types

ODBC defines certain generic data types, which every data source driver maps to its own internal types. Each driver needs to be queried about which types it supports, since different drivers support different types.

The types that Acu4GL for ODBC uses, if they exist in the driver, are:

SQL_CHAR

SQL_VARCHAR

SQL_DECIMAL

SQL_NUMERIC

SQL_SMALLINT

SQL_INTEGER

SQL_REAL

SQL_FLOAT

SQL_DOUBLE

SQL_TINYINT

SQL_BIGINT

SQL_BINARY

SQL_VARBINARY

SQL_LONGVARBINARY

SQL_DATE

SQL_TIMESTAMP

For example, Informix maps these types like this:

   SQL_CHAR           CHAR
   SQL_VARCHAR        VARCHAR
   SQL_DECIMAL        DECIMAL or MONEY
   SQL_NUMERIC        no such type
   SQL_SMALLINT       SMALLINT
   SQL_INTEGER        INTEGER or SERIAL
   SQL_REAL           REAL
   SQL_FLOAT          no such type
   SQL_DOUBLE         FLOAT
   SQL_TINYINT        no such type
   SQL_BIGINT         no such type
   SQL_BINARY         no such type
   SQL_VARBINARY      no such type
   SQL_LONGVARBINARY  BYTE
   SQL_DATE           DATE
   SQL_TIMESTAMP      DATETIME YEAR TO
                      FRACTION(5)

Informix also defines a type that ODBC calls SQL_LONGVARCHAR, and Informix calls TEXT, which the Acu4GL for ODBC product doesn't use.

Notice that Informix has two types that match the ODBC SQL_DECIMAL type, and two that match the SQL_INTEGER type. The Acu4GL product will usually use the first type that it finds that matches an ODBC type, unless there are restrictions on that type. For example, the SERIAL type is limited in that a table can have only one such column, while a typical table may have more than one column of integer data.

So the Acu4GL for ODBC product will use the Informix INTEGER type, instead of the SERIAL type for integer data.

For another example, Access maps these types like this:

   SQL_CHAR               CHAR
   SQL_VARCHAR            TEXT
   SQL_DECIMAL            no such type
   SQL_NUMERIC            CURRENCY
   SQL_SMALLINT           SHORT
   SQL_INTEGER            LONG or COUNTER
   SQL_REAL               SINGLE
   SQL_FLOAT              no such type
   SQL_DOUBLE             DOUBLE
   SQL_TINYINT            BYTE
   SQL_BIGINT             no such type
   SQL_BINARY             BINARY
   SQL_VARBINARY          VARBINARY
   SQL_LONGVARBINARY      LONGBINARY
   SQL_DATE               no such type
   SQL_TIMESTAMP          DATETIME

Access also defines types, SQL_BIT = BIT and SQL_LONGVARCHAR = LONGTEXT which the Acu4GL for ODBC product doesn't use.

Mapping COBOL data types to ODBC data types

When the Acu4GL for ODBC product creates a table, it uses what it determines to be the best match of a data type for any particular column.

This means that the database column will be able to hold any data that the COBOL data type can hold, and is as close as possible to the type of data that the COBOL program is using. This determination is based in part on what data types the data source has available. Obviously, if a data source doesn't support some data type, the Acu4GL for ODBC product can't use it with that data source. The actual algorithm used is rather complicated, but the general rules are as follows:

User preferences take precedence. This means that the XFD directives specified are checked first. Therefore, when data should be of type DATE or BINARY, a DATE or BINARY type is located and used, if the data source supports it.

If the COBOL data type is usage float or usage double, a data source type of FLOAT, REAL, or DOUBLE is used, depending on what is available. If none of these is available, the Acu4GL product abides by the next rule.

If the COBOL data type is numeric, a numeric type is used in the data source. The numeric type chosen depends on how large the COBOL data type is, and how many digits to the right of the decimal point (if any) there are.

For example, if the COBOL data type is PIC 99, the data source types checked for are TINYINT, SMALLINT, INTEGER, BIGINT, DECIMAL, NUMERIC, CHAR. The first of these that exists is the type that will be used for that column.

For another example, if the COBOL data type is PIC 9(6)v99, the data source types checked for are DECIMAL, NUMERIC, DOUBLE, FLOAT, REAL, CHAR. Again, the first of these types that exists is the type that will be used for that column.

Anything else will use CHAR.

Occasionally, you may encounter a data source type that only supports type CHAR (which is the only data type that is guaranteed to exist, according to the ODBC specification). Under these data sources, all the COBOL data types will be mapped to CHAR types.

Mapping ODBC data types to COBOL data types

Sometimes developers are in a situation where they need to create a COBOL File Description based on an existing data source table. The most important thing to understand in this situation is that there is almost nothing that you can do wrong! When the Acu4GL product opens a data source table, the only thing it checks is that the column names match the COBOL data names.

When the Acu4GL product reads data from the data source, it essentially does a COBOL-style MOVE from the native data type to the COBOL data type, whatever it is. And since most types have a CHAR representation (in other words, you can actually display most data types, using a standard ODBC-capable tool), using PIC X(nn) for each column will work perfectly well.

A better general rule is to use a COBOL type that closely matches the data source data type, but don't worry about getting an exact fit. So you can use PIC 9(9) whenever the data source has an INTEGER type.

If you have more information about the data source type, you might be able to use a different COBOL representation. For example, if you know that a particular column in an ODBC data source has values only in the range 0-999, you could use PIC 999 for your COBOL data. The COMP-type you use is really determined by your own preferences, and should have little bearing on the COBOL data type you choose.

If you want to somehow choose your COBOL data types so that there is a best fit, you can use the following mapping:

   SQL_CHAR                   PIC X(nn)      nn =size of item
   SQL_VARCHAR                PIC X(nn)      nn = maximum size of item
   SQL_DECIMAL                PIC 9(n)v9(m)
   SQL_NUMERIC                PIC 9(n)v9(m)
   SQL_SMALLINT               PIC 9(5) COMP-5
   SQL_INTEGER                PIC 9(9) COMP-5
   SQL_REAL                   USAGE FLOAT
   SQL_FLOAT                  USAGE FLOAT
   SQL_DOUBLE                 USAGE DOUBLE
   SQL_TINYINT                PIC 9(3) COMP-5
   SQL_BIGINT                 PIC 9(9) COMP-5
   SQL_BINARY                 PIC X(nn)
   SQL_VARBINARY              PIC X(nn)
   SQL_LONGVARBINARY          PIC X(nn)
   SQL_DATE                   PIC 9(6) or PIC 9(8)
   SQL_TIMESTAMP              USAGE DISPLAY

Note: The BINARY data types are usually of a form that COBOL can't understand anyway. You will usually just read these columns, and rewrite them unchanged. If you have more information about the data in the columns, you might be able to do something else, but this requires more knowledge about the columns.

The DECIMAL, NUMERIC, DATE, and TIMESTAMP types usually have special representations in a data source, which really doesn't match any COBOL data type exactly. When the Acu4GL product binds the data (a technical term), it asks the data source to return it in character form, so the most efficient COBOL data type would probably be USAGE DISPLAY.