Datatypes -- An overview of datatype handling in MDB2
Introduction
All DBMS provide multiple choice of data types for the information that can be
stored in their database table fields. However, the set of data types made
available varies from DBMS to DBMS.
To simplify the interface with the DBMS supported by MDB2,
it was defined a base set of data types that applications may access
independently of the underlying DBMS.
The MDB2 applications programming interface takes care
of mapping data types when managing database options. It is also able to
convert that is sent to and received from the underlying DBMS using the
respective driver.
The following data type examples should be used with MDB2's createTable()
method. The example array at the end of the data types section may be used
with createTable() to create a portable table on the DBMS
of choice (please refer to the main MDB2 documentation to find out what DBMS
back ends are properly supported). It should also be noted that the following
examples do not cover the creation and maintenance of indices, this chapter
is only concerned with data types and the proper usage thereof.
"Global" table type modifiers
Within the MDB2 API there are a few modifiers that have been designed to aid
in optimal table design. These are:
The notnull modifiers
The length modifiers
The default modifiers
unsigned modifiers for some field definitions,
although not all DBMS's support this modifier for integer field types.
fixed length modifiers for some field definitions.
Building upon the above, we can say that the modifiers alter the field definition
to create more specific field types for specific usage scenarios.
The notnull modifier will be used in the following way
to set the default DBMS NOT NULL Flag on the field to true or false, depending
on the DBMS's definition of the field value:
In PostgreSQL the "NOT NULL" definition will be set to "NOT NULL", whilst in
MySQL (for example) the "NULL" option will be set to "NO".
In order to define a "NOT NULL" field type, we simply add an extra parameter
to our definition array (See the examples in the following section)
Using the above example, we can also explore the default
field operator. Default is set in the same way as the notnull
operator to set a default value for the field. This value may be set in any
character set that the DBMS supports for text fields, and any other valid
data for the field's data type. In the above example, we have specified a
valid time for the "Time" data type, '12:34:05'.
Remember that when setting default dates and times, as well as datetimes,
you should research and stay within the epoch of your chosen DBMS, otherwise
you will encounter difficult to diagnose errors!
The above example will create a character varying field of length 12 characters
in the database table. If the length definition is left out, MDB2 will create
a length of the maximum allowable length for the data type specified, which
may create a problem with some field types and indexing. Best practice is to
define lengths for all or most of your fields.
Text data type
The text data type is available with two options for the length: one that is
explicitly length limited and another of undefined length that should be as
large as the database allows.
The length limited option is the most recommended for efficiency reasons.
The undefined length option allows very large fields but may prevent the use
of indexes, nullability and may not allow sorting on fields of its type.
The fields of this type should be able to handle 8 bit characters. Drivers
take care of DBMS specific escaping of characters of special meaning with
the values of the strings to be converted to this type.
By default MDB2 will use variable length character types. If fixed length
types should be used can be controlled via the fixed modifier.
The boolean data type represents only two values that can be either 1 or 0.
Do not assume that these data types are stored as integers because some DBMS
drivers may implement this type with single character text fields for a
matter of efficiency. Ternary logic is possible by using null as the third
possible value that may be assigned to fields of this type.
The integer data type may store integer values as large as each DBMS may handle.
Fields of this type may be created optionally as unsigned integers but not
all DBMS support it. Therefore, such option may be ignored. Truly portable
applications should not rely on the availability of this option.
The decimal data type may store decimal numbers accurately with a fixed number
of decimal places. This data type is suitable for representing accurate values
like currency amounts.
Some DBMS drivers may emulate the decimal data type using integers.
Such drivers need to know in advance how many decimal places that should be
used to perform eventual scale conversion when storing and retrieving values
from a database. Despite this, applications may use arithmetic expressions
and functions with the values stored on decimal type fields as long as any
constant values that are used in the expressions are also converted with the
respective MDB2 conversion functions.
The number of places that are used to the left and the right of the decimal
point is pre-determined and fixed for all decimal values stored in the same
database. By default, MDB2 uses 2 places to the right of the decimal point,
but this may be changed when setting the database connection. The number of
places available to the right of the decimal point depend on the DBMS.
It is not recommended to change the number places used to represent decimal
values in database after it is installed. MDB2 does not keep track of changes
in the number of decimal places. The number of decimal places can be set
using the setOption() method.
The float data type may store floating point decimal numbers. This data type
is suitable for representing numbers within a large scale range that do not
require high accuracy. The scale and the precision limits of the values that
may be stored in a database depends on the DBMS that it is used.
The date data type may represent dates with year, month and day.
DBMS independent representation of dates is accomplished by using text
strings formatted according to the IS0-8601 standard.
The format defined by the ISO-8601 standard for dates is
YYYY-MM-DD where YYYY is the number of
the year (Gregorian calendar), MM is the number of the
month from 01 to 12 and DD is the number of the day from
01 to 31. Months or days numbered below 10 should be padded on the left with 0.
Some DBMS have native support for date formats, but for others the DBMS
driver may have to represent them as integers or text values. In any case,
it is always possible to make comparisons between date values as well sort
query results by fields of this type.
The time data type may represent the time of a given moment of the day.
DBMS independent representation of the time of the day is also accomplished
by using text strings formatted according to the ISO-8601
standard.
The format defined by the ISO-8601 standard for the time
of the day is HH:MI:SS where HH is the
number of hour the day from 00 to 23 and MI and
SS are respectively the number of the minute and of the
second from 00 to 59. Hours, minutes and seconds numbered below 10 should be
padded on the left with 0.
Some DBMS have native support for time of the day formats, but for others
the DBMS driver may have to represent them as integers or text values.
In any case, it is always possible to make comparisons between time values
as well sort query results by fields of this type.
The timestamp data type is a mere combination of the date and the time of
the day data types. The representation of values of the time stamp type is
accomplished by joining the date and time string values in a single string
joined by a space. Therefore, the format template is YYYY-MM-DD HH:MI:SS.
The represented values obey the same rules and ranges described for the
date and time data types
The large object data types are meant to store data of undefined length that
may be too large to store in text fields, like data that is usually stored
in files.
MDB2 supports two types of large object fields:
Character Large OBjects (CLOBs) and Binary Large OBjects
(BLOBs). CLOB fields are meant to
store only data made of printable ASCII characters. BLOB
fields are meant to store all types of data.
Large object fields are usually not meant to be used as parameters of query
search clause (WHERE) unless the underlying DBMS supports
a feature usually known as "full text search"