A database must be designed to store the correct
data in the correct way without that data becoming damaged or
corrupted. To ensure this process, the DBA implements integrity rules
using features of the DBMS. Three aspects of integrity are relevant to
our discussion of databases: physical, semantic, and internal.
Three aspects of integrity are relevant to our discussion of databases: physical, semantic, and internal.
|
Physical
issues can be handled using DBMS features such as domains and data
types. The DBA chooses the appropriate data type for each column of
each table. This action ensures that only data of that type is stored
in the database. That is, the DBMS enforces the integrity of the data
with respect to its type. A column defined as “integer” can only
contain integers. Attempts to store non-numeric or non-integer values
in a column defined as integer will fail. DBAs can also utilize
constraints to further delineate the type of data that can be stored in
database columns. Most relational DBMS products provide the following
types of constraints:
-
Referential constraints
are used to specify the columns that define any relationships between
tables. Referential constraints are used to implement referential
integrity, which ensures that all intended references from data in one
column (or set of columns) of a table are valid with respect to data in
another column of the same or a different table.
-
Unique constraints ensure that the values for a column or a set of columns occur only once in a table.
-
Check constraints
are used to place more complex integrity rules on a column or set of
columns in a table. Check constraints are typically defined using SQL
and can be used to define the data values that are permissible for a
column or set of columns.
Semantic integrity
is more difficult to control and less easily defined. An example of
semantic integrity is the quality of the data in the database. Simply
storing any data that meets the physical integrity definitions
specified to the database is not enough. Procedures and practices need
to be in place to ensure data quality. For example, a customer database
that contains a wrong address or phone number in 25% of the customer
records is an example of a database with poor quality. There is no
systematic, physical method of ensuring data accuracy. Data quality is
encouraged through proper application code, sound business practices,
and specific data policies. Redundancy is another semantic issue. If
data elements are stored redundantly throughout the database, the DBA
should document this fact and work to ensure that procedures are in
place to keep redundant data synchronized and accurate.
The
final aspect of integrity comprises internal DBMS issues. The DBMS
relies on internal structures and code to maintain links, pointers, and
identifiers. In most cases, the DBMS will do a good job of maintaining
these structures, but the DBA needs to be aware of their existence and
how to cope when the DBMS fails. Internal DBMS integrity is essential
in the following areas:
-
Index consistency.
An index is really nothing but an ordered list of pointers to data in
database tables. If for some reason the index gets out of sync with the
data, indexed access can fail to return the proper data. The DBA has
tools at his disposal to check for and remedy these types of errors.
-
Pointer consistency.
Sometimes large multimedia objects are not stored in the same physical
files as other data. Therefore, the DBMS requires pointer structures to
keep the multimedia data synchronized to the base table data. Once
again, these pointers may get out of sync if proper administration
procedures are not followed.
-
Backup consistency.
Some DBMS products occasionally take improper backup copies that
effectively cannot be used for recovery. It is essential to identify
these scenarios and take corrective actions.
|