SQL:Data Integrity

Database integrity
A database is valuable only if the data in it is correct. Taylor (2006, pp. 104-) identifies three kinds of database integrity.
 * Entity integrity
 * Domain integrity
 * Referential integrity

And identifies some problems that can threaten database integrity, such as bad input data, operator error, mechanical failure.

A sample database
To have a look at Data Integrity issues we will look at a simple insurance database that keeps a record of People and Items that they want to keep serial numbers for. DemoInt.sql

Entity integrity
An entity corresponds to something in the real world. As such you must be able to connect the entry in the database with the correct one in the real world. This is commonly achieved by specifying a primary key in the table that allows each record to be unique. In the Person table an entity integrity error could occur if we try to add a person with an ID (PerID) which is already in the table.

For a table where the primary key is not numeric, you maintain entity integrity by setting the Primary key to NOT NULL and UNIQUE, as in the following example:

This causes a problem if you have two John Smiths, as with this table you will not be able to correctly connect them to the correct real world one. As the StuFullName is Unique you will not be able to add "John Smith" twice to this table - the second will fail entity integrity.

In the person table the Drivers licence is also set to unique. So adding the same drivers licence to two different people will also fail entity integrity.

Domain integrity
Many data items can only take specific values, for example, gender in tblPerson can only be Male (M) or Female (F). MySQL provides the ENUM data type.

So if you tried to add or update a Persons record containing "X" for gender this would fail domain integrity

Referential integrity
An example of Referential integrity is when an Item is added to the database it MUST belong to a Person. So before a Item can be entered the Person MUST be created. Referential Integrity will fail if the Item is added before the Person is created.

Here the PerID 999 must be created first in the Person table

Another example of referential integrity is if we try to delete a Person who has transaction.

This should fail as for person 1 (Elly Fant) there are 2 items a 'Diamond Ring',and an 'iPod'.

MySQL allows Cascading Deletes by using the ON DELETE CASCADE clause on the foreign key constraints, where deleting a Person will delete all associated items. This should be used with caution.

Other problems that can affect the integrity of your database

 * bad input data
 * operator error
 * mechanical failure
 * malice
 * Data redundancy
 * Exceeding the capacity of your DBMS