DBMS:Normalisation

Overview
Normalisation is a process of converting complex data structures into simple stable data structures to achieve data integrity.

It is an approach that examines data and groups data items in a way that seeks to minimise the impact of change by reducing the duplication of data, represents data in form of a two dimensional table (refer Relational Database)


 * Normalisation objectives, are to:
 * Prevent insert/update and delete anomolies
 * Minimise redundancy (produces controlled redundancy)

Once you have normalised your data you may have to go back and review your Entity Relationship diagrams
 * An entity is normalised when it is " Dependent on"
 * the key = first normal form
 * the whole key = second normal form
 * nothing but the key = third normal form

First Normal Form (1NF)
Removes duplicate groups of data by creating separate tables.


 * Applicable when:
 * Key field and related fields repeat.
 * Method:
 * Remove repeating groups
 * Identify entities by splitting into separate tables (entities).
 * Assign primary keys for each entity.



Example

 * Original (note [Course, Result] repeat)
 * STUDENTRESULT( StuId ,Name,Address,Course1,Result1,Course2,Result2,Course3,Result3)
 * Remove repeating data (either create a CourseResultID with StuId as a foreign key)
 * STUDENT( StuId ,Name,Address) and
 * COURSERESULT( ResultId ,StuId*,Course,Result)
 * Or, alternatively, you could also create a joint key field so the result would be
 * STUDENT( StuId ,Name,Address) and
 * COURSERESULT( StuId*, Course ,Result)

Exercise
Apply the First normal form to:


 * 1) TIMESHEET ( EmpId ,EmpName,EmpEmail,WeekNum1,HrsWorked1, WeekNum2, HrsWorked2, WeekNum3, HrsWorked3) when
 * 2) There can be only one timesheet per employee per week.
 * 3) There can be many timesheets per employee per week.

Second Normal Form (2NF)
All the data in a table belongs to the whole key.


 * Applicable when:
 * A table has a compound key (two or more key fields).
 * Method:
 * Remove partial dependencies. For records whose keys have more than one data item (compound keys), ensure that all other items are dependent on the whole key (functionally dependent). Split records if necessary to achieve this (second normal form).

Example


 * Original:
 * PART( PartId, SupplierId ,SuppName,SuppDetails,PartPrice)
 * Split into separate tables (projection)
 * PART( PartId, SupplierId* ,PartPrice) and
 * SUPPLIER( SupplierId ,SuppName,SuppDetails)

Third Second Form (3NF)
Separates attributes that do not relate to the key.


 * Applicable when:
 * Separates attributes that do not relate to the key.
 * Method:
 * Identify transitive dependencies. Transitive dependency occurs when a non-key item is the identifier for some other item.
 * Remove transitive dependencies and create new entities (tables).
 * Third normal form is when each attribute is dependent on the key, and nothing but the key.
 * Finally - remove calculated fields.

Example: 1
 * Original:
 * EMPLOYEE( EmpId ,EmpName,Salary,ProjectId,ProjDsn,CompletionDate)
 * Remove transitive dependencies
 * EMPLOYEE( EmpId ,ProjectId*,EmpName,Salary) and
 * PROJECT( ProjectId ,ProjDsn,CompletionDate)

Example: 2
 * Original:
 * BOOKLOAN( LoanId ,BookId*,StudentId,StuName,StuEmail,LoanDate)
 * Remove transitive dependencies
 * BOOKLOAN( LoanId ,BookId*,StudentId*,LoanDate) and
 * STUDENT( StudentId ,StuName,StuEmail)

Combined exercises
For the following table normalise to the third normal form (i.e. check for first, second and third).


 * STUDENTRESULT( StuId ,Name, Address,Program ID, Program Name, Program Duration Course1,Result1,CourseName1,Course2,Result2,CourseName2,Course3,Result3,CourseName3)
 * For this example students can only get one grade per course.

Example data
 * 1 Bilbo, 12 Tui St, DipICT, Diploma in Information and Communications Technology, 1yr, DB500,P,IN600,M,DA500,P
 * 2 Freddo, 18 Kea St, DipICT, Diploma in Information and Communications Technology, 1yr, IN600,M,OS500,M,DA500,P
 * 3 Greeno, 98 Kowhai St, BCS, Bachelor of Computing Systems, 3yr, ITWD5.320,89,ITDB6.200,55,ITMM5.320,86