LESSON-9 NORMALIZATION OF ENTITIES (Cont’d; Ashraf, Khan, and Mir)

LESSON-9 NORMALIZATION OF ENTITIES (Cont’d; Ashraf, Khan, and Mir)

Normalization was originally an invention of Dr. E. F. Codd, an IBM Research Fellow, and has been further refined and extended by a number of database scientists since its introduction in 1972.

9.1 Why Normalization?

The logical data model at this stage is normalized.

Normalization is a formal process that groups related facts into common entities and remove duplicate definitions from the model.

These models are a standard forms which is comparable with other problem domains. The other big gain is that accessing records is simpler in a relational database. Instead of using record pointers to navigate between data sets, one can reuse a portion of one record to link it to another.

That portion is usually called the primary key on its owning table, because of its identifying nature. It becomes a foreign key on the child table. The process of propagating the key to the child table is called migrating. One can use a foreign key to navigate back into the design, allowing to skip over tables.

Analytic Definition: the normalization is a set of characteristic of construing business fact, which consists of attributes.

Practitioner’s Normalization Definition: the normalization characterizes rules of placing the basic attributes (excluding derivative attributes) in appropriate entities. Some of the attributes of interest to the user community are identified as PK, and FK of entities of the data model.

At this stage and point the issue is to place the third variety of attributes, which are neither PK nor FK, and are of interest to the user community.

Normalization Rules: there are six rules in all. In transactional databases the required normalization level is up to the third level, namely;

• the first normal form, abbreviated by 1nf
• the second normal form, abbreviated by 2nf
• the third normal form, abbreviated by 3nf

Higher normal forms concerns issues of data warehousing, or complex nature of dependency characteristics of attributes in transactional databases.

• Boyce Codd normal form, abbreviated by BCNF.
• the fourth normal form, abbreviated by 4nf.
• the fifth normal form, abbreviated by 5nf.

Attribute Definition: an attribute is a characteristic or quality of an entity or of an association, of interest to the user community, about which the system is to maintain, correlate, and display information.

First Normal Form Definition: The relationship between the primary-key of an entity and each of the other attributes must be one-to-one. That is, the attribute is functionally dependent upon the primary key, and so the attribute value is dependent on the primary key value. The entities that conform to the rule are said to in ‘first normal form – 1nf’.

Second Normal Form Definition: the relationship between any portion of the primary-key of an entity and each of the other attributes must not be one-to-one. That is, the attribute is fully functionally dependent upon the complete primary key, and so the attribute value is dependent on the complete primary key value. The entities that conform to the rule are said to in ‘second normal form – 2nf’.

Third Normal Form Definition: the relationship between any two non primary-key attributes of an entity must not be one-to-one (in either direction). Such a non-primary attributes are said to be “non-transitively dependent” upon each other and the primary-key. That is, the attribute is fully functionally dependent upon the complete primary key, and so the attribute value is dependent on the complete primary key value. The entities that conform to the rule are said to in ‘third normal form – 3rd’.

Derivative Attribute Definition: an attribute value is determined or calculated from other attribute value(s) in the model.

Domain Definition: It applies for each attribute and consists of the domain name, its type, and size restrictions, and its default formats.

Domain Constraints Definition: includes the above and any other peculiar operational constraints custom oriented.

Entity Definition: It is a collection of table consists of columns (set of attributes), integrity rules, domains, its inter-relationships with other entities, and sample data values.


9.2 Advanced Normal Forms

Boyce-Codd Normal Form (BCNF): A Relation in which every determinant is a candidate key, it is known as Functional Dependency.

To eliminate Functional Dependency decompose the relation by eliminating the partial functional dependency.

Rule: 3NF ensures that feasible selection of robust Primary Key out of the candidate keys.

The BCNF addresses potential redundancies when there are multiple, composite, overlapping candidate keys which are not spotted by applying 1nf, 2nf, and 3nf normalizations.

Fourth Normal Form (4NF):

The BCNF eliminates functional dependencies, however, there are still anomalies that result from multi-valued dependencies as under;

Rule: Reduce BCNF entities to 4NF by removing any independent multi-valued components of the Primary Key to two new parent entities. Retain the original (now child) entity if it contains other, non-key attributes.

Multi-valued dependencies: can exist when there are at least three attributes (say, A, B, C) in a relation, and for each value of A there is a well defined set of values of B and a well defined set of value of C.

However, the set of values of B is independent of set C, and vice versa.

Fifth Normal Form (5NF):

It addresses cyclic dependencies that warrant an entity into three or more equivalent  entities (representing the same information) with fewer redundancies.

Anomaly: JOIN operation dependency is an assertion that a relation can be constructed by the Join operation of its projections.

In cases, the Join operation on the projections generates “spurious” tuples (rows), which is a superset or subset of the original relation, and constitutes loss decomposition.

There is a way to identify these spurious rows or short rows. These spurious and short makes the database status is doubtful due to this design fault.

RULE: reduces 4NF entities to 5NF by removing pair-wise cyclic dependencies (appearing with composite Primary Keys with three or more component attributes) to three or more new parent entities.

Joining any of the above two entities will produce spurious rows, let us call that Joining result SUPPLIER_PART_PROJECT_LOSS.

Taking another Join operation with the third component, say, Project-Supplier over (PRJT#, SUPP#) would eliminate spurious rows, and will produce the nonloss join as the original one Supplier-Part-Project-Nonloss.

NOTE: If the child entity has a composite Primary Key of Three components while their immediate parents are composite entities of combination of two parents.

The child entity can not enforce (as a design feature) overlapping components as Foreign Key of two different parents, thus one of them will be taken up as Foreign Key constraint while others could be enforced at database level by hard coded procedures.
Similarly, their SQL could be tricky as well.
*****

Click To MAIN PAGE: OBJECT- ORIENTED RELATIONAL MODELLING E-MONOGRAPH, By J. Ashraf, M. Khan, and H. Mir

 

*****

 MESSAGE: DEDICATED TO ANONYMOUS COMBATANTS OF KNOWLEDGE

مکتب علم الل مہد منل لحد   Learning continues from birth to death

FOR PROMOTION OF LEARNEDNESS SHARE WITH FRIENDS ABOUT;
   
  1. CAUSE AND EVENTS – SPATIAL And TEMPORAL RESPECTIVELY:                         https://be4gen.wordpress.com/2012/11/16/cause-and-events-spatial-temporal-respectively/

  1. WHETHER ‘BRAIN-POWER’ OR ‘HEART-IMPULSE’ PLINTHS ‘HUMAN GEN’? http://sunedu.wordpress.com/2012/09/17/whether-brain-power-or-heart-impulse-plinths-human-gen/

************