Problem of Data Redundancy:
The occurrence of data redundancy can lead to the following problems:
(a) Redundant storage: Some information is stored repeatedly.
(b) Update anomalies: If one copy of such repeated data is updated, an inconsistency is created unless all copies are similarly updated.
(c) Insertion anomalies: It may not be possible to store some information unless some other information is stored as well.
(d) Deletion anomalies: It may not be possible to delete some information without losing some other information as well.
One of the usages of FD is to eliminate data redundancy. The process is called normalization. The idea comes out by finding out the FDs in a particular relation and trying to eliminate data redundancy by making the left hand side of the FDs to be the key of the new relation.
Problem Solving:
The normalization process also defined relation in different normal form (NF), and the redundancy problem associated with each form of relation. Basically, FD can help us to solve the redundancy problem by transform the relation to one of the following form:
* 1NF: Relation should have no non-atomic attributes or nested relations.
* 2NF: Relation where the primary key contains multiple attributes and no non-key attribute should be FD on a part of the primary key.
* 3NF: Relation should not have a non-key attribute functionally determined by another non-key attribute (or by a set of non-key attributes). That is, there should be no transitive dependency of a non-key attribute on the primary key.
* Boyce-Codd NF (BCNF): The only non-trivial FDs that hold over R are the key constraints.
As we can see, the FD constraint from 1NF to BCNF is become higher and higher. All the FD constraints in the lower NF can be applied to the higher NF (assume BCNF higher than 3NF). Up to BCNF, the left hand side of the FDs can only be the key, it means that all the left hand side of FD can at most be occurred once, and this ensures that no redundancy can be detected just by FD alone. For 1NF to 3NF, they are all subject to have data redundancy under some situations, but higher level of NF have tighter constraints in FD, so the higher the level of NF, the less the chance for data redundancy to occur.
The following figure is an excellent diagram for visualizing the normal form definition.
Figure: Visualizing Normal Form Definition