<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Database Design: 4th and 5th Normal Forms</title>
	<link>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/</link>
	<description>Business Intelligence, Data Warehousing, SQL, Visual FoxPro.</description>
	<pubDate>Thu, 20 Nov 2008 21:36:39 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2</generator>

	<item>
		<title>By: Tod McKenna</title>
		<link>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-1380</link>
		<author>Tod McKenna</author>
		<pubDate>Thu, 17 Apr 2008 06:41:14 +0000</pubDate>
		<guid>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-1380</guid>
		<description>Hi Caroline, I like it. You've presented a very good example where 3NF is preferable over higher forms. 

One thing I always come back to is the original intent of normalization: To reduce the number of data anomalies that can occur during INSERT, UPDATE, and DELETE operations. Normally, if tables are up to 3NF, then INSERT would never be a problem due to the constraints set up by primary and foreign key relationships. But UPDATEs and DELETEs do create potential issues. So, if you are not going to update or delete a row from a table (or there is little likelihood), then it is safe to denormalize an otherwise 4th or 5th model "down" to 3rd.

I'd love to hear more about 2-and-a-half normal form too. And I admit, I have never used CRUD analysis for the purposes of spotting opportunities to denormalize. That's a  great tip!!</description>
		<content:encoded><![CDATA[<p>Hi Caroline, I like it. You&#8217;ve presented a very good example where 3NF is preferable over higher forms. </p>
<p>One thing I always come back to is the original intent of normalization: To reduce the number of data anomalies that can occur during INSERT, UPDATE, and DELETE operations. Normally, if tables are up to 3NF, then INSERT would never be a problem due to the constraints set up by primary and foreign key relationships. But UPDATEs and DELETEs do create potential issues. So, if you are not going to update or delete a row from a table (or there is little likelihood), then it is safe to denormalize an otherwise 4th or 5th model &#8220;down&#8221; to 3rd.</p>
<p>I&#8217;d love to hear more about 2-and-a-half normal form too. And I admit, I have never used CRUD analysis for the purposes of spotting opportunities to denormalize. That&#8217;s a  great tip!!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Caroline</title>
		<link>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-1378</link>
		<author>Caroline</author>
		<pubDate>Wed, 16 Apr 2008 20:16:38 +0000</pubDate>
		<guid>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-1378</guid>
		<description>I would say a case for 3NF would be data representing a ternary relationship, i.e. where 4th &#38; 5th become relevant, but where the entities are written like transactions, i.e. INSERTed but never UPDATEd or DELETEd.  The danger of not splitting out entities where 4th or 5th is appropriate is not in creating the data, it's in updating it (where you can easily introduce update anomalies because you are carrying redundant data) or deleting it (where you can lose all the information held about a particular attribute).  And on DB2 on the mainframe we used to talk about 2-and-a-half normal form, which was a denormalisation tactic to reduce frequent joins on very read-intensive tables.  CRUD analysis will sometimes highlight this sort of situation up front but more often than not it is done post hoc facto to improve performance.</description>
		<content:encoded><![CDATA[<p>I would say a case for 3NF would be data representing a ternary relationship, i.e. where 4th &amp; 5th become relevant, but where the entities are written like transactions, i.e. INSERTed but never UPDATEd or DELETEd.  The danger of not splitting out entities where 4th or 5th is appropriate is not in creating the data, it&#8217;s in updating it (where you can easily introduce update anomalies because you are carrying redundant data) or deleting it (where you can lose all the information held about a particular attribute).  And on DB2 on the mainframe we used to talk about 2-and-a-half normal form, which was a denormalisation tactic to reduce frequent joins on very read-intensive tables.  CRUD analysis will sometimes highlight this sort of situation up front but more often than not it is done post hoc facto to improve performance.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tod McKenna</title>
		<link>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-1210</link>
		<author>Tod McKenna</author>
		<pubDate>Wed, 12 Mar 2008 12:30:09 +0000</pubDate>
		<guid>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-1210</guid>
		<description>Hi amitabh, I apologize for being very late here on my reply.

Query performance and data model simplicity are two often cited reasons to denormalize. The former is a much more reasonable reason than the latter. The latter, in fact, sounds like laziness!

OLTP systems can be complex and data integrity and quality should be a primary focus. There are solutions to solve the query and reporting performance and join complexity problem, ranging from views to operational data stores built for query speed. 

Denormalized tables could potentially expose the database to anomalies especially if multiple systems interact with the database at different levels. 

There is a threshold to how much you can normalize, though. You could normalize every entity up to higher and higher forms but at some point -- the tipping point, if you will -- maintainability and understandability will erode, rendering the model too difficult to use. You’ll enter the land of academia and theoretical models which might be far from practical. 

The balance is really for the modeler to discover (based on many factors such as industry, business needs, application access, compliance concerns, data quality policies, etc.). I don't think that there is a magic formula. Some entities in the model may be normalized to 5th, while others to 3rd. 

Some scenarios where denormalization might be the answer for performance purposes: (a) to repeat attributes in a table row so that calculations can be performed without the need to join to other tables, (b) when repeating groups of rows exist, but must be processed as a group, rather than by the row, or (c) when certain attributes in a table are queries so often that it makes sense to include them in the table.

When denormalizing, consider these factors: (a) the possibility that data integrity will be jeopardized, (b) that performance gains will still not be significant enough to make the risk worthwhile, and (c) that other methods may be available to handle the performance issues associated with highly normalized data.

Personally, I think denormalization makes the most sense on tables that are read-only with a very controlled process for updates, deletes, and inserts. For example, I modeled a database 10 years ago that rests mostly in 4th normal form, with some entities in 5th, and others in 3rd, where necessary. I have a separate database in mostly 2nd form for reporting (I designed this data model before I was aware of dimensional modeling; although my design is similar, it is NOT a true dimensional model). This database is updated using triggers from the primary OLTP database. Not only does this work, but it works very well. Those denormalized tables are “read only” to everyone but the triggers. Yet, several applications use them for reporting and query purposes.  It is very fast, while my OLTP database is very secure.</description>
		<content:encoded><![CDATA[<p>Hi amitabh, I apologize for being very late here on my reply.</p>
<p>Query performance and data model simplicity are two often cited reasons to denormalize. The former is a much more reasonable reason than the latter. The latter, in fact, sounds like laziness!</p>
<p>OLTP systems can be complex and data integrity and quality should be a primary focus. There are solutions to solve the query and reporting performance and join complexity problem, ranging from views to operational data stores built for query speed. </p>
<p>Denormalized tables could potentially expose the database to anomalies especially if multiple systems interact with the database at different levels. </p>
<p>There is a threshold to how much you can normalize, though. You could normalize every entity up to higher and higher forms but at some point &#8212; the tipping point, if you will &#8212; maintainability and understandability will erode, rendering the model too difficult to use. You’ll enter the land of academia and theoretical models which might be far from practical. </p>
<p>The balance is really for the modeler to discover (based on many factors such as industry, business needs, application access, compliance concerns, data quality policies, etc.). I don&#8217;t think that there is a magic formula. Some entities in the model may be normalized to 5th, while others to 3rd. </p>
<p>Some scenarios where denormalization might be the answer for performance purposes: (a) to repeat attributes in a table row so that calculations can be performed without the need to join to other tables, (b) when repeating groups of rows exist, but must be processed as a group, rather than by the row, or (c) when certain attributes in a table are queries so often that it makes sense to include them in the table.</p>
<p>When denormalizing, consider these factors: (a) the possibility that data integrity will be jeopardized, (b) that performance gains will still not be significant enough to make the risk worthwhile, and (c) that other methods may be available to handle the performance issues associated with highly normalized data.</p>
<p>Personally, I think denormalization makes the most sense on tables that are read-only with a very controlled process for updates, deletes, and inserts. For example, I modeled a database 10 years ago that rests mostly in 4th normal form, with some entities in 5th, and others in 3rd, where necessary. I have a separate database in mostly 2nd form for reporting (I designed this data model before I was aware of dimensional modeling; although my design is similar, it is NOT a true dimensional model). This database is updated using triggers from the primary OLTP database. Not only does this work, but it works very well. Those denormalized tables are “read only” to everyone but the triggers. Yet, several applications use them for reporting and query purposes.  It is very fast, while my OLTP database is very secure.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: amitabh mishra</title>
		<link>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-281</link>
		<author>amitabh mishra</author>
		<pubDate>Tue, 11 Dec 2007 07:02:12 +0000</pubDate>
		<guid>http://blog.todmeansfox.com/2007/12/04/database-design-4th-and-5th-normal-forms/#comment-281</guid>
		<description>hi 
 i m amitabh mishra,my question is that in what situation we denormalize the table. like if we want speed for searchig or for other purpose.
and second question is 
&#62;do foreign ket have some space or slow the speed of the table?</description>
		<content:encoded><![CDATA[<p>hi<br />
 i m amitabh mishra,my question is that in what situation we denormalize the table. like if we want speed for searchig or for other purpose.<br />
and second question is<br />
&gt;do foreign ket have some space or slow the speed of the table?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
