Some of our
competitors recommend that you use 3rd Normal Form (3NF) for their data
structures, as they believe that is the optimal architecture for the ad hoc
queries that form the basis for decision support and analytical processing of
today. While 3NF can save storage space, it makes queries harder to write, and
slower to execute. A big down side of 3NF for data warehousing is that it
causes the database to join tables for most queries. “Joins” can be a
performance pitfall because they force large volumes of data to be moved around
the system. To speed up these queries, DBAs using these other databases create
and maintain aggregates and/or indexes across tables. In fact, some tables can
have 3, 4, 5 or even more aggregates/ indexes if they are joined to tables
using different columns. It is important to realize that these aggregates/indexes
require knowledge of the queries, reports and analytics that are going to be
run within the system, now and in the future.
Think of it
this way—if you are heading out for a Sunday drive, and have no destination in
mind, how can you use a navigation system to give you directions?
Another
issue that many of our competitors’ customers report is that they need to
create a duplicate copy of their data in a dimensional model in order to meet
their desired response times. In many cases this dimensional model will contain
many star schemas, as well as a number of copies of the fact tables containing
different levels of data. The issue with this approach is that the application
developer and the business users must be fully aware of what data these new tables
and aggregates really contain. If they aren’t aware, they can inadvertently
make a business decision based on an old snapshot of the data, or a small,
skewed sample of the data. In addition, if a user mistakenly picks the wrong
“table,” that query might take 10 or even 100 times longer to run that it would
if they completely understood the model.
Because IBM
Netezza appliances do not need indexes or pre-computed aggregates to perform
well, there is no need to tune the data model for IBM Netezza appliances. In
fact, many customers simply copy the data model directly from their old data
mart/warehouse or from their transactional system into the IBM Netezza
appliance and leave it as-is. They then simply load a single copy of the data
and start running their queries.
3 comments:
tulisan yang anda buat sangat menarik, saya juga punya tulisan yang menarik, kamu bisa kunjungi di http://repository.gunadarma.ac.id/bitstream/123456789/2979/1/78.pdf
Very interesting piece Dwaine. I've been analyzing different types of software for asset tracking to help my company improve our data management operations... This was very enlightening to read as I begin my search for best practices. Thanks for the great content.
I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in Big Data Hadoop and Spark Developer, kindly contact us http://www.maxmunus.com/contact
MaxMunus Offer World Class Virtual Instructor led training on Netezza Training. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
For Demo Contact us.
Sangita Mohanty
MaxMunus
E-mail: sangita@maxmunus.com
Skype id: training_maxmunus
Ph:(0) 9738075708 / 080 - 41103383
http://www.maxmunus.com/
Post a Comment