Monday, July 16, 2012

Adding a 4th V to BIG Data - Veracity


I talked a week or so ago about IBM’s 3 V’s of Big Data. Maybe it is time to add a 4th V, for Veracity.

Veracity deals with uncertain or imprecise data. In traditional data warehouses there was always the assumption that the data is certain, clean, and precise. That is why so much time was spent on ETL/ELT, Master Data Management, Data Lineage, Identity Insight/Assertion, etc.

However, when we start talking about social media data like Tweets, Facebook posts, etc. how much faith can or should we put in the data. Sure, this data can be used as a count toward your sentiment, but you would not count it toward your total sales and report on that.

Two of the now 4 V’s of Big Data are actually working against the Veracity of the data. Both Variety and Velocity hinder the ability to cleanse the data before analyzing it and making decisions.

Due to the sheer velocity of some data (like stock trades, or machine/sensor generated events), you cannot spend the time to “cleanse” it and get rid of the uncertainty, so you must process it as is - understanding the uncertainty in the data. And as you bring multi-structured data together, determining the origin of the data, and fields that correlate becomes nearly impossible.

When we talk Big Data, I think we need to define trusted data differently than we have in the past. I believe that the definition of trusted data depends on the way you are using the data and applying it to your business. The “trust” you have in the data will also influence the value of the data, and the impact of the decisions you make based on that data.

14 comments:

fajar uno said...

tulisan yang anda buat sangat menarik, saya juga punya tulisan yang menarik, kamu bisa kunjungi di http://repository.gunadarma.ac.id/bitstream/123456789/2979/1/78.pdf

Doug Laney said...

Great to see others finally coming to appreciate the "V"s of Big Data that Gartner first defined over 15 years ago, albeit without the professional courtesy of attributing them to us. Note however that only the original 3Vs I first identified back then are definitional qualities of Big Data. Other "V"s that people (cleverly?) add are not measures of magnitude. And value is an aspirational attribute at that. To see my original 2001 piece on the 3Vs: http://goo.gl/wH3qG. To see what Batman thinks of those being cute by adding other Vs: http://blogs.gartner.com/doug-laney/batman-on-big-data/. --Doug Laney, VP Research, Gartner, @doug_laney

Anoushka Sakthi said...


Thank you for sharing the excellent post about Big Data. you helped me to gain more information on the recent technology


big data training in velachery|
Big Data Course in Chennai

supreet said...

Awesome blog. It was very informative. I would like to appreciate you. Keep updated like this

best selenium training institute in hyderabad
best selenium online training institute in hyderabad
best institute for selenium training in hyderabad
best selenium training institute in hyderabad

Mageshkumar said...

It’s truly awesome; you have imparted truly great knowledge.
Hadoop Training Chennai |
Hadoop Training in Chennai |
Best JAVA Training in Chennai



Mageshkumar said...

The blog is good enough I again and again read this.
Cloud computing Training |
Cloud computing Training in Chennai |
Cloud computing courses in Chennai

Unknown said...

Your blog is so inspiring for the young generations.thanks for sharing your information with us and please update more new ideas.
Android Training in Karapakkam
Android Training in Nungambakkam
Android Training in Mogappair
android development training in bangalore

Unknown said...

Nice article i have ever read information's like this.it's really awesome the way you have delivered your ideas.i hope you will add more content in your blog.
AWS Training in anna nagar
AWS Training in Chennai Anna Nagar
AWS Training in T nagar
AWS Certification Training in T nagar

Unknown said...

Your blog information are really creative and useful for the readers.I ever read such kind of nice article yet.hope you will add more innovative ideas on your post.
selenium classes in bangalore
selenium training in bangalore with placement
Selenium Certification Training in T nagar
Selenium Training in Sholinganallur

sathyaramesh said...

I am really enjoying reading your well-written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work.
RPA courses in Chennai
RPA Training Institute in Chennai
Robotic Process Automation training in bangalore
Robotics courses in bangalore
RPA Training in Chennai

Unknown said...

Your blog information are really creative and useful for the readers.I ever read such kind of nice article yet.hope you will add more innovative ideas on your post.
devops training in chennai
devops Training in anna nagar
devops training near me
devops training in bangalore

pratheep said...

It's a wonderful post and very helpful, thanks for all this information. You are including better information regarding this topic in an effective way.Thank you so much.

Robotic Process Automation (RPA) Training in Chennai | Robotic Process Automation (RPA) Training in anna nagar | Robotic Process Automation (RPA) Training in omr | Robotic Process Automation (RPA) Training in porur | Robotic Process Automation (RPA) Training in tambaram | Robotic Process Automation (RPA) Training in velachery

Aishwariya said...

Really informative article post. Really looking forward to read more. Really Great.
Reactjs Training in Chennai |
Best Reactjs Training Institute in Chennai |
Reactjs course in Chennai

Lwinx said...

cover coin hangi borsada
cover coin hangi borsada
cover coin hangi borsada
xec coin hangi borsada
ray hangi borsada
tiktok jeton hilesi
tiktok jeton hilesi
tiktok jeton hilesi
tiktok jeton hilesi