What is BIG DATA
As Name suggests BIG Data is huge data created due to IOT, Apps and real time application data colelction
• Walmart handles more than 1 million customer transactions every hour.
• Facebook stores, accesses, and analyzes 30+ Petabytes of user generated data.
• 230+ millions of tweets are created every day.
• More than 5 billion people are calling, texting, tweeting and browsing on mobile phones worldwide.
The three different formats of big data are:
1. Structured: Organised data format with a fixed schema. Ex: RDBMS
2. Semi-Structured: Partially organised data which does not have a fixed format. Ex: XML, JSON
3. Unstructured: Unorganised data with an unknown schema. Ex: Audio, video files etc.
The core problem which it handles is -
Core is 3V-
- Volume
- Velocity
- Variety
It helps us on Analytics of prediction & perspective
Hadoop implementation
- Amazon EMR
- MapR
- Cloudera + Hortonworks - cloudera manager (CDH earlier and now CDP )
- Azure BigInsights
Apache Oozie - used for workflow for Hadoop tack stacks

Comments
Post a Comment