HADOOP \ SPARK ECO System
The Big data eco system is build with multiple tools and systems.
1) What is difference between sqoop and flume?
Sqoop and Flume both are meant to fulfill data ingestion needs but they serve different purposes. Apache Flume works well for streaming data sources that are generated continuously in hadoop environment such as log files from multiple servers whereas whereas Apache Sqoop works well with any RDBMS has JDBC connectivity.
2) What is Hive?
- Hive is a data warehouse infrastructure tool that processes structured data in Hadoop. It is proced by FB to help data anlytics \ RDBMS people to directly query Hadoop cluster.
- Most data warehousing applications work with SQL-based querying language.
- Features of Hive. It accelerates queries as it provides indexes, including bitmap indexes.
Comments
Post a Comment