12-13 Hadoop Subprojects
Slides: https://webeep.polimi.it/mod/resource/view.php?id=52306
HBase
HBase is a key-valued row/column store modeled on Google’s Bigtable providing Bigtable-like capab...
Pig
Apache Pig is a high-level platform for creating programs that run on Apache Hadoop. The language...
Hive
Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data...
Impala
Apache Impala is an open source massively parallel processing (MPP) SQL query engine for data sto...
Storm
Apache Storm is a distributed stream processing computation framework. Storm provides realtime co...
Flume
Apache Flume is a distributed, reliable, and available software for efficiently collecting, aggre...
Sqoop
Sqoop is a command-line interface application for transferring data between relational databases ...