Hadoop Hive Introduction




Hadoop Hive Overview Hadoop Hive is very similar to Apache Pig. What it does is let you create tables and load external files into tables using SQL. Then it creates MapReduce jobs in Java.  Java is a very wordy language so using Pig and Hive is simpler. Some have said that Hadoop Hive is a data warehouse tool (Bluntly put, […]

Read more

Steps to change hadoop hive default metastore Derby DB to MySQL DB




Steps to change hadoop hive default metastore Derby DB to MySQL DB Step 1: Install and start MySQL Step 2: Configure the MySQL Service and Connector Download mysql-connector-java-5.0.5.jar file and copy it to $HIVE_HOME/lib directory. Step 3: Create the Database and User Create a metastore_db database in MySQL database using root user $ mysql -u root -p Enter password: mysql> CREATE […]

Read more

HBase Integration with Hadoop Hive




HBase Integration with Hadoop Hive In this post, we will discuss about the setup needed for HBase Integration with Hive and we will test this integration with the creation of some test hbase tables from hive shell and populate the contents of it from another hive table and finally verify these contents in hbase table. Reasons to use Hadoop Hive […]

Read more

Sqoop Hive Use Case Example




Sqoop Hive Use Case Example This is another Use case on Sqoop, Hive concepts. Hive Use Case Example. Hive Use Case Example Problem Statement There are about 35,000 crime incidents that happened in the city of San Francisco in the last 3 months. Our task is to store this relational data in an RDBMS. Use Sqoop to import it into Hadoop. Can we […]

Read more

Custom simple eval UDFs in Pig and Hive




Custom simple eval UDFs in Pig and Hive 1.0. What’s in this blog? A demonstration of creating a custom simple eval UDF to mimic NVL2 functionality from the DBMS world, in Pig and Hive.  It includes sample data, java code for creating the UDF, expected results, commands to execute and the output. About NVL2: NVL2 takes three parameters, we will […]

Read more

Hadoop Hive UDF Part 2: Custom GenericUDF in Hive (NVL2)




Hadoop Hive UDF Part 2: Custom GenericUDF in Hive (NVL2) 1.0. What’s in this blog? In my previous blog on creating custom UDFs in Hadoop Hive, I covered a sample basic UDF.  This blog covers generic UDF creation, to mimic the same NVL2 functionality covered in the previous blog.  It includes sample data, java code for creating the UDF, expected results, […]

Read more