Setup Hive

Hive is a data warehouse infrastructure that provides data summarization and ad hoc querying. And here, step by step to setup hive on CentOS

  1. Install Hadoop on Master Node

  2. Download and setup Hive

    su - root
    cd /opt
    tar -xzf apache-hive-0.14.0-bin.tar.gz
    mv apache-hive-0.14.0-bin hive
    chown -R hadoop /opt/hive
  3. Create Directory /tmp and /user/hive/warehouse in HDFS for Hive Table

    $HADOOP_PREFIX/bin/hadoop fs -mkdir /tmp
    $HADOOP_PREFIX/bin/hadoop fs -mkdir /user/hive/warehouse
    $HADOOP_PREFIX/bin/hadoop fs -chmod g+w /tmp
    $HADOOP_PREFIX/bin/hadoop fs -chmod g+w /user/hive/warehouse
  4. Environment Variables for Hive

    vi $HOME/.bash_profile
    # add following export command after line "export PATH"
    export HADOOP_PREFIX=/opt/hadoop
    export PATH=$PATH:$HADOOP_PREFIX/bin
    export HIVE_HOME=/opt/hive
    export PATH=$PATH:$HIVE_HOME/bin
  5. Reload .bash_profile

    source $HOME/.bash_profile


