上QQ阅读APP看书,第一时间看更新
Step 4 - Installing and verifying Sqoop
Sqoop has been changing in its form and features very rapidly since the Hadoop platform release. As mentioned before, the Sqoop framework, as it stands today, comes as Sqoop and Sqoop 2. While Sqoop is the older generation of the ETL framework in the Hadoop world, it is complete. On the other hand, Sqoop 2 is a more recent advancement with a REST-based interface but it is still not complete. For the purpose of installation, we'll cover the installation of the Sqoop version specifically.
Sqoop binary packages can be downloaded and extracted as given below:
- Download the binary package of sqoop with hadoop2 compatibility with the following command:
wget https://www-eu.apache.org/dist/sqoop/1.4.6/sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
- Extract this tarball in any of the user directories, let us refer to it as ${SQOOP_HOME}, with the following command:
tar -zxvf <DOWNLOAD_LOCATION>/sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
- Configure the ${SQOOP_HOME} environment variable with the following command and add the same to ~/.bashrc:
export SQOOP_HOME=${SQOOP_HOME}
export PATH=$PATH:$SQOOP_HOME/bin
- Verify by invoking sqoop --help in the CentOS shell.
Alternatively, to build and install Sqoop from source, please follow the following steps:
- Download the latest version of Sqoop from:
wget http://www-eu.apache.org/dist/sqoop/1.4.6/sqoop-1.4.6.tar.gz
- Extract this tarball in any of the user directories; let's refer to it as ${SQOOP_HOME}, with the following command:
tar -zxvf <DOWNLOAD_LOCATION>/sqoop-1.4.6.tar.gz
- Follow instructions in COMPILING.txt to compile the source code, or alternatively using ssh change into ${SQOOP_HOME} and run the following command:
ant release
- During the install, if any executables are reported to be missing, please install them using package installer (yum install <required package>). The ones we encountered on CentOS 7 were, AsciiDoc, LSB and xmlto.
- After the install, copy the bin folder of Sqoop source into build/bin and copy the build folder to an independent folder.
- Configure the ${SQOOP_HOME} environment variable with the following command and add the same to ~/.bashrc:
export SQOOP_HOME=${SQOOP_HOME}
export PATH=$PATH:$SQOOP_HOME/bin
- Verify by invoking sqoop -help in the CentOS shell.