sqoop - Sqoop Introduction - apache sqoop - sqoop tutorial - sqoop hadoop
What is Sqoop?
- SQOOP is an open source which is the product of Apache. SQOOP stands for SQL to Hadoop.
- It is the tool which is the specially designed to transfer data between Hadoop and RDBMS like SQL Server, MySQL, Oracle etc.
- SQOOP is basically command based interface so we use import command to transfer RDBMS data to Hadoop and Export command to transfer data back in RDBMS.
- Here, Sqoop occupies a place in the Hadoop ecosystem to provide feasible interaction between relational database server and Hadoop’s HDFS.
- Sqoop (“SQL-to-Hadoop”) is a straightforward command-line tool with the following capabilities:
- Imports individual tables or entire databases to files in HDFS
- Generates Java classes to allow you to interact with your imported data
- Provides the ability to import from SQL databases straight into your Hive data warehouse
- After setting up an import job in Sqoop, you can get started working with SQL database-backed data from your Hadoop MapReduce cluster in minutes.
Sqoop related tags : sqoop import , sqoop interview questions , sqoop export , sqoop commands , sqoop user guide , sqoop documentation
How Sqoop Works?
- The following image describes the workflow of Sqoop.
Learn sqoop - sqoop tutorial - sqoop-workflow - sqoop examples - sqoop programs
Sqoop Import:
- The import tool imports individual tables from RDBMS to HDFS. Each row in a table is treated as a record in HDFS.
- All records are stored as text data in text files or as binary data in Avro and Sequence files.
Sqoop Export:
- The export tool exports a set of files from HDFS back to an RDBMS. The files given as input to Sqoop contain records, which are called as rows in table.
- Those are read and parsed into a set of records and delimited with user-specified delimiter.
- Sqoop is a tool designed to transfer data between Hadoop and relational database servers.
- It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases.
- This is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem.
Prerequisites
- The following prerequisite knowledge is required for this product:
- Basic computer technology and terminology
- Familiarity with command-line interfaces such as bash
- Relational database management systems
- Basic familiarity with the purpose and operation of Hadoop