如何從特定架構的SQL Server 2014(Unicode)將數據導入Hive倉庫

[英]How to import data into Hive warehouse from SQL Server 2014 (Unicode) for specific schema


I want to import data from SQL Server and query it from hive.

我想從SQL Server導入數據並從配置單元中查詢它。

I created a VirtualBox using cloudera template and also started reading its tutorial.

我使用cloudera模板創建了一個VirtualBox,並開始閱讀它的教程。

I am successfully able to import data from SQL Server using sqoop as avro files and then create table in hive and import data from avro file. Then query it from hive.

我成功地能夠使用sqoop作為avro文件從SQL Server導入數據,然后在hive中創建表並從avro文件導入數據。然后從配置單元中查詢它。

But import-all-tables command of sqoop only imports table of schema "dbo". What if I want to import tables with a schema dw also? I tried to use import command to import specific table exist in dw schema. but that also doesn't work.

但是sqoop的import-all-tables命令只導入模式“dbo”的表。如果我還要導入具有模式dw的表,該怎么辦?我嘗試使用import命令導入dw架構中存在的特定表。但這也行不通。

Any idea how to import data from SQL Sever using sqoop for non dbo. schema related tables as avro? Or import data from SQL Server for other than dbo. schema and load it directly into hive?

知道如何使用sqoop為非dbo從SQL Sever導入數據。架構相關表格為avro?或者從SQL Server導入數據而不是dbo。架構並將其直接加載到配置單元?

1 个解决方案

#1


1  

Download JDBC driver and copy it to sqoop directory

下載JDBC驅動程序並將其復制到sqoop目錄

$ curl -L 'http://download.microsoft.com/download/0/2/A/02AAE597-3865-456C-AE7F-613F99F850A8/sqljdbc_4.0.2206.100_enu.tar.gz' | tar xz
$ sudo cp sqljdbc_4.0/enu/sqljdbc4.jar /var/lib/sqoop/

Import table from Sql Server using sqoop

使用sqoop從Sql Server導入表

sqoop import --driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" --connect="jdbc:sqlserver://sqlserver;database=databasename;username=username;password=passwordofuserprovidedinusername" --username=username --password= passwordofuserprovidedinusername --table="schemaname.tablename" --split-by=primarykeyoftable --compression-codec=snappy --as-avrodatafile --warehouse-dir=/user/hive/warehouse/tablename

Verify if table imported properly

驗證表是否正確導入

hadoop fs -ls /user/hive/warehouse
ls -l *.avsc

create new directory and provide appropriate permissions

創建新目錄並提供適當的權限

sudo -u hdfs hadoop fs -mkdir /user/examples
sudo -u hdfs hadoop fs -chmod +rw /user/examples
hadoop fs -copyFromLocal ~/*.avsc /user/examples

start hive

hive

import table schema and data from sqoop to hive warehouse

從sqoop到hive倉庫導入表模式和數據

CREATE EXTERNAL TABLE tablename
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/tablename’
TBLPROPERTIES ('avro.schema.url'='hdfs://quickstart.cloudera/user/examples/sqoop_import_schemaname_tablename.avsc');

Note: make sure while typing command the single quote may change if you are coping the command. There should not be any space in path or filenames.

注意:確保在鍵入命令時,如果您正在處理命令,則單引號可能會更改。路徑或文件名中不應有任何空格。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2015/07/08/72072801818311bb4cecec36d4b7e357.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com