hadoop單擊模式環境搭建


一 安裝jdk

下載相應版本的jdk安裝到相應目錄,我的安裝目錄是/usr/lib/jdk1.8.0_40

下載完成后,在/etc/profile中設置一下環境變量,在文件最后追加如下內容

export JAVA_HOME=/usr/lib/jdk1.8.0_40
export JRE_HOME
=/usr/lib/jdk1.8.0_40/jre
export CLASSPATH
=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH
export PATH
=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH

 

二 安裝ssh---------sudo apt-get install ssh 

主要使用其管理遠端守護進程,這里是單擊模式,所以,不重要.

三 下載hadoop

http://hadoop.apache.org/releases.html

建議下載穩定版本的,我下載的是hadoop2.6.4,並把它放在了/usr/local/目錄下

hadoop運行在apache服務器上的,需要java環境的支持,所以,下載的hadoop需要配置java環境變量,使java認識hadoop,同時也要使hadoop放到java環境中.

1 設置 ~/.bashrc,為登錄的hadoop用戶設置環境變量

export JAVA_HOME=/usr/lib/jdk1.8.0_40

export HADOOP_INSTALL
=/usr/local/hadoop-2.6.4

export PATH
=$PATH:$HADOOP_INSTALL/bin
export PATH
=$PATH:$JAVA_HOME/bin
export PATH
=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME
=$HADOOP_INSTALL
export HADOOP_COMMON_HOME
=$HADOOP_INSTALL
export HADOOP_HDFS_HOME
=$HADOOP_INSTALL
export YARN_HOME
=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR
=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS
="-Djava.library.path=$HADOOP_INSTALL/lib"

設置完成之后,要運行

source ~/.bashrc

使設置的環境變量生效

2 配置hadoop

在   /usr/local/hadoop-2.6.4/etc/hadoop/下打開hadoop-env.sh

export JAVA_HOME=/usr/lib/jdk1.8.0_40
export JRE_HOME
=/usr/lib/jdk1.8.0_40/jre
export CLASSPATH
=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH
export PATH
=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH

到這里hadoop單擊模式就配置好了

運行  

./bin/hadoop version

可看到如下信息

Hadoop 2.6.4
Subversion https:
//git-wip-us.apache.org/repos/asf/hadoop.git -r 5082c73637530b0b7e115f9625ed7fac69f937e6
Compiled by jenkins on 2016-02-12T09:45Z
Compiled with protoc
2.5.0
From source with checksum 8dee2286ecdbbbc930a6c87b65cbc010
This command was run using
/usr/local/hadoop-2.6.4/share/hadoop/common/hadoop-common-2.6.4.jar

說明hadoop配置好了

 

 

下面來運行一下hadoop自帶的wordcount程序檢驗一下

1 在hadoop目錄下創建input文件夾,將/etc/hadoop中的配置文件復制到里面作為待測文件

mkdir input
cp etc/hadoop/* input/

2 運行程序,計數

在hadoop目錄下運行命令

./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.4.jar grep input output '[a-z.]+'

意思是,通過example那個jar包,將a-z開頭的單詞數統計出來

看到如下運行信息

    File System Counters
FILE: Number of bytes read
=632564
FILE: Number of bytes written
=1415622
FILE: Number of read operations
=0
FILE: Number of large read operations
=0
FILE: Number of
write operations=0
Map
-Reduce Framework
Map input records
=1151
Map output records
=1151
Map output bytes
=22396
Map output materialized bytes
=24704
Input
split bytes=126
Combine input records
=0
Combine output records
=0
Reduce input
groups=70
Reduce shuffle bytes
=24704
Reduce input records
=1151
Reduce output records
=1151
Spilled Records
=2302
Shuffled Maps
=1
Failed Shuffles
=0
Merged Map outputs
=1
GC
time elapsed (ms)=0
CPU
time spent (ms)=0
Physical memory (bytes) snapshot
=0
Virtual memory (bytes) snapshot
=0
Total committed heap usage (bytes)
=667942912
Shuffle Errors
BAD_ID
=0
CONNECTION
=0
IO_ERROR
=0
WRONG_LENGTH
=0
WRONG_MAP
=0
WRONG_REDUCE
=0
File Input Format Counters
Bytes Read
=32250
File Output Format Counters
Bytes Written
=15798

說明運行成功

查看運行結果 

cat output/*

 

再次運行的話,需要 rm -r output/ 刪除output文件夾才能再次運行

 


注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
粤ICP备14056181号  © 2014-2021 ITdaan.com