Winse Blog

走走停停, 熙熙攘攘, 忙忙碌碌, 不知何畏.

Hadoop安装与升级-(3)HA配置

官网的文档[HDFSHighAvailabilityWithQJM.html]很详细,但是没有一个整体的案例。这里整理下操作记录下来。

配置

hadoop-master1和hadoop-master2之间无密钥登录(failover要用到):

1
2
3
[hadoop@hadoop-master2 hadoop-2.2.0]$ ssh-keygen
[hadoop@hadoop-master2 hadoop-2.2.0]$ ssh-copy-id hadoop-master2
[hadoop@hadoop-master2 hadoop-2.2.0]$ ssh-copy-id hadoop-master1

配置文件修改:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
[hadoop@hadoop-master1 hadoop-2.2.0]$ vi etc/hadoop/core-site.xml 

<property>
<name>fs.defaultFS</name>
<value>hdfs://zfcluster</value>
</property>

<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-master1</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/data/tmp</value>
</property>

[hadoop@hadoop-master1 hadoop-2.2.0]$ vi etc/hadoop/hdfs-site.xml 

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.namenode.secondary.http-address</name>
<value> </value>
</property>

<property>
<name>dfs.nameservices</name>
<value>zfcluster</value>
</property>

<property>
<name>dfs.ha.namenodes.zfcluster</name>
<value>nn1,nn2</value>
</property>

<property>
<name>dfs.namenode.rpc-address.zfcluster.nn1</name>
<value>hadoop-master1:8020</value>
</property>

<property>
<name>dfs.namenode.rpc-address.zfcluster.nn2</name>
<value>hadoop-master2:8020</value>
</property>

<property>
<name>dfs.namenode.http-address.zfcluster.nn1</name>
<value>hadoop-master1:50070</value>
</property>

<property>
<name>dfs.namenode.http-address.zfcluster.nn2</name>
<value>hadoop-master2:50070</value>
</property>

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-master1:8485/zfcluster</value>
</property>

<property>
<name>dfs.client.failover.proxy.provider.zfcluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/journal</value>
</property>

<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>

<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>

启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
[hadoop@hadoop-master1 hadoop-2.2.0]$ cd ..
[hadoop@hadoop-master1 ~]$ for h in hadoop-master2 hadoop-slaver1 hadoop-slaver2 hadoop-slaver3 ; do rsync -vaz --delete --exclude=logs hadoop-2.2.0 $h:~/ ; done

[hadoop@hadoop-master1 ~]$ cd hadoop-2.2.0/

[hadoop@hadoop-master1 hadoop-2.2.0]$ sbin/hadoop-daemon.sh start journalnode

[hadoop@hadoop-master1 hadoop-2.2.0]$ sbin/hadoop-daemon.sh start namenode
[hadoop@hadoop-master2 hadoop-2.2.0]$ bin/hdfs namenode -bootstrapStandby

[hadoop@hadoop-master1 hadoop-2.2.0]$ bin/hdfs namenode -initializeSharedEdits

#// 此时可以启动datanode,通过50070端口看namenode的状态

#// Automatic failover,zkfc和namenode没有启动顺序的问题!
[hadoop@hadoop-master1 hadoop-2.2.0]$ bin/hdfs zkfc -formatZK
[hadoop@hadoop-master1 hadoop-2.2.0]$ sbin/hadoop-daemon.sh start zkfc
[hadoop@hadoop-master2 hadoop-2.2.0]$ sbin/hadoop-daemon.sh start zkfc

[hadoop@hadoop-master1 hadoop-2.2.0]$ bin/hdfs haadmin -failover nn1 nn2

#// 测试failover,把一个active的namenode直接kill掉,看看另一个是否变成active!

# 重启
[hadoop@hadoop-master1 hadoop-2.2.0]$ sbin/stop-dfs.sh
16/01/07 10:57:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [hadoop-master1 hadoop-master2]
hadoop-master1: stopping namenode
hadoop-master2: stopping namenode
hadoop-slaver1: stopping datanode
hadoop-slaver2: stopping datanode
hadoop-slaver3: stopping datanode
Stopping journal nodes [hadoop-master1]
hadoop-master1: stopping journalnode
16/01/07 10:58:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping ZK Failover Controllers on NN hosts [hadoop-master1 hadoop-master2]
hadoop-master2: no zkfc to stop
hadoop-master1: no zkfc to stop

[hadoop@hadoop-master1 hadoop-2.2.0]$ sbin/start-dfs.sh
16/01/07 10:59:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop-master1 hadoop-master2]
hadoop-master2: starting namenode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-namenode-hadoop-master2.out
hadoop-master1: starting namenode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-namenode-hadoop-master1.out
hadoop-slaver1: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-datanode-hadoop-slaver1.out
hadoop-slaver3: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-datanode-hadoop-slaver3.out
hadoop-slaver2: starting datanode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-datanode-hadoop-slaver2.out
Starting journal nodes [hadoop-master1]
hadoop-master1: starting journalnode, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-journalnode-hadoop-master1.out
16/01/07 10:59:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [hadoop-master1 hadoop-master2]
hadoop-master2: starting zkfc, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-zkfc-hadoop-master2.out
hadoop-master1: starting zkfc, logging to /home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-zkfc-hadoop-master1.out

[hadoop@hadoop-master1 ~]$ jps
15241 DFSZKFailoverController
14882 NameNode
244 QuorumPeerMain
18715 Jps
15076 JournalNode

参考

–END

Comments