引言:最近微服务发展迅速,利用docker的轻量级共享宿主机资源的特性,容器中运行程序以及部署环境是十分快捷的!不同于虚拟机的严格隔离环境,docker利用一个固有的守护进程就可以创建出隔离性能优良的容器环境(namespace+cgroup)。最近由于疫情的延续,实验环境限制了很多计划,hadoop集群不能正常使用,于是利用IBM cloud上面的server(支持SGX),结合docker技术创建了hadoop2.7集群(暂时八个节点),IBM server内存2*16G,3.8GHz Intel Xeon-KabyLake(E3-1270-V6-Quadcore),磁盘1.7T,由于踩过不少坑,所以还是打算记录一下步骤:

一.集群环境

1
2
3
4
5
6
7
8
172.18.12.10      Master
172.18.12.11 slave1
172.18.12.12 slave2
172.18.12.13 slave3
172.18.12.14 slave4
172.18.12.15 slave5
172.18.12.16 slave6
172.18.12.17 slave7

二.开始创建

创建docker桥接ip段,在linux终端下键入以下命令(默认docker已经安装成功:curl -fssl https://raw.githubusercontent.com/SconeDocs/SH/master/install_docker.sh | bash)

1
2
docker network create --driver bridge --subnet=172.18.12.0/16 --gateway=172.18.1.1 mynet
docker network inspect mynet

创建容器(里面有的参数是实验需要,比如挂载、端口)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
sudo docker run --privileged  --device=/dev/isgx --name Master --hostname Master --network=mynet --ip 172.18.12.10 -d -P -p 50070:50070 -p 8088:8088 -v /home/xidian/class:/home/ -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

sudo docker run --privileged --device=/dev/isgx --name slave1 --hostname slave1 --network=mynet --ip 172.18.12.11 -d -P -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

sudo docker run --privileged --device=/dev/isgx --name slave2 --hostname slave2 --network=mynet --ip 172.18.12.12 -d -P -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

sudo docker run --privileged --device=/dev/isgx --name slave3 --hostname slave3 --network=mynet --ip 172.18.12.13 -d -P -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

sudo docker run --privileged --device=/dev/isgx --name slave4 --hostname slave4 --network=mynet --ip 172.18.12.14 -d -P -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

sudo docker run --privileged --device=/dev/isgx --name slave5 --hostname slave5 --network=mynet --ip 172.18.12.15 -d -P -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

sudo docker run --privileged --device=/dev/isgx --name slave6 --hostname slave6 --network=mynet --ip 172.18.12.16 -d -P -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

sudo docker run --privileged --device=/dev/isgx --name slave7 --hostname slave7 --network=mynet --ip 172.18.12.17 -d -P -it xiaoyuzhang321/scone_mr:SGXhadoop2.7new /etc/bootstrap.sh -bash

进入容器命令(以Master为例)

1
docker exec -it Master /bin/bash

创建hosts脚本(因为每次重新进入容器 /etc/hosts 中的ip username会清空,我们每次进入需要执行脚本)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
vi runhosts.sh

#!/bin/bash
echo 172.18.12.10 Master >>/etc/hosts
echo 172.18.12.11 slave1 >>/etc/hosts
echo 172.18.12.12 slave2 >>/etc/hosts
echo 172.18.12.13 slave3 >>/etc/hosts
echo 172.18.12.14 slave4 >>/etc/hosts
echo 172.18.12.15 slave5 >>/etc/hosts
echo 172.18.12.16 slave6 >>/etc/hosts
echo 172.18.12.17 slave7 >>/etc/hosts

chmod +x runhosts.sh
./runhosts.sh

下面关键是ssh配置

1.Master容器中执行:

1
2
3
4
5
cd ~/.ssh
ssh-keygen -t rsa
cat id_rsa.pub >> authorized_keys
chmod 600 ~/.ssh/authorized_keys
chmod 700 ~/.ssh/

2.每台slave容器中执行:

1
2
3
4
5
6
cd ~/.ssh
ssh-keygen -t rsa
cat id_rsa.pub >> authorized_keys
chmod 600 ~/.ssh/authorized_keys
chmod 700 ~/.ssh/
ssh-copy-id -i id_rsa.pub Master

3.Master容器中执行:

1
2
3
4
scp ~/.ssh/authorized_keys slave1:~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys slave2:~/.ssh/authorized_keys
.....
scp ~/.ssh/authorized_keys slave7:~/.ssh/authorized_keys

配置hadoop完全分布式环境

1.修改core-site.xml

1
2
3
4
5
6
修改namenode为Master(原来是localhost)
添加(路径自己按照自己需要改):
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>

2.修改slave(添加为以下)

1
2
3
4
5
6
7
8
Master
slave1
slave2
slave3
slave4
slave5
slave6
slave7

3.修改yarn-site.xml(添加)

1
2
3
4
5
<property>
<description>The hostname of the RM.</description>
<name>yarn.resourcemanager.hostname</name>
<value>Master</value>
</property>

最后在Master容器中执行

1
2
3
4
scp -rq /usr/local/hadoop/etc/hadoop  slave1:/usr/local/hadoop/etc
scp -rq /usr/local/hadoop/etc/hadoop slave2:/usr/local/hadoop/etc
.....
scp -rq /usr/local/hadoop/etc/hadoop slave7:/usr/local/hadoop/etc

上诉过程有些可以使用脚本来方便执行,比如hosts更新脚本,直接可以在宿主机上为容器启动脚本,在Master开启hadoop以及关闭hadoop等,ssh配置需要技巧,思路首先可以将每个容器的rsa公钥都统一发送到Master,最后Master再统一分发公钥,效率高点,如有不足,虚心讨教哈^^