0.2.kubeadm搭建kubernetes高可用集群
2021-03-05 12:28
标签:sysstat hub lease EDA 重启 火墙 开机启动 one can kubeadm是官方社区推出的一个用于快速部署kubernetes集群的工具;使用 部署版本 kubernetes1.16 在开始之前,部署Kubernetes集群机器需要满足以下几个条件: 1.1.主机名 设置永久主机名称,然后重新登录: 如果 DNS 不支持解析主机名称,则需要修改每台机器的 1.2.添加 docker 账户 在每台机器上添加 docker 账户: 1.3.无密码 ssh 登录其它节点 如果没有特殊指明,本文档的所有操作均在 k8s-master01 节点上执行,然后远程分发文件和执行命令,所以需要添加该节点到其它节点的 ssh 信任关系。 设置 k8s-master 的 root 账户可以无密码登录所有节点: 1.4.更新 PATH 变量 将可执行文件目录添加到 PATH 环境变量中: 1.5.安装依赖包 在每台机器上安装依赖包: 1.6.关闭防火墙 在每台机器上关闭防火墙,清理防火墙规则,设置默认转发策略: 1.7.关闭 swap 分区 如果开启了 swap 分区,kubelet 会启动失败(可以通过将参数 --fail-swap-on 设置为 false 来忽略 swap on),故需要在每台机器上关闭 swap 分区。同时注释 1.8.关闭 SELinux 关闭 SELinux,否则后续 K8S 挂载目录时可能报错 1.9.关闭 dnsmasq(可选) linux 系统开启了 dnsmasq 后(如 GUI 环境),将系统 DNS Server 设置为 127.0.0.1,这会导致 docker 容器无法解析域名,需要关闭它: 1.10.加载内核模块 1.11.优化内核参数 1.12.设置系统时区 1.13.关闭无关的服务 1.14.升级内核 CentOS 7.x 系统自带的 3.10.x 内核存在一些 Bugs,导致运行的 Docker、Kubernetes 不稳定,例如: 解决方案如下: 安装内核源文件(可选,在升级完内核并重启机器后执行): 1.15.关闭 NUMA 重新生成 grub2 配置文件: 1.16.文件最大数 每台机器上安装Docker,建议使用18.09版本。 镜像下载加速: 设置cgroup驱动,推荐systemd: 修改cgroupdriver是为了消除告警: 启动&验证 k8s-master节点都执行本部分操作。 keepalived配置 master01上keepalived配置: master02上keepalived配置: master03上keepalived配置: 启动keepalived 所有k8s-mster节点启动keepalived服务并设置开机启动 VIP查看 确保vip在master01上;否则在初始化安装会无法连接api-server 在所有节点操作 maser节点安装 node节点安装 命令补全 安装bash-completion 加载bash-completion 启动kubelet并设置开机启动 kubectl命令补全 镜像下载的脚本 url为阿里云镜像仓库地址,version为安装的kubernetes版本。 下载镜像 运行脚本image.sh,下载指定版本的镜像 master01节点执行本部分操作。 kubeadm_config.yaml为初始化的配置文件 检查文件是否错误,忽略 进行初始化 复制kubectl的kubeconfig,kubectl的kubeconfig路径默认是 init的yaml信息实际上会存在集群的configmap里,我们可以随时查看,该yaml在其他node和master join的时候会使用到 加载环境变量 配置其他master的k8s管理组件 配置其余k8s-master 所有master配置kubectl,准备kubectl的kubeconfig 设置kubectl的补全脚本 在node节点执行 和master的join一样,提前准备好环境和docker,然后join的时候不需要带 如果初始化失败,可执行kubeadm reset后重新初始化 新建flannel网络 由于网络原因,可能会安装失败,建议更换image源,然后再执行apply 修改yaml文件、并部署 下载yaml** 如果连接超时,可以多试几次 配置yaml 修改镜像地址 由于默认的镜像仓库网络访问不通,故改成阿里镜像 外网访问 配置NodePort,外部通过https://NodeIp:NodePort 访问Dashboard,此时端口为30001 新增管理员帐号 创建超级管理员的账号用于登录Dashboard 部署Dashboard 状态查看 令牌查看 创建service account并绑定默认cluster-admin管理员集群角色: 使用输出的token登录Dashboard,使用火狐浏览。 0.2.kubeadm搭建kubernetes高可用集群 标签:sysstat hub lease EDA 重启 火墙 开机启动 one can 原文地址:https://www.cnblogs.com/Gmiaomiao/p/12905795.html
kubeadmin安装Kubernetes
staticPod(容器)
运行的管理组件,镜像使用gcr.io
域名仓库里的。0.部署环境&需求
IP
Hostname
内核
CPU&Memory
系统版本
备注
192.168.121.81
k8s-master01
4.4.206-1.el7.elrepo.x86_64
4C&8M
7.6.1810
control plane
192.168.121.82
k8s-master02
4.4.206-1.el7.elrepo.x86_64
4C&8M
7.6.1810
control plane
192.168.121.83
k8s-master03
4.4.206-1.el7.elrepo.x86_64
4C&8M
7.6.1810
control plane
192.168.121.84
k8s-node01
4.4.206-1.el7.elrepo.x86_64
4C&8M
7.6.1810
worker nodes
192.168.121.85
k8s-node02
4.4.206-1.el7.elrepo.x86_64
4C&8M
7.6.1810
worker nodes
192.168.121.86
k8s-node03
4.4.206-1.el7.elrepo.x86_64
4C&8M
7.6.1810
worker nodes
192.168.121.88
keepalived
v1.3.5
vip
1.系统初始化
]# hostnamectl set-hostname k8s-master01 # 将 k8s-master01 替换为当前主机名
/etc/hostname
文件中;/etc/hosts
文件,添加主机名和 IP 的对应关系:]# cat >> /etc/hosts
]# useradd -m docker
]# ssh-keygen -t rsa
]# ssh-copy-id root@k8s-master01
]# ssh-copy-id root@k8s-master02
]# ssh-copy-id root@k8s-master03
]# echo ‘PATH=/opt/k8s/bin:$PATH‘ >>/root/.bashrc
]# source /root/.bashrc
]# yum install -y epel-release
]# yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget
]# systemctl stop firewalld
]# systemctl disable firewalld
]# iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
]# iptables -P FORWARD ACCEPT
/etc/fstab
中相应的条目,防止开机自动挂载 swap 分区:]# swapoff -a
]# sed -i ‘/ swap / s/^\(.*\)$/#\1/g‘ /etc/fstab
Permission denied
:]# setenforce 0
]# sed -i ‘s/^SELINUX=.*/SELINUX=disabled/‘ /etc/selinux/config
]# systemctl stop dnsmasq
]# systemctl disable dnsmasq
]# modprobe ip_vs_rr
]# modprobe br_netfilter
]# cat > kubernetes.conf
# 调整系统 TimeZone
]# timedatectl set-timezone Asia/Shanghai
# 将当前的 UTC 时间写入硬件时钟
]# timedatectl set-local-rtc 0
### 重启依赖于系统时间的服务
]# systemctl restart rsyslog
]# systemctl restart crond
]# systemctl stop postfix && systemctl disable postfix
3.10 kernel
实验支持的 kernel memory account
功能(无法关闭),当节点压力大如频繁启动和停止容器时会导致 cgroup memory leak
;"kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1"
;
disable CONFIG_MEMCG_KMEM
特性;Docker 18.09.1
及以上的版本。但由于 kubelet 也会设置 kmem(它 vendor 了 runc),所以需要重新编译 kubelet 并指定 GOFLAGS="-tags=nokmem"
;
这里采用升级内核的解决办法:]# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
# 安装完成后检查 /boot/grub2/grub.cfg 中对应内核 menuentry 中是否包含 initrd16 配置,如果没有,再安装一次!
]# yum --enablerepo=elrepo-kernel install -y kernel-lt
# 设置开机从新内核启动
]# grub2-set-default 0
]# yum erase kernel-headers
]# yum --enablerepo=elrepo-kernel install kernel-lt-devel-$(uname -r) kernel-lt-headers-$(uname -r)
]# cp /etc/default/grub{,.bak}
]# vim /etc/default/grub # 在 GRUB_CMDLINE_LINUX 一行添加 `numa=off` 参数,如下所示:
]# diff /etc/default/grub.bak /etc/default/grub
6c6
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet numa=off"
]# cp /boot/grub2/grub.cfg{,.bak}
]# grub2-mkconfig -o /boot/grub2/grub.cfg
]# cat>/etc/security/limits.d/kubernetes.conf
2.开始部署
2.1.部署Docker
]# wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
]# yum list docker-ce --showduplicates | sort -r
]# yum install docker-ce-18.09.9 docker-ce-cli-18.09.9 containerd.io -y
]# curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io
]# mkdir -p /etc/docker
]# cat > /etc/docker/daemon.json
[WARNING IsDockerSystemdCheck]: detected “cgroupfs” as the Docker cgroup driver. The recommended driver is “systemd”. Please follow the guide at https://kubernetes.io/docs/setup/cri/
]# systemctl enable --now docker
]# docker --version
Docker version 18.09.9, build 039a7df9ba
2.2.部署keepalived
安装keepalived]# yum -y install keepalived
[root@master01 ~]# more /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id master01
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 50
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.121.88
}
}
[root@master02 ~]# more /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id master02
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 50
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.121.88
}
}
[root@master03 ~]# more /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id master03
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 50
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.121.88
}
]# service keepalived start
]# systemctl enable keepalived
]# ip a
2.3.部署管理工具kubeadm
]# cat /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF
]# yum install -y kubeadm-1.16.4 kubectl-1.16.4 kubelet-1.16.4
]# systemctl enable kubelet
]# yum install -y kubeadm-1.16.4 kubelet-1.16.4
]# systemctl enable kubelet
]# yum -y install bash-completion
]# source /etc/profile.d/bash_completion.sh
]# systemctl enable kubelet && systemctl start kubelet
]# echo "source > ~/.bash_profile
]# source .bash_profile
2.4.下载镜像
Kubernetes几乎所有的安装组件和Docker镜像都放在goolge自己的网站上,直接访问可能会有网络问题,这里的解决办法是从阿里云镜像仓库下载镜像,拉取到本地以后改回默认的镜像tag。本文通过运行image.sh脚本方式拉取镜像。]# more image.sh
#!/bin/bash
url=registry.aliyuncs.com/google_containers
version=v1.16.4
images=(`kubeadm config images list --kubernetes-version=$version|awk -F ‘/‘ ‘{print $2}‘`)
for imagename in ${images[@]} ; do
docker pull $url/$imagename
docker tag $url/$imagename k8s.gcr.io/$imagename
docker rmi -f $url/$imagename
done
]# chmod +x image.sh
]# ./image.sh
]# docker images
3.初始化Master&Node
3.1.配置初始化文件
]# more kubeadm_config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.16.4
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
apiServer:
extraArgs:
authorization-mode: "Node,RBAC"
runtime-config: api/all,settings.k8s.io/v1alpha1=true
storage-backend: etcd3
etcd-servers: https://192.168.121.81:2379,https://192.168.121.82:2379,https://192.168.121.83:2379
certSANs: #填写所有kube-apiserver节点的hostname、IP、VIP
- k8s-master01
- k8s-master02
- k8s-master03
- k8s-node01
- k8s-node02
- k8s-node03
- 192.168.121.81
- 192.168.121.82
- 192.168.121.83
- 192.168.121.84
- 192.168.121.85
- 192.168.121.86
- 192.168.121.87
- 192.168.121.88
extraVolumes:
- hostPath: /etc/localtime
mountPath: /etc/localtime
name: localtime
readOnly: true
etcd:
local:
dataDir: /var/lib/etcd
serverCertSANs:
- master
- 192.168.121.81
- 192.168.121.82
- 192.168.121.83
- k8s-master01
- k8s-master02
- k8s-master03
peerCertSANs:
- master
- 192.168.121.81
- 192.168.121.82
- 192.168.121.83
- k8s-master01
- k8s-master02
- k8s-master03
extraArgs:
auto-compaction-retention: "1h"
max-request-bytes: "33554432"
quota-backend-bytes: "8589934592"
enable-v2: "false"
controlPlaneEndpoint: "192.168.121.88:6443"
networking:
podSubnet: "10.244.0.0/16"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs #启动ipvs mode;默认iptables mode
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: "rr"
strictARP: false
syncPeriod: 15s
iptables:
masqueradeAll: true
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
failSwapOn: true
3.2.master初始化
warning
,错误的话会抛出error]# kubeadm init --config /root/initconfig.yaml --dry-run
]# kubeadm init --config=kubeadm_config.yaml
......
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4 --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839 --control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4 --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839
~/.kube/config
]# mkdir -p $HOME/.kube
]# sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl -n kube-system get cm kubeadm-config -o yaml
]# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
]# source .bash_profile
for node in 192.168.33.102 192.168.33.103;do
ssh $node ‘mkdir -p /etc/kubernetes/pki/etcd‘
scp -r /root/kubeadm_config.yaml $node:/root/kubeadm_config.yaml
scp -r /root/image.sh $node:/root/image.sh
scp -r /etc/kubernetes/pki/ca.* $node:/etc/kubernetes/pki/
scp -r /etc/kubernetes/pki/sa.* $node:/etc/kubernetes/pki/
scp -r /etc/kubernetes/pki/front-proxy-ca.* $node:/etc/kubernetes/pki/
scp -r /etc/kubernetes/pki/etcd/ca.* $node:/etc/kubernetes/pki/etcd/
done
]# kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4 --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839 --control-plane
]# mkdir -p $HOME/.kube
]# sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
]# kubectl completion bash > /etc/bash_completion.d/kubectl
]# source /etc/bash_completion.d/kubectl
3.3.node初始化
--control-plane
]# kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4 --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839
3.4.初始化失败
]# kubeadm reset
]# rm -rf $HOME/.kube/config
4.配置etcdctl
4.1.复制出容器里的etcdctl
]# docker cp `docker ps -a | awk ‘/k8s_etcd/{print $1}‘`:/usr/local/bin/etcdctl /usr/local/bin/etcdctl
]# cat >/etc/profile.d/etcd.sh
4.2.配置etcd备份脚本
mkdir -p /opt/etcd
cat>/opt/etcd/etcd_cron.sh/dev/null || { echo ‘the value of the -c must be number‘;exit 1; }
shift 2
;;
-d)
[ ! -d "$2" ] && mkdir -p $2
bak_dir=$2
shift 2
;;
*)
[[ -z "$1" || "$1" == ‘--‘ ]] && { shift;break; }
echo "Internal error!"
exit 1
;;
esac
done
function etcd_v3(){
ETCDCTL_API=3 etcdctl --cert $cert_dir/healthcheck-client.crt --key $cert_dir/healthcheck-client.key --cacert $cert_dir/ca.crt --endpoints $endpoints $@
}
etcd::cron::save(){
cd $bak_dir/
etcd_v3 snapshot save $bak_prefix$($cmd_suffix)$bak_suffix
rm_files=`ls -t $bak_prefix*$bak_suffix | tail -n +$[bak_count+1]`
if [ -n "$rm_files" ];then
rm -f $rm_files
fi
}
main(){
[ -n "$bak_count" ] && etcd::cron::save || etcd_v3 snapshot save $@
}
main $@
EOF
crontab -e
添加下面内容自动保留四个备份副本]# `crontab -e 0 0 * * * bash /opt/etcd/etcd_cron.sh -c 4 -d /opt/etcd/ &>/dev/null
5.部署插件
5.1.网络查件
5.1.1部署flannel网络
]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
5.1.2部署Calico网络(可选)
]# curl https://docs.projectcalico.org/v3.9/manifests/calico-etcd.yaml -o calico.yaml
]# sed -i -e "s@192.168.0.0/16@10.244.0.0/16@g" calico.yaml
]# kubectl apply -f calico.yaml
5.2.Web UI(Dashboard搭建)
5.2.1进行配置
]# swget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml
]# sed -i ‘s/registry.aliyuncs.com\/google_containers/g‘ recommended.yaml
]# sed -i ‘/targetPort: 8443/a\ \ \ \ \ \ nodePort: 30001\n\ \ type: NodePort‘ recommended.yaml
]# cat >> recommended.yaml
5.2.2部署访问
]# kubectl apply -f recommended.yaml
]# kubectl get all -n kubernetes-dashboard
]# kubectl describe secrets -n kubernetes-dashboard $(kubectl -n kubernetes-dashboard get secret | awk ‘/dashboard-admin/{print $1}‘)
文章标题:0.2.kubeadm搭建kubernetes高可用集群
文章链接:http://soscw.com/index.php/essay/60443.html