0.2.kubeadm搭建kubernetes高可用集群

2021-03-05 12:28

阅读:552

标签:sysstat   hub   lease   EDA   重启   火墙   开机启动   one   can   

目录
  • kubeadmin安装Kubernetes
    • 0.部署环境&需求
    • 1.系统初始化
    • 2.开始部署
      • 2.1.部署Docker
      • 2.2.部署keepalived
      • 2.3.部署管理工具kubeadm
      • 2.4.下载镜像
    • 3.初始化Master&Node
      • 3.1.配置初始化文件
      • 3.2.master初始化
      • 3.3.node初始化
      • 3.4.初始化失败
    • 4.配置etcdctl
      • 4.1.复制出容器里的etcdctl
      • 4.2.配置etcd备份脚本
    • 5.部署插件
      • 5.1.网络查件
        • 5.1.1部署flannel网络
        • 5.1.2部署Calico网络(可选)
      • 5.2.Web UI(Dashboard搭建)
        • 5.2.1进行配置
        • 5.2.2部署访问

kubeadmin安装Kubernetes

kubeadm是官方社区推出的一个用于快速部署kubernetes集群的工具;使用staticPod(容器)运行的管理组件,镜像使用gcr.io域名仓库里的。

0.部署环境&需求

部署版本 kubernetes1.16

IP Hostname 内核 CPU&Memory 系统版本 备注
192.168.121.81 k8s-master01 4.4.206-1.el7.elrepo.x86_64 4C&8M 7.6.1810 control plane
192.168.121.82 k8s-master02 4.4.206-1.el7.elrepo.x86_64 4C&8M 7.6.1810 control plane
192.168.121.83 k8s-master03 4.4.206-1.el7.elrepo.x86_64 4C&8M 7.6.1810 control plane
192.168.121.84 k8s-node01 4.4.206-1.el7.elrepo.x86_64 4C&8M 7.6.1810 worker nodes
192.168.121.85 k8s-node02 4.4.206-1.el7.elrepo.x86_64 4C&8M 7.6.1810 worker nodes
192.168.121.86 k8s-node03 4.4.206-1.el7.elrepo.x86_64 4C&8M 7.6.1810 worker nodes
192.168.121.88 keepalived v1.3.5 vip

在开始之前,部署Kubernetes集群机器需要满足以下几个条件:

  • 一台或多台机器,操作系统 CentOS7.x-86_x64
  • 硬件配置:2GB或更多RAM,2个CPU或更多CPU,硬盘30GB或更多
  • 集群中所有机器之间网络互通
  • 可以访问外网,需要拉取镜像
  • 禁止swap分区

1.系统初始化

1.1.主机名

设置永久主机名称,然后重新登录:

]# hostnamectl set-hostname k8s-master01 # 将  k8s-master01 替换为当前主机名
  • 设置的主机名保存在 /etc/hostname 文件中;

如果 DNS 不支持解析主机名称,则需要修改每台机器的 /etc/hosts 文件,添加主机名和 IP 的对应关系:

]# cat >> /etc/hosts 

1.2.添加 docker 账户

在每台机器上添加 docker 账户:

]# useradd -m docker

1.3.无密码 ssh 登录其它节点

如果没有特殊指明,本文档的所有操作均在 k8s-master01 节点上执行,然后远程分发文件和执行命令,所以需要添加该节点到其它节点的 ssh 信任关系。

设置 k8s-master 的 root 账户可以无密码登录所有节点

]# ssh-keygen -t rsa
]# ssh-copy-id root@k8s-master01
]# ssh-copy-id root@k8s-master02
]# ssh-copy-id root@k8s-master03

1.4.更新 PATH 变量

将可执行文件目录添加到 PATH 环境变量中:

]# echo ‘PATH=/opt/k8s/bin:$PATH‘ >>/root/.bashrc
]# source /root/.bashrc

1.5.安装依赖包

在每台机器上安装依赖包:

]# yum install -y epel-release
]# yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget

1.6.关闭防火墙

在每台机器上关闭防火墙,清理防火墙规则,设置默认转发策略:

]# systemctl stop firewalld
]# systemctl disable firewalld
]# iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
]# iptables -P FORWARD ACCEPT

1.7.关闭 swap 分区

如果开启了 swap 分区,kubelet 会启动失败(可以通过将参数 --fail-swap-on 设置为 false 来忽略 swap on),故需要在每台机器上关闭 swap 分区。同时注释 /etc/fstab 中相应的条目,防止开机自动挂载 swap 分区:

]# swapoff -a
]# sed -i ‘/ swap / s/^\(.*\)$/#\1/g‘ /etc/fstab 

1.8.关闭 SELinux

关闭 SELinux,否则后续 K8S 挂载目录时可能报错 Permission denied

]# setenforce 0
]# sed -i ‘s/^SELINUX=.*/SELINUX=disabled/‘ /etc/selinux/config

1.9.关闭 dnsmasq(可选)

linux 系统开启了 dnsmasq 后(如 GUI 环境),将系统 DNS Server 设置为 127.0.0.1,这会导致 docker 容器无法解析域名,需要关闭它:

]# systemctl stop dnsmasq
]# systemctl disable dnsmasq

1.10.加载内核模块

]# modprobe ip_vs_rr
]# modprobe br_netfilter

1.11.优化内核参数

]# cat > kubernetes.conf 
  • 必须关闭 tcp_tw_recycle,否则和 NAT 冲突,会导致服务不通;
  • 关闭 IPV6,防止触发 docker BUG;

1.12.设置系统时区

# 调整系统 TimeZone
]# timedatectl set-timezone Asia/Shanghai

# 将当前的 UTC 时间写入硬件时钟
]# timedatectl set-local-rtc 0

### 重启依赖于系统时间的服务
]# systemctl restart rsyslog 
]# systemctl restart crond

1.13.关闭无关的服务

]# systemctl stop postfix && systemctl disable postfix

1.14.升级内核

CentOS 7.x 系统自带的 3.10.x 内核存在一些 Bugs,导致运行的 Docker、Kubernetes 不稳定,例如:

  1. 高版本的 docker(1.13 以后) 启用了 3.10 kernel 实验支持的 kernel memory account 功能(无法关闭),当节点压力大如频繁启动和停止容器时会导致 cgroup memory leak
  2. 网络设备引用计数泄漏,会导致类似于报错:"kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1";

解决方案如下:

  1. 升级内核到 4.4.X 以上;
  2. 或者,手动编译内核,disable CONFIG_MEMCG_KMEM 特性;
  3. 或者,安装修复了该问题的 Docker 18.09.1 及以上的版本。但由于 kubelet 也会设置 kmem(它 vendor 了 runc),所以需要重新编译 kubelet 并指定 GOFLAGS="-tags=nokmem"
    这里采用升级内核的解决办法:
]# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
# 安装完成后检查 /boot/grub2/grub.cfg 中对应内核 menuentry 中是否包含 initrd16 配置,如果没有,再安装一次!
]# yum --enablerepo=elrepo-kernel install -y kernel-lt
# 设置开机从新内核启动
]# grub2-set-default 0

安装内核源文件(可选,在升级完内核并重启机器后执行):

]#  yum erase kernel-headers
]# yum --enablerepo=elrepo-kernel install kernel-lt-devel-$(uname -r) kernel-lt-headers-$(uname -r)

1.15.关闭 NUMA

]# cp /etc/default/grub{,.bak}
]# vim /etc/default/grub # 在 GRUB_CMDLINE_LINUX 一行添加 `numa=off` 参数,如下所示:
]# diff /etc/default/grub.bak /etc/default/grub
6c6
 GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet numa=off"

重新生成 grub2 配置文件:

]# cp /boot/grub2/grub.cfg{,.bak}
]# grub2-mkconfig -o /boot/grub2/grub.cfg

1.16.文件最大数

]# cat>/etc/security/limits.d/kubernetes.conf

2.开始部署

2.1.部署Docker

每台机器上安装Docker,建议使用18.09版本。

]#  wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
]#  yum list docker-ce --showduplicates | sort -r
]#  yum install docker-ce-18.09.9 docker-ce-cli-18.09.9 containerd.io -y

镜像下载加速:

]# curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://f1361db2.m.daocloud.io

设置cgroup驱动,推荐systemd:

]#  mkdir -p /etc/docker
]# cat > /etc/docker/daemon.json 

修改cgroupdriver是为了消除告警:
[WARNING IsDockerSystemdCheck]: detected “cgroupfs” as the Docker cgroup driver. The recommended driver is “systemd”. Please follow the guide at https://kubernetes.io/docs/setup/cri/

启动&验证

]# systemctl enable --now docker
]# docker --version
Docker version 18.09.9, build 039a7df9ba

2.2.部署keepalived

k8s-master节点都执行本部分操作。
安装keepalived

]# yum -y install keepalived

keepalived配置

master01上keepalived配置:

[root@master01 ~]# more /etc/keepalived/keepalived.conf 
! Configuration File for keepalived
global_defs {
   router_id master01
}
vrrp_instance VI_1 {
    state MASTER 
    interface eth0
    virtual_router_id 50
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.121.88
    }
}

master02上keepalived配置:

[root@master02 ~]# more /etc/keepalived/keepalived.conf 
! Configuration File for keepalived
global_defs {
   router_id master02
}
vrrp_instance VI_1 {
    state BACKUP 
    interface eth0
    virtual_router_id 50
    priority 90
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.121.88
    }
}

master03上keepalived配置:

[root@master03 ~]# more /etc/keepalived/keepalived.conf 
! Configuration File for keepalived
global_defs {
   router_id master03
}
vrrp_instance VI_1 {
    state BACKUP 
    interface eth0
    virtual_router_id 50
    priority 80
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.121.88
    }

启动keepalived

所有k8s-mster节点启动keepalived服务并设置开机启动

]# service keepalived start
]# systemctl enable keepalived

VIP查看

]# ip a

确保vip在master01上;否则在初始化安装会无法连接api-server

2.3.部署管理工具kubeadm

在所有节点操作

]# cat /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
EOF

maser节点安装

]# yum install -y     kubeadm-1.16.4     kubectl-1.16.4     kubelet-1.16.4 
]# systemctl enable kubelet

node节点安装

]# yum install -y     kubeadm-1.16.4     kubelet-1.16.4 
]# systemctl enable kubelet

命令补全

安装bash-completion

]# yum -y install bash-completion

加载bash-completion

]# source /etc/profile.d/bash_completion.sh

启动kubelet并设置开机启动

]# systemctl enable kubelet && systemctl start kubelet

kubectl命令补全

]# echo "source > ~/.bash_profile
]# source .bash_profile 

2.4.下载镜像

镜像下载的脚本
Kubernetes几乎所有的安装组件和Docker镜像都放在goolge自己的网站上,直接访问可能会有网络问题,这里的解决办法是从阿里云镜像仓库下载镜像,拉取到本地以后改回默认的镜像tag。本文通过运行image.sh脚本方式拉取镜像。

]# more image.sh 
#!/bin/bash
url=registry.aliyuncs.com/google_containers
version=v1.16.4
images=(`kubeadm config images list --kubernetes-version=$version|awk -F ‘/‘ ‘{print $2}‘`)
for imagename in ${images[@]} ; do
  docker pull $url/$imagename
  docker tag $url/$imagename k8s.gcr.io/$imagename
  docker rmi -f $url/$imagename
done

url为阿里云镜像仓库地址,version为安装的kubernetes版本。

下载镜像

运行脚本image.sh,下载指定版本的镜像

]# chmod +x image.sh
]# ./image.sh
]# docker images

3.初始化Master&Node

master01节点执行本部分操作。

3.1.配置初始化文件

]# more kubeadm_config.yaml 
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.16.4
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
apiServer:
  extraArgs:
    authorization-mode: "Node,RBAC"
    runtime-config: api/all,settings.k8s.io/v1alpha1=true
    storage-backend: etcd3
    etcd-servers: https://192.168.121.81:2379,https://192.168.121.82:2379,https://192.168.121.83:2379
  certSANs:    #填写所有kube-apiserver节点的hostname、IP、VIP
  - k8s-master01
  - k8s-master02
  - k8s-master03
  - k8s-node01
  - k8s-node02
  - k8s-node03
  - 192.168.121.81
  - 192.168.121.82
  - 192.168.121.83
  - 192.168.121.84
  - 192.168.121.85
  - 192.168.121.86
  - 192.168.121.87
  - 192.168.121.88
  extraVolumes:
  - hostPath: /etc/localtime
    mountPath: /etc/localtime
    name: localtime
    readOnly: true
etcd:
  local:
    dataDir: /var/lib/etcd
    serverCertSANs:
    - master
    - 192.168.121.81
    - 192.168.121.82
    - 192.168.121.83
    - k8s-master01
    - k8s-master02
    - k8s-master03
    peerCertSANs:
    - master
    - 192.168.121.81
    - 192.168.121.82
    - 192.168.121.83
    - k8s-master01
    - k8s-master02
    - k8s-master03
    extraArgs:
      auto-compaction-retention: "1h"
      max-request-bytes: "33554432"
      quota-backend-bytes: "8589934592"
      enable-v2: "false"
controlPlaneEndpoint: "192.168.121.88:6443"
networking:
  podSubnet: "10.244.0.0/16"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs #启动ipvs mode;默认iptables mode
ipvs:
  excludeCIDRs: null
  minSyncPeriod: 0s
  scheduler: "rr"
  strictARP: false
  syncPeriod: 15s
iptables:
  masqueradeAll: true
  masqueradeBit: 14
  minSyncPeriod: 0s
  syncPeriod: 30s
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
failSwapOn: true

kubeadm_config.yaml为初始化的配置文件

3.2.master初始化

检查文件是否错误,忽略warning,错误的话会抛出error

]# kubeadm init --config /root/initconfig.yaml --dry-run

进行初始化

]# kubeadm init --config=kubeadm_config.yaml
......
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities 
and service account keys on each node and then running the following as root:

  kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4     --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839     --control-plane 	  

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4     --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839

复制kubectl的kubeconfig,kubectl的kubeconfig路径默认是~/.kube/config

]# mkdir -p $HOME/.kube
]# sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
]# sudo chown $(id -u):$(id -g) $HOME/.kube/config

init的yaml信息实际上会存在集群的configmap里,我们可以随时查看,该yaml在其他node和master join的时候会使用到

kubectl -n kube-system get cm kubeadm-config -o yaml

加载环境变量

]# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
]# source .bash_profile

配置其他master的k8s管理组件

for node in 192.168.33.102 192.168.33.103;do
    ssh $node ‘mkdir -p /etc/kubernetes/pki/etcd‘
    scp -r /root/kubeadm_config.yaml $node:/root/kubeadm_config.yaml
    scp -r /root/image.sh $node:/root/image.sh
    scp -r /etc/kubernetes/pki/ca.* $node:/etc/kubernetes/pki/
    scp -r /etc/kubernetes/pki/sa.* $node:/etc/kubernetes/pki/
    scp -r /etc/kubernetes/pki/front-proxy-ca.* $node:/etc/kubernetes/pki/
    scp -r /etc/kubernetes/pki/etcd/ca.* $node:/etc/kubernetes/pki/etcd/
done

配置其余k8s-master

]# kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4     --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839     --control-plane

所有master配置kubectl,准备kubectl的kubeconfig

]# mkdir -p $HOME/.kube
]# sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
]# sudo chown $(id -u):$(id -g) $HOME/.kube/config

设置kubectl的补全脚本

]# kubectl completion bash > /etc/bash_completion.d/kubectl
]# source  /etc/bash_completion.d/kubectl

3.3.node初始化

在node节点执行

和master的join一样,提前准备好环境和docker,然后join的时候不需要带--control-plane

]# kubeadm join 192.168.121.88:6443 --token kkpntz.yc5ow033k2mnp0z4     --discovery-token-ca-cert-hash sha256:bbc3c89aa94f90c457fc6a61f66c39b8bfccb12e84cd8df31ce226e09f0ea839

3.4.初始化失败

如果初始化失败,可执行kubeadm reset后重新初始化

]# kubeadm reset
]# rm -rf $HOME/.kube/config

4.配置etcdctl

4.1.复制出容器里的etcdctl

]# docker cp `docker ps -a | awk ‘/k8s_etcd/{print $1}‘`:/usr/local/bin/etcdctl /usr/local/bin/etcdctl
]# cat >/etc/profile.d/etcd.sh

4.2.配置etcd备份脚本

mkdir -p /opt/etcd
cat>/opt/etcd/etcd_cron.sh/dev/null ||                 { echo ‘the value of the -c must be number‘;exit 1; }
            shift 2
            ;;
        -d)
            [ ! -d "$2" ] && mkdir -p $2
            bak_dir=$2
            shift 2
            ;;
         *)
            [[ -z "$1" || "$1" == ‘--‘ ]] && { shift;break; }
            echo "Internal error!"
            exit 1
            ;;
    esac
done



function etcd_v3(){

    ETCDCTL_API=3 etcdctl          --cert $cert_dir/healthcheck-client.crt        --key  $cert_dir/healthcheck-client.key        --cacert $cert_dir/ca.crt        --endpoints $endpoints $@
}

etcd::cron::save(){
    cd $bak_dir/
    etcd_v3 snapshot save  $bak_prefix$($cmd_suffix)$bak_suffix
    rm_files=`ls -t $bak_prefix*$bak_suffix | tail -n +$[bak_count+1]`
    if [ -n "$rm_files" ];then
        rm -f $rm_files
    fi
}

main(){
    [ -n "$bak_count" ] && etcd::cron::save || etcd_v3 snapshot save $@
}

main $@
EOF

crontab -e添加下面内容自动保留四个备份副本

]# `crontab -e 0 0 * * * bash /opt/etcd/etcd_cron.sh  -c 4 -d /opt/etcd/ &>/dev/null

5.部署插件

5.1.网络查件

5.1.1部署flannel网络

新建flannel网络

]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml

由于网络原因,可能会安装失败,建议更换image源,然后再执行apply

5.1.2部署Calico网络(可选)

]# curl https://docs.projectcalico.org/v3.9/manifests/calico-etcd.yaml -o calico.yaml

修改yaml文件、并部署

]# sed -i -e "s@192.168.0.0/16@10.244.0.0/16@g" calico.yaml
]# kubectl apply -f calico.yaml

5.2.Web UI(Dashboard搭建)

5.2.1进行配置

下载yaml**

]# swget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml

如果连接超时,可以多试几次

配置yaml

修改镜像地址

]#  sed -i ‘s/registry.aliyuncs.com\/google_containers/g‘ recommended.yaml

由于默认的镜像仓库网络访问不通,故改成阿里镜像

外网访问

]# sed -i ‘/targetPort: 8443/a\ \ \ \ \ \ nodePort: 30001\n\ \ type: NodePort‘ recommended.yaml

配置NodePort,外部通过https://NodeIp:NodePort 访问Dashboard,此时端口为30001

新增管理员帐号

]# cat >> recommended.yaml 

创建超级管理员的账号用于登录Dashboard

5.2.2部署访问

部署Dashboard

]# kubectl apply -f recommended.yaml

状态查看

]# kubectl get all -n kubernetes-dashboard 

令牌查看

创建service account并绑定默认cluster-admin管理员集群角色:

]# kubectl describe secrets -n kubernetes-dashboard $(kubectl -n kubernetes-dashboard get secret | awk ‘/dashboard-admin/{print $1}‘)

使用输出的token登录Dashboard,使用火狐浏览。

0.2.kubeadm搭建kubernetes高可用集群

标签:sysstat   hub   lease   EDA   重启   火墙   开机启动   one   can   

原文地址:https://www.cnblogs.com/Gmiaomiao/p/12905795.html


评论


亲,登录后才可以留言!