记录一下CKA考试的备考过程

CKA考试备考笔记

记录备考CKA的过程

学习K8S的资源主要有三个:

其中kodekloud提供了在线的实验环境,无需自己搭建k8s集群测试环境 最终考了94分,高分飘过

考试技巧

  • 考试环境是类似远程桌面,是一个远程的ubuntu桌面,只能在远程桌面内操作,不能使用你自己电脑上的浏览器,但是可以使用远程桌面里的firefox浏览器

  • 提前30分钟就可以进入考场,出示证件并检查环境

  • 考试命令行各种快捷键和命令补全都已经配置好了,不用操心

  • 如果要求是写shell命令行到文件中,不能用k缩写,必须是完整的kubelet命令

  • 考试时可以复制粘贴k8s官网的内容

  • 一定要看看网上历年真题,并实际联系,题目几乎一样(很关键)

  • 最好看英文试题,不会因为翻译产生误解

  • 最好参加下Killer Shell - Exam Simulators的模拟考试,模拟难度很大,分析学习完会有明显提高

  • 必学命令

    • vim
    • grep命令
    • yaml语法
    • netstat命令
    • ip命令
    • systemctl命令
    • journalctl命令
    • ipcalc命令(可选)
    • apt-cache madison
    • apt-get
    • json-path

笔记

核心概念

核心概念

一切皆是资源

通过如下命令可以列出api支持的资源

kubectl api-resources
Short name Full name
csr certificatesigningrequests
cs componentstatuses
cm configmaps
ds daemonsets
deploy deployments
ep endpoints
ev events
hpa horizontalpodautoscalers
ing ingresses
limits limitranges
ns namespaces
no nodes
pvc persistentvolumeclaims
pv persistentvolumes
po pods
pdb poddisruptionbudgets
psp podsecuritypolicies
rs replicasets
rc replicationcontrollers
quota resourcequotas
sa serviceaccounts
svc services

通用命令

# 列出api接口支持的资源(也包含了资源的apiVersiono和kind信息)
kubectl api-resources
# 解释具体的资源
kubectl explain <resource-name>
kubectl explain replicaset
kubectl explain rs | head -n3

# 列出当前namespace所有资源
kubectl get all

Pod

# 查看所有pods
kubectl get pods
kubectl get pods -o wide

# 创建一个nginx容器
kubectl run nginx --image=nginx

# 查看pod的详情
kubectl describe pod <podname> | less

# 删除pod
kubectl delete pod <podname>

# 查找特定lable的pods
k get pods --selector env=dev,bu=finance

利用声明式接口创建pod

  • dry run机制生成yaml
  • kubectl create -f 从yaml文件生成pod
controlplane ~ ➜  kubectl run redis --image=redis123 --dry-run=client -o yaml > redis-definition.yaml

controlplane ~ ➜  cat redis-definition.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: redis
  name: redis
spec:
  containers:
  - image: redis123
    name: redis
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

controlplane ~ ➜  kubectl create -f redis-definition.yaml 
pod/redis created

刚才创建的pod中image名字redis123无法pull镜像,现在进行修复

方式1:

kubectl edit pod redis将spec段中的redis123修改为redis,保存即可生效

方式2:

直接编辑redis-definition.yaml文件,修改image名后

kubectl apply -f redis-definition.yaml

方式3:

先获取pod的yaml

kubectl get pod redis -o yaml > redis-definition.yaml

编辑修改后再进行替换

kubectl replace -f redis-definition.yaml --force

一些例子


# 创建带lable的pod
kubectl run redis --image=redis:alpine --labels tier=db
# 为pod创建一个service
k expose pod redis --port 6379 --name redis-service

# 创建pod并制定端口
k run custom-nginx --image=nginx --port=8080

# 创建pod同时创建同名的service
k run httpd --image httpd:alpine --port=80 --expose

查看/创建static pod

static pod由kubelet直接管理,只能从api接口查看,无法从api管理

# 查看哪些是static pod,以-controlplane结尾的是的
k get pods --all-namespaces | grep -controlplane
# 查找static pod的yaml配置路径
# step1: 查看kubelet的配置文件(启动时通过命令行传递)
ps -aux | grep /usr/bin/kubelet
# kubelet的配置路径为/var/lib/kubelet/config.yaml
grep -i staticpod /var/lib/kubelet/config.yaml

# 创建static pod
# 首先创建定义pod的yaml文件,然后放到/etc/kubernetes/manifests路径下
kubectl run --restart=Never --image=busybox static-busybox --dry-run=client -o yaml --command -- sleep 1000 > /etc/kubernetes/manifests/static-busybox.yaml

Replicaset

# 列出replicasets资源
kubectl get replicasets
kubectl get rs
# 查看replicasets更多信息
kubectl get rs -o wide
kubectl describe replicaset <replicaset-name>

# 从yaml文件创建replicasets
# 命令式:资源已存在会报错
kubectl create -f replicaset-definition.yaml
# 声明式:资源已存在不会报错,更适用于版本更新
kubectl apply -f replicaset-definition.yaml

# 删除replicasets
kubectl delete rs <replicaset-name>

# 修复错误的replicasets
# step1: 修复配置
kubectl edit rs <replicaset-name>
# step2: 删除之前未成功的pods
k get po | grep <replicaset-name> | awk  '{print $1}' | xargs -n 1 k delete po

# scale扩容或缩容
k scale rs <replicaset-name> --replicas=5
# 或者直接k edit编辑replicas
k edit rs <replicaset-name>

一些例子

k create deployment webapp --image kodekloud/webapp-color --replicas 

DaemonSet

# 获取所有namespace中的daemonsets
k get daemonsets --all-namespaces

# 查看daemonset详情
kubectl describe ds <daemonset-name> --namespace=<namespace-name>

# 创建一个daemonset的yaml
# step1: 先通过dry run方式创建一个deployment的yaml
# step2: 编辑yaml,移除replicas, strategy and status 段
# step3: 将kind从Deployment 改为 DaemonSet

创建daemonset的例子

k create deployment elasticsearch -n kube-system --image registry.k8s.io/fluentd-elasticsearch:1.20 --dry-run=client  -o yaml > daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  creationTimestamp: null
  labels:
    app: elasticsearch
  name: elasticsearch
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: elasticsearch
    spec:
      containers:
      - image: registry.k8s.io/fluentd-elasticsearch:1.20
        name: fluentd-elasticsearch
        resources: {}
k apply -f daemonset.yaml

Deployment

# 列出deployments资源
kubectl get deployments
kubectl get deploy


# 创建deployment
# 方式1:直接命令创建
kubectl create deployment --image httpd:2.4-alpine --replicas 3 httpd-frontent
# 方式2:先dry-run生成yaml,然后修改后再apply
kubectl create deployment --image httpd:2.4-alpine --replicas 3 --dry-run=client httpd-frontent -o yaml > httpd-deployment.yaml
kubectl apply -f httpd-deployment.yaml

# 删除deployment
k delete deploy <deployment-name>

一些例子

# 指定namespace创建deployment
kubectl create deployment redis-deploy -n dev-ns --image redis --replicas 2

Namespace

# 列出所有namespace
k get namespaces
k get ns
# 获取环境中namespace个数
k get ns --no-headers | wc -l

# 列出特定namespace中的pods
k get pods -n <namespace-name>

# 获取所有namespace中的pods
k get pods --all-namespaces

# 在特定namespace中创建pod
k run redis --image=redis  -n finance
k run redis --image=redis --dry-run=client -n finance -o yaml

# 列出特定namespace中的service
k get service -n <namespace-name>

# 创建一个新的namespace
k create namespace <namespace-name>

备注:

  • service的DNS解析策略
    • 同namespace,可以直接用service的名字
    • 不同namesapce,需要加上namespace的后缀,例如
      • db-service.dev
      • db-service.dev.svc.cluster.local

Service

# 列出所有services
k get services
k get svc

# 查看特定service详情
k describe svc <service-name>

# 创建service,使用类似下民的yaml创建
k apply -f service-definition.yaml
---
apiVersion: v1
kind: Service
metadata:
  name: webapp-service 
  namespace: default
spec:
  ports: 
  - nodePort: 30080 
    port: 8080
    targetPort: 8080 
  selector:
    name:  simple-webapp
  type: NodePort

Node

# 获取集群中所有node的信息
kubectl get nodes

# 查看节点详情
k describe node <node-name>

调度部分

通过指定nodeName指定节点部署

---
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  nodeName: node01
  containers:
  -  image: nginx
     name: nginx

Taint和Tolerance

# 为节点node01添加taint,效果是NoSchedule
k taint node node01 spary=mortein:NoSchedule
# 为节点node01移除key是spary的taint
k taint node node01 spary-

创建可以容忍特定taint的pod

---
apiVersion: v1
kind: Pod
metadata:
  name: bee
spec:
  containers:
  - image: nginx
    name: bee
  tolerations:
  - key: spray
    value: mortein
    effect: NoSchedule
    operator: Equal

Label

# 为节点添加lable color=blue
kubectl label node node01 color=blue

使用label实现一个deployment中的pod调度亲和

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: blue
spec:
  replicas: 3
  selector:
    matchLabels:
      run: nginx
  template:
    metadata:
      labels:
        run: nginx
    spec:
      containers:
      - image: nginx
        imagePullPolicy: Always
        name: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: color
                operator: In
                values:
                - blue

遇到pod定义的资源不足以运行

kubectl get pod elephant -o yaml > ele.yaml
# 编辑资源request后
kubectl replace -f ele.yaml --force

pod通过schedulerName指定调度器

---
apiVersion: v1 
kind: Pod 
metadata:
  name: nginx 
spec:
  schedulerName: my-scheduler
  containers:
  - image: nginx
    name: nginx

度量和日志Metric & Log

Metrics

# Metrics度量

git clone https://github.com/kodekloudhub/kubernetes-metrics-server.git
cd kubernetes-metrics-server/
kubectl create -f .
# wait a minute
kubectl top node

# cpu占用最高的节点
kubectl top node --sort-by='cpu' --no-headers | head -1 
# 内存占用最高的节点
kubectl top node --sort-by='memory' --no-headers | head -1

# 内存占用最多的pod
kubectl top pod --sort-by='memory' --no-headers | head -1
# 内存占用最少的pod
kubectl top pod --sort-by='memory' --no-headers | tail -1

Log

# 查看pod日志
kubectl logs <pod-name>

# 指定pod中容器查看日志(一个pod可以有多个container)
kubectl logs <pod-name> -c <container-name>

# 登入pod中指定的容器
kubectl exec -it <pod-name> -c <container-name> -- /bin/bash

应用生命周期管理

更新deployment版本

# 查看deployment的策略
k describe deploy <deployment-name> | grep StrategyType
# RollingUpdate意味着一次只更新部分pod
# Recreate意味着同时终止pod并重新创建

k describe deploy <deployment-name> | grep RollingUpdateStrategy
# 25% max unavailable, 25% max surge意味着最多可以同时终止25%的pod

# 更新deployment的镜像
k edit deploy <deployment-name>

命令行

指定容器命令行

apiVersion: v1 
kind: Pod 
metadata:
  name: ubuntu-sleeper-2 
spec:
  containers:
  - name: ubuntu
    image: ubuntu
    command:
      - "sleep"
      - "5000"
apiVersion: v1 
kind: Pod 
metadata:
  name: webapp-green
  labels:
      name: webapp-green 
spec:
  containers:
  - name: simple-webapp
    image: kodekloud/webapp-color
    command: ["python", "app.py"]
    args: ["--color", "pink"]

Secret

# 列出secrets
k get secrets
# 查看secret详情
k describe secrets <secret-name>

# 创建secret,名为db-secret
kubectl create secret generic db-secret --from-literal=DB_Host=sql01 --from-literal=DB_User=root

pod使用secret

---
apiVersion: v1 
kind: Pod 
metadata:
  labels:
    name: webapp-pod
  name: webapp-pod
  namespace: default 
spec:
  containers:
  - image: kodekloud/simple-webapp-mysql
    imagePullPolicy: Always
    name: webapp
    envFrom:
    - secretRef:
        name: db-secret

Multi Container

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: yellow
  name: yellow
spec:
  containers:
  - image: busybox
    name: lemon
    resources: {}
    command: ["sleep", "1000"]
  - image: redis
    name: gold
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}

多容器共享文件系统

---
apiVersion: v1
kind: Pod
metadata:
  name: app
  namespace: elastic-stack
  labels:
    name: app
spec:
  containers:
  - name: app
    image: kodekloud/event-simulator
    volumeMounts:
    - mountPath: /log
      name: log-volume

  - name: sidecar
    image: kodekloud/filebeat-configured
    volumeMounts:
    - mountPath: /var/log/event-simulator/
      name: log-volume

  volumes:
  - name: log-volume
    hostPath:
      # directory location on host
      path: /var/log/webapp
      # this field is optional
      type: DirectoryOrCreate

Init Container

---
apiVersion: v1
kind: Pod
metadata:
  name: red
  namespace: default
spec:
  containers:
  - command:
    - sh
    - -c
    - echo The app is running! && sleep 3600
    image: busybox:1.28
    name: red-container
  initContainers:
  - image: busybox
    name: red-initcontainer
    command: 
      - "sleep"
      - "20"

Cluster 维护

升级OS

# 升级node01,首先清空node01上的pod
# 效果 pod evicted, node drained
# drain会默认将node cordon,即不可调度
kubectl drain node01 --ignore-daemonsets

# node01维护完毕后,将node01设置为可调度
kubectl uncordon node01

# NOTE: drain只能处理replicaset,单独的pod需要强制执行
k drain node01 --ignore-daemonsets --force
# 对不属于replicaset的pod,将会永久丢失

# 将node01设置不可调度
kubectl cordon node01

升级K8S

# 查看控制节点版本
k version --short
# 查看node版本,当前1.26
k get nodes

# 查看远端可升级版本
kubeadm upgrade plan

# 有两个node分别是controlplane和node01
# 先清空controlplane的pod
kubectl drain controlplane --ignore-daemonsets

# 开始升级controlplane
apt update

# 首先升级kubeadm
apt-get install kubeadm=1.27.0-00

kubeadm upgrade apply v1.27.0

# 升级命令行
apt-get install kubelet=1.27.0-00 

# 重启服务
systemctl daemon-reload
systemctl restart kubelet

# controlplane升级完毕,标记为可调度
kubectl uncordon controlplane

# 需要解除controlplane的taint
k drain node01
# 接着ssh登录到node01上,升级node01
apt-get update
apt-get install kubeadm=1.27.0-00
kubeadm upgrade node
apt-get install kubelet=1.27.0-00 
systemctl daemon-reload
systemctl restart kubelet

# 最后让node01可调度
kubectl uncordon node01

备份恢复

# 查看etcd版本(查看日志或者镜像)
kubectl -n kube-system logs etcd-controlplane | grep -i 'etcd-version'
kubectl -n kube-system describe pod etcd-controlplane | grep Image:

# 查看etcd的endpoint
kubectl -n kube-system describe pod etcd-controlplane | grep '\--listen-client-urls'

# 查看etcd的cert文件
kubectl -n kube-system describe pod etcd-controlplane | grep '\--cert-file'
# 查看etcd的ca cert文件
kubectl -n kube-system describe pod etcd-controlplane | grep '\--trusted-ca-file'

# 在维护或者对环境做操作前先备份etcd
ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /opt/snapshot-pre-boot.db

# 在维护窗口期间丢失了一些pod或deployment,需要恢复
# 首先将etcd快照恢复到一个新的目录
ETCDCTL_API=3 etcdctl  --data-dir /var/lib/etcd-from-backup \
snapshot restore /opt/snapshot-pre-boot.db

# 修改etcd这个staticpod的yaml,指向恢复的data-dir
vim /etc/kubernetes/manifests/etcd.yaml
# 修改成如下,将etcd-data这个volume指向hostPath: /var/lib/etcd-from-backup
#  volumes:
#  - hostPath:
#      path: /var/lib/etcd-from-backup
#      type: DirectoryOrCreate
#    name: etcd-data

# 更新yaml后,etcd会自动重建,若etcd一直PENDING,手动删除触发重启
kubectl delete pod -n kube-system etcd-controlplane

多cluster

# 查看cluster信息
kubectl config view
# 或者
kubectl config get-clusters

# 查看当前的cluster
kubectl config view | grep current-context

# 切换cluster
kubectl config use-context cluster2


# Type1: Stacked ETCD Topology
# 如下命令可以在cluster中查找到etcd的pod
kubectl get pods -n kube-system | grep etcd

kubectl -n kube-system describe pod etcd-cluster1-controlplane

# Type2: External ETCD
# cluster内没有etcd的pod,kube-apiserver直接指向一个外部的etcd endpoint
ps -aux | grep etcd


# 查看ETCD集群的members
ETCDCTL_API=3 etcdctl \
 --endpoints=https://127.0.0.1:2379 \
 --cacert=/etc/etcd/pki/ca.pem \
 --cert=/etc/etcd/pki/etcd.pem \
 --key=/etc/etcd/pki/etcd-key.pem \
  member list
  
# 备份Stacked ETCD
kubectl config use-context cluster1
kubectl describe pods -n kube-system etcd-cluster1-controlplane  | grep advertise-client-urls
# --advertise-client-urls=https://10.1.218.16:2379

kubectl describe  pods -n kube-system etcd-cluster1-controlplane  | grep pki
#      --cert-file=/etc/kubernetes/pki/etcd/server.crt
#      --key-file=/etc/kubernetes/pki/etcd/server.key
#      --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
#      --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
#      --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
#      --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
#      /etc/kubernetes/pki/etcd from etcd-certs (rw)
#    Path:          /etc/kubernetes/pki/etcd

# 登录到etcd pod所在的节点
ETCDCTL_API=3 etcdctl --endpoints=https://10.1.220.8:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save /opt/cluster1.db

# 恢复External ETCD
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/etcd/pki/ca.pem --cert=/etc/etcd/pki/etcd.pem --key=/etc/etcd/pki/etcd-key.pem snapshot restore /root/cluster2.db --data-dir /var/lib/etcd-data-new

systemctl status etcd.service
vim /etc/systemd/system/etcd.service
systemctl daemon-reload 
# service的User是etcd,修改文件权限
chown -R etcd:etcd /var/lib/etcd-data-new

systemctl restart etcd

# 推荐重启控制平面的pod(kube-scheduler, kube-controller-manager, kubelet)

安全Security

TLS和证书介绍

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 3478545236123834268 (0x304646764e49d79c)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN = kubernetes
        Validity
            Not Before: Dec  1 12:24:30 2023 GMT
            Not After : Nov 30 12:24:30 2024 GMT
        Subject: CN = kube-apiserver
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:ce:fc:11:8c:a0:2c:83:2d:20:b2:47:83:dc:38:
                    ec:3f:7f:b5:9a:09:c8:a5:7a:16:7a:c7:2d:1d:62:
                    ae:a6:02:7e:d0:be:6a:c6:fd:71:d3:1a:a8:fd:9b:
                    4d:11:45:f1:21:aa:20:a1:9d:4e:d7:be:f1:22:25:
                    a9:82:52:87:f8:e5:ce:d5:30:e6:1f:99:a5:13:56:
                    e1:38:e3:68:f5:54:de:67:e1:d1:7e:7a:30:12:6c:
                    48:fd:d9:89:95:07:2a:51:8e:d8:fa:0c:02:79:54:
                    c4:8f:16:42:b1:f4:a9:0e:ac:83:20:f7:d4:eb:c6:
                    8f:e2:74:2a:03:c7:2a:b6:d9:c4:ea:28:3c:b8:14:
                    3f:dd:f0:d9:d9:b2:1f:6c:89:93:0b:37:cd:1b:57:
                    1c:8e:53:fe:d1:40:f5:80:ee:2d:8d:c6:ce:c2:39:
                    03:d6:c7:aa:61:cb:b5:8d:5c:d6:73:99:ef:c8:6b:
                    87:ac:0e:3b:59:bb:ec:e2:c5:04:54:4b:ad:d5:da:
                    48:16:f5:15:0c:bb:29:fe:13:c7:ed:29:dc:bc:01:
                    b5:ac:dd:84:c9:01:e1:fc:40:1e:8f:c5:4a:82:c5:
                    69:3a:4a:54:b3:22:c6:4b:61:78:54:59:e9:21:ef:
                    a9:5d:cf:a1:b4:c8:f2:18:4e:6a:03:d3:44:3c:be:
                    9e:d9
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Authority Key Identifier: 
                keyid:27:89:3A:1C:08:4F:74:4E:60:20:1A:44:E0:47:AC:D6:F9:05:93:E5
            X509v3 Subject Alternative Name: 
                DNS:controlplane, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:10.96.0.1, IP Address:192.21.43.6
    Signature Algorithm: sha256WithRSAEncryption
         8e:75:b2:8e:47:5c:8f:a1:6c:c8:49:da:ef:e0:09:09:6d:cf:
         dd💿35:f0:e2:df:b2🇩🇪b0:f0:8a:0a:4b:4b:32:7e:46:45:
         6b:0b:52:7b:8d:ad:17:67:59:fb:7d:68:86:2b:d1:91:7d:99:
         c8:ff:d2:17:46:a0:92:ae:3c:55:9a:e4:f5:ee:59:48:a5:2a:
         93:4d:8d:02:ba:02:73:f6:07:36:2a:5a:99:4a:33:52:ce:36:
         ea:44:29:19:cb:d1:6f:4a:db:1f:d9:47:7e:8c:e7:2b:6a:7f:
         11:43:57:f1:f6:7a:19:c1:b6:ff:81:37:71:3a:f6:14:d5:63:
         ab:9d:31:f7:bc:4c:0a:19:fb:36:d7:84:f2:1c:fd:c5:fc:8d:
         83:b0:8f:ec:1b:9c:ae:57:4f:f1:96:f5:45:f5:c5:4e:8f:a0:
         ac:04:97:fa:87:2c:0d:5c:83:a9:d8:94:2f:d5:64:8d:ec:13:
         c6:b4:93:d2:29:f9:87:23:a0:90:7b:68:8c:d6:5c:fd:a3:97:
         96:68:a7:e0:2a:22:8b:a9:d4:01:fc:f5:06:39:7f:63:6b:33:
         8e:3f:6b:23:18:9c:c8:7c:0f:0c:c1:4c:73:3f:a3:c2:d7:ea:
         cb:2a:a8:5d:86:a1:0d:4a:5e:12:51:ba:6c:1e:1c:20:75:d6:
         5f:62:5f:d3

证书查看

# 查看kube-api的cert文件
cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep tls-cert-file

# 查看certfile详情

openssl x509 -in ./apiserver.crt -text -noout

# kube-api故障,排查问题
# 列出所有的容器
crictl ps -a
crictl ps -a | grep kube-api
crictl logs <container-id>

CertificateSigningRequest

# 有签名文件akshay.csr,先base64编码
cat akshay.csr | base64 -w 0

# 按照如下yaml,执行k apply创建csr

# 查看csr,刚创建的状态是Pending
k get certificatesigningrequests
k get csr

# 授权,授权后状态是Approved,Issued
k certificate approve <csr-name>

# 拒绝 csr
kubectl certificate deny <csr-name>

# 删除csr
k delete csr <csr-name>

# 查看csr详情
k get csr <csr-name> -o yaml
---
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
  name: akshay
spec:
  groups:
  - system:authenticated
  request: <Paste the base64 encoded value of the CSR file>
  signerName: kubernetes.io/kube-apiserver-client
  usages:
  - client auth

Kube Config

Client-certificate(客户端证书)和client-key(客户端密钥)是在SSL/TLS通信中用于客户端身份验证的两个重要组件。

客户端证书是一种数字证书,用于验证客户端的身份。它包含了客户端的公钥以及与之相关联的身份信息,通常由证书颁发机构(CA)签发。客户端证书可以被服务器用来验证客户端的身份,确保通信双方的合法性和安全性。

客户端密钥是客户端证书中包含的私钥部分。私钥是一种加密密钥,用于对数据进行签名和解密。客户端使用私钥对数据进行签名,服务器使用客户端证书中的公钥对签名进行验证,从而确认客户端的身份。

在SSL/TLS通信中,客户端证书和客户端密钥通常一起使用,以确保通信的安全性和完整性。客户端证书用于验证客户端的身份,客户端密钥用于生成数字签名以进行身份验证和数据加密。

这三个文件名后缀分别代表以下含义:

  • .crt:代表证书文件,包含了公钥和证书相关信息。
  • .csr:代表证书签名请求文件,包含了公钥和一些个人信息,用于向证书颁发机构申请签名。
  • .key:代表密钥文件,包含了与证书相关的私钥信息。

这三个文件之间的关联性在于:

  1. 证书签名请求文件(.csr)是用来向证书颁发机构申请签名的,其中包含了公钥和一些个人信息。
  2. 证书文件(.crt)是由证书颁发机构签名后的证书,包含了公钥和证书相关信息。
  3. 密钥文件(.key)包含了与证书相关的私钥信息,用于与证书进行配对,以进行加密和解密操作。

这三个文件通常一起使用,以建立安全的通信连接和进行加密操作。

# config默认文件路径
ls ~/.kube/config

# 查看所有cluster
k config view

# 指定config文件
kubectl config view --kubeconfig <config-file-path>

# 查看config中的current-context
kubectl config current-context --kubeconfig <config-file-path>

# 切换context
kubectl config --kubeconfig=<config-file-path> use-context <new-context-name>

# 将自定义的config文件替换~/.kube/config文件,即可设置为默认的config

RBAC

# 查看kube-api的authorization-mode
kubectl describe pod kube-apiserver-controlplane -n kube-system | grep mode

# 列出roles
k get roles
# 列出所有namespace的roles
k get roles.rbac.authorization.k8s.io --all-namespaces --no-headers 

# 查看roles详情
k describe roles <role-name>

# 列出rolebinding
k get rolebindings  --all-namespaces

# 查看rolebinding详情
k describe rolebindings <rolebinding-name> -n kube-system

# 查看config
k config view
# 指定config中的用户执行kubectl命令
k get pods --as <user-name>

# 上述命令执行报错,用户无权限,需要创建role和rolebinding
# 创建一个role,名为developer
k create role developer --namespace=default --verb=list,create,delete --resource=pod --dry-run=client -o yaml
# 创建rolebinding,将deveploer绑定到用户dev-user上
kubectl create rolebinding dev-user-binding --namespace=default --role=developer --user=dev-user

# 编辑role的权限,编辑后无需重新binding
k edit role <role-name>

Cluster Role

# 列出clusterrole
k get clusterrole
# 列出clusterrolebinding
k get clusterrolebindings

# 查看环境中cluster数量
k get clusterrole --no-headers | wc -l

# 查看clusterrolebinding数量
k get clusterrolebindings.rbac.authorization.k8s.io --no-headers | wc -l

# 查看所有与namspace无关的resource
k api-resources --namespaced=false

# 查看clusterrole详情
k describe clusterrole <clusterrole-name>

# 查看clusterrolebinding详情
k describe clusterrolebindings.rbac.authorization.k8s.io <name>

# 创建一个管理node的clusterrole
k create clusterrole nodeadmin --resource=nodes --verb=get,watch,list,create,delete --dry-run=client  -o yaml

# 创建clusterrolebinding
k create clusterrolebinding michella-binding --clusterrole nodeadmin --user=michelle  --dry-run=client -o yaml

# 测试是否有执行权限
kubectl auth can-i list nodes --as michelle

# 需要给michelle用户增加权限,新建存储相关的clusterrole和binding
k create clusterrole storage-admin --resource=persistentvolumes,storageclasses --verb=get,watch,list,create,delete --dry-run=client -o yaml

k create clusterrolebinding michelle-storage-admin --clusterrole=storage-admin --user=michelle --dry-run=client -o yaml

Service Account

# 获取所有service account
k get serviceaccounts
k get sa

# 查看service account详情
k describe sa <sa-name>

# 查看pod的serviceaccount
k get po web-dashboard-97c9c59f6-vmw8g -o yaml | grep -i account

# 创建一个serviceaccount
k create serviceaccount dashboard-sa

# 用如下yaml为serviceaccount创建role和rolebinding赋予权限

# 为dashboard-sa这个service account创建token
kubectl create token dashboard-sa

# 编辑deployment的serviceaccount(默认是default)
k edit deployments.apps web-dashboard
# 或者
k set serviceaccount deploy/web-dashboard dashboard-sa
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups:
  - ''
  resources:
  - pods
  verbs:
  - get
  - watch
  - list


---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: ServiceAccount
  name: dashboard-sa # Name is case sensitive
  namespace: default
roleRef:
  kind: Role #this must be Role or ClusterRole
  name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
  apiGroup: rbac.authorization.k8s.io

Image Secret

# 应用使用私有仓库的镜像
k create secret --help

# 创建docker-registry类型的密钥
k create secret docker-registry private-reg-cred --docker-username=dock_user --docker-password=dock_passwd --docker-server=myprivateregistry.com:5000 --docker-email=dock_user@myprivateregistry.com
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "3"
  creationTimestamp: "2023-12-02T13:18:12Z"
  generation: 3
  labels:
    app: web
  name: web
  namespace: default
  resourceVersion: "2598"
  uid: b375109f-b184-4cb1-9a80-425f1a940e3d
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: web
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: web
    spec:
      containers:
      - image: myprivateregistry.com:5000/nginx:alpine
        imagePullPolicy: IfNotPresent
        name: nginx
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: private-reg-cred
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30

Security Context

# 执行whoami查看容器内部的执行用户
kubectl exec <pod-name> -- whoami
# 强制删除pod
kubectl delete pod <pod-name> --force

# 更新security context必须先删除再新建pod,可以用k replace命令
# 编辑如下yaml,容器内用user ID 1010执行sleep命令
---
apiVersion: v1
kind: Pod
metadata:
  name: ubuntu-sleeper
  namespace: default
spec:
  securityContext:
    runAsUser: 1010
  containers:
  - command:
    - sleep
    - "4800"
    image: ubuntu
    name: ubuntu-sleeper
    securityContext:
      capabilities:
        add: ["SYS_TIME"]

不同level的context

apiVersion: v1
kind: Pod
metadata:
  name: multi-pod
spec:
  securityContext:
    runAsUser: 1001
  containers:
  -  image: ubuntu
     name: web
     command: ["sleep", "5000"]
     securityContext:
      runAsUser: 1002

  -  image: ubuntu
     name: sidecar
     command: ["sleep", "5000"]

编辑如下yaml,为容器增加SYS_TIME的能力

---
apiVersion: v1
kind: Pod
metadata:
  name: ubuntu-sleeper
  namespace: default
spec:
  containers:
  - command:
    - sleep
    - "4800"
    image: ubuntu
    name: ubuntu-sleeper
    securityContext:
      capabilities:
        add: ["SYS_TIME"]

Network Policy

# 列出networkpolicy
k get networkpolicies.networking.k8s.io
k get netpol

# 查看netpol详情
k describe netpol <netpol-name> 

# 使用netpol中对应的selector查询应用的pod
k get pods --selector name=payroll
# 或者
k get po --show-labels

# 创建network policy使得Internal只能访问mysql和payroll
# 注意同时放开了53端口,因为需要通过53端口访问DNS
# 使用如下yaml实现
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: internal-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      name: internal
  policyTypes:
  - Egress
  - Ingress
  ingress:
    - {}
  egress:
  - to:
    - podSelector:
        matchLabels:
          name: mysql
    ports:
    - protocol: TCP
      port: 3306

  - to:
    - podSelector:
        matchLabels:
          name: payroll
    ports:
    - protocol: TCP
      port: 8080

  - ports:
    - port: 53
      protocol: UDP
    - port: 53
      protocol: TCP

存储Storage

Persistent Volume Claim

# 查看pod内的日志
kubectl exec <pod-name> -- cat /log/app.log

# 为pod添加持久化的日志卷
kubectl get po webapp -o yaml > webapp.yaml

# 配置hostpath类型的volume
apiVersion: v1
kind: Pod
metadata:
  name: webapp
spec:
  containers:
  - name: event-simulator
    image: kodekloud/event-simulator
    env:
    - name: LOG_HANDLERS
      value: file
    volumeMounts:
    - mountPath: /log
      name: log-volume

  volumes:
  - name: log-volume
    hostPath:
      # directory location on host
      path: /var/log/webapp
      # this field is optional
      type: Directory

用如下yaml创建Persistent Volume

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-log
spec:
  persistentVolumeReclaimPolicy: Retain
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 100Mi
  hostPath:
    path: /pv/log

创建Persistent Volume Claim

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: claim-log-1
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Mi
# 列出persistent volume
k get persistentvolume
k get pv

# 列出persistent volume claim
k get pvc
k get persistentvolumeclaims

# 列出多种资源
k get pv,pvc

# pod中使用pvc
apiVersion: v1
kind: Pod
metadata:
  name: webapp
spec:
  containers:
  - name: event-simulator
    image: kodekloud/event-simulator
    env:
    - name: LOG_HANDLERS
      value: file
    volumeMounts:
    - mountPath: /log
      name: log-volume

  volumes:
  - name: log-volume
    persistentVolumeClaim:
      claimName: claim-log-1

被pod使用中的PVC无法直接删除,需要等到pod删除后才能删掉PVC

Storage Class

# 列出storage class
k get storageclasses.storage.k8s.io
k get sc

# 查看sc详情
k describe sc <sc-name>

# 创建storage class,使用如下yaml
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: delayed-volume-sc
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

注意WaitForFirstConsume模式的storage,pvc绑定pv时,不会立刻申请存储,而是处于PENDING状态,等到使用pvc的pod被调度时,才会实际上申请存储

如下为一个使用pvc的yaml

---
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:alpine
    volumeMounts:
      - name: local-persistent-storage
        mountPath: /var/www/html
  volumes:
    - name: local-persistent-storage
      persistentVolumeClaim:
        claimName: local-pvc

网络Networking

网络基本命令

# 查看node的internal ip地址
k get nodes -o wide
# 查看ip地址对应的网口
ip a | grep <ip>

# 查看网口的mac地址/状态
ip link show <interface-name>

# 查看Containerd创建的网口(Container Network Interface)
ip link | grep cni

# 查看路由表
ip route list

# 查看默认路由
# 默认路由:默认路由是指当目标网络不在路由表中时,数据包将被发送到的默认网关
ip route show default

# 查看kube服务监听的port
netstat -ntpl | grep kube

# 查看etcd的tcp监听服务
netstat -ntpl | grep etcd

# 查看etcd的所有链接(包含client)
netstat -anp | grep etcd

CNI

# 查看kubelet的container runtime endpoint
ps aux | grep kubelet | grep runtime

# 查看cni的插件
ls /opt/cni/bin/

# 查看当前cluster配置的cni插件
ls /etc/cni/net.d/

部署weave net

# 部署weave-net

k apply -f weave-daemonset-k8s.yaml
# serviceaccount/weave-net created
# clusterrole.rbac.authorization.k8s.io/weave-net created
# clusterrolebinding.rbac.authorization.k8s.io/weave-net created
# role.rbac.authorization.k8s.io/weave-net created
# rolebinding.rbac.authorization.k8s.io/weave-net created
# daemonset.apps/weave-net created
apiVersion: v1
kind: List
items:
  - apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: weave-net
      labels:
        name: weave-net
      namespace: kube-system
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: weave-net
      labels:
        name: weave-net
    rules:
      - apiGroups:
          - ''
        resources:
          - pods
          - namespaces
          - nodes
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - extensions
        resources:
          - networkpolicies
        verbs:
          - get
          - list
          - watch
      - apiGroups:
          - 'networking.k8s.io'
        resources:
          - networkpolicies
        verbs:
          - get
          - list
          - watch
      - apiGroups:
        - ''
        resources:
        - nodes/status
        verbs:
        - patch
        - update
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: weave-net
      labels:
        name: weave-net
    roleRef:
      kind: ClusterRole
      name: weave-net
      apiGroup: rbac.authorization.k8s.io
    subjects:
      - kind: ServiceAccount
        name: weave-net
        namespace: kube-system
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: weave-net
      namespace: kube-system
      labels:
        name: weave-net
    rules:
      - apiGroups:
          - ''
        resources:
          - configmaps
        resourceNames:
          - weave-net
        verbs:
          - get
          - update
      - apiGroups:
          - ''
        resources:
          - configmaps
        verbs:
          - create
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: weave-net
      namespace: kube-system
      labels:
        name: weave-net
    roleRef:
      kind: Role
      name: weave-net
      apiGroup: rbac.authorization.k8s.io
    subjects:
      - kind: ServiceAccount
        name: weave-net
        namespace: kube-system
  - apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: weave-net
      labels:
        name: weave-net
      namespace: kube-system
    spec:
      # Wait 5 seconds to let pod connect before rolling next pod
      selector:
        matchLabels:
          name: weave-net
      minReadySeconds: 5
      template:
        metadata:
          labels:
            name: weave-net
        spec:
          initContainers:
            - name: weave-init
              image: 'weaveworks/weave-kube:2.8.1'
              command:
                - /home/weave/init.sh
              env:
              securityContext:
                privileged: true
              volumeMounts:
                - name: cni-bin
                  mountPath: /host/opt
                - name: cni-bin2
                  mountPath: /host/home
                - name: cni-conf
                  mountPath: /host/etc
                - name: lib-modules
                  mountPath: /lib/modules
                - name: xtables-lock
                  mountPath: /run/xtables.lock
                  readOnly: false
          containers:
            - name: weave
              command:
                - /home/weave/launch.sh
              env:
                - name: IPALLOC_RANGE
                  value: 10.32.1.0/24
                - name: INIT_CONTAINER
                  value: "true"
                - name: HOSTNAME
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: spec.nodeName
              image: 'weaveworks/weave-kube:2.8.1'
              readinessProbe:
                httpGet:
                  host: 127.0.0.1
                  path: /status
                  port: 6784
              resources:
                requests:
                  cpu: 50m
              securityContext:
                privileged: true
              volumeMounts:
                - name: weavedb
                  mountPath: /weavedb
                - name: dbus
                  mountPath: /host/var/lib/dbus
                  readOnly: true
                - mountPath: /host/etc/machine-id
                  name: cni-machine-id
                  readOnly: true
                - name: xtables-lock
                  mountPath: /run/xtables.lock
                  readOnly: false
            - name: weave-npc
              env:
                - name: HOSTNAME
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: spec.nodeName
              image: 'weaveworks/weave-npc:2.8.1'
#npc-args
              resources:
                requests:
                  cpu: 50m
              securityContext:
                privileged: true
              volumeMounts:
                - name: xtables-lock
                  mountPath: /run/xtables.lock
                  readOnly: false
          hostNetwork: true
          dnsPolicy: ClusterFirstWithHostNet
          hostPID: false
          restartPolicy: Always
          securityContext:
            seLinuxOptions: {}
          serviceAccountName: weave-net
          tolerations:
            - effect: NoSchedule
              operator: Exists
            - effect: NoExecute
              operator: Exists
          volumes:
            - name: weavedb
              hostPath:
                path: /var/lib/weave
            - name: cni-bin
              hostPath:
                path: /opt
            - name: cni-bin2
              hostPath:
                path: /home
            - name: cni-conf
              hostPath:
                path: /etc
            - name: cni-machine-id
              hostPath:
                path: /etc/machine-id
            - name: dbus
              hostPath:
                path: /var/lib/dbus
            - name: lib-modules
              hostPath:
                path: /lib/modules
            - name: xtables-lock
              hostPath:
                path: /run/xtables.lock
                type: FileOrCreate
          priorityClassName: system-node-critical
      updateStrategy:
        type: RollingUpdate
# 查看weave的agent/peer
k get pods --all-namespaces | grep weave
# weave pod在每个节点上存在一个(包括master和work node)

# 查看weave创建的网口
ip link | grep weave

# 查看weave网口的ip地址段
ip addr show weave
ip route list

# 查看work node上weave net的默认网关,首先登录到work node
ip route | gre weave

Service Networking

# 查看整个cluster的网络范围
ip addr | grep eth0
# 215: eth0@if216: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
#    inet 192.4.210.9/24 brd 192.4.210.255 scope global eth0

ipcalc -b 192.4.210.9
# 计算的cluster的地址范围192.4.210.0/24

# 查看cluster中pod的网络地址范围
# 若pod使用weave网络,使用如下查询方式
k logs -n kube-system weave-net-8v8hr  | grep ipalloc-range
# pod地址范围是10.244.0.0/16

# 查看service的网络地址范围
cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep cluster-ip-range
# --service-cluster-ip-range=10.96.0.0/12

# 查看kube-proxy的类型
k logs -n kube-system <kube-proxy-pod-name>

CoreDNS

# 查看cluster的DNS解决方案
k get pods -n kube-system | grep dns

# 查看dns service
k get service --all-namespaces | grep dns

# 查看coredns的配置文件
k -n kube-system describe deployments.apps coredns | grep -A2 Args

# 查看Corefile的注入方式
k get po -n kube-system coredns-5d78c9869d-5sxxz  -o yaml
k describe cm -n kube-system coredns


# service的DNS规则如下:service.namespace.svc.cluster.local
# web-service.payroll.svc.cluster.local

# 查看DNS结果
kubectl exec -it hr -- nslookup mysql.payroll > /root/CKA/nslookup.out

Ingress

# 查看ingress
kubectl get all -A | grep -i ingress

k get ingress -A

# 查看ingress详情
kubectl describe ingress <ingress-name>

# 编辑ingress修改rules
kubectl edit ingress

# 创建ingress,注意如下命令缺少annotation,创建的yaml添加annotation才能正常工作
# pathType: Exact
k create ingress -n critical-space test-ingress --rule=/pay=pay-service:8282  --dry-run=client -o yaml
# pathType: Prefix
k create ingress -n critical-space test-ingress --rule=/pay*=pay-service:8282   --dry-run=client -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
  name: ingress-wear-watch
  namespace: app-space
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: wear-service
            port: 
              number: 8080
        path: /wear
        pathType: Prefix
      - backend:
          service:
            name: video-service
            port: 
              number: 8080
        path: /stream
        pathType: Prefix

如下为cluster创建ingress controller

# 创建ingress独立的namespace(便于隔离)
k create namespace ingress-nginx

# 创建configmap
kubectl create configmap ingress-nginx-controller --namespace ingress-nginx

# 创建service account
k create sa ingress-nginx -n ingress-nginx
k create sa ingress-nginx-admission -n ingress-nginx

# 创建role,rolebinding,clusterrole,clusterrolebinding
# 略

# ingress controller 的yaml如下
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    app.kubernetes.io/version: 1.1.2
    helm.sh/chart: ingress-nginx-4.0.18
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  replicas: 1
  minReadySeconds: 0
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/component: controller
      app.kubernetes.io/instance: ingress-nginx
      app.kubernetes.io/name: ingress-nginx
  template:
    metadata:
      labels:
        app.kubernetes.io/component: controller
        app.kubernetes.io/instance: ingress-nginx
        app.kubernetes.io/name: ingress-nginx
    spec:
      containers:
      - args:
        - /nginx-ingress-controller
        - --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
        - --election-id=ingress-controller-leader
        - --watch-ingress-without-class=true
        - --default-backend-service=app-space/default-http-backend
        - --controller-class=k8s.io/ingress-nginx
        - --ingress-class=nginx
        - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
        - --validating-webhook=:8443
        - --validating-webhook-certificate=/usr/local/certificates/cert
        - --validating-webhook-key=/usr/local/certificates/key
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: LD_PRELOAD
          value: /usr/local/lib/libmimalloc.so
        image: registry.k8s.io/ingress-nginx/controller:v1.1.2@sha256:28b11ce69e57843de44e3db6413e98d09de0f6688e33d4bd384002a44f78405c
        imagePullPolicy: IfNotPresent
        lifecycle:
          preStop:
            exec:
              command:
              - /wait-shutdown
        livenessProbe:
          failureThreshold: 5
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: controller
        ports:
        - name: http
          containerPort: 80
          protocol: TCP
        - containerPort: 443
          name: https
          protocol: TCP
        - containerPort: 8443
          name: webhook
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 10254
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          requests:
            cpu: 100m
            memory: 90Mi
        securityContext:
          allowPrivilegeEscalation: true
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - ALL
          runAsUser: 101
        volumeMounts:
        - mountPath: /usr/local/certificates/
          name: webhook-cert
          readOnly: true
      dnsPolicy: ClusterFirst
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: ingress-nginx
      terminationGracePeriodSeconds: 300
      volumes:
      - name: webhook-cert
        secret:
          secretName: ingress-nginx-admission

---

apiVersion: v1
kind: Service
metadata:
  creationTimestamp: null
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    app.kubernetes.io/version: 1.1.2
    helm.sh/chart: ingress-nginx-4.0.18
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
    nodePort: 30080
  selector:
    app.kubernetes.io/component: controller
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/name: ingress-nginx
  type: NodePort
# 接着创建ingress,ingress资源是在namespace范围下的,所以ingress要和service同一个namespace
k create ingress -n app-space --rule /wear*=wear-service:8080 --rule /watch*=video-service:8080 --annotation nginx.ingress.kubernetes.io/rewrite-target=/ --annotation nginx.ingress.kubernetes.io/ssl-redirect=false test-ingress --dry-run=client -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
  creationTimestamp: null
  name: test-ingress
  namespace: app-space
spec:
  rules:
  - http:
      paths:
      - backend:
          service:
            name: wear-service
            port:
              number: 8080
        path: /wear
        pathType: Prefix
      - backend:
          service:
            name: video-service
            port:
              number: 8080
        path: /watch
        pathType: Prefix
status:
  loadBalancer: {}

部署K8S

# 在每个节点上执行如下安装操作
# step1: 修改网络策略
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

sudo sysctl --system

# step2: 安装包
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

sudo mkdir -m 755 /etc/apt/keyrings

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.27/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.27/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update

# To see the new version labels
sudo apt-cache madison kubeadm

sudo apt-get install -y kubelet=1.27.0-2.1 kubeadm=1.27.0-2.1 kubectl=1.27.0-2.1

sudo apt-mark hold kubelet kubeadm kubectl

Installing kubeadm | Kubernetes

# step3: 查看kubelet版本
kubelet --version

# step4: 使用kubeadm启动kubernetes集群
# step4-1: 启动controlplane
# 查看eth0网口ip地址
ifconfig eth0
# 假设上一步eth0地址是192.17.132.9
kubeadm init --apiserver-cert-extra-sans=controlplane --apiserver-advertise-address 192.17.132.9 --pod-network-cidr=10.244.0.0/16
# 或者执行如下命令
IP_ADDR=$(ip addr show eth0 | grep -oP '(?<=inet\s)\d+(\.\d+){3}')
kubeadm init --apiserver-cert-extra-sans=controlplane --apiserver-advertise-address $IP_ADDR --pod-network-cidr=10.244.0.0/16

# 执行完毕后,配置kubeconfig
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.17.132.9:6443 --token b2134k.u3alyaj4iwkjhm6w \
        --discovery-token-ca-cert-hash sha256:23471ecdb6d52790d13880d333fa0a667cc249297ed26faa7b75171b84157c46 
# Step 4-2: 将node01加入到cluster
# kubeadm生成join token
kubeadm token generate
# 或
kubeadm token create --print-join-command
# 生成如下命令,ssh登录到node01上执行
kubeadm join 192.17.132.9:6443 --token 2uh5xs.qoh26441ba0l4du4 --discovery-token-ca-cert-hash sha256:23471ecdb6d52790d13880d333fa0a667cc249297ed26faa7b75171b84157c46

# Step5: 安装Flannel网络
# 下载kube-flannel.yml
curl -LO https://raw.githubusercontent.com/flannel-io/flannel/v0.20.2/Documentation/kube-flannel.yml
# 编辑kube-flannel.yml
# 定位到
  args:
  - --ip-masq
  - --kube-subnet-mgr
# 添加参数
  - --iface=eth0
  
kubectl apply -f kube-flannel.yml

# 网络创建成功后,node状态变成Ready
controlplane ~ ➜  kubectl get nodes
NAME           STATUS   ROLES           AGE   VERSION
controlplane   Ready    control-plane   15m   v1.27.0
node01         Ready    <none>          15m   v1.27.0

Trouble Shooting

Application Failure:

  • dns解析的service名称不对
  • service中selector没有选中工作的pod
  • service的targetPort配置错误
  • db的用户名密码错误
  • service暴露的node端口错误

Control Plane Failure:

  • kube-scheduler故障,修改/etc/kubernetes/manifests/下的yaml
  • scale无响应,查看kube-controller故障

Work Node Failure:

  • 计算节点containerd服务或者kubelet服务停掉了,用systemctl命令手动启动
  • /var/lib/kubelet/config.yaml配置文件有问题
  • /etc/kubernetes/kubelet.conf中kube-api的端口配置

Network Failure:

  • 缺少网络插件Creating a cluster with kubeadm | Kubernetes

    curl -L https://github.com/weaveworks/weave/releases/download/latest_release/weave-daemonset-k8s-1.11.yaml | kubectl apply -f -

  • 查看所有kube-proxy是否运行正常,查看关联的configMap是否正确,修改daemonset

Lighting Lab

# json path定义输出
kubectl get deployments.apps -n admin2406 -o custom-columns=DEPLOYMENT:.metadata.name,CONTAINER_IMAGE:.spec.template.spec.containers[*].image,READY_REPLICAS:.status.readyReplicas,NAMESPACE:.metadata.namespace --sort-by=.metadata.name

参考

Kubectl Command Cheat Sheet - Learn Essential Kubernetes Commands

Kubectl Cheatsheet | Free Cheatsheet

LF 认证考试攻略|认证考试流程全介绍–购买、注册及预约考试篇(建议收藏)-Linux Foundation开源软件学园

CKA (Certified Kubernetes Administrator)

Killer Shell - Exam Simulators

Linux Foundation Certification Exam: Candidate Handbook (using PSI BRIDGE Proctoring platform) - T&C DOCS (Candidate Facing Resources)

考试入口Certified Kubernetes Administrator China Exam (CKA-CN) - Exam | The Linux Foundation

CKA考试心得分享 - 木二 - 博客园

CKA考试经验总结 - 简书

Introduction to YAML - KodeKloud

互联网最全cka真题解析-2022.9.9 - 知乎

2023年CKA考试真题及注意事项 - jiayou111 - 博客园

Kubernetes CKA真题解析-20200402真题_check to see how many nodes are ready (not includi-CSDN博客