CKA

[CKA] Troubleshooting

테런 2023. 5. 13. 14:31
  • CKA 시험 비중 (30%)
  • 지정한 Pod 내의 특정 컨테이너 애플리케이션 로그 확인
$ kubectl logs {Pod Name}

 

  • 실습
{Pod name}의 로그 모니터링 후 'file not found' 오류가 있는 로그 라인 추출(Extract)해서 custom-log 파일에 저장하세요.
$ kubectl get pod {Pod name}
$ kubectl logs {Pod name} | grep 'file not found' > custom-log

 

  • 클러스터 리소스 Pod 모니터링
$ kubectl top pods
$ kubectl top pods --sort-by=cpu, kubectl top pods --sort-by=memory
$ kubectl top pods {Pod Name}
$ kubectl top pods --sort-by=memory > filename

 

  • 메트릭스 에러 발생한다면?
$ kubectl top pods 
error: Metrics API not available
$ kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
$ kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
$ kubectl edit deployments.apps -n kube-system metrics-server
spec:
  containers:
  - args:
    - --cert-dir=/tmp
    - --secure-port=4443
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --kubelet-use-node-status-port
    - --metric-resolution=15s
    - --kubelet-insecure-tls # 추가
    - --kubelet-preferred-address-types=InternalIP # 추가
    image: registry.k8s.io/metrics-server/metrics-server:v0.6.3
    imagePullPolicy: IfNotPresent
    livenessProbe:

# 일정 시간 경과 (1~2분) 후
$ kubectl top pods -n kube-system                            
NAME                               CPU(cores)   MEMORY(bytes)   
coredns-787d4945fb-wczw4           4m           50Mi            
etcd-minikube                      51m          71Mi            
kube-apiserver-minikube            87m          321Mi           
kube-controller-manager-minikube   47m          117Mi           
kube-proxy-ps6wh                   1m           52Mi            
kube-scheduler-minikube            5m           57Mi            
metrics-server-7457f65fb5-27jz4    7m           15Mi            
storage-provisioner                7m           10Mi

 

  • 클러스터 리소스 Node 모니터링
$ kubectl top nodes
$ kubectl top nodes --sort-by=memory

 

  • Json 포맷 기준으로 특정 리소스 sort해서 보기
$ kubectl get pods {pod name} -o json
$ kubectl get pods -A --sort-by=.metadata.name
$ kubectl get pv --sort-by=.spec.capacity.storage

 

  • 실습
클러스터에 구성된 모든 PV를 capacity별로 sort하여 my-pv-list 파일에 저장하세요.
PV 출력 결과를 sort하기 위해 kubectl 명령만 사용하고, 그 외 리눅스 명령은 적용하지 마세요.
$ kubectl get pv
$ kubectl get pv -o json
$ kubectl get pv --sort-by=.spec.capacity.storage > my-pv-list

 

  • 실습
'name=overloaded-cpu' 레이블을 사용하는 Pod들 중 CPU 소비율이 가장 높은 Pod의 이름을 찾아서 custom-app-log에 기록하세요.
$ kubectl get pods --show-labels
$ kubectl get pods --show-labels | grep name=overloaded-cpu
$ kubectl top pods --sort-by=cpu
$ echo {Pod Name} > custom-app-log

 

  • 클러스터 트러블슈팅 (스케쥴러, 매트릭스, etcd, coreDns, api 등) 실습
Replicas 개수에 문제가 있습니다. 확인 해보세요.
$ minikube ssh -n minikube
$ docker@minikube:/etc/kubernetes/manifests$ sudo -i
$ root@minikube:~$ cd /etc/kubernetes/manifests/
$ root@minikube:/etc/kubernetes/manifests$ ls
$ etcd.yaml  kube-apiserver.yaml  kube-controller-manager.yaml  kube-scheduler.yaml
$ root@minikube:/etc/kubernetes/manifests$ vi kube-controller-manager.yaml

 

  • Worker Node 동작 문제 해결 (단골 문제) - 확인해야할 사항 4가지
# 1. Container Engine (Docker)
$ systemctl status docker
$ systemctl enable --now docker

# 2. kubelet
$ systemctl status kubelet
$ systemctl enable --now kubelet

# 3. kubeproxy (proxy 확인)
$ kubectl get pod -n kube-system | grep kube

# 4. cni (controller 확인)
$ kubectl get pod -n kube-system | grep kube

 

  • 실습
{Node name}이 현재 NotReady 상태에 있습니다. 이 상태의 원인을 조사하고 Ready 상태로 전환하여 영구적으로 유지되도록 운영하세요.
$ minikube ssh -n minikube
Last login: Sat May 13 04:45:48 2023 from 192.168.49.1
$ docker@minikube:~$ sudo -i
$ root@minikube:~$ docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED          STATUS          PORTS     NAMES
$ root@minikube:~$ systemctl status docker
● docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2023-05-13 04:07:36 UTC; 47min ago
     
$ root@minikube:~$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; disabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: inactive (dead)
     
$ root@minikube:~$ systemctl enable --now kubelet
$ root@minikube:~$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; disabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Sat 2023-05-13 04:07:49 UTC; 49min ago

 

출처: TTABAE-LEARN - 이성미 강사