Tuesday, May 7, 2019

On Premise Kubernetes With CentOS 7 And vSphere

There are a few guides to getting Kubernetes running within a vSphere environment, however, I was having trouble putting everything together so I thought I'd make a guide here in case others were having problems.
CentOS 7 is popular, at least within the enterprise in North America, so that was chosen as the base OS. Calico will be used as the CNI provider (Container Network Interface) although you could substitute another provider if you feel it necessary. VMware is the underlying infrastructure for this setup, and with that, I wanted to be able to handle automatic provisioning of persistent volumes so setting up vsphere as a cloud provider is the final step.

Node Installation

I've got a total of 3 VMs but you can add worker nodes as required. When you're creating the VMs, make sure the name in vcenter matches the name of the guest exactly. In my case, I'll have k8s-master, k8s-n1, and k8s-n2. To start, install a minimal installation of CentOS 7.





















Node Setup

The official documentation says to enable disk by UUID, however, you shouldn't need to do this anymore. The way to check is to see if you have a /dev/disk/by-uuid folder, so if you don't have that, I'll leave this step in for reference. Once you've got the OS running, the first step is to power down and enable disk UUID. VMware documentation is available from https://kb.vmware.com/s/article/52815 but the steps are pretty easy to do. As mentioned, make sure the VM is powered off your you won't be able to add a custom VM option. For those of you not on at least vSphere 6.7, you'll need to use the flash version of the GUI.

  • vSphere > VM > Edit Settings > VM Options (a tab) > Advanced > Edit Configuration...
  • Name: disk.EnableUUID
  • Value: TRUE
  • Click Add and then OK. Repeat this for all nodes.





Disable SELinux and FirewallD
[root@k8s-master ~]# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=permissive/g' /etc/sysconfig/selinux
[root@k8s-master ~]# systemctl disable firewalld
Comment out swap space from fstab if you have it
[root@k8s-master ~]# sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
Traversing iptables rules should be the default but will be enable as per kubernetes network plugin requirements but I've set it anyway:
[root@k8s-master ~]# bash -c 'cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF'
Add Kubernetes Repository
[root@k8s-master ~]# bash -c 'cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF'
Install and enable kubernetes
[root@k8s-master ~]# yum install kubeadm kubelet kubectl docker -y
[root@k8s-master ~]# systemctl enable docker
[root@k8s-master ~]# systemctl enable kubelet
Reboot as this will disable swap, SELinux, firewalld, and start docker and kubelet services. Repeat this process for all nodes.

Setup Kubernetes

Getting kubernetes up and running should be fairly straight forward. Start by initializing the cluster on the master node. If you're running a windows cluster you need to use the default 10.244.0.0/16 network. For me, this range has the potential to cause conflict within the organization so I'll be using 192.168.0.0 as shown below:
[root@k8s-master ~]# kubeadm init --pod-network-cidr=192.168.0.0/16
This is going to return a lot of information. You should see a few key pieces of information
  1. How to setup kubectl
    • This is the primary mechanism to talk to your cluster. You can do this on your laptop, or, just on the master node for now
      [root@k8s-master ~]# mkdir ~/.kube
      [root@k8s-master ~]# cp /etc/kubernetes/admin.conf ~/.kube/config
      [root@k8s-master ~]# chown $(id -u):$(id -g) ~/.kube/config
      
  2. How to join worker nodes
    • Copy and paste this to a safe place for now. If you lose it you can regenerate a key (they only last for a few hours anyway) with the following command:
      [root@k8s-master ~]# kubeadm token create --print-join-command 
      
You should join any worker nodes now with the command provided.

If you check the status of your nodes now, you'll notice that they're not ready. And if you look at the pods, coredns will be in a pending state. This is because we haven't installed Calico yet.
[root@k8s-master ~]# kubect get nodes
NAME         STATUS     ROLES    AGE     VERSION
k8s-master   NotReady   master   6m47s   v1.14.1
k8s-n1       NotReady   <none>   78s     v1.14.1
k8s-n2       NotReady   <none>   41s     v1.14.1
[root@k8s-master ~]# kubectl get pods -n kube-system
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-fb8b8dccf-2fd9b              0/1     Pending   0          64m
coredns-fb8b8dccf-ddn42              0/1     Pending   0          64m
etcd-k8s-master                      1/1     Running   0          63m
kube-apiserver-k8s-master            1/1     Running   0          63m
kube-controller-manager-k8s-master   1/1     Running   0          64m
kube-proxy-lbptw                     1/1     Running   0          64m
kube-proxy-mx4gl                     1/1     Running   0          59m
kube-proxy-mzbmw                     1/1     Running   0          59m
kube-proxy-wxctq                     1/1     Running   0          59m
kube-scheduler-k8s-master            1/1     Running   0          63m
Calico is installed as a pod under the kube-system namespace with a simple command:
[root@k8s-master ~]# kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml
All nodes should report ready and coredns should be in a running state. Congratulations!

vSphere Integration

vSphere is part of the built in cloud providers, with documentation, however confusing and limited, available at the vSphere Cloud Provider site. It should be noted that all cloud providers have been deprecated within kubernetes but the cloud controller replacement seems to be in either an alpha or beta state. You can find our more and track component progress at these sites:

vSphere Configuration

On the master node, you'll need a configuration file which tells kubernetes where to find your vCenter server. To do this you'll need some information from your specific installation.
  • The vCenter IP address or DNS name and the name of the data centre hosting your ESX cluster
    • For my setup this is 10.9.178.236
    • My data centre is named Vancouver, this would be listed above your cluster in the vSphere Client
  • The name of a datastore to host your kubernetes volumes (which will host VMDK files)
  • The network name you're using for the kubernetes VMs
    • You might be using a default network which is usually called "VM Network" but I've got a specific one that we name after the VLAN number
  • A resource pool
    • If you don't have a resource pool or would like to use the defaults, that's OK, that's what I've used here, otherwise you'll need the resource location.
If you have a more complicated setup, like multiple vCenter servers you can find additional options under Creating the vSphere cloud config file
[root@k8s-master ~]# cat /etc/kubernetes/vsphere.conf
[Global]
port = "443"
insecure-flag = "1"
datacenters = "Vancouver"
secret-name = "vsphere-credentials"
secret-namespace = "kube-system"

[VirtualCenter "10.9.178.236"]

[Workspace]
server = "10.9.178.236"
datacenter = "Vancouver"
default-datastore="esx_labcluster2_ds02"
resourcepool-path="<cluster_name>/Resources"
folder = "kubernetes"

[Disk]
scsicontrollertype = pvscsi

[Network]
public-network = "VNET1805"
A special note about the folder entry. This has nothing to do with storage but rather then VM folder your kubernetes nodes are in within vcenter. If you don't have one, now is a good time to create one. From your vsphere client, select the VMs and Templates icon or tab, right click on your data center, and select New Folder > New VM and Templates Folder... In the file above I used a folder named kubernetes. Create it and drag your master and all nodes into this folder.

If you're trying to provision a persistent volume and keep getting an error like the following, the VM folder is a likely problem
I0718 21:39:08.761621       1 vsphere.go:1311] Datastore validation succeeded
E0718 21:39:09.104965       1 datacenter.go:231] Failed to get the folder reference for kubernetes. err: folder 'kubernetes' not found
E0718 21:39:09.104997       1 vsphere.go:1332] Failed to set VM options required to create a vsphere volume. err: folder 'kubernetes' not found
If you have datastore clusters or a storage folder, be sure to reference the documentation reference above on what that format might look like.

Secrets File

In order to provide authentication from kubernetes to vSphere you'll need a secrets file. Technically you can do this in your vsphere.conf file but that has a couple of problems:
  1. Your passwords are stored in plain text
  2. If you have special characters in the username or password, like a domain account with a backslash in it, you'll have problems
Creating a secrets file is pretty easy. First, we'll encode, not encrypt, our username and password, and second, we'll store that in a secrets file within our kubernetes cluster.
[root@k8s-master ~]# echo -n 'my_username' | base64
[root@k8s-master ~]# echo -n 'my_super_password' | base64
[root@k8s-master ~]# bash -c 'cat <<EOF > ~/vsphere-credentials.yaml
apiVersion: v1
kind: Secret
metadata:
  name: vsphere-credentials
type: Opaque
data:
  10.9.178.236.username: <encoded_user>
  10.9.178.236.password: <encoded_password>
EOF'
[root@k8s-master ~]# kubectl apply -f ~/vsphere-credentials.yaml --namespace=kube-system
[root@k8s-master ~]# kubectl get secrets -n kube-system

Enabling vSphere

Remember that diagram at the top of the page that you glanced over? Well it shows the key components deployed on the master node, which we now need to edit in order to tell them about vSphere. We'll start with the controller manifest file with the changes highlighted in blue:
[root@k8s-master ~]# cat /etc/kubernetes/manifests/kube-controller-manager.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=192.168.0.0/16
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --node-cidr-mask-size=24
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --use-service-account-credentials=true
    - --cloud-provider=vsphere
    - --cloud-config=/etc/kubernetes/vsphere.conf
    image: k8s.gcr.io/kube-controller-manager:v1.14.1
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10252
        scheme: HTTP
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-controller-manager
    resources:
      requests:
        cpu: 200m
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/pki
      name: etc-pki
      readOnly: true
    - mountPath: /etc/kubernetes/pki
      name: k8s-certs
      readOnly: true
    - mountPath: /etc/kubernetes/controller-manager.conf
      name: kubeconfig
      readOnly: true
    - mountPath: /etc/kubernetes/vsphere.conf
      name: vsphere-config
      readOnly: true
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/pki
      type: DirectoryOrCreate
    name: etc-pki
  - hostPath:
      path: /etc/kubernetes/pki
      type: DirectoryOrCreate
    name: k8s-certs
  - hostPath:
      path: /etc/kubernetes/controller-manager.conf
      type: FileOrCreate
    name: kubeconfig
  - hostPath:
      path: /etc/kubernetes/vsphere.conf
      type: FileOrCreate
    name: vsphere-config
status: {}
Next we need to modify the API Server manifest file, again with the changes marked in blue:
[root@k8s-master ~]# cat /etc/kubernetes/manifests/kube-apiserver.yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-apiserver
    tier: control-plane
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=10.9.176.25
    - --allow-privileged=true
    - --authorization-mode=Node,RBAC
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --enable-admission-plugins=NodeRestriction
    - --enable-bootstrap-token-auth=true
    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
    - --etcd-servers=https://127.0.0.1:2379
    - --insecure-port=0
    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
    - --requestheader-allowed-names=front-proxy-client
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --requestheader-extra-headers-prefix=X-Remote-Extra-
    - --requestheader-group-headers=X-Remote-Group
    - --requestheader-username-headers=X-Remote-User
    - --secure-port=6443
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    - --service-cluster-ip-range=10.96.0.0/12
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    - --cloud-provider=vsphere
    - --cloud-config=/etc/kubernetes/vsphere.conf
    image: k8s.gcr.io/kube-apiserver:v1.14.1
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 10.9.176.25
        path: /healthz
        port: 6443
        scheme: HTTPS
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-apiserver
    resources:
      requests:
        cpu: 250m
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/pki
      name: etc-pki
      readOnly: true
    - mountPath: /etc/kubernetes/pki
      name: k8s-certs
      readOnly: true
    - mountPath: /etc/kubernetes/vsphere.conf
      name: vsphere-config
      readOnly: true
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/pki
      type: DirectoryOrCreate
    name: etc-pki
  - hostPath:
      path: /etc/kubernetes/pki
      type: DirectoryOrCreate
    name: k8s-certs
  - hostPath:
      path: /etc/kubernetes/vsphere.conf
      type: FileOrCreate
    name: vsphere-config
status: {}
On any worker nodes you'll need to tell it to use vsphere as a cloud provider. No need to worry about a vsphere.conf file here as that's handled by the controller. This is one of the poorest and most conflicted pieces of documentation, but this should work.
[root@k8s-n1 ~]# cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml --cloud-provider=vsphere"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

Reload Configuration

To get all of this configuration loaded you can either reboot all nodes or you can restart the kubelet service which will in turn reload the api and controller pods on the master or just kubelet on the worker nodes:
[root@k8s-n1 ~]# systemctl daemon-reload
[root@k8s-n1 ~]# systemctl restart kubelet
[root@k8s-master ~]# reboot

UUID Problem

One of the issues I had was getting kubernetes to locate the UUID of my master node. This prevented vSphere from finding the node in vcenter which prevented anything from working. The error looks something like this:
[root@k8s-master ~]# kubectl -n kube-system logs kube-controller-manager-k8s-master | grep -B 2 "failed to add node"
E0507 16:55:20.983794       1 datacenter.go:78] Unable to find VM by UUID. VM UUID: 
E0507 16:55:20.983842       1 vsphere.go:1408] failed to add node &Node{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{Name:k8s-master,GenerateName:,Namespace:,SelfLink:/api/v1/nodes/k8s-master,UID:f02a92f5-70e3-11e9-bd37-005056bc31df,ResourceVersion:3895,Generation:0,CreationTimestamp:2019-05-07 16:19:50 +0000 UTC,DeletionTimestamp:,DeletionGracePeriodSeconds:nil,Labels:map[string]string{beta.kubernetes.io/arch: amd64,beta.kubernetes.io/os: linux,kubernetes.io/arch: amd64,kubernetes.io/hostname: k8s-master,kubernetes.io/os: linux,node-role.kubernetes.io/master: ,},Annotations:map[string]string{kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock,node.alpha.kubernetes.io/ttl: 0,projectcalico.org/IPv4Address: 10.9.176.25/23,projectcalico.org/IPv4IPIPTunnelAddr: 192.168.235.192,volumes.kubernetes.io/controller-managed-attach-detach: true,},OwnerReferences:[],Finalizers:[],ClusterName:,Initializers:nil,ManagedFields:[],},Spec:NodeSpec{PodCIDR:192.168.0.0/24,DoNotUse_ExternalID:,ProviderID:,Unschedulable:false,Taints:[{node-role.kubernetes.io/master  NoSchedule }],ConfigSource:nil,},Status:NodeStatus{Capacity:ResourceList{cpu: {{2 0} {} 2 DecimalSI},ephemeral-storage: {{38216605696 0} {}  BinarySI},hugepages-1Gi: {{0 0} {} 0 DecimalSI},hugepages-2Mi: {{0 0} {} 0 DecimalSI},memory: {{8201801728 0} {} 8009572Ki BinarySI},pods: {{110 0} {} 110 DecimalSI},},Allocatable:ResourceList{cpu: {{2 0} {} 2 DecimalSI},ephemeral-storage: {{34394945070 0} {} 34394945070 DecimalSI},hugepages-1Gi: {{0 0} {} 0 DecimalSI},hugepages-2Mi: {{0 0} {} 0 DecimalSI},memory: {{8096944128 0} {} 7907172Ki BinarySI},pods: {{110 0} {} 110 DecimalSI},},Phase:,Conditions:[{NetworkUnavailable False 2019-05-07 16:55:09 +0000 UTC 2019-05-07 16:55:09 +0000 UTC CalicoIsUp Calico is running on this node} {MemoryPressure False 2019-05-07 16:54:57 +0000 UTC 2019-05-07 16:19:44 +0000 UTC KubeletHasSufficientMemory kubelet has sufficient memory available} {DiskPressure False 2019-05-07 16:54:57 +0000 UTC 2019-05-07 16:19:44 +0000 UTC KubeletHasNoDiskPressure kubelet has no disk pressure} {PIDPressure False 2019-05-07 16:54:57 +0000 UTC 2019-05-07 16:19:44 +0000 UTC KubeletHasSufficientPID kubelet has sufficient PID available} {Ready True 2019-05-07 16:54:57 +0000 UTC 2019-05-07 16:52:05 +0000 UTC KubeletReady kubelet is posting ready status}],Addresses:[{InternalIP 10.9.176.25} {Hostname k8s-master}],DaemonEndpoints:NodeDaemonEndpoints{KubeletEndpoint:DaemonEndpoint{Port:10250,},},NodeInfo:NodeSystemInfo{MachineID:d772fbc266f3456387d01f1914e2a33c,SystemUUID:7B283C42-6E10-4736-8616-A6D505FB3D73,BootID:cd4f0ad4-bb18-4ca9-b3e9-d0f64465514c,KernelVersion:3.10.0-957.el7.x86_64,OSImage:CentOS Linux 7 (Core),ContainerRuntimeVersion:docker://1.13.1,KubeletVersion:v1.14.1,KubeProxyVersion:v1.14.1,OperatingSystem:linux,Architecture:amd64,},Images:[{[k8s.gcr.io/etcd@sha256:17da501f5d2a675be46040422a27b7cc21b8a43895ac998b171db1c346f361f7 k8s.gcr.io/etcd:3.3.10] 258116302} {[k8s.gcr.io/kube-apiserver@sha256:bb3e3264bf74cc6929ec05b494d95b7aed9ee1e5c1a5c8e0693b0f89e2e7288e k8s.gcr.io/kube-apiserver:v1.14.1] 209878057} {[k8s.gcr.io/kube-controller-manager@sha256:5279e0030094c0ef2ba183bd9627e91e74987477218396bd97a5e070df241df5 k8s.gcr.io/kube-controller-manager:v1.14.1] 157903081} {[docker.io/calico/node@sha256:04806ef1d6a72f527d7ade9ed1f2fb78f7cb0a66f749f0f2d88a840e679e0d4a docker.io/calico/node:v3.7.1] 154794504} {[docker.io/calico/cni@sha256:cc72dab4d80a599913bda0479cf01c69d37a7a45a31334dbc1c9c7c6cead1893 docker.io/calico/cni:v3.7.1] 135366007} {[k8s.gcr.io/kube-proxy@sha256:44af2833c6cbd9a7fc2e9d2f5244a39dfd2e31ad91bf9d4b7d810678db738ee9 k8s.gcr.io/kube-proxy:v1.14.1] 82108455} {[k8s.gcr.io/kube-scheduler@sha256:11af0ae34bc63cdc78b8bd3256dff1ba96bf2eee4849912047dee3e420b52f8f k8s.gcr.io/kube-scheduler:v1.14.1] 81581961} {[k8s.gcr.io/coredns@sha256:02382353821b12c21b062c59184e227e001079bb13ebd01f9d3270ba0fcbf1e4 k8s.gcr.io/coredns:1.3.1] 40303560} {[k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea k8s.gcr.io/pause:3.1] 742472}],VolumesInUse:[],VolumesAttached:[],Config:nil,},}: No VM found
To fix that I had to modify the node information as it was missing the providerID. To get the UUID of the node and update it in kubernetes you can run the following:
[root@k8s-master ~]# cat /sys/class/dmi/id/product_serial | sed -e 's/^VMware-//' -e 's/-/ /' | awk '{ print toupper($1$2$3$4 "-" $5$6 "-" $7$8 "-" $9$10 "-" $11$12$13$14$15$16) }'
[root@k8s-master ~]# kubectl patch node <Node name> -p '{"spec":{"providerID":"vsphere://<vm uuid>"}}'
Make sure you do this right the first time as setting it is easy, changing it can be problematic. It's likely you'll need to do this for all nodes. You can find out more from a bug report on github: https://github.com/kubernetes/kubernetes/issues/65933

Logging

In the event you have problems, the logs are poor and hard to decipher but here are a few helpful hints that worked for me. To see these logs you'll need to find the docker container ID for your controller. Ignore the /pause container
[root@k8s-master ~]# docker container ls | grep controller-manager
4796191d16d0        efb3887b411d           "kube-controller-m..."   5 hours ago         Up 5 hours                              k8s_kube-controller-manager_kube-controller-manager-k8s-master_kube-system_2a375eee0a4d25cea1a1eb0509d792c7_1
5a4de92a7ff2        k8s.gcr.io/pause:3.1   "/pause"                 5 hours ago         Up 5 hours                              k8s_POD_kube-controller-manager-k8s-master_kube-system_2a375eee0a4d25cea1a1eb0509d792c7_1
And to view the log output similar to tail -f you can run this:
[root@k8s-master ~]# docker logs -f 4796191d16d0
You're looking for lines like the following:
I0506 18:31:39.573627       1 vsphere.go:392] Initializing vc server 10.9.178.236
I0506 18:31:39.574195       1 vsphere.go:276] Setting up node informers for vSphere Cloud Provider
I0506 18:31:39.574238       1 vsphere.go:282] Node informers in vSphere cloud provider initialized
I0506 18:31:39.581044       1 vsphere.go:1406] Node added: &Node{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{<A lot of node information>}
I'd also recommend increasing the log level for the controller by adding the following to /etc/kubernetes/manifests/kube-controller-manager.yaml with a restart of the machine or the container. Without this success will go silently unnoticed, which is OK when you're up and running but makes things tough when you're trying to set things up.
  - --v=4
Success looks like this with one entry for each node:
I0507 17:17:29.380235       1 nodemanager.go:394] Invalid credentials. Cannot connect to server "10.9.178.236". Fetching credentials from secrets.
I0507 17:17:30.393408       1 connection.go:138] SessionManager.Login with username "itlab\\mengland"
I0507 17:17:30.563076       1 connection.go:209] New session ID for 'ITLAB\mengland' = 523502d1-7796-7de1-5d03-99733a3390c1
I0507 17:17:30.570269       1 nodemanager.go:166] Finding node k8s-master in vc=10.9.178.236 and datacenter=Vancouver
I0507 17:17:30.575057       1 nodemanager.go:194] Found node k8s-master as vm=VirtualMachine:vm-364036 in vc=10.9.178.236 and datacenter=Vancouver
I0507 17:17:30.575150       1 vsphere.go:1406] Node added: &Node{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{Name:k8s-n1,GenerateName:,Namespace:,SelfLink:/api/v1/nodes/k8s-n1,UID:8adb895b-70e4-11e9-bd37-005056bc31df,ResourceVersion:5719,Generation:0,CreationTimestamp:2019-05-07 16:24:09 +0000 UTC,DeletionTimestamp:,DeletionGracePeriodSeconds:nil,Labels:map[string]string{beta.kubernetes.io/arch: amd64,beta.kubernetes.io/os: linux,kubernetes.io/arch: amd64,kubernetes.io/hostname: k8s-n1,kubernetes.io/os: linux,},Annotations:map[string]string{kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock,node.alpha.kubernetes.io/ttl: 0,projectcalico.org/IPv4Address: 10.9.176.32/23,projectcalico.org/IPv4IPIPTunnelAddr: 192.168.215.64,volumes.kubernetes.io/controller-managed-attach-detach: true,},OwnerReferences:[],Finalizers:[],ClusterName:,Initializers:nil,ManagedFields:[],},Spec:NodeSpec{PodCIDR:192.168.1.0/24,DoNotUse_ExternalID:,ProviderID:vsphere://423CDDFE-1970-4958-56B7-415BEE001167,Unschedulable:false,Taints:[],ConfigSource:nil,},Status:NodeStatus{Capacity:ResourceList{cpu: {{4 0} {} 4 DecimalSI},ephemeral-storage: {{38216605696 0} {}  BinarySI},hugepages-1Gi: {{0 0} {} 0 DecimalSI},hugepages-2Mi: {{0 0} {} 0 DecimalSI},memory: {{16657203200 0} {}  BinarySI},pods: {{110 0} {} 110 DecimalSI},},Allocatable:ResourceList{cpu: {{4 0} {} 4 DecimalSI},ephemeral-storage: {{34394945070 0} {} 34394945070 DecimalSI},hugepages-1Gi: {{0 0} {} 0 DecimalSI},hugepages-2Mi: {{0 0} {} 0 DecimalSI},memory: {{16552345600 0} {}  BinarySI},pods: {{110 0} {} 110 DecimalSI},},Phase:,Conditions:[{NetworkUnavailable False 2019-05-07 16:25:16 +0000 UTC 2019-05-07 16:25:16 +0000 UTC CalicoIsUp Calico is running on this node} {MemoryPressure False 2019-05-07 17:16:30 +0000 UTC 2019-05-07 16:24:09 +0000 UTC KubeletHasSufficientMemory kubelet has sufficient memory available} {DiskPressure False 2019-05-07 17:16:30 +0000 UTC 2019-05-07 16:24:09 +0000 UTC KubeletHasNoDiskPressure kubelet has no disk pressure} {PIDPressure False 2019-05-07 17:16:30 +0000 UTC 2019-05-07 16:24:09 +0000 UTC KubeletHasSufficientPID kubelet has sufficient PID available} {Ready True 2019-05-07 17:16:30 +0000 UTC 2019-05-07 16:48:45 +0000 UTC KubeletReady kubelet is posting ready status}],Addresses:[{ExternalIP 10.9.176.32} {InternalIP 10.9.176.32} {Hostname k8s-n1}],DaemonEndpoints:NodeDaemonEndpoints{KubeletEndpoint:DaemonEndpoint{Port:10250,},},NodeInfo:NodeSystemInfo{MachineID:ce61f472a8d74d9ba195df280e104030,SystemUUID:FEDD3C42-7019-5849-56B7-415BEE001167,BootID:54c28221-7d74-41c0-ac97-2f5e95fa3bf8,KernelVersion:3.10.0-957.el7.x86_64,OSImage:CentOS Linux 7 (Core),ContainerRuntimeVersion:docker://1.13.1,KubeletVersion:v1.14.1,KubeProxyVersion:v1.14.1,OperatingSystem:linux,Architecture:amd64,},Images:[{[docker.io/calico/node@sha256:04806ef1d6a72f527d7ade9ed1f2fb78f7cb0a66f749f0f2d88a840e679e0d4a docker.io/calico/node:v3.7.1] 154794504} {[docker.io/calico/cni@sha256:cc72dab4d80a599913bda0479cf01c69d37a7a45a31334dbc1c9c7c6cead1893 docker.io/calico/cni:v3.7.1] 135366007} {[k8s.gcr.io/kube-proxy@sha256:44af2833c6cbd9a7fc2e9d2f5244a39dfd2e31ad91bf9d4b7d810678db738ee9 k8s.gcr.io/kube-proxy:v1.14.1] 82108455} {[k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea k8s.gcr.io/pause:3.1] 742472}],VolumesInUse:[],VolumesAttached:[],Config:nil,},}

Add Storage Class

You thought we were done, didn't you? Nope, we need to add a storage class so it can be used by pods. Some of our defaults were defined in the vsphere.conf file, like destination datastore, but you can also specify that here. You can call the disk type anything you want and you can have multiple, for example you've got a datastore on SSD and one on SATA you could have two different classes that can be used. Here's a sample of a class and deploying it to the server.
[root@k8s-master ~]# cat vmware-storage.yaml 
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: vsphere-ssd
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
parameters:
  diskformat: thin
  datastore: esx_labcluster2_ds02
provisioner: kubernetes.io/vsphere-volume
[root@k8s-master ~]# kubectl apply -f vmware-storage.yaml
If you're using block based storage (e.g. fibre channel or iSCSI) you probably want to use a thick provisioned disk format as initial write performance can be terrible on thin provisioned block but that's for a different discussion.

All of your VMDK files will be placed in a folder kubevols at the root of your datastore.

Using Storage

Last step! Actually making use of our storage and thereby testing things out. This is just a quick persistent volume claim which should allocate a persistent volume through vsphere.
[root@k8s-master ~]# cat pvc_test.yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
[root@k8s-master ~]# kubectl create -f pvc_test.yaml
It might take a few seconds but you should end up with a persistent volume
[root@k8s-master ~]# kubectl get pvc
NAME   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test   Bound    pvc-9cc548db-a9d3-11e9-a5fe-005056bc8d93   5Gi        RWO            vsphere-ssd    59m
[root@k8s-master ~]# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM          STORAGECLASS   REASON   AGE
pvc-9cc548db-a9d3-11e9-a5fe-005056bc8d93   5Gi        RWO            Delete           Bound    default/test   vsphere-ssd             27m

Cleaning Up

If you make mistakes or have problems, here are some helpful commands to clean up:
[root@k8s-master ~]# kubectl delete pvc test
To reset kubernetes back to scratch you can run these commands:
[root@k8s-master ~]# kubectl delete node --all
[root@k8s-master ~]# kubeadm reset
[root@k8s-master ~]# rm -rf /var/lib/etcd/*
[root@k8s-master ~]# rm -rf /etc/kubernetes
[root@k8s-master ~]# rm -rf ~/.kube
[root@k8s-master ~]# docker image rm `docker images -q`
If you'd like to cleanup a node:
[root@k8s-master ~]# kubectl delete node <node_name>
[root@k8s-n1 ~]# rm -rf /var/lib/kubelet
[root@k8s-n1 ~]# rm -rf /etc/kubernetes
either reboot or kill kubelet process if it's running

Links

Kevin Tijssen's Blog
- One of the better written guides I've seen. It really, really helped getting Kubernetes running quickly and easily
- http://blog.kevintijssen.eu/how-to-install-kubernetes-on-centos-7/
vSphere Cloud Provider
- Not my favourite but it's what's available
- https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/overview.html
Kubernetes Cloud Controller (the future of cloud providers)
- https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/
vSphere Container Storage Interface
- https://github.com/kubernetes-sigs/vsphere-csi-driver
Cloud Controller Manager for vSphere (future of kubernetes and vSphere)
- https://github.com/kubernetes/cloud-provider-vsphere

5 comments: