Friday, May 24, 2019

Kubernetes Logging With Fluent Bit

Centralized logging is a requirement in any kubernetes installation. With the defaults in place, logs will be kept on each node and won't persist as pods come and go. This is a big problem, as you really need your logs when things go wrong, not when they're working well, so I set out to establish a central log. All of the examples I could find referenced keeping logs in /var/log/containers/*.log which is great, but not in use on a systemd system. Since pretty much every linux distribution uses systemd these day, this is my attempt to provide a logging configuration to support my base install.

There are several ways to log, which can seem confusing, as to me, most of these methods are way too complicated and error prone. Kubernetes.io has this to say:
"Because the logging agent must run on every node, it’s common to implement it as either a DaemonSet replica, a manifest pod, or a dedicated native process on the node. However the latter two approaches are deprecated and highly discouraged."
So, using a DaemonSet is the way to go. With what? We'll fluent bit is the easiest way I've found, once you have a working configuration file for your setup. Originally I wanted to use Graylog as my collection and presentation layer, but have found that the the fluent pieces just aren't mature enough to deal with GELF correctly, so I eventually settled on an ELK stack; much better.

There are some excellent tutorials on how to install elastic search, logstash, and kibana on the web. If you're using the repositories, it can't get much easier so please follow one of those, https://computingforgeeks.com/how-to-install-elk-stack-on-centos-fedora/ is a good example.

Namespace And Role Setup

I like namespaces, I like roles, so the first thing to do is set those up. Because we're going to be running a daemon set to catch all pods and node activities we need to setup a cluster role and corresponding cluster role binding like this:
$ cat fluent-bit-role.yaml 
apiVersion: v1
kind: Namespace
metadata:
  name: logging
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: logging
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit-read
rules:
- apiGroups: [""]
  resources:
  - namespaces
  - pods
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit-read
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit-read
subjects:
- kind: ServiceAccount
  name: fluent-bit
  namespace: logging

Fluent Bit Configuration

A ConfigMap seems to be the most popular way to manage Fluent Bit's configuration. This is by no means exhaustive but should provide a pretty good template. A couple of general notes:
  • The name of the input or filter determines its type. It feels like just a string but it isn't, you're actually defining type with this field
  • Documentation is pretty good once you figure out which plugin you're dealing with, for example, Systemd document
  • There's no magical way to get the logs, the daemon set needs access to the nodes logs, including systemd. For centos7 this is kept under /run/log/journal/<UUID>/*.journal. We'll talk about this again when building the daemon set itself
I wanted to "enrich" my node logs with kubernetes information as without it you'll be missing some key information like namespace, container name, and other app specific labels. To do this you need a kubernetes filter, with the name kubernetes, remember this is a type not just a name, and a match entry within that needs to align with the input you'd like to pair it with. In this example there's only one input and one filter so hopefully it makes sense. The Kube_URL entry is whatever URL you can reach the kubernetes management API. It'll be queried to fill in the missing pieces merging that data with this log entry. It should be visible with 'kubectl get svc'
$ cat fluent-bit-configmap.yaml 
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
  labels:
    k8s-app: fluent-bit
data:
  # Configuration files: server, input, filters and output
  # ======================================================
  fluent-bit.conf: |
    [SERVICE]
        Flush              1
        Log_Level          info
        Daemon             off
        HTTP_Server        On
        HTTP_Listen        0.0.0.0
        HTTP_Port          2020

    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-elasticsearch.conf

  input-kubernetes.conf: |
    [INPUT]
        Name                systemd
        Tag                 kube_systemd.*
        Path                /run/log/journal
        DB                  /var/log/flb_kube_systemd.db
        Systemd_Filter      _SYSTEMD_UNIT=docker.service
        Read_From_Tail      On
        Strip_Underscores   On

  filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube_systemd.*
        Kube_URL            https://kubernetes.default.svc:443
        Annotations         On
        Labels              On
        Merge_Log           On
        K8S-Logging.Parser  On
        Use_Journal         On
    
  output-elasticsearch.conf: |
    [OUTPUT]
        Name                es
        Match               *
        Host                elasticsearch.prod.int.com
        Port                9200
        Index               k8s-lab

Fluent Bit DaemonSet

At last we'll configure fluent bit as a daemon set. Some general notes here:
  • I found it's possible to debug quickly by running a -debug variant, but anything newer than 1.0.4-debug lacked the ability to run /bin/sh, replacing it instead with busybox, which seemed more complicated for my needs.
    • To run a debug version you'd use an image like this image: fluent/fluent-bit:1.0.4-debug
  • As mentioned in the config map section, you need to mount the node's systemd location; in my case, /run/log
$ cat fluent-bit-daemon-set.yaml 
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
  labels:
    k8s-app: fluent-bit-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    matchLabels:
      name: fluent-bit
  template:
    metadata:
      labels:
        name: fluent-bit
        k8s-app: fluent-bit-logging
        version: v1
        kubernetes.io/cluster-service: "true"
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "2020"
        prometheus.io/path: /api/v1/metrics/prometheus
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:1.1.1
        imagePullPolicy: Always
        ports:
          - containerPort: 2020
        volumeMounts:
        - name: systemdlog
          mountPath: /run/log
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      terminationGracePeriodSeconds: 10
      volumes:
      - name: systemdlog
        hostPath:
          path: /run/log
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
      serviceAccountName: fluent-bit
      tolerations:
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule

Deploying

Now comes the easy part, deploying all the yaml files:
$ kubectl create -f fluent-bit-role.yaml
$ kubectl create -f fluent-bit-configmap.yaml
$ kubectl create -f fluent-bit-daemon-set.yaml
And that's it. You should start to see logs coming into your elasticsearch instance and consequentially Kibana which you can start to visualize and create dashboards for.

Clean Up

If you need to remove everything for testing or other purposes, namespaces make this really easy. The only piece remaining is the role / binding, all of which can be removed like this:
$ kubectl delete namespace logging
$ kubectl delete clusterrolebinding fluent-bit-read
$ kubectl delete clusterrole fluent-bit-read

5 comments: