Skip to main content

Logging

Its important to have and to analyze logs for both the nodes and the applications.

Pre-Req's

  • This assumes you have a log server already setup, for example Graylog* That you know how to setup inputs on on your log server of choice.

Node logs

On each node, open the rsyslog config file -

sudo nano /etc/rsyslog.conf

Then add to the bottom

# Log to Graylog
*.* @@logserver:1516

Restart rsyslog -

systemctl restart rsyslog

You should now see you logs start to come into your logging solution.

Kubernetes Logs

Next we need to get the logs from Kubernetes itself.
For this, we will be using an app called Fluent-bit, which is a lightweight derivative of FluentD.
As a side note, it also has a VERY helpful community in its Slack channel.

For this we will be creating 3x config YAML files.

  1. Role, Role Binding & Service Account
  2. Config Map
  3. Daemon Set.

fluent-bit_roles.yaml

No edits needed on this one, but note the aforementioned --- usage to combine several configs into one file.
We can technically combine all of these YAMLs into one, however for the purpose of this, we will be separating so its easier to breakdown the functions of each.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fluent-bit-read
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: fluent-bit-read
subjects:
- kind: ServiceAccount
name: fluent-bit
namespace: logging-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluent-bit-read
rules:
- apiGroups: [""]
resources:
- namespaces
- pods
verbs: ["get", "list", "watch"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluent-bit
namespace: logging-system

fluent-bit_configmap.yaml

This is what gives Fluent-bit its configuration.
In this section, you will need to edit the variables for host and port, under the output-graylog.conf:` section, updating it to the IP of your log server, and the port setup for the GELF connection.

apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: logging-system
labels:
k8s-app: fluent-bit
data:
# Configuration files: server, input, filters and output
# ======================================================
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020

@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-graylog.conf

input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
Parser docker
DB /var/log/flb_graylog.db
DB.Sync Normal
Docker_Mode On
Buffer_Chunk_Size 512KB
Buffer_Max_Size 5M
Rotate_Wait 30
Mem_Buf_Limit 30MB
Skip_Long_Lines On
Refresh_Interval 10

filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Merge_Log On
Merge_Log_Key log
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Annotations Off
Labels On

output-graylog.conf: |
[OUTPUT]
Name gelf
Match *
Host IP_OF_YOUR_SERVER
Port 12204
Mode tcp
Gelf_Short_Message_Key log

parsers.conf: |
[PARSER]
Name apache
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
Name apache2
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
Name apache_error
Format regex
Regex ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
Name json
Format json
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On

[PARSER]
Name syslog
Format regex
Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S

Make sure you updated IP_OF_YOUR_SERVER with...the IP of your log server.
Right now, it doesn't appear to play very nicely, or consistently, with DNS names.
I am not sure if this is a syntax issue, or a limitation/bug of the graylog parser. Once i get some time to investigate, i will update this article accordingly.

fluent-bit_daemonset.yaml

This is the "Application" itself, ran as a Daemon Set. No changes are needed here.

note

The image version in this is set to 1.9 - deliberately.
The below config does not work on versions greater than 1.9.x, it will be updated at a later date to allow 2.0+ to be used.

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit
namespace: logging-system
labels:
k8s-app: fluent-bit-logging
version: v1
kubernetes.io/cluster-service: "true"
spec:
selector:
matchLabels:
k8s-app: fluent-bit-logging
template:
metadata:
labels:
k8s-app: fluent-bit-logging
version: v1
kubernetes.io/cluster-service: "true"
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "2020"
prometheus.io/path: /api/v1/metrics/prometheus
spec:
containers:
- name: fluent-bit
image: fluent/fluent-bit:1.9
imagePullPolicy: Always
ports:
- containerPort: 2020
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: fluent-bit-config
mountPath: /fluent-bit/etc/
terminationGracePeriodSeconds: 10
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: fluent-bit-config
configMap:
name: fluent-bit-config
serviceAccountName: fluent-bit
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
- operator: "Exists"
effect: "NoExecute"
- operator: "Exists"
effect: "NoSchedule"

Apply all three YAML files -

kubectl apply -f fluent-bit_roles.yaml -f fluent-bit_configmap.yaml -f fluent-bit_daemonset.yaml  

After a minute or two, you should see the logs start to stream into Graylog.

X

X

Application Logs

Coming soon!

Next Step

Next, go to the next step, Dashboard - Traefik.
Or,
Go back to the index page.