Container Logs in Kubernetes
kubernetesNote
This article only talks about the kubelet + CRI based container runtimes. Pre-CRI based implementations (i.e., Docker) are slightly different.
Logging & Kubernetes
Logs can provide helpful insights into an application. Especially when troubleshooting bugs, logs can help us understand the why?
In Kubernetes, logs from containers are being handled by the container-runtime (e.g. containerd, cri-o, etc.) & the kubelet, as long as containers write logs to stdout or stderr.
By default, logs can be fetched using kubectl -n <namespace> logs <pod> -c <container>
. Though, it is not always an option, nor is it especially convenient. There are several situations where Kubernetes will move pods across nodes, which causes the kubelet on the old node to delete the logs. To avoid losing logs and make searching logs easier, they can be shipped to a central location.
Kubelet & container-runtime
The Kubelet communicates with the container runtime via the Container Runtime Interface. In short: It’s an API between the container runtime & Kubernetes to manage the lifecycle of a pod and its containers. Whenever you create a new pod in Kubernetes, and it gets scheduled, the kubelet will call the container runtime to start the container via the CRI.
When starting a new pod on a node, the kubelet creates a directory + logfile for every container of the pod (/var/log/pods/<namespace>_<pod>_<pod_uid>/<container>/<restart-count>.log
). Everytime a container restarts, a new logfile will be created. i.e.
As the log file includes the restart count, everytime a container restarts, it gets a new log file. i.e.:
- 0 restart(s) ->
/var/log/pods/<namespace>_<pod>_<pod_uid>/<container>/0.log
- 1 restart(s) ->
/var/log/pods/<namespace>_<pod>_<pod_uid>/<container>/1.log
Only the current & previous log files are kept. For a container that had 2 restarts, only 1.log
(previous) & 2.log
(current) will exist.
The kubelet passes the log path to the container-runtime in 2 API calls:
RunPodSandbox
- Absolute path to the directory for the pod logs(
/var/log/pods/<namespace>_<pod>_<pod_uid>/
) gets passed viaRunPodSandboxRequest.config.log_directory
- Absolute path to the directory for the pod logs(
CreateContainer
- The relative path(
<container>/<restart-count>.log
) for the container gets passed viaCreateContainerRequest.config.log_path
- The relative path(
The container runtime will then pipe all stdout & stderr logs of the container to the specified log file.
Log rotation
With CRI implementations, unlike with Docker, the kubelet is responsible for rotating log files. The Log rotation routine will be invoked every 10s & can be configured using 2 flags on the kubelet:
--container-log-max-files Set the maximum number of container log files that can be present for a container. The number must be >= 2. This flag can only be used with --container-runtime=remote.
--container-log-max-size Set the maximum size (e.g. 10Mi) of the container log file before it is rotated. This flag can only be used with --container-runtime=remote.
The corresponding fields in the kubelet config: .containerLogMaxSize
& .containerLogMaxFiles
.
Details on the structure of the kubelet config can be found in the upstream documentation.
The <restart-count>.log
file will be rotated once it exceeds --container-log-max-size
. For the rotation, the current log file will be renamed to <restart-count>.log.<timestamp>
.
After renaming the log file, the kubelet calls ReopenContainerLog
on the container runtime, which makes the runtime create a new log file (The filename is taken from the initial CreateContainer
request) where new logs will be forwarded to.
The previously rotated files with the <timestamp>
suffix will be gzip-compressed on the next rotation.
On the filesystem, this will look similar:
[root@node-0 counter]# du -sh /var/log/pods/default_counter_f9d6dec4-6ea1-446f-b29d-7de7f292f944/counter/*
1.5M /var/log/pods/default_counter_f9d6dec4-6ea1-446f-b29d-7de7f292f944/counter/0.log
108K /var/log/pods/default_counter_f9d6dec4-6ea1-446f-b29d-7de7f292f944/counter/0.log.20220130-200217.gz
112K /var/log/pods/default_counter_f9d6dec4-6ea1-446f-b29d-7de7f292f944/counter/0.log.20220130-200327.gz
112K /var/log/pods/default_counter_f9d6dec4-6ea1-446f-b29d-7de7f292f944/counter/0.log.20220130-200437.gz
11M /var/log/pods/default_counter_f9d6dec4-6ea1-446f-b29d-7de7f292f944/counter/0.log.20220130-200548
--container-log-max-files
includes the current + not yet rotated file, thus it must be >= 2.
Once the <restart-count>.log
file got rotated, those logs cannot be fetched anymore using kubectl logs
Log format
Container logs that are forwarded by the container-runtime to /var/log/pods/<namespace>_<pod>_<pod_uid>/<container>/0.log
are written in a special CRI format:
# Format: <timestamp> <stream> <tag> <container log message>
#
# timestamp: RFC3339 with nanoseconds. i.e. 2022-01-30T20:30:58.395515030+01:00
# stream : Originating stream. stdout or stderr
# tag : "F" for full, a full message. P" for a partial log message. Messages that exceed the maximal length or don't end with a newline are treated as partial.
# Example
2022-01-30T20:36:58.439215654+01:00 stdout F foo
Metrics
The kubelet provides a gauge for the usage of container logs: kubelet_container_log_filesystem_used_bytes{uid,namespace,pod,container}
.
The metric is exposed via a collector, avoiding keeping around metrics for removed containers.
kubelet_container_log_filesystem_used_bytes{container="counter",namespace="default",pod="counter",uid="eeef7958-380a-40a7-9cfd-3d137a2fa755"} 5.742592e+06
Shipping logs
Shipping logs to a central location solves two issues:
- Log messages are not lost when the container-runtime rotates a container’s log file
- Access to logs can be accessed in a central location, potentially with a system that offers a DSL for querying logs.
A common way to ship logs is using a log shipper. The log shipper is an agent running on every node, which “ships” logs to a defined destination. An additional feature most log shippers provide is adding metadata to logs. In Kubernetes, this is often the metadata of the pod that was running a container. Common solutions are Fluentd, Fluent Bit & Filebeat.
Example using filebeat
filebeat.autodiscover:
providers:
# Watch kubernetes pods
- type: kubernetes
# Filter by node the agent is running on
node: ${NODE_NAME}
# Allow hints from pod annotations
hints.enabled: true
hints.default_config:
# Create a container input for every container/pod
type: container
paths:
- "/var/log/pods/${data.kubernetes.namespace}_${data.kubernetes.pod.name}_${data.kubernetes.pod.uid}/${data.kubernetes.container.name}/*.log"
The config will configure filebeat to monitor the container logs from pods running on the same node as the filebeat agent. Kubernetes metadata(pod, node, namespace) will be attached to every log event.
The hints
settings enable controlling filebeat via pod annotations.
i.e., you could annotate a pod with co.elastic.logs/enabled: "false"
, which will disable log shipping for the annotated pod.
The type: container
config instructs filebeat to use the container input, which can parse logs in the CRI format.