Making Python respect Docker memory limits
If you run Python inside containers, chances are you have seen Linux’s OOMKiller working …
I always forget the details about Kubernetes pod shutdown lifecycle is something when I need them, so this is my now made public notes on the subject.
When a pod is signaled to terminate (deleted, for example), what happens is:
Terminating
phasepreStop
lifecycle hooksSIGTERM
This entire process has a timeout defined in the terminationGracePeriodSeconds
(30 seconds by default). After that, Kubernetes sends a SIGKILL
to the container.
During this entire process, readiness
and liveness
probes will still be probed, but their failure will not cause the killing of the container, as it is already being killed.
Let’s setup a playground so we can explore this.
The sources I used are below:
# deploy.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: test
spec:
strategy:
type: Recreate
template:
metadata:
labels:
app: test
spec:
terminationGracePeriodSeconds: 18
containers:
- name: test
image: caarlos0/sinkhole
ports:
- containerPort: 8080
lifecycle:
preStop:
exec:
command: ["sleep", "8"]
livenessProbe:
httpGet:
path: /live
port: 8080
readinessProbe:
httpGet:
path: /ready
port: 8080
The test
image can be generated with this source code:
// main.go
package main
import (
"context"
"fmt"
"log"
"net/http"
"os"
"os/signal"
"syscall"
"time"
)
func main() {
var addr = ":8080"
if p := os.Getenv("PORT"); p != "" {
addr = ":" + p
}
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
if sleep := r.URL.Query().Get("sleep"); sleep != "" {
d, err := time.ParseDuration(sleep)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
time.Sleep(d)
}
defer r.Body.Close()
log.Println(r.URL)
fmt.Fprintln(w, "ok")
})
srv := &http.Server{
Addr: addr,
Handler: mux,
}
done := make(chan os.Signal, 1)
signal.Notify(done, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)
go func() {
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatalf("listen: %s\n", err)
}
}()
log.Println("server started on " + addr)
<-done
log.Println("server stop requested")
ctx, cancel := context.WithTimeout(context.Background(), 90*time.Second)
defer cancel()
if err := srv.Shutdown(ctx); err != nil {
log.Fatalf("server shutdown failed:%+v", err)
}
log.Print("server exit properly")
}
# Dockerfile
FROM golang:alpine as build
WORKDIR /src
COPY . .
RUN go build -o server main.go
FROM alpine
COPY --from=build /src/server server
ENTRYPOINT ["./server"]
And then just run:
docker build -t test .
We can deploy our testing image with
kubectl apply -f deploy.yaml
And forward it to our local machine:
kubectl port-forward `kubectl get po -l app=test -oname` 8080:8080
From now on, we’ll need a couple of terminals.
Simulate normal requests:
while true; do
curl localhost:8080/fake
sleep 1
done
Simulate a long living request:
curl localhost:8080/slow\?sleep=30m
Watch the pod’s events:
while true; do
kubectl describe -l app=test
sleep 1
done
Watch the pods logs:
kubectl logs -f -l app=test
Delete the pod:
kubectl delete po -l app=test
As soon as we delete the pod in Terminal #5, our describe on Terminal #3 will show something like this:
Normal Killing 5s kubelet, node-12108603-9g16 Stopping container test
This will trigger our preStop
hook, which is just a 8 seconds sleep.
After some time, we see more events:
Warning Unhealthy 8s kubelet, node-12108603-9g16 Readiness probe failed: Get http://172.16.6.7:8080/ready: dial tcp 172.16.6.7:8080: connect: connection refused
Warning Unhealthy 5s kubelet, node-12108603-9g16 Liveness probe failed: Get http://172.16.6.7:8080/live: dial tcp 172.16.6.7:8080: connect: connection refused
Also, on our pod’s logs, we see a server stop requested
, meaning we got a SIGTERM
, and the server is not accepting new incoming request anymore. Our slow request on Terminal #2 is still there though.
After another ~10s (terminationGracePeriodSeconds
’s 18s - preStop
hook’s 8s), the pod receives a SIGKILL
because it hasn’t finished in time due to our slow request.
We can now run this scenario again, but this time, have the slow request with a smaller time (e.g. 15s
), and the pod exit, and we’ll see in its logs that it exit 0
:
2020/07/30 18:47:03 server stop requested
2020/07/30 18:47:03 server exit properly
That’s because all requests finished, so our process finished as well.
Kubernetes is powerful, and because you can do a lot of things, its normal to be confused by it at times.
Particularly, the shutdown behavior confused me more times than I can count (“does the pre-stop hook account on the grace period? I don’t remember” et al), that’s why I noted this for quick reference, and thought I might as well make it public.
The Go app used in the example is an app that I use often to test/debug things that need dummy a web server, you can just use caarlos0/sinkhole
instead of building locally if prefer (also check the full source code).
That’s it, hope it helps. 🙂