-
Notifications
You must be signed in to change notification settings - Fork 75
Docker daemon dies on boot when cgroupsPath used #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
#14 says the change was based on https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources but I don't see any reference to |
Looks like removing the |
I had thought this was correlated with I added |
It's a bit hard to be sure, but using |
Putting I wonder if between them they are blowing some sort of cgroup memory limit when they all share the same space and triggering something like an OOM? I'm seeing no sign of that in |
thats really odd. The cgroups just create the groups, but no limits are applied. Hmm. |
I wondered if maybe there was some unhelpful default (e.g. 50% of RAM) in the kernel (seems a little unlikely) or if something (like |
It can't be kubelet because this happens even when that service is still sat in our |
I'm playing with auditd trying to get a handle on the process doing the killing. I'm not 100% sure yet, but it looks like it is a |
The process which is killing
which is:
It's parent is the So it does appear that tearing down the common cache service (which is the first to exit) is tearing down all the processes in that cgroup. |
@justincormack is it notable that 662d3d4 sets |
So it seems to be expected that this will nuke the whole cgroup. |
containerd/containerd#1932 (comment) says:
So I guess #14 needs to be redone using sub-cgroups under a namespace. @justincormack will you look into that? |
This replaces the intent of linuxkit#14 (which was reverted in linuxkit#24). Compared with that: - Use a separate (nested) cgroup for each component since having multiple containers in the same cgroup has caveats (see linuxkit#23). - Tell kubelet about system and runtime cgroups which contain all the others. Fixes linuxkit#23. Signed-off-by: Ian Campbell <[email protected]>
This replaces the intent of linuxkit#14 (which was reverted in linuxkit#24). Compared with that: - Use a separate (nested) cgroup for each component since having multiple containers in the same cgroup has caveats (see linuxkit#23). - Tell kubelet about system and runtime cgroups which contain all the others. Fixes linuxkit#23. Signed-off-by: Ian Campbell <[email protected]>
This replaces the intent of linuxkit#14 (which was reverted in linuxkit#24). Compared with that: - Use a separate (nested) cgroup for each component since having multiple containers in the same cgroup has caveats (see linuxkit#23). - Tell kubelet about system and runtime cgroups which contain all the others. Fixes linuxkit#23. Signed-off-by: Ian Campbell <[email protected]>
After #14 Docker does not start. Shortly after boot without taking any other action the state becomes:
The logs are uninteresting:
It seems that reverting #14 fixes things. I'll double check and raise a PR to revert while we sort this out.
/cc @justincormack. This was also mentioned in #11 (comment).
The text was updated successfully, but these errors were encountered: