Project

General

Profile

Actions

Bug #5447

closed

docker "failed to get digest"

Added by laforge about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Target version:
-
Start date:
02/08/2022
Due date:
% Done:

100%

Spec Reference:

Description

Every so often we have spontaneous build failures on our jenkins slave looking like this:

failed to get digest sha256:e8b3f56b281aa832fb0664d1553d17d2dc93217ece10b440bd9c2c492107c31a: open /var/lib/docker/image/overlay2/imagedb/content/sha256/e8b3f56b281aa832fb0664d1553d17d2dc93217ece10b440bd9c2c492107c31a: no such file or directory
../make/Makefile:87: recipe for target 'docker-build' failed
make: *** [docker-build] Error 1

from https://jenkins.osmocom.org/jenkins/view/All%20no%20Gerrit/job/nplab-m3ua-test/1677/console

To me this looks like something has purged intermediate docker layers while the docker build is running? Something like our docker cleanup tasks?

The job example above is running at 3am (UTC?) on host2-deb9build-ansible

there are more examples in the recent build history with similar problems:

https://jenkins.osmocom.org/jenkins/view/All%20no%20Gerrit/job/nplab-m3ua-test/1667/console

Actions #1

Updated by osmith about 2 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 90

Yes, this is caused by the docker-cleanup script. This will be resolved with the patches in https://gerrit.osmocom.org/q/topic:docker-clean, as only images that have not been used for the longest time will be removed with this until a size limit is reached. Images that are currently in use should not get removed anymore.

Actions #2

Updated by osmith about 2 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100
Actions #3

Updated by osmith about 2 years ago

  • Status changed from Resolved to In Progress
  • % Done changed from 100 to 90

This still happens, even with the new clean up script. As I understand it: when we start building an image, all intermediate steps are dangling images, until at the very end the last image gets tagged and only then each step is not dangling anymore. Looks like when the clean up script runs after the image build was started, but before the last step is finished, it will remove the images from the steps finished and then we get the error message and it fails.

I've adjusted the timer by 10 minutes, this should fix it.
https://gerrit.osmocom.org/c/osmo-ci/+/27349

Actions #4

Updated by osmith about 2 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)