"update-osmo-ci-on-slaves" hangs: gtp0-deb10build32 is offline
The "update-osmo-ci-on-slaves" job is hanging forever, waiting for gtp0-deb10build32 to come online.
1. Is the node supposed to be offline?2. Having the job "update-osmo-ci-on-slaves" hang forever if one of the nodes is offline isn't so great:
- it does not send a notification mail
- new changes from osmo-ci.git are only rolled out after I (or someone else) manually cancels the currently running job
- build-timeout plugin, so we can abort the build after one hour or so (should be enough time to build docker containers etc. if needed). Then we should actually get failure mails if one node is unexpectedly down.
- Not sure if this is currently maintained though, the jenkins page says "This plugin is up for adoption!"
- the matrix-project plugin we already use has an elastic-axis extension. If we install that, we can skip offline nodes.
- If we want a mail notification that nodes are not available, we can probably set something up too, as separate job or elsewhere in the jenkins config.
Updated by laforge about 1 month ago
On Thu, Sep 16, 2021 at 08:10:17AM +0000, osmith [REDMINE] wrote:
1. Is the node supposed to be offline?
it is as much supposed to be offfline as much as everything in my basement is "supposed" to be offline
as it has been physically removed due to water damage related reconstruction. Sorry for that.
- Status changed from New to Resolved
- % Done changed from 0 to 100
node are back online for at leat the past week or so.
After the Debian 11 upgrade of the root OS, I had to use "systemd.unified_cgroup_hierarchy=0" kernel arguments in order to make docker-in-deb9-lxc and docker-in-deb10-lxc continue to work.