Project

General

Profile

Bug #5061

build2-deb9build-ansible is offline

Added by osmith 8 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
High
Assignee:
sysmocom
Category:
-
Target version:
-
Start date:
03/08/2021
Due date:
% Done:

100%

Spec Reference:

Description

The "update-osmo-ci-on-slaves" job is hanging forever, waiting for build2-deb9build-ansible to come online.

https://jenkins.osmocom.org/jenkins/computer/build2-deb9build-ansible/ says, "This node is being launched. See log for more details", but the log is empty. The "Relaunch agent" button also doesn't help.

I'm clicking on "Mark this node temporarily offline" now, expecting that this makes the job pass again.

EDIT: I've temporarily removed the node from the update-osmo-ci-on-slaves job to make it pass.


Related issues

Related to Osmocom.org Servers - Bug #5234: "update-osmo-ci-on-slaves" hangs: gtp0-deb10build32 is offlineResolved09/16/2021

History

#1 Updated by osmith 8 months ago

  • Description updated (diff)

#2 Updated by laforge 8 months ago

On Mon, Mar 08, 2021 at 12:40:33PM +0000, osmith [REDMINE] wrote:

The "update-osmo-ci-on-slaves" job is hanging forever, waiting for build2-deb9build-ansible to come online.

https://jenkins.osmocom.org/jenkins/computer/build2-deb9build-ansible/ says, "This node is being launched. See log for more details", but the log is empty. The "Relaunch agent" button also doesn't help.

strange. I can ssh both to the physical machine as well as the deb9build-ansible container.

There's no 'out of diskspace' or other obvious error condition.

It also seems like it's marked active in jeknins again now?

#3 Updated by osmith 8 months ago

laforge wrote:

On Mon, Mar 08, 2021 at 12:40:33PM +0000, osmith [REDMINE] wrote:

The "update-osmo-ci-on-slaves" job is hanging forever, waiting for build2-deb9build-ansible to come online.

https://jenkins.osmocom.org/jenkins/computer/build2-deb9build-ansible/ says, "This node is being launched. See log for more details", but the log is empty. The "Relaunch agent" button also doesn't help.

strange. I can ssh both to the physical machine as well as the deb9build-ansible container.

There's no 'out of diskspace' or other obvious error condition.

It also seems like it's marked active in jeknins again now?

Hm, I just clicked "Bring this node online again" and it is in the same state as before: "This node is being launched. See log for more details" with an empty log.

#4 Updated by laforge 8 months ago

On Tue, Mar 09, 2021 at 10:45:03AM +0000, osmith [REDMINE] wrote:

Hm, I just clicked "Bring this node online again" and it is in the same state as before: "This node is being launched. See log for more details" with an empty log.

strange enough:

jenkins@jenkins:/$ ssh root@2a01:4f8:10b:2ad9::1:6
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:8QcoRrcTAPfFt9ZRHWmi4A0C0ljnwHZOqsVBrRcmSt4.
Please contact your system administrator.
Add correct host key in /var/jenkins_home/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /var/jenkins_home/.ssh/known_hosts:10
remove with:
ssh-keygen -f "/var/jenkins_home/.ssh/known_hosts" -R 2a01:4f8:10b:2ad9::1:6
ECDSA host key for 2a01:4f8:10b:2ad9::1:6 has changed and you have requested strict checking.
Host key verification failed.

did your ansible-rpi4-... work suddenly replace the ssh key on other build slaves?

[not a serious question, more joking]

#5 Updated by osmith 8 months ago

Not that I'm aware of. I've also checked my shell history and didn't find anything related, I've only operated on rpi* hosts.

The node is back online, looks like you have resolved it now. Can we mark this issue as resolved, or should we investigate further?

#6 Updated by laforge 8 months ago

On Wed, Mar 10, 2021 at 09:47:33AM +0000, osmith [REDMINE] wrote:

The node is back online, looks like you have resolved it now. Can we mark this issue as resolved, or should we investigate further?

yes, can be resolved.

#7 Updated by osmith 8 months ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

#8 Updated by osmith about 1 month ago

  • Related to Bug #5234: "update-osmo-ci-on-slaves" hangs: gtp0-deb10build32 is offline added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)