Project

General

Profile

Actions

Bug #5654

closed

OBS errors for libosmo-pfcp, osmo-upf, osmo-hnbgw

Added by osmith 4 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
Start date:
08/22/2022
Due date:
% Done:

100%

Spec Reference:

Description

Meta-issue for the multiple errors we currently have on obs.osmocom.org that are occurring since merging https://gerrit.osmocom.org/c/osmo-ci/+/29047.

I voted +1 myself on that patch but now realize that we should really change the workflow from:
  • let jenkins push packages to OBS
  • see the failures on OBS over the next day(s)
  • fix it up over multiple iterations
to:
  • verify first that the packages will build on OBS
  • only then add them to the jenkins job

The problem is that otherwise we generate a lot of mails for the failed builds and other build errors from projects that were not recently added get lost in the noise.

I'll investigate which errors I can fix right now, and possibly revert the patch so we can properly fix the packaging, have it go through code review, and only then enable it again.

My impression is that we (or at least I) always spend quite some time on fixing packaging once a new project is added. I think having this in CI would be a big time saver in the future (#2385), so I'll also look into that again.


Checklist

  • libosmo-pfcp nightly: rpm spec lint errors for opensuse builds (OS#5653)
  • libosmo-pfcp latest: missing changelog (-> jenkins job can't generate the source package)
  • osmo-hnbgw nightly: rpm doesn't build
  • osmo-upf nightly: lots of errors for deb and rpm, for suse currently unresolvable
  • add note to obs scripts config to build-test first, before adding new projects to the config

Related issues

Related to libosmo-pfcp - Bug #5653: various rpmlint errorsResolvedosmith08/21/2022

Actions
Related to Cellular Network Infrastructure - Feature #2385: validate debian rules/control as part of jenkins build testingResolvedosmith07/21/2017

Actions
Actions #1

Updated by osmith 4 months ago

  • Related to Bug #5653: various rpmlint errors added
Actions #2

Updated by osmith 4 months ago

  • Related to Feature #2385: validate debian rules/control as part of jenkins build testing added
Actions #3

Updated by laforge 4 months ago

On Mon, Aug 22, 2022 at 09:05:21AM +0000, osmith wrote:

I'll investigate which errors I can fix right now, and possibly revert the patch so we can properly fix the packaging, have it go through code review, and only then enable it again.

thanks. I already fixed some yesterday after getting spammed with tons of build failures every day of my
holidays without anyone seeming to care.

My impression is that we (or at least I) always spend quite some time on fixing packaging once a new project is added.

The same also applies to me. The expectation is that if somebody
creates a new project, that the packaging is in place and has been
tested locally or in a private OBS project first.

If a patch adding a project to jenkins nightly/latest builds is
submitted for review in gerrit, I am of course assuming that the
actual packages build fine before such a patch is submitted for review.
If it is not, it should remain WIP or not in gerrit at all.

Sure, there can always be some unexpected fall-out, but the issues we're
seeing currently appear 100% predictable. Like not at all having
dependencies available, or not listing libosmo-foo as package dependency
while configure.ac depends on them.

Actions #4

Updated by osmith 4 months ago

  • Checklist item libosmo-pfcp nightly: rpm spec lint errors for opensuse builds (OS#5653) set to Done
  • Checklist item libosmo-pfcp latest: missing changelog (-> jenkins job can't generate the source package) set to Done
  • Checklist item osmo-hnbgw nightly: rpm doesn't build set to Done
  • Checklist item osmo-upf nightly: lots of errors for deb and rpm, for suse currently unresolvable set to Done
  • Checklist item add note to obs scripts config to build-test first, before adding new projects to the config set to Done
  • % Done changed from 0 to 90

laforge wrote:

without anyone seeming to care.

FWIW there was a patch in https://gerrit.osmocom.org/c/osmo-upf/+/29142 but it didn't fix all errors.

checklist wrote:

libosmo-pfcp nightly: rpm spec lint errors for opensuse builds (OS#5653)

libosmo-pfcp latest: missing changelog (-> jenkins job can't generate the source package)

https://gerrit.osmocom.org/q/topic:pfcp-packaging

osmo-hnbgw nightly: rpm doesn't build

https://gerrit.osmocom.org/c/osmo-hnbgw/+/29187

osmo-upf nightly: lots of errors for deb and rpm, for suse currently unresolvable

add note to obs scripts config to build-test first, before adding new projects to the config

https://gerrit.osmocom.org/c/osmo-ci/+/29190

Actions #6

Updated by osmith 3 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100
Actions #7

Updated by neels 3 months ago

TLDR:
- I would appreciate if we could avoid an atmosphere of blame.
- Please let me merge my own patches when non-trivial, WIP or not.
- Many thanks for fixing things, where it would have been my task.
- Build fallout happens, and a brief nudge on IM is the best way to deal with it.

My impression is that we (or at least I) always spend quite some time on fixing packaging once a new project is added.

The same also applies to me.

I feel like you're directing this at me. I have and am spending massive time on
the packaging, fixed dozens of errors already. It is complex and most of it is
new to me.

Many interdependent patches in 4 gits (2 brand new) affecting our CI infra,
which is also in flux. I am testing extensively on my OBS "home project",
errors pop up incrementally, on various OS targets. All needs CR, and needs
merge in a specific order. Naturally i will make a fair number of mistakes,
some more obvious than others, and it will take its time to settle.

This is a perfectly normal part of the process, and, obviously, errors are not
something i do on purpose. The tone matters, it is frustrating to be implicated
as being negligent while juggling six balls and an axe.

I do very much appreciate the help that others volunteer to eradicate also the
last mistakes that remain, even if it would technically have been my task.
It would be good to stay in contact on jabber about this.

The expectation is that if somebody creates a new project, that the packaging
is in place and has been tested locally or in a private OBS project first.

Idk whose expectation you are referring to, this sentence is utopian to me.
Would be good, but practically impossible to achieve: it obviously takes a lot
of time and effort to get working jenkins builds and binary packaging.

If a patch adding a project to jenkins nightly/latest builds is
submitted for review in gerrit, I am of course assuming that the
actual packages build fine before such a patch is submitted for review.
If it is not, it should remain WIP or not in gerrit at all.

It is important to get review early, especially when doing unfamiliar things.
WIP patches are usually being ignored completely.

If someone else merges my patch, it would be that person's responsibility to
watch out for build fallout, and to revert the patch if necessary. Or rather,
we should work together to fix things, whoever broke it.

But I prefer to merge my own patches, marked WIP or not, especially when there
is interdependence. It is better to ping me on jabber asking whether i forgot
to merge some patch. I'm pretty sure I have asked this at least once before.

I probably also did merge patches myself that caused build fallout.
This happens, and a brief nudge on IM is the best way to deal with it.

~N

Actions #9

Updated by laforge 3 months ago

Hi Neels,

On Tue, Aug 23, 2022 at 03:21:42PM +0000, neels wrote:

- I would appreciate if we could avoid an atmosphere of blame.

I'm sorry, this is not the intention, but with the many tasks I'm
juggling with, and particularly the extremely limited time I have to
spend on important work issues while on holidays I can sometimes get a
bit harsh.

Nobody is accusing you or anyone of intentionally causing fall-out.
It just looks like some if not the majority of it was preventable. The
majority of the problems should have become visible in a home:* build of
the package in question and/or its dependencies. Still right now, after
probably more than a week we don't even have the dpkg/spec/configure.ac
dependencies all in a working order.

- Build fallout happens, and a brief nudge on IM is the best way to deal with it.

I think my assumption is that anyone is aware of build errors the moment
they generate notification mails, and hence thre's no need for any
additional redundant communication on that.

I feel like you're directing this at me.

In this specific instance it was your work that caused the fall-out. I
don't really keep mental per-developer histograms of whose work caused
which amount of build failures, so I cannot really say who might have
triggered it at earlier times.

Having something fail once is not the problem (IMHO). If a nightly job
fails one night, then everything should be done to either revert the
patch or fix all those problems in the next day, so that the next
nightly build doesn't generate 30+ error mails again.

In this case now I had the feeling that this has been going on for the
entire week of my holidays. So every time I looked at my e-mails every
1-2 days, I was buried in build errors that weren't fixed any of those
days. Sorry if that was just a subjective exaggeration.

I have and am spending massive time on
the packaging, fixed dozens of errors already. It is complex and most of it is
new to me.

That's probably the surprising part for me: I was under the assumption
that all of us had done [osmocom] packaging work in the past. Given the
number of projects/repos we have and the number of dependencies, etc. I
wouldn't have assumed that there are people involved in the project that
long but never went through it.

This is a perfectly normal part of the process, and, obviously, errors are not
something i do on purpose. The tone matters, it is frustrating to be implicated
as being negligent while juggling six balls and an axe.

I think the main problem from my point of view is the duration. As
stated, any build failures should normally be dealt with (or reverted)
before the next nightly builds again fails.

If something was failing and fixed the next day, or 2 days lalter, I
wouldn't have mentioned it. But we're talking about a different scale
here.

It would be good to stay in contact on jabber about this.

It usually is a situation where already one has no time at all, but the
ongoing failures are reacting a point where it is hard to impossible to
further ignore it. So I feel compelled to do some emergency fixes to
remediate the situation. Starting a discussion via whatever medium
(particularly a real-time one) doesn't feel like it's going to change
the situation right now, while a related fix does.

The expectation is that if somebody creates a new project, that the packaging
is in place and has been tested locally or in a private OBS project first.

Idk whose expectation you are referring to, this sentence is utopian to me.

I was referring to what I understood as general "common sense"
expectation, but it certainly was my personal expectation.

Would be good, but practically impossible to achieve: it obviously takes a lot
of time and effort to get working jenkins builds and binary packaging.

I'm quite certain that the kind of missing dpkg/rpm dependencies and missing
packages-depended-upon would have been visible immediately in a
"home:..." OBS project before we start to attempt any "master/nightly"
builds in the official OBS project.

OBS builds in home: should not differ by one bit from those in osmocom:*
if you use the same project config.

WIP patches are usually being ignored completely.

That's unfortunately true with the default setting "-is:wip" in gerrit.
I guess the only time I am reviewing WIP patches is when they are part
of a group of patches of which some others are non-WIP and I go through
them step by step.

One method I've seen several times as a work-around is to not flag a patch
as WIP but set a CR-1 or -2 by the author itself (together with a
one-line that this is not to be merged yet). This way it's clear that
the patch should under no circumstances be merged. Not ideal, but
certainly makes it very clear.

If someone else merges my patch, it would be that person's responsibility to
watch out for build fallout, and to revert the patch if necessary. Or rather,
we should work together to fix things, whoever broke it.

I think this is where we disagree to some extent. If your patch causes build
failures because it depends on some other changes in some other repositories, then
that patch should have a "Depends: foo.git I12341234change_id" in it. This way
it is explicit that e.g. a jenkins build job should only be merged after the respective
commit with the jenkins.sh in the target repo has been merged.

The code review process already takes long as it is (several days at
least), so waiting any longer for the author to get back to merge it is
something I don't really think would scale very well.

But I prefer to merge my own patches, marked WIP or not, especially when there
is interdependence. It is better to ping me on jabber asking whether i forgot
to merge some patch. I'm pretty sure I have asked this at least once before.

My subjective feeling is that this adds a lot of additional delay. Especially
with fixes that are related to regressions or build failures, there is a very
strong incentive to get them merged ASAP.

Regards,
Harald

Actions #10

Updated by neels 3 months ago

I was buried in build errors that weren't fixed [many days]

I'm quite certain that [problems] would have been visible immediately in a
"home:..." OBS project

The situation was:

I'm not sure which patch merge triggered the build failures, i was under the
impression that the osmocom OBS was not affected yet. That accidental early
merge should have been reverted.

I was going to push the patches when sure that everything worked -- my OBS Home
builds from my git branches were mostly passing, the external netfilter deps
were still a problem. Important fixes were already available on my git branches.

I intentionally block most automatic mails, they are simply too many. So you
were having the impression of a huge impact impossible to be missed, while I
was not aware of any annoyance caused.

[when there is] no time at all, [...] I feel compelled to do some emergency
fixes

If you have time for emergency fixes, you have time to set off a PM one-liner.

...in summary, I should better have noticed the build failures,
and probably someone fixing my patches should have PMed me.
I'll try to get builds passing as soon as possible.

Thanks for you feedback,
I guess that this .. incident is resolved now?

Actions #11

Updated by osmith 3 months ago

Hi Neels,

I'm sorry if what I wrote in this issue came across as blaming you, this was not my intention.

neels wrote in #note-7:

The expectation is that if somebody creates a new project, that the packaging
is in place and has been tested locally or in a private OBS project first.

Idk whose expectation you are referring to, this sentence is utopian to me.
Would be good, but practically impossible to achieve: it obviously takes a lot
of time and effort to get working jenkins builds and binary packaging.

I think it's fine if deb/rpm packaging isn't building with the first commits of a new project.

But for the future I think we should make sure that packages do build before we let jenkins push the source packages to OBS. Because at that point, the build failures show up. I'm assigned to the OBS build maintainer role (SYS#4010) so I need to make sure we don't have errors there; and I've realized now that it just makes much more sense to test first if the packages build rather than waiting for it to fail in nightly and fixing up afterwards (as it was common practice before, again not blaming you here). In general also I don't mind taking care of fixing up packaging myself / completely doing the packaging of a new project, I understand that it's some work to wrap one's head around this.

Actions #12

Updated by neels 3 months ago

On Wed, Aug 24, 2022 at 09:27:27AM +0000, osmith wrote:

[...]

thanks for your feedback!

I'm still not sure what exactly triggered the build failures, i'll continue to
pay attention to that aspect.

I don't mind taking care of fixing up packaging myself

ack, thanks for that. It is also a good idea to spread important knowledge
across several of us, so i think it is overall good that i could make some
mistakes and re-familiarize myself with the packaging topic.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)