Feature #5848
closedbuilders need ccache
100%
Description
This is not just a power consumption issue because we're rebuilding objects that rarely change all the time, this also would cut down the build time on our weakest devices (rp4) to seconds instead of 30min as far as osmotrx is concerned.
Related issues
Updated by osmith over 1 year ago
I've thought about how to make this work with untrusted code from gerrit and to have limited fallout from cache poisoning.
It would work if we use several separate ccache dirs, to ensure that:- ccache dirs are never shared between code from gerrit and code that has been merged
especially osmo-trx seems to build with multiple cpu related flags (--with-neon, --with-neon-vfpv4 etc), we need to use separate ccache dirs for thoseno, ccache hashes the flags used too
But then, as you mentioned, it should lead to significant speed up of builds. I like the idea.
Updated by osmith over 1 year ago
Related: due to a bug actually only two of four build verification jobs were running for osmo-trx:
https://gerrit.osmocom.org/c/osmo-ci/+/30985
Updated by Hoernchen over 1 year ago
It it reasonable to assume that we will be fighting nation-state actors that are capable of forging blake2b hashes to poison our ccache with bad code that has to go thorugh gerrit where no one will notice it? I don't think that is much of a concern unless you use the ancient md4 ccache..
Updated by osmith over 1 year ago
- Status changed from New to In Progress
- % Done changed from 0 to 30
First related patch: https://gerrit.osmocom.org/c/docker-playground/+/31010
Updated by osmith over 1 year ago
- % Done changed from 30 to 50
I have a proof of concept ready. However the problem is that we have lots of calls considered "uncachable" by ccache, e.g. when building osmo-trx + deps for x86_64 it's about 500 of them. Looking into what's causing them.
Updated by osmith over 1 year ago
- % Done changed from 50 to 90
osmith wrote in #note-6:
I have a proof of concept ready. However the problem is that we have lots of calls considered "uncachable" by ccache, e.g. when building osmo-trx + deps for x86_64 it's about 500 of them. Looking into what's causing them.
These come from autotools, ltmain.sh
. It runs $CC -V
to figure out if it's a Sun C++ compiler. GCC doesn't support this flag:
gcc -V gcc: error: unrecognized command-line option ā-Vā gcc: fatal error: no input files compilation terminated.
Not sure why this runs 500x during a build, but at least it's not a performance bottleneck as gcc exits instantly here.
Now with ccache enabled, the whole CI job for osmo-trx runs in 9 min, 15 seconds instead of 23 min. It still spends a lot of time in osmo-trx (not the dependencies) on the raspberry pis, apparently most of the time linking binaries.
For x86_64 --with-sse and without manuals, it's running in 1 min 55s instead of 4 min 52s.
(I guess if we changed the rpi OS to aarch64, we could get another speed improvement.)
So all in all, it's not done in a few seconds, but a significant improvement nonetheless.
Patches:Updated by osmith over 1 year ago
- Related to Bug #5863: Building the jenkins image on rpi4 takes > 1h added
Updated by Hoernchen over 1 year ago
my pi4 with cpu freq fixed to max and 2 usable cores building libusrp+osmo-trx -j5 with all options, no ccache :
./buildall.sh 406.40s user 266.79s system 152% cpu 7:20.66 total
Why are our office pis so slow? Ultracheap slow sd card, maybe? Is my btrfs doing some magic?
ī° uname -a Linux raspberrypi 5.15.50-v8-osnoise-raspi #1 SMP PREEMPT Sun Oct 16 15:11:41 CEST 2022 aarch64 aarch64 aarch64 GNU/Linux ī° lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.1 LTS Release: 22.04 Codename: jammy
same numbers for my host:
./buildall.sh 89,01s user 20,37s system 201% cpu 54,306 total
Updated by laforge over 1 year ago
On Mon, Jan 23, 2023 at 04:55:34PM +0000, Hoernchen wrote:
Why are our office pis so slow? Ultracheap slow sd card, maybe? Is my btrfs doing some magic?
possibly completely worn out by now in terms of flash writes? I want to mount them
differently mechanically (using bgt-rpi-mount) soon-ish, maybe we should swap SD cards
at that time.
Also, this is yet another reason why building on our LX2 should be considered: It
has a nVMe SSD (faster, likely more TBW permitted than any SD card), and it has 64GB RAM,
so tmpfs-only builds are also an option.
Updated by osmith over 1 year ago
Hoernchen wrote in #note-9:
Why are our office pis so slow? Ultracheap slow sd card, maybe? Is my btrfs doing some magic?
differences in your setup that probably make it faster:
- aarch64
- clang instead of gcc
- different SD card
laforge wrote in #note-10:
On Mon, Jan 23, 2023 at 04:55:34PM +0000, Hoernchen wrote:
Why are our office pis so slow? Ultracheap slow sd card, maybe? Is my btrfs doing some magic?
possibly completely worn out by now in terms of flash writes? I want to mount them
differently mechanically (using bgt-rpi-mount) soon-ish, maybe we should swap SD cards
at that time.Also, this is yet another reason why building on our LX2 should be considered: It
has a nVMe SSD (faster, likely more TBW permitted than any SD card), and it has 64GB RAM,
so tmpfs-only builds are also an option.
I didn't realize moving the arm jenkins builds over to lx2 was an option! That would make it much faster of course. Do you want me to set up an lxc on the lx2 and set it up as jenkins node?
Updated by laforge over 1 year ago
On Tue, Jan 24, 2023 at 08:19:30AM +0000, osmith wrote:
I didn't realize moving the arm jenkins builds over to lx2 was an option!
I believe I mentioned it several times during weekly review, when I added the grafana
setup showing how little utilization this unit has (check for yourself).
Do you want me to set up an lxc on the lx2 and set it up as jenkins node?
I would at least make sense to test that and see if it solves our performance troubles.
Updated by osmith over 1 year ago
- Status changed from In Progress to Resolved
- % Done changed from 90 to 100
laforge wrote in #note-12:
On Tue, Jan 24, 2023 at 08:19:30AM +0000, osmith wrote:
I didn't realize moving the arm jenkins builds over to lx2 was an option!
I believe I mentioned it several times during weekly review, when I added the grafana
setup showing how little utilization this unit has (check for yourself).
It's still the case.
Do you want me to set up an lxc on the lx2 and set it up as jenkins node?
I would at least make sense to test that and see if it solves our performance troubles.
Opened #5873 to follow up.
Ccache related patches are merged, marking this issue as resolved.
Also I've changed the number of executors from 2 to 1 for each rpi, since that caused the individual builds to be slower and sometimes abort (probably something like linking consuming much ram, running twice at the same time).
Updated by osmith over 1 year ago
- Status changed from Resolved to In Progress
- % Done changed from 100 to 90
Fixups for not running ccache on e.g. simtester: https://gerrit.osmocom.org/q/topic:ccache-fixup
Updated by laforge about 1 year ago
- Status changed from In Progress to Resolved
- % Done changed from 90 to 100
osmith wrote in #note-14:
Fixups for not running ccache on e.g. simtester: https://gerrit.osmocom.org/q/topic:ccache-fixup
those are also long merged, please close such issues by occasioanlly browsing through all of "your" issues and checking their status.