Project

General

Profile

Bug #4555

host2 disk space running low

Added by laforge 6 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
05/17/2020
Due date:
% Done:

100%

Spec Reference:

Description

/dev/md2        438G  377G   39G  91% /
  • 199 GB are jenkins (build history / artefacts)
  • 23GB is the deb8build jenkins slave lxc
  • 79GB is the deb9build jenkins slave lxc
  • 29GB are our docker image layers

We could add hard disks / SSDs to the server, but those are (rented) relatively expensive at ~ 8-10 EUR per month, which we'd have to double for RAID-1. It would be much more economic to upgrade from the AX60 to an AX51-NVMe (2x 1TB storage, faster CPU), which has only an EUR 5 per month price increase. However, that would mean migrating all data to a new machne and then finally switch over after everything is migrated.

History

#1 Updated by laforge 6 months ago

  • Description updated (diff)

#2 Updated by laforge 6 months ago

I've reclaimed 5GB due to 'docker image prune'.

In the debian9 build slave lxc,
  • 33GB are docker images/layers
  • 24GB are in /home/osmocom-build/jenkins/workspace

The workspace looks fine, but whether the 33GB docker layers are all needed remains to be investigated.

#3 Updated by laforge 6 months ago

What's quite interesting is the difference in 'du' output within the debian9 lxc and outside of it:

root@deb9build-ansible:~# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/md2        438G  371G   45G  90% /
root@host2 /var/lib/lxc # du --max-depth=1 -h
78G     ./deb9build-ansible

#4 Updated by laforge 5 months ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 10

the problem is becoming more critical. I've inquired wih Hetzner about an upgrade to AX51-NVMe.

The biggest consumer of space are the BTS tests at about 1.8 to 2 GB per build. We keep 35 at the moment, ant we test master+latest for debian+centos, adding to almost 100GB at the moment already only for the BTS test.

As an interim measure, I am bzip'ing all those pcap files of the BTS tests.

#5 Updated by roh 4 months ago

  • Priority changed from Normal to Urgent

disk ran full tonight.

redmine for osmocom.org stalled out, so i tried to reload the docker compose unit which failed (looped)

i scraped off 2% reserved space of /dev/md2 via tune2fs to get it up again - this is urgent now.

/dev/md2        438G  415G  8.9G  98% /

#6 Updated by laforge 4 months ago

The "easy" approach to get more disk space without deleting anything is:

cd /external/jenkins/home/jobs/ttcn3-bts-test
find . -iname \*pcap -exec bzip2 \{\} \;

Migrating to the new server is a too time-intensive distraction for me at the
moment :/

#7 Updated by roh 4 months ago

disk was full again. i used the above script to gain 20gigs again.

note to self: use gzip next time, so wireshark can still directly open the files.

#8 Updated by laforge 3 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 10 to 100

all services migrated to new host2. The new machine still has a different IP address (178.63.23.163) and the old machine is running TCP + UDP port forwarding.

On Tuesday or Wednesday morning 8am, the IPs get migrated, and the old machine will disappear. I'm taking a last backup of the old 'host2' right now.

Resolving this ticket as the disk space problem is now absent.

I also re-compressed all pcap.bz2 to pcap.gz for convenience reasons.

/dev/md2        906G  397G  464G  47% /

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)