Project

General

Profile

Bug #5211

problems with let's encrypt certificate renewal

Added by laforge about 2 months ago. Updated about 2 months ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
08/11/2021
Due date:
% Done:

50%

Spec Reference:

Description

today at 3pm our certificates for all services on host2 were due to expire.

Initially I thought maybe the cron job was not runnning, so I manually triggered a renewal.

the error message given was:

Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/www.osmocom.org.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cert is due for renewal, auto-renewing...
Plugins selected: Authenticator webroot, Installer None
Renewing an existing certificate
Attempting to renew cert (www.osmocom.org) from /etc/letsencrypt/renewal/www.osmocom.org.conf
produced an unexpected error: urn:ietf:params:acme:error:rateLimited :: There were too many requests of a
given type :: Error creating new order :: too many certificates (5) already issued for this exact set of
domains in the last 168 hours:
3rdparty.downloads.openmoko.org,bb.osmocom.org,cgit.osmocom.org,gerrit.osmocom.org,git.osmocom.org,gmr.osmocom.org,lists.openmoko.org,
openbsc.osmocom.org,openmoko.org,osmocom.org,patchwork.osmocom.org,people.openmoko.org,planet.
openmoko.org,planet.osmocom.org,projects.osmocom.org,registry.osmocom.org,sdr.osmocom.org,secu
rity.osmocom.org,tetra.osmocom.org,wiki.openmoko.org,www.openmoko.org,www.osmocom.org: see htt
ps://letsencrypt.org/docs/rate-limits/. Skipping.

All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/www.osmocom.org/fullchain.pem (failure)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/www.osmocom.org/fullchain.pem (failure)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

A check in the letsencrypt related directories determined that none of those alleged 5 successful renewed certs was stored anywhere in the filesystem. Even the timestamps of the files alone was sufficient to see that for many months no renewed cert has ever been stored.

As there is no way to reset / increase the rate limits, I decided to slightly modify the certificate [removing security.osmocom.org from the altnames], which then is treated in a separate rate limit bucket.

This produced the following output:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Obtaining a new certificate
Performing the following challenges:
http-01 challenge for 3rdparty.downloads.openmoko.org
http-01 challenge for bb.osmocom.org
http-01 challenge for cgit.osmocom.org
http-01 challenge for gerrit.osmocom.org
http-01 challenge for git.osmocom.org
http-01 challenge for gmr.osmocom.org
http-01 challenge for lists.openmoko.org
http-01 challenge for openbsc.osmocom.org
http-01 challenge for openmoko.org
http-01 challenge for osmocom.org
http-01 challenge for patchwork.osmocom.org
http-01 challenge for people.openmoko.org
http-01 challenge for planet.openmoko.org
http-01 challenge for planet.osmocom.org
http-01 challenge for projects.osmocom.org
http-01 challenge for registry.osmocom.org
http-01 challenge for sdr.osmocom.org
http-01 challenge for tetra.osmocom.org
http-01 challenge for wiki.openmoko.org
http-01 challenge for www.openmoko.org
http-01 challenge for www.osmocom.org
Using the webroot path /data/letsencrypt for all unmatched domains.
Waiting for verification...
Cleaning up challenges
archive directory exists for www.osmocom.org-0001

This at first glance looks good. There is nothing that looks like an error message. However, after nginx SIGHUP reload we were still using the old certificate.

Checking for certbot bugs rendered https://github.com/certbot/certbot/issues/5395

So "archive directory exists" is actually not a random status/indication message, but a fatal error. Despite compliants about this since 2016 this is still the case in 2021.

Next step:

root@host2-new /external/nginx/etc/letsencrypt # rm -rf live archive renewal

now certificate generation actually succeeded not just on the letsencrypt server side, but also locally in certbot:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Obtaining a new certificate

IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/www.osmocom.org/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/www.osmocom.org/privkey.pem
   Your cert will expire on 2021-11-09. To obtain a new or tweaked
   version of this certificate in the future, simply run certbot
   again. To non-interactively renew *all* of your certificates, run
   "certbot renew" 
 - If you like Certbot, please consider supporting our work by:

   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le

There is some blame to be shared on our side: local mail was not configured so cron could not send error messages about the unsuccessful exit code to the sysadmin.

Still I think it's extremely bad UI if a fatal error message does not identify itself as one.

Keeping this issue open as I need to add security.osmocom.org back into the list of hostnames as soon as we get out of that rate limit window


Checklist

  • re-add security.osmocom.org and re-gen certificate
  • verify e-mail notifications for cronjob failures work

History

#1 Updated by laforge about 2 months ago

Opened https://github.com/certbot/certbot/issues/8980 in case they want to improve their log output

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)