TLS Certificate Auto-Renewal

I’m seeing the same thing as backpackhasjetz on my domain: https://crt.sh/?q=mail.seanwatson.io

Since about the start of January the box has been requesting new certs every day. It looks like they are being issued properly which is why Lets Encrypt is starting to rate limit the requests.

Did you try this and reboot?

sudo pip3 install pyOpenSSL --upgrade

Or perhaps the steps outlined here:

Yes, and I get the same error message as backpackhasjetz. I’m going to disable SSL renewal for a week and then try it again.

Yes, I tried that comment’s instructions, didn’t see a change.Something really does seem to be bugged in MIAB with LetsEncrypt, though :-/

I’m not sure how to stop the auto-LetsEncrypt stuff from firing? There isn’t a way in the admin interface. What are you doing to disable it, @swatson?

What makes this worse is that I tried moving to a new box to see if that fixes it, and ran into issues with the guide: https://github.com/mail-in-a-box/mailinabox/issues/1071 - and I don’t know how to proceed from there. So I’m just stuck. Both of my boxes now have expired certs, so I may need to buy a real SSL cert or something.

I just commented out command to renew certs from the cron script. Open /home/root/mailinabox/management/daily_tasks.sh (or something like that, I’m writing from memory) and comment out the line with ssl_certificates.py (or something like that). I don’t see any new certificates on https://crt.sh today so I think its working.

FYI- Let’s Encrypt limit is 5 certificates for a given domain over a 7 day period.

Have you been able to diagnose why the certificates that are issued aren’t being saved?

You could try diagnosing by switching to the Let’s Encrypt staging server. To do so, edit management/ssl_certificates.py and modify the two places that client.issue_certificate is called – in my checkout of 0.21b the functions are called on line 332 and on line 351.

They should look something like this:

                                cert = client.issue_certificate(              
                                        domain_list,                          
                                        account_path,                         
                                        agree_to_tos_url=agree_to_tos_url,    
                                        private_key=private_key,              
                                        logger=my_logger)                     

You’d want to add a new acme_server argument:

                                cert = client.issue_certificate(              
                                        domain_list,                          
                                        account_path,
                                        acme_server=client.LETSENCRYPT_STAGING_SERVER,                         
                                        agree_to_tos_url=agree_to_tos_url,    
                                        private_key=private_key,              
                                        logger=my_logger)                     

Then try running ./management/ssl_certificates.py again and see if you can determine why the updated certificates aren’t being saved.

Once you’re done, you can reset that file back to the clean state:

$ git checkout -- ./management/ssl_certificates.py

Thanks. Very strange. I updated the code and re-ran ssl_certificates.py - and I’m getting the same error! Seems unlikely since I haven’t do anything regarding their staging server, ever. Maybe this means there’s something amiss with the actual ACME client.

What’s the output of:
pip3 list
and
dpkg --list

FYI: the staging server doesn’t produce valid certificates nor does it have a rate limit. That’s why @benschumacher suggested it. If the staging server produced the rate limit error, it’s not actually letsencrypt rate limiting you.

@cromulus I put the outputs here: https://gist.github.com/nicholashead/0a700cb43289093d7336d742d3935197

Very odd. Why would I be getting a rate error? Possible bug in ACME client or something else? It definitely changed the source code of the ssl_certificates.py file - I even put a print(“testing”) statement in there to make sure it was working.

I disabled the nightly ssl_certificates.py call two nights ago - and now just manually ran ssl_certificates (pointed at LetsEncrypt staging still supposedly) and it gave me… a totally valid certificate.

So I guess my problem is “fixed” for now, but that doesn’t answer:

  • How did MIAB get in this state to begin with?
  • What’s the long term fix (since I should be able to run the ssl_certificates.py nightly?)
  • Why isn’t the ACME client changing the server URL when passed into the method call?

I’m a C# programmer by trade, so I’m not really familiar with python enough to diagnose.

How did MIAB get in this state to begin with?

So, my hunch is that the pyOpenSSL version was outdated and incorrectly validating the ssl certificate downloaded from letsencrypt. Because validation failed, MIAB didn’t save it and thus when the cron job ran again the next day, MIAB requested another certificate. This caused MIAB to hit the rate limit.

Upgrading your pyOpenSSL package and then waiting for the rate limit to expire resulted in your ability to get a new, valid cert.

What’s the long term fix (since I should be able to run the ssl_certificates.py nightly?)

Everything should just work now. I think we might need to be more rigorous about the version of our python packages and the packages that are installed on a MIAB install.

Why isn’t the ACME client changing the server URL when passed into the method call?

Not sure what you are asking here. There is only one Lets Encrypt production URL. The staging url is only meant really for developing ACME clients or testing an implementation.

I’m saying that when I changed the URL in the python script to use the staging server (per the directions earlier in this thread from Ben), it still didn’t work. Isn’t that odd? Can you get it to talk to staging server yourself?

Also when you say “Because validation failed” - doesn’t it seem like a bug in the ACME client that not validating the cert would result in the “rate limited” error coming back? The source code for the client looks for specific language before it raises the rate limit error message, if I’m reading it correctly.

Did all my package versions seem ok? What did you specifically look for in those lists?

Thanks so much for helping by the way.

@backpackhasjetz
The staging server will never provide a valid ssl certificate.

MIAB only asks for a new certificate if the currently saved certificate is old or invalid. If you had either the ubuntu package installed (which is old and buggy) OR an older version of pyOpenSSL installed (which is also buggy), MIAB could not validate the totally valid SSL cert provided by letsencrypt. MIAB did not save that new cert because MIAB was using old buggy python ssl libraries that couldn’t validate it. So, every day, when MIAB checked to see if it should get a new cert, it appeared that the cert currently installed was old and needed to be renewed, thus MIAB tried to renew the cert, couldn’t validate it, and didn’t save it, and the cycle continued. From LetsEncrypt’s perspective, though, MIAB has requested a certificate every day for the past month, thus exceeding LetsEncrypt’s rate limit.

Even after you upgraded your python SSL libraries, you still were suffering the rate limit from the LetsEncryt side of things, which is why it took some time for you to be offered a new, valid cert from LetsEncrypt.

Yes, your packages seemed OK. I was looking to see:

  1. if you have the ubuntu python openssl package installed (which probably would have broken things)
  2. if your python libraries installed by pip3 were up to date, specifically, those related to SSL and Crypto, particularly pyOpenSSL.

The staging server will never provide a valid ssl certificate.

Yes, I understand that, but when I tried to modify the code earlier to talk to the staging server, I was getting the same error as I got on production (the rate limited error). Doesn’t that seem odd/a bug?

Thanks so much. Your explanation makes sense to me. You’ve been very helpful!

I see now. Huh. No clue. My guess is a bug in the underlying pyOpenSSL library or the free_tls_certificates.

The issue described by @cromulus above is exactly right. I’ve resolved this by commenting out the line in daily_tasks.sh that would request the new certificate, waited until I was outside of the rate limit, and have just successfully provisioned a new certificate manually using the admin interface. I’ve uncommented the line and am now back to a functional machine.

Thanks everyone for troubleshooting and figuring out the issue!

Hi guys,

First the solution with updating pip was a success, now it seems that ngix, dovecot and postfix using also the old certs now and then. This results in HTTPS, IMAP and SMTPS failures…

Does anybody have a clue to fix this?

Cheers.

in addition, this is the error message that the email client gets:

Mail was unable to connect to server “mail.domain.com” using SSL on port 993. Verify that this server supports SSL and that your account settings are correct.

I already removed the old certs in /home/user-data/ssl/ as I saw in the HTTPS request give an old cert. Now HTTPS is behaving like it should be :slight_smile:
Now I don’t know why it is behaving like this and the symlink seems to be pointing to the correct cert… perhaps a minor thing… reinstalling via script the MIAB didn’t fix the issue…

Please see also this error from Dovecot:


I think it is related to the issue.

Cheers