DNS issue preventing certificate renewal

vele · May 22, 2024, 8:25pm

I think his intention is since he is based in the UK the co.uk site is primary and thus he redirects dot com there and MIAB is on the dot com.

UPDATE: OK he got it working it is valid until 20 AUG

check here: box.gideon-it.com Webmail :: Welcome to box.gideon-it.com Webmail

MarthinL · May 22, 2024, 8:47pm

OK, according to /etc/nging/conf.d/local.conf there is a specific section that intercepts .well-known/acme-challenge request on port 80.

          location /.well-known/acme-challenge/ {
                # This path must be served over HTTP for ACME domain validation.
                # We map this to a special path where our TLS cert provisioning
                # tool knows to store challenge response files.
                alias /home/user-data/ssl/lets_encrypt/webroot/.well-known/acme-challenge/;

That means that the actual challenge file is being generated into the /home/user-data/ssl/lets_encrypt/webroot/.well-known/acme-challenge directory.

But I’ve checked, it seem the .well-known directory in webroot is deleted again afterwards so it won’t show when you go look for it with find. But I’ve created the directory, put a file in it and I can curl it no problem. We Philip still has a problem, we have a good number of things to check and test now, starting with the nginx config, then manually creating the alias directory with a test.txt test file in it and seeing if

curl http://box.gideon-it.com/.well-known/acme-challenge/test.txt

comes back with whatever text was placed in that file.

We’ll be able to tell why certbot fails based on whether we can get that to work or not. If must just remember to remove the directory afterwards or that might cause another failure down the line.

vele · May 22, 2024, 9:23pm

@MarthinL Great Job!
I was always wandering how this is done when http is not permited.
He still has the problem with the box.gideon-it.com and mta-sts.box.gideon-it.com.

I am awaiting for him to paste the logs from:
var/log/letsencrypt/letsencrypt.log

To see the messages.

The co.uk >> dot com redirect must be properly configured via an nginx block in the nginx config. That is the catch. The same refers to those who wish to redirect somedomain.com >> mail.somedomain.com, as per previous requests in other posts.

Gideon-IT-UK · May 22, 2024, 10:06pm

Holy cow! I go away for my tea and a bit of a break and two people are working their butts off trying to figure out my problem!! That’s amazing. I’m really grateful.

The latest LOG from clicking the provision button, on the TLS admin page, is as follows…

Saving debug log to /var/log/letsencrypt/letsencrypt.log Requesting a certificate for box.gideon-it.com and mta-sts.box.gideon-it.com Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems: Domain: box.gideon-it.com Type: connection Detail: During secondary validation: 81.174.152.174: Fetching http://box.gideon-it.com/.well-known/acme-challenge/EFcHCV33QGAQtvA8wPffEZtTzknCjeCEnSQ1JFyNaxk: Timeout during connect (likely firewall problem) Hint: The Certificate Authority failed to download the temporary challenge files created by Certbot. Ensure that the listed domains serve their content from the provided --webroot-path/-w and that files created there can be downloaded from the internet. Some challenges have failed. Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

I replaced the /home/user-data/www/default/index.html and the /home/user-data/www/gideon-it.com/index.html files to redirect using Javascript to gideon-it.co.uk simply because that’s my main website for my business.

As you may have guessed Linux is not my skill set, my work requires a LOT of Windows knowledge and some Mac OS/iOS knowledge but rarely any Linux.

It never occured to me that simply redirecting the home page of the default and gideon-it.com domains to the gideon-it.co.uk index.html would cause so much grief when it came time for certificate renewal!

The only two certificates that are still to renew are…

mta-sts.box.gideon-it.com
and
box.gideon-it.com

I’ll see what happens overnight and check again tomorrow.

Once again. Thank you both @MarthinL & @vele

vele · May 22, 2024, 10:25pm

This is a poorly performing or misconfigured DNS problem as the Let’s Encrypt staff says here:

Their main server reads the DNS records fine, but their secondary at another location in the world just doesn’t.

Just keep trying.

I was interested in this issue myself because I know how frustrating it might be.
And I learnt something from @MarthinL
And @MarthinL this is an additional justification for incorporating your External DNS idea. It seems that people who host their boxes at home over broadband should definitely use external dns for propagation or use secondary slaves.

Cheers

MarthinL · May 22, 2024, 10:53pm

That “Hint” confirms there’s something funky with your nginx config. Did you by any chance make any changes to nginx’s configuration in the process of siguring out the redirection you want or otherwise. Either was, please paste the file /etc/nginx/conf.d/local.conf here as preformatted text (i.e. between two lines containing nothing but three backticks each as in \n \n the file contents \n\n
I want to compare it with a file on my system. I suspect that somewhere along the way of trying to get the site to do what you wanted it to do you have followed instructions from some how-to off the web without knowing for real what the consequences was going to be and ended up rendering that carefully constructed section to allow AACME challenges to work kind broken. I’m hoping that I can spot the probem if I can see the file and how it differs from the original.

It’s also possible that what messes up your system isn’t in the local.conf file at all. Johs wants config changes to sit in local files so he doesn’t overwrite them when the software gets upgraded, but the guys making how-to material for extra income couldn’t care less about what Josh wants so they’d tell you to make changes directly in nginx.conf.

MarthinL · May 22, 2024, 11:43pm

That would be a cheap shot, I think. It’s true in a way but not as straight forward as you suggest. HEre’s the thing. When you’re properly using external DNS, the ns1.box.whatever.tld isn’t in play at all. Its not listed in the SOA and no publicly visible zone declares its IP. That’s when it’s done right. When it’s done half-right, the ns1 and/or ns2 .box.whatever.tld references stick around. What happens out there is that a client wanting to resolve a name to an IP for whatever purpose uses the protocols prescribed by the RFCs, and that means that they get a list of the NS records and the order of the list is deliberately randomised for each client asking to spread the load across the nameservers. If one or more of the name servers are not behaving as they should, whether its because the network is too busy or because the DNSSEC signatures are giving problems or there is a badly configured firewall or switch in the mix, then clients who pick that nameserver to ask is going to wait and get a timeout which is easily interpreted as a “temporary domain name lookup failure”. The professional DNS companies out there use a whole arrrray of techniques including load balancers and high availability principles to ensure that each of the name server IPs they provide is unlikely to get taken out by any single point of failure. Officially I have 4 nameserver IPs, but in real life those 4 are GeoIPs in the first place meaning that everybody querying them gets served by their nearest server of the 40 or 50 points of presence they’re up to at this point. Secondly once the query arrives at that nearest location there are multiple servers behind the scene that’s armed and ready to service the request. There is just no way a single VM at a fixed IP somewhere in the world, possibly even on the other side of a shared boradband link, with both ns1 andns2 resolving to the same IP as well can offer result in similar speed and reliability. Now for a slow, asynchronous service like email, that will patiently keep trying to query each email being sent’s MX record, that isn’t such a big deal. Time is not so critical and the typical number of emails sent through MiaB results in very small numbers of legitimate hits on the DNS servers anyway. But for real web traffic the situation is very different. Browsers have even less patience than their users, and that’s saying a lot. IF a DNS server is slow to react or unreachable even for a short while, the web world is very sensitive to that and things go wrong quite rapidly. Google skips the indexing of the site and users start reporting problems. Ultimately, the place where your typical MiaB user will get a glimpse of just how good their self hosted DNS server really is, or isn’t, is when the ACME certbot HTTP challenge tries to access the site to verify ownership. That process takes place in web space and web pace, not with email’s patience. That’s why it popped up during a certificate renewal conversation making people suspect that the certificate is failing because of a misconfigured DNS record when it’s really just badly performing DNS service that causes the process of resolving the domain name to try retrieve the file the challenge script placed on the website to fail.

MiaB has makes provision for serving static web pages, i.e. a set of files placed in a directory and accessed by giving the relative path to the files in the URL. Anything beyond that is formally out of scope of MiaB. In my opinion, that rules out 99% of today’s websites that’s all about interactive live content and SEO friendly URLs. I don’t have stats on it, but I suspect that the vast majority of MiaB users do not even look at the web capability. If you really have a website, even something as blatantly simplistic as Gideon-IT’s, then you’re goint to want a proper web server with a back-end that runs off a database for starters, not static html files.

Gideon-IT-UK · May 23, 2024, 5:20pm

@MarthinL - I’ve made no changes to the nginx config - wouldn’t have a clue how to or what it is!

Gideon-IT-UK · May 23, 2024, 5:21pm

Tried clicking the Provision button again to see if the last two items will get a certificate…no dice. Here’s the result…

Log:

Saving debug log to /var/log/letsencrypt/letsencrypt.log Requesting a certificate for box.gideon-it.com and mta-sts.box.gideon-it.com Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems: Domain: box.gideon-it.com Type: dns Detail: DNS problem: server failure at resolver looking up A for box.gideon-it.com; DNS problem: server failure at resolver looking up AAAA for box.gideon-it.com Hint: The Certificate Authority failed to download the temporary challenge files created by Certbot. Ensure that the listed domains serve their content from the provided --webroot-path/-w and that files created there can be downloaded from the internet. Some challenges have failed. Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

Gideon-IT-UK · May 23, 2024, 5:28pm

Here’s a link to the LetsEncrypt.log

vele · May 23, 2024, 6:44pm

Paste these in the Admin Page>> Custom Dns >> Using a Secondary Nameserver >> Hostname:

Hit Update

uz5x6wcwzfbjs8fkmkuchydn9339lf7xbxdmnp038cmyjlgg9sprr2.free.ns.buddyns.com uz56xw8h7fw656bpfv84pctjbl9rbzbqrw4rpzdhtvzyltpjdmx0zq.free.ns.buddyns.com uz588h0rhwuu3cc03gm9uckw0w42cqr459wn1nxrbzhym2wd81zydb.free.ns.buddyns.com uz5154v9zl2nswf05td8yzgtd0jl6mvvjp98ut07ln0ydp2bqh1skn.free.ns.buddyns.com uz5dkwpjfvfwb9rh1qj93mtup0gw65s6j7vqqumch0r9gzlu8qxx39.free.ns.buddyns.com uz5w6sb91zt99b73bznfkvtd0j1snxby06gg4hr0p8uum27n0hf6cd.free.ns.buddyns.com uz52u1wtmumlrx5fwu6nmv22ntcddxcjjw41z8sfd6ur9n7797lrv9.free.ns.buddyns.com

There is a single spage between them.
I made a zone for you on buddydns. This might help propagation.
Let’s see if tomorrow or in 24 hours.

MarthinL · May 23, 2024, 6:53pm

I’m thrown off balance by the evidence in the log file that some times certbot reports three challenge methods being used and other times only the http-01 method. I check my own letsencrypt.log file and it’s there as well from when I last renewed my certificates. But I haven’t seen any instance in either of our logfiles where using three challenge methods resulted in certificates being issued. The ones that succeed are when http-01 is used on its own and the request fully ppopulated with the url to the generated token file.

My best guess is that certbot falls back into desperate mode when it fails all the time, and that’s where your server is stuck at the moment. The original reasons why it started failing is unclear to me, but based on the errors certbot is reporting on, it appears to be related to your box or your link being overwhelmed to the point where the UDP packets by which it is trying to get your name server to turn names into IP addresses never delivers and after enough retries the let’s encrypt’s server gives up, reports a failure (sometimes, other times it fails silently and the “payload” is just empty) so it refuses to issue certificates.

Are you running MiaB on bare metal or a VM at home or hosted somewhere? What’s the specs, how busy is it and what is your internet connection like? Could it be that one or more of those components are simply too busy causing UDP packets to drop? That’s the trouble with UDP - there’s no session / connection so it does not guarantee delivery like TCP does. You don’t know if your datagram reached its destination unless you get a response back that means it must have gotten there.

Would you have any possibility of doing a clean install of MiaB. Backing up and restoring isn’t guaranteed not to restore your problems as well, but it might be the fastest way to get you up and running again in a properly certified manner.

Perhaps @JoshData can chime in here with some wisdom about how to reset the certbot config and data. Something like deleting or renaming /home/user-data/ssl and running the whole installer or a specific script again to rebuild it?

Gideon-IT-UK · May 31, 2024, 10:59pm

Well I fixed it by installing a new MiaB instance in a new VM on my server on the same IP and re-created the domains. I’ve since moved it again from the server to a spare laptop - that way I get some battery backup resiliance if the power goes off.

It’s working really well with 16Gb of RAM and a 256Gb SSD on a 2.3Ghz 4-core i3 gen 4 CPU.

system · July 10, 2024, 10:59pm

This topic was automatically closed 40 days after the last reply. New replies are no longer allowed.