Error: Something went wrong, sorry. (System Status Checks and TSL Certificates Pages)

0eebPlGYK88eyn740993 · September 4, 2019, 3:32am

Question for all you MIAB experts.

I just updated to v.43 and all was fine, then today, there were apt updates noted below, I applied them and rebooted the box.

All services are running fine, but when I try to access the admin website, the following two pages (System Status Checks and TLS (SSL) Certificates) timeout with this error “Error: Something went wrong, sorry.”. All other pages load fine with no errors and work as expected. I have run the “curl -s https://mailinabox.email/setup.sh | sudo bash” statement again with no errors noted, see below at end.

APT Upgrade: libsystemd0, udev, libudev1, systemd-sysv, libpam-systemd, systemd, libnss-systemd

Setup Results:
Mail-in-a-Box Version: v0.43

Updating system packages…
Installing system packages…
Initializing system random number generator…
Firewall is active and enabled on system startup
Installing nsd (DNS server)…
Installing Postfix (SMTP server)…
Installing Dovecot (IMAP server)…
Installing OpenDKIM/OpenDMARC…
Installing SpamAssassin…
Installing Nginx (web server)…
Installing Roundcube (webmail)…
Installing Nextcloud (contacts/calendar)…
Nextcloud is already latest version
Installing Z-Push (Exchange/ActiveSync server)…
Installing Mail-in-a-Box system management daemon…
Installing Munin (system monitoring)…
updated DNS: OpenDKIM configuration

Your Mail-in-a-Box is running

0eebPlGYK88eyn740993 · September 4, 2019, 4:29am

When I run, sudo ./status_checks.py from the mailinabox/management folder, all checks appear correct with no issues, but the website still shows the error and nothing returns to the website page. Could there be a timeout that can be changed/extended that is causing the Status Check and TLS Certificates pages to timeout when the rest of them work fine? I do have aroudn 89 DNS entries that it is checking for records being set correctly, but this runs fine via the command line tool and displays all of them correctly, but this might be causing a timeout via the website, plus with having to try to query and list all the TSL certificates too for all of them. This was not an issue before I added 7 more domains today too that results in around 28 more records since the system creates the root, autoconfig, autodiscover and www entries for each new domain added for email.

System

Public DNS (nsd4) is not running (port 53).
✓ SSH disallows password-based login.
✓ System software is up to date.
✓ Mail-in-a-Box is up to date. You are running version v0.43.
✓ System administrator address exists as a mail alias. [administrator@ ]
✓ The disk has 21.08 GB space remaining.
✓ System memory is 32% free.

Network

✓ Firewall is active.
✓ Outbound mail (SMTP port 25) is not blocked.
✓ IP address is not blacklisted by zen.spamhaus.org.

0eebPlGYK88eyn740993 · September 4, 2019, 5:07am

I doubled VM system resources and it still takes the same amount of time from the command line for the status checks and ssl certificates scripts to run, so it’s not a resource issue that I can see, must be a timeout issue, can that be tweaked as needed, and if so, where?

alento · September 4, 2019, 9:54am

Right there is the problem. Your system is timing out because nsd is not runnnig.

The problem has something to do with IPv6 …

0eebPlGYK88eyn740993 · September 4, 2019, 1:22pm

no, nsd is running locally just blocked on firewall because i am running dns externally, been working fine since original install of v.42, with updates to v.a, v.b and now v.43 including ip setup which is ip4 only. check always shows X by nsd, that is not the problem as manual run returns that nsd status immediately. delay happens when querying the domains setup via check script and tls cert script, takes a long time, can time it later this morning. as i said manual scripts run to completion with no errors or timeouts, web interface times out for both, i will post entire manual response after sanitizing later too. is there a way to make tls changes normally done through web interface via the command line, if so, where are the instrutions?

alento · September 4, 2019, 1:29pm

Ok, I missed this earlier … indeed the time it takes to do these checks is causing the time out.

You can run the /mailinabox/management/ssl_certificates.py manually for the certs.

I am not certain how you are going to be able to fix the time out error … most likely you’ll need to adjust the timeout for nginx.

0eebPlGYK88eyn740993 · September 4, 2019, 2:22pm

The status_checks.py script takes 100 seconds from command line and runs just fine, no errors. The System Status Checks on the website timeouts after 60 seconds with the message, “Error: Something went wrong, sorry.” I have run these time checks with no VPN, no Tor, just straight internet connection via AT&T fiber, so no unnecessary delays on any front.

The problem with changing nginx timeout or whatever timeout is that it will most likely be reset by any MIAB update, so can we get this timeout to be a changeable option or raise it by default to something other than 60 seconds. Also, where would you go about changing it, what file and location and which settings for sure, and will it revert after an update?

@JoshData can you answer about these questions and future possibilities?

alento · September 4, 2019, 2:28pm

You are absolutely correct that any change would be reverted when MiaB is upgraded to a new version.

How many domains are you serving? I found this problem to begin to occur at around 20 domains myself.

I cannot remember where I made changes to the nginx timeout as I encountered this 2-3 years ago. My solution was to remove some unnecessary sub-domains which brought me to a manageable number of domains which did not cause the status page to time out. Obviously, this is not the correct solution for everyone.

0eebPlGYK88eyn740993 · September 4, 2019, 3:22pm

I added the following to /etc/nginx/nginx.conf and the System Status Check and TLS Certificates web pages work fine again. Whether this is a good practice or not, or if it revert after an MIAB update, at least I know where to fix again should the need arise or until this is resolved in the software upon install. SSC takes 150 seconds and TLS takes 240 seconds to run and return data to webpage. @JoshData

http {
proxy_read_timeout 300s;