BIND9 config (as recursive local resolver)

Hi,

I need some insights from @JoshData or @KiekerJan here.

I got stumped this weekend as to why, of all the servers I run, upgrading to the latest version of pfSense only broke MiaB’s DNS resolution. I saw evidence that the DHCP server gives the correct upstream DNS server to MiaB, but for some reason the named configuration (BIND9) ends up no knowing which DNS server it should recursively resolve against.

It’s my understanding that MiaB used NSD exclusively for for the zones it hosts and configures it not to resolve recursively, and BIND9 exclusively for recursive resolution of everything else. I also found the piece of setup script that overwrites /etc/resolv.conf with comments about deleting symlinks which I don’t fully understand the significance of.

I wish to determine exactly what about how the new version of pfSense handles DNS and DHCP is causing the problem whereby the moment MiaB connects via that the local (recursive) resolver in MiaB stops being able to resolve anything. Since pfSense is going through many changes itself supporting either ISC DHCP (to be deprecated soon) or Kea DHCP (not quite production ready yet) with a lot of noise in that community about features the Kea version has been lacking and the work done around that, it’s a bit of a minefield at the moment. It would be really useful if I could somehow get insight into how MiaB engages with DNS Servers to get its intended setup so taht I can better know where to look on the pfSense side for the magic bit to turn on or off to make it work. All the named.conf related stuff in MiaB looks fine to me, and works without a problem against older versions of pfSense, but somewhere there is a new default or assumption in play that isn’t compatible.

P.S. If I manually change resolv.conf to point the nameserver at my pfSense that runs as a resolver everything works as normal. it also works without any entry in resolv.conf or the file being absent, for then the nameserver provided by DHCP is used. It has the feel of a race condition triggered by nameserver set to 127.0.0.1 but I’ve not done tcpdumps to see what traffic actually flows since I needed the servers alive again, they’re not testing grounds.

Any help would be appreciated.

I’m not 100% sure, but as far as I know, bind9 on Mail-in-a-Box is configured as a recursive DNS resolver for all local DNS requests on the box itself. That means it likely never used your pfSense for DNS resolution, unless you manually changed something like /etc/resolv.conf or the bind config to forward queries to pfSense.

The easiest solution is probably to just allow outbound port 53 (UDP and TCP) to the internet on your pfSense for the MiaB host. So basically, don’t block port 53, and don’t try to force MiaB to use the DNS resolver on your pfSense, as it’s designed to handle DNS lookups on its own.

Thank you for getting involved.

Whether or not it ever use the resolver on my pfSense is besides the issue actually, as long as it works and doesn’t stop working because the pfSense version changes. Note, only my MiaB servers are affected. Everything else, the other servers and all the clients, all resolving DNS by using the pfSense as recursive resolver, keeps working without a problem.

Those ports already are open and have been open and working forever.

BTW, It was never my intention to force BIND9/MiaB to use my pfSense as resolver. Using it as a temporary workaround merely helped me deal with the effects of the problem.

I’m happy for BIND9 to recursively resolve on its own steam as much as it wants to. I just need to figure out what stop it from doing so the moment the new firewall version gets involved.

None of the firewall rules, routes, gateways or acl’s or parameters being handed out by DHCP changes or anything else I am aware of is different. In fact, I’ve swopped between a router running 2.7.2 and one running 2.8.0 with the same config file applie to both, and back again without making any changes on MiaB, not even rebooting, still they stop resolving names while it’s 2.8.0 between them and the internet and resume normal operation once it’s back to 2.7.2.

Meaning that I’m not trying to get MiaB to work or configure itself differently as far as DNS is concerned (only how duplicity updates are concerned :-), but DNS is fine). I’m just looking for insights as to how it is configured to work so I’m better equipped to figure out what goes wrong.

Confirmation that BIND9 is definitely and purposefully configured to exclusively resolve recursively helps, a bit, thanks, as it rules out any notion of possible forwarding. It’ll be a bit of work to get a test setup going. Maybe if I do I’ll find that the new version has different IPv6 defaults in play which makes BIND9 use those servers and getting nowhere, or something similar.

If you think of anything else that might be involved, don’t hesitate to let me know.

P.S. It makes me wonder about the daily task job mysteriously timing out on checking the reverse DNS entries so often most users habitually ignore the warning already. Since mail delivery is a slow background process and DNS resolution has provision for temporary server failures built-in, I’m wondering if timeouts could be happening far more often than we suspect. Is it a given that every DNS timeout would get logged in some log file, or are most of them dealt with silently?

MiaB does not use the DNS server provided by DHCP, it uses bind9 as a local DNS resolver. This is used by all local processes that require Domain Name resolution. In addition the box uses NSD to host its own zones. This is used by the rest of the internet to find out stuff about the domains the box is hosting. You’ve found this out already.

There have been plenty of reports about the installation breaking. Often it breaks in the first step after changing the DNS resolver, where it is needed, so this failure does not directly point to bind9. Also, many users report getting it to work by either enabling or disabling ipv6, suggesting something fishy in the ipv6 handling configuration of bind9.
I’ve myself never encountered it, but I would start with looking at the logs: sudo systemctl status named and sudo journalctl -u named I assume bind9 gives enough of a error report to pin down what part of the configuration is wrong. (I assume the configuration can be improved to handle more situations than it covers now)

I suspect the timeouts that are now and then reported are also caused by bind9, but I’ve never been able to confirm that. These timeouts are reported by the Mailinabox management daemon. Mail programs like postfix are using their own code to resolve DNS, although they will still refer to the local DNS resolver. I suppose the reality is that it’s hard to find out if DNS timeouts cause delays in mail delivery. The mail system is resilient in that it tries multiple times to deliver email, so one DNS timeout will not stop it.

Side note: I’ve moved to unbound ages ago, and never looked back. There’s also some code to deal with timeouts, but I suspect it’s not Python enough :wink: