When upgrading a Zimbra server to a somewhat recent version (8.7.3 for example), it may attempt to install its own DNS Cache (zimbra-dnscache). It’s obvious that this may cause issues if you are running some other DNS caching service, or your own BIND, on the server. But these are rather obvious issues and not unique to Zimbra.
What is not, however, equally obvious is that you may think that zimbra-dnscache is actually running, and that it is actually doing what it is supposed to be doing.
My first hint that things weren’t as they appeared to be was extremely slow external SMTP sessions when clients like Thunderbird and other “client mailers”, as well as some web based Helpdesk applications were attempting to send e-mail via Zimbra.
The upgrade to Zimbra 8.7.3 had gone quite well, so it wasn’t an obvious place to start looking.
Until I noticed that SSH logins were also quite slow to this server. They had never been slow before. Checking the SSH configuration on the server did not reveal much other than the fact that it was indeed using reverse DNS lookups.
Checking /etc/resolv.conf made everything clear. Zimbra had, in attempt to use its own zimbra-dnscache, added “nameserver 127.0.0.1” to /etc/resolv.conf. In a perfect world, that may have been what I wanted …
After removing 127.0.0.1 from /etc/resolv.conf, inbound SMTP sessions from “client mailers” and web applications went from 7-10 seconds down to 0.5-0.1 seconds. Case closed.
I’m thinking Zimbra should add a post-installation sanity check. When all services are up and running, a DNS lookup to a known host (www.zimbra.com for example) should return within less than a second or two, anything else is an indication that the system may not function as intended.
#zimbra-dnscache
This is really interesting, I need to take a look and try to reproduce it, would you mind to open a bugzilla about it?
I have now opened #107676