Temporary failure in name resolution

16 posts / 0 new
Last post
#1 Mon, 01/27/2014 - 14:19
mossbot

Temporary failure in name resolution

Hello. I'm having an issue since upgrading Ubuntu. I had backed up all my virtual servers, did a clean install of the OS, reinstalled Virtualmin, then restored all the virtual servers. Everything seemed to go very smooth, but I am now noticing an issue that popped up in Wordpress, but I believe it relates to the server itself.

When I try and do an auto-update in wordpress, I get the following error:

Download failed.: 0: php_network_getaddresses: getaddrinfo failed: Temporary failure in name resolution I also can no longer receive emails through contact forms. I assume this might have something to do with DNS? I can post whatever information is needed, I've tried to figure this out on my own for the past few hours and nothing seems to work. Any help is appreciated, thank you.

Mon, 01/27/2014 - 14:39
mossbot

Wow, solved.

I swapped the order of my DNS servers in Webmin so that 127.0.0.1 came after OpenDNS servers. That seemed to do the trick. Not sure if this is an ideal solution, is 127.0.0.1 even meant to be listed?

Mon, 01/27/2014 - 15:45
andreychek

Howdy,

Yeah, 127.0.0.1 should be listed on a Virtualmin system -- BIND should be configured to answer DNS queries and to handle DNS zone entries for your various Virtual Servers.

OTOH, if you aren't using BIND, and stopped that service, then you wouldn't want 127.0.0.1 in there... also note that in that case, you'd want to go into System Settings -> Features and Plugins, and in there you can disable the BIND DNS Domain feature.

-Eric

Mon, 01/27/2014 - 17:16 (Reply to #3)
mossbot

Thanks for the reply. I do use BIND, but I have no clue what could be causing this issue. I did solve it by moving those external (OpenDNS) DNS entries up the list. Any clue how I could diagnose the issue with 127.0.0.1?

Mon, 01/27/2014 - 17:32
Locutus

Maybe your BIND is not configured properly, e.g. does not allow recursive lookups for local processes. The syslog in /var/log should contain BIND error messages if any occur. BIND also has a debug logging mode if need be. The dig command can also help diagnosing problems with the local nameserver.

Tue, 01/28/2014 - 08:08 (Reply to #5)
mossbot

Thanks for the reply. I have copied some of the errors in the syslog here. I'm not sure if BIND is in debugging mode. The error log was massive, so I only copied I slice which might be related, since trying to update Wordpress was the thing that made me realize something was wrong. (hostname is substituted for my actual hostname)

Jan 27 04:39:27 hostname named[1057]: error (no valid RRSIG) resolving 'wordpress.org/DS/IN': 199.249.112.1#53
Jan 27 04:39:27 hostname named[1057]:   validating @0x7fb33c4f6f50: h9p7u7tr2u91d0v0ljs9l1gidnp90u3h.org NSEC3: no valid signature found
Jan 27 04:39:27 hostname named[1057]:   validating @0x7fb33c4f6f50: org SOA: no valid signature found
Jan 27 04:39:27 hostname named[1057]: error (no valid RRSIG) resolving 'wordpress.org/DS/IN': 199.19.54.1#53
Jan 27 04:39:27 hostname named[1057]: error (no valid DS) resolving 'api.wordpress.org/AAAA/IN': 66.155.9.169#53
Jan 27 04:39:27 hostname named[1057]: validating @0x7fb33c4b5ef0: api.wordpress.org AAAA: bad cache hit (wordpress.org/DS)
Jan 27 04:39:27 hostname named[1057]: error (broken trust chain) resolving 'api.wordpress.org/AAAA/IN': 66.155.40.24#53

For the dig command, I put 127.0.0.1 back as my first DNS server and rebooted my machine just in case.

dig

; <<>> DiG 9.8.1-P1 <<>>
;; global options: +cmd
;; connection timed out; no servers could be reached

dig google.com

; <<>> DiG 9.8.1-P1 <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57642
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 4, ADDITIONAL: 0

;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             300     IN      A       74.125.28.138
google.com.             300     IN      A       74.125.28.139
google.com.             300     IN      A       74.125.28.100
google.com.             300     IN      A       74.125.28.101
google.com.             300     IN      A       74.125.28.102
google.com.             300     IN      A       74.125.28.113

;; AUTHORITY SECTION:
google.com.             172800  IN      NS      ns2.google.com.
google.com.             172800  IN      NS      ns4.google.com.
google.com.             172800  IN      NS      ns1.google.com.
google.com.             172800  IN      NS      ns3.google.com.

;; Query time: 335 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Jan 27 23:48:04 2014
;; MSG SIZE  rcvd: 196

Clearly something is up with the first result just using dig. Any advice on how to continue troubleshooting?

Tue, 01/28/2014 - 08:21 (Reply to #6)
mossbot

I'll also post the results of dig when I direct it at my own server's domain name. (hostname replaced my actual domain name, IPs covered)

; <<>> DiG 9.8.1-P1 <<>> hostname.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23038
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;hostname.com.                   IN      A

;; ANSWER SECTION:
hostname.com.            38400   IN      A       000.000.000.107

;; AUTHORITY SECTION:
hostname.com.            38400   IN      NS      ns2.hostname.com.
hostname.com.            38400   IN      NS      ns1.hostname.com.

;; ADDITIONAL SECTION:
ns1.hostname.com.        38400   IN      A       000.000.000.107
ns2.hostname.com.        38400   IN      A       000.000.000.120

;; Query time: 13 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Jan 27 23:59:40 2014
;; MSG SIZE  rcvd: 113
Tue, 01/28/2014 - 10:11
Locutus

What happens when you type dig @127.0.0.1 a few times? Does it time out each time?

What do you get for dig wordpress.org @127.0.0.1? The errors you've seen in your syslog might be harmless.

Tue, 01/28/2014 - 10:19 (Reply to #8)
mossbot

Here are the results, no changes in the configuration. I did the first command a few times and got the same result as follows.

user@hostname:~# dig @127.0.0.1

; <<>> DiG 9.8.1-P1 <<>> @127.0.0.1
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57417
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;.                              IN      NS

;; ANSWER SECTION:
.                       510681  IN      NS      a.root-servers.net.
.                       510681  IN      NS      i.root-servers.net.
.                       510681  IN      NS      c.root-servers.net.
.                       510681  IN      NS      e.root-servers.net.
.                       510681  IN      NS      g.root-servers.net.
.                       510681  IN      NS      l.root-servers.net.
.                       510681  IN      NS      j.root-servers.net.
.                       510681  IN      NS      f.root-servers.net.
.                       510681  IN      NS      b.root-servers.net.
.                       510681  IN      NS      m.root-servers.net.
.                       510681  IN      NS      h.root-servers.net.
.                       510681  IN      NS      d.root-servers.net.
.                       510681  IN      NS      k.root-servers.net.

;; Query time: 3 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jan 28 01:56:39 2014
;; MSG SIZE  rcvd: 228

user@hostname:~# dig wordpress.org @127.0.0.1

; <<>> DiG 9.8.1-P1 <<>> wordpress.org @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 39051
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;wordpress.org.                 IN      A

;; Query time: 819 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jan 28 01:58:17 2014
;; MSG SIZE  rcvd: 31

After seeing this, I tried just dig again and get this.

user@hostname:~# dig

; <<>> DiG 9.8.1-P1 <<>>
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46846
;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;.                              IN      NS

;; ANSWER SECTION:
.                       510367  IN      NS      c.root-servers.net.
.                       510367  IN      NS      e.root-servers.net.
.                       510367  IN      NS      m.root-servers.net.
.                       510367  IN      NS      f.root-servers.net.
.                       510367  IN      NS      i.root-servers.net.
.                       510367  IN      NS      j.root-servers.net.
.                       510367  IN      NS      k.root-servers.net.
.                       510367  IN      NS      d.root-servers.net.
.                       510367  IN      NS      l.root-servers.net.
.                       510367  IN      NS      h.root-servers.net.
.                       510367  IN      NS      b.root-servers.net.
.                       510367  IN      NS      g.root-servers.net.
.                       510367  IN      NS      a.root-servers.net.

;; Query time: 2 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Jan 28 02:01:53 2014
;; MSG SIZE  rcvd: 228

Good news? I didn't change anything at all. I wonder what could cause it to timeout before and give results now.

Tue, 01/28/2014 - 11:38
Locutus

The fact that your local BIND replies with SERVFAIL for "wordpress.org" shows that something's wrong with it. Can you post the output to this: cat /etc/bind/named.conf.options

Tue, 01/28/2014 - 11:44 (Reply to #10)
mossbot

I should have noticed that. Thanks for your continued help. Here's the output you requested.

user@hostname:~# cat /etc/bind/named.conf.options
options {
        directory "/var/cache/bind";

        // If there is a firewall between you and nameservers you want
        // to talk to, you may need to fix the firewall to allow multiple
        // ports to talk.  See http://www.kb.cert.org/vuls/id/800113

        // If your ISP provided one or more IP addresses for stable
        // nameservers, you probably want to use them as forwarders.
        // Uncomment the following block, and insert the addresses replacing
        // the all-0's placeholder.

        // forwarders {
        //      0.0.0.0;
        // };

        //========================================================================
        // If BIND logs error messages about the root key being expired,
        // you will need to update your keys.  See https://www.isc.org/bind-keys
        //========================================================================
        dnssec-validation auto;
        recursion yes;
        allow-recursion { 127.0.0.1; };
        auth-nxdomain no;    # conform to RFC1035
        listen-on-v6 { any; };
};
Wed, 01/29/2014 - 16:06 (Reply to #11)
mossbot

Any insight into what that output might indicate?

Fri, 01/31/2014 - 19:49
mossbot

Still trying to figure this out. Been trying several tutorials on the proper setup of BIND to no avail. If anyone can help me out with this or point me in a direction it would be greatly appreciated.

Sat, 02/01/2014 - 03:38
Locutus

It's a bit hard to debug this just via the forum, without being able to do tests immediately. I can offer to log on to your system myself / do a screensharing session with you to take a look. Please note though that I need to ask for a moderate fee if that takes longer than about 30 minutes, and I can't promise that I'm able to solve the problem. If you'd like that, send a Skype add request to "Loc2262".

Mon, 02/03/2014 - 08:29 (Reply to #14)
mossbot

Thanks for your offer to help. I'll definitely contact you if this problem persists. I seemed to have solved it, but maybe it's not ideal?

I changed/added the following in my named.conf.options.

dnssec-enable no;
        // dnssec-validation auto;

Now the results from dig wordpress.org

; <<>> DiG 9.8.1-P1 <<>> wordpress.org @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25032
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;wordpress.org.                 IN      A

;; ANSWER SECTION:
wordpress.org.          300     IN      A       66.155.40.250
wordpress.org.          300     IN      A       66.155.40.249

;; AUTHORITY SECTION:
wordpress.org.          14400   IN      NS      ns1.mobiusltd.com.
wordpress.org.          14400   IN      NS      ns2.mobiusltd.com.

;; ADDITIONAL SECTION:
ns1.mobiusltd.com.      300     IN      A       66.155.40.24
ns2.mobiusltd.com.      300     IN      A       66.155.9.169

;; Query time: 468 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Feb  3 00:09:33 2014
;; MSG SIZE  rcvd: 144

Good news?

Mon, 02/03/2014 - 10:21
Locutus

Yes, that looks promising! You could do a cross-check by taking a look at the syslog with the dnssec-enable option set to "no" and "yes".

If when you have it set to "no", digging persistently works, and if when you have it set to "yes", you get the SERVFAIL and a dnssec validation error in the syslog at the same time, you have tested scientifically that this is your solution (if you can live without dnssec validation). :)

Topic locked