How can I do DNS lookups in Python, including referring to /etc/hosts?

dnspython will do my DNS lookups very nicely, but it entirely ignores the contents of /etc/hosts.

Is there a python library call which will do the right thing? ie check first in etc/hosts, and only fall back to DNS lookups otherwise?

184082 次浏览

I'm not really sure if you want to do DNS lookups yourself or if you just want a host's ip. In case you want the latter,

/!\ socket.gethostbyname is deprecated, prefer socket.getaddrinfo

from man gethostbyname:

The gethostbyname*(), gethostbyaddr*(), [...] functions are obsolete. Applications should use getaddrinfo(3), getnameinfo(3),

import socket
print(socket.gethostbyname('localhost')) # result from hosts file
print(socket.gethostbyname('google.com')) # your os sends out a dns query

The normal name resolution in Python works fine. Why do you need DNSpython for that. Just use socket's getaddrinfo which follows the rules configured for your operating system (on Debian, it follows /etc/nsswitch.conf:

>>> print(socket.getaddrinfo('google.com', 80))
[(10, 1, 6, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::63', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::68', 80, 0, 0)), (10, 1, 6, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 2, 17, '', ('2a00:1450:8006::93', 80, 0, 0)), (10, 3, 0, '', ('2a00:1450:8006::93', 80, 0, 0)), (2, 1, 6, '', ('209.85.229.104', 80)), (2, 2, 17, '', ('209.85.229.104', 80)), (2, 3, 0, '', ('209.85.229.104', 80)), (2, 1, 6, '', ('209.85.229.99', 80)), (2, 2, 17, '', ('209.85.229.99', 80)), (2, 3, 0, '', ('209.85.229.99', 80)), (2, 1, 6, '', ('209.85.229.147', 80)), (2, 2, 17, '', ('209.85.229.147', 80)), (2, 3, 0, '', ('209.85.229.147', 80))]

I found this way to expand a DNS RR hostname that expands into a list of IPs, into the list of member hostnames:

#!/usr/bin/python


def expand_dnsname(dnsname):
from socket import getaddrinfo
from dns import reversename, resolver
namelist = [ ]
# expand hostname into dict of ip addresses
iplist = dict()
for answer in getaddrinfo(dnsname, 80):
ipa = str(answer[4][0])
iplist[ipa] = 0
# run through the list of IP addresses to get hostnames
for ipaddr in sorted(iplist):
rev_name = reversename.from_address(ipaddr)
# run through all the hostnames returned, ignoring the dnsname
for answer in resolver.query(rev_name, "PTR"):
name = str(answer)
if name != dnsname:
# add it to the list of answers
namelist.append(name)
break
# if no other choice, return the dnsname
if len(namelist) == 0:
namelist.append(dnsname)
# return the sorted namelist
namelist = sorted(namelist)
return namelist


namelist = expand_dnsname('google.com.')
for name in namelist:
print name

Which, when I run it, lists a few 1e100.net hostnames:

list( map( lambda x: x[4][0], socket.getaddrinfo( \
'www.example.com.',22,type=socket.SOCK_STREAM)))

gives you a list of the addresses for www.example.com. (ipv4 and ipv6)

This code works well for returning all of the IP addresses that might belong to a particular URI. Since many systems are now in a hosted environment (AWS/Akamai/etc.), systems may return several IP addresses. The lambda was "borrowed" from @Peter Silva.

def get_ips_by_dns_lookup(target, port=None):
'''
this function takes the passed target and optional port and does a dns
lookup. it returns the ips that it finds to the caller.


:param target:  the URI that you'd like to get the ip address(es) for
:type target:   string
:param port:    which port do you want to do the lookup against?
:type port:     integer
:returns ips:   all of the discovered ips for the target
:rtype ips:     list of strings


'''
import socket


if not port:
port = 443


return list(map(lambda x: x[4][0], socket.getaddrinfo('{}.'.format(target),port,type=socket.SOCK_STREAM)))


ips = get_ips_by_dns_lookup(target='google.com')

It sounds like you don't want to resolve DNS yourself. dnspython is a standalone DNS client that will understandably ignore your operating system because it's bypassing the operating system's utilities.

We can look at a shell utility named getent to understand how the (Debian 11-like) operating system resolves DNS for programs. This is likely the standard for all *nix like systems that use a socket implementation.

See man getent's "hosts" section, which mentions the use of getaddrinfo, which we can see as man getaddrinfo.

To use it in Python, we have to extract some info from the data structures:

import socket


def get_ipv4_by_hostname(hostname):
# see `man getent` `/ hosts `
# see `man getaddrinfo`


return list(
i        # raw socket structure
[4]  # internet protocol info
[0]  # address
for i in
socket.getaddrinfo(
hostname,
0  # port, required
)
if i[0] is socket.AddressFamily.AF_INET  # ipv4


# ignore duplicate addresses with other socket types
and i[1] is socket.SocketKind.SOCK_RAW
)


print(get_ipv4_by_hostname('localhost'))
print(get_ipv4_by_hostname('google.com'))