Kudos should go to gawry for his answer. However, I didn't want to mutilate his answer with my additions, which seem to be somewhat longer than his full answer. So please see this answer as an addition to his answer.
Caveat emptor
This will only work on Python 2.x with urllib2. The structure of the classes have changed in Python 3.x, so even the casual compatibility trick:
try:
import urllib.request as urllib2
except ImportError:
import urllib2
won't save you. I guess that's the reason why you shouldn't rely on internals of classes, especially when the attributes start with an underscore and are therefore by convention not part of the public interface, albeit being accessible.
Conclusion: the following trick below doesn't work on Python 3.x.
Extracting IP:port from an HTTPResponse
Here's a condensed version of his answer:
import urllib2
r = urllib2.urlopen("http://google.com")
peer = r.fp._sock.fp._sock.getpeername()
print("%s connected\n\tIP and port: %s:%d\n\tpeer = %r" % (r.geturl(), peer[0], peer[1], peer))
Output will be something like this (trimmed ei parameter for privacy reasons):
http://www.google.co.jp/?gfe_rd=cr&ei=_... connected
IP and port: 173.194.120.95:80
peer = ('173.194.120.95', 80)
Assuming r above is an httplib.HTTPResponse instance we make the following additional assumptions:
- its attribute
fp (r.fp) is an instance of class sock._fileobject, created via sock.makefile() in the ctor of httplib.HTTPResponse
- attribute
_sock (r.fp._sock) is the "socket" instance passed to the class socket._fileobject ctor, it will be of type
- attribute
fp (r.fp._sock.fp) is another socket._filetype which wraps the real socket
- attribute
_sock (r.fp._sock.fp._sock) is the real socket object
Roughly r.fp is a socket._fileobject, while r.fp._sock.fp._sock is the actual socket instance (type _socket.socket) wrapped by a socket._fileobject wrapping another socket._fileobject (two levels deep). This is why we have this somewhat unusual .fp._sock.fp._sock. in the middle.
The variable returned by getpeername() above is a tuple for IPv4. Element 0 is the IP in string form and element 1 is the port to which the connection was made on that IP. Note: The documentation states that this format depends on the actual socket type.
Extracting this information from HTTPError
On another note, since urllib2.HTTPError derives from URLError as well as addinfourl and stores the fp in an attribute of the same name, we can even extract that information from an HTTPError exception (not from URLError, though) by adding another fp to the mix like this:
import urllib2
try:
r = urllib2.urlopen("https://stackoverflow.com/doesnotexist/url")
peer = r.fp._sock.fp._sock.getpeername()
print("%s connected\n\tIP and port: %s:%d\n\tpeer = %r" % (r.geturl(), peer[0], peer[1], peer))
except urllib2.HTTPError, e:
if e.fp is not None:
peer = e.fp.fp._sock.fp._sock.getpeername()
print("%s: %s\n\tIP and port: %s:%d\n\tpeer = %r" % (str(e), e.geturl(), peer[0], peer[1], peer))
else:
print("%s: %s\n\tIP and port: <could not be retrieved>" % (str(e), e.geturl()))
Output will be something like this (unless someone at StackOverflow adds that URL ;)):
HTTP Error 404: Not Found: https://stackoverflow.com/doesnotexist/url
IP and port: 198.252.206.16:80
peer = ('198.252.206.16', 80)