If you want to anonymize IP addresses, for example in HTTP logs, here’s the method I recommend:

#!/usr/bin/env python3

import hashlib
import ipaddress
import sys

KEY = b"example"

def hash_ip(ip, key=b""):
    last_len = -8 if ip.version == 6 else -1
    base_bytes = ip.packed[:last_len]
    last_bytes = ip.packed[last_len:]
    base_bhash = hashlib.shake_256(key + base_bytes).digest(len(base_bytes))
    last_bhash = hashlib.shake_256(key + ip.packed).digest(len(last_bytes))
    return ipaddress.ip_address(base_bhash + last_bhash)

for line in sys.stdin:
    line_ip, line_rest = line.split(" ", 1)
    ip_hashed = hash_ip(ipaddress.ip_address(line_ip), KEY)
    sys.stdout.write("{} {}".format(ip_hashed, line_rest))


  • IPv4 and IPv6 addresses are both supported.
  • Hashed IPv4 addresses return IPv4, and hashed IPv6 addresses return IPv6. However, that doesn’t mean the hashed addresses are globally valid (they could be within multicast, for example).
  • The last IPv4 /24 or IPv6 /64 can still be correlated, allowing you to pick out relatively similar hashed addresses. For example, if you see both and hashed addresses, you know they’re part of the same /24, even though you know nothing else about the addresses.
  • However, the last IPv4 /24 or IPv6 /64 is not globally hashed and does not leak information about itself. If you happen to know hashed corresponds to real, hashed does not mean the real IP ends in .4.
  • A hash key is optional but recommended. Reuse the key if you want to be able to correlated hashed IPs over time, or be able to hash a known real IP to search for it in the past. Use a random key and throw it away afterwards if you want it to be truly anonymous.
  • SHAKE-256 is used because it’s modern (SHA-3) and support arbitrary-length output. If you can’t use SHA-3, use something like SHA-512, HMAC it with a key, and use a portion of the output. (SHA-3 digests can securely use key prepending; other digests require HMAC.)
  • Hashed IPv6 addresses get quite unwieldy. Get used to addresses like 20f1:b413:7f5d:7720:1fe2:1bf3:38fb:620. (Exactly one hex byte out of 32 was compressible in that random example I used. Most examples have none.)