Ryan Finnie

The "perfect" IP hashing algorithm

If you want to anonymize IP addresses, for example in HTTP logs, here’s the method I recommend:

#!/usr/bin/env python3

import hashlib
import ipaddress
import sys


KEY = b"example"


def hash_ip(ip, key=b""):
    last_len = -8 if ip.version == 6 else -1
    base_bytes = ip.packed[:last_len]
    last_bytes = ip.packed[last_len:]
    base_bhash = hashlib.shake_256(key + base_bytes).digest(len(base_bytes))
    last_bhash = hashlib.shake_256(key + ip.packed).digest(len(last_bytes))
    return ipaddress.ip_address(base_bhash + last_bhash)


for line in sys.stdin:
    line_ip, line_rest = line.split(" ", 1)
    ip_hashed = hash_ip(ipaddress.ip_address(line_ip), KEY)
    sys.stdout.write("{} {}".format(ip_hashed, line_rest))

Notes:

  • IPv4 and IPv6 addresses are both supported.
  • Hashed IPv4 addresses return IPv4, and hashed IPv6 addresses return IPv6. However, that doesn’t mean the hashed addresses are globally valid (they could be within multicast, for example).
  • The last IPv4 /24 or IPv6 /64 can still be correlated, allowing you to pick out relatively similar hashed addresses. For example, if you see both 230.10.134.107 and 230.10.134.226 hashed addresses, you know they’re part of the same /24, even though you know nothing else about the addresses.
  • However, the last IPv4 /24 or IPv6 /64 is not globally hashed and does not leak information about itself. If you happen to know hashed 230.10.134.107 corresponds to real 10.9.8.4, hashed 178.59.91.107 does not mean the real IP ends in .4.
  • A hash key is optional but recommended. Reuse the key if you want to be able to correlated hashed IPs over time, or be able to hash a known real IP to search for it in the past. Use a random key and throw it away afterwards if you want it to be truly anonymous.
  • SHAKE-256 is used because it’s modern (SHA-3) and support arbitrary-length output. If you can’t use SHA-3, use something like SHA-512, HMAC it with a key, and use a portion of the output. (SHA-3 digests can securely use key prepending; other digests require HMAC.)
  • Hashed IPv6 addresses get quite unwieldy. Get used to addresses like 20f1:b413:7f5d:7720:1fe2:1bf3:38fb:620. (Exactly one hex byte out of 32 was compressible in that random example I used. Most examples have none.)

That Time I Bought Classified Military Intelligence Off eBay (NOT CLICKBAIT)

Yesterday, “Alex” posted a hilarious account of getting former Australian prime minister Tony Abbott’s passport information. The lesson of the story boils down to: how do you report a vulnerability at a national scale? Do you just call 1-800-NATIONAL-SECURITY and let them know what happened? (Actually, Australia does have a literal hotline like that, but it turned out to be useless. Seriously, before you read further here, go read “Alex”’s story.)

This story reminds me of a similar problem I had 13 years ago. But be warned, this story is completely from memory and some things may be wrong. It will also not be nearly as entertaining as “Alex”’s story.

Cobalt RaQ 2, redacted

In 2007, I bought a Cobalt RaQ 2 off eBay. Introduced in 1999, the RaQ 2 was the rackmount server variant of the more well known Cobalt Qube 2. It featured a MIPS R5000 CPU and was technically a general purpose server platform, but was most often used as a file server or proxy server. I bought it used (and long after its useful life) because I like to collect non-x86 hardware which can run Linux.

When it arrived, I powered it on, intending to look around Cobalt’s bespoke Linux distribution before wiping the drive and installing Debian. I was expecting a server restored to factory settings, but was secretly hoping for some random company’s fileserver data. Likely an out-of-business company since these sorts of sales usually come from liquidators.

Through the front panel, I was able to reset the administrator password and retrieve the configured IP information, which was… a public IP address. Huh, weird. Was it being used as a web server? Doing a whois on it revealed it was in a block owned by the DoD Network Information Center. Yes, that DoD.

By attaching a crossover Ethernet cable, I was able to log into the RaQ’s administrative web interface. I could see it was configured as a caching LAN web proxy, and its hostname ended in smil.mil. Yes, that military.

Without looking at the contents, I could tell that the drive was full of cached content, with the usual extensions (.html, .jpg, etc), though if I remember correctly, the base filenames were randomly assigned so there was no indication of the subject of each file’s content. The file dates were mostly through 2003.

That’s as far as I dug. I turned off the RaQ and pulled the hard drive. As I would come to learn, smil.mil is a domain on SIPRNet, the US government’s classified equivalent to the Internet. SIPRNet is supposed to be air-gapped; there should be no physical way to reach a machine on SIPRNet from the Internet or vice versa, and it was definitely not accessible to civilians. The fact that I had a caching proxy server used on SIPRNet was inconceivable.

(This also explains why it had a “public” IP address. If you look at visual representations of IPv4 space (buy this poster, hint hint), you’ll see DoD spaces but they don’t appear to be in use. They are in use, just not on the Internet. RFC 1918? Never heard of it. Similarly, smil.mil and sgov.gov look like Internet domains, but only resolve on SIPRnet. I’ll give the government a pass here; the IANA has no organization-specific private namespace, like .internal or .lan, and it annoys me that domains like that are not explicitly reserved. What were we talking about? Oh right, national security.)

So what should I do with this information? I could go to the press, and it would be quite a embarrassment to the administration, but I didn’t want to draw political attention to myself. Ideally I just wanted to go to someone official and say “someone messed up, here’s the server’s hard drive and who I bought it from”. But again, there isn’t really a 1-800-CLASSIFIED-LEAK.

A co-worker suggested I contact my congressman’s office to see if they could help me navigate to someone proper in the federal government. I called, they took down my information, and never called me back. I sent an email, no response. I’m sure I sounded like a crackpot, but it’s not my fault it was all true.

In the end, I gave up. The hard drive went into my document safe, just in case I got a call later, perhaps the FBI got a list of buyers from the eBay seller. But that never happened, and about 5 years later I destroyed the hard drive without loading any of its cached contents. Somehow I lost my email archives from earlier than 2010, and eBay sale records only go back about 2 years, so I can’t even find out who sold me the server back in 2007.

I still have the Cobalt RaQ 2 hardware today (minus the hard drive), and it currently runs Debian off an SD to IDE adapter. There was nothing otherwise special about the hardware, nor any asset stickers or similar, so now it’s just a weird server from an old generation.

Git branch-based contribution workflow management

Let’s face it, Git is not easy to use. Actually, git clone $URL and then the occasional git pull is simple, but the barrier to entry for making changes is quite high. Even if a novice makes it to the point of filing a pull request, a response like “please squash your changes” probably makes no sense.

This post is roughly based on a guide I wrote for an internal company wiki about a decade ago. It describes a workflow for making contributions to an existing Git-managed project, using purposeful branches. I say “purposeful” a lot; the idea is a branch houses the development of a single change, which has a single purpose. 99% of the time, that change will be a single commit by the time it’s reviewed and merged upstream.

Managing purposeful changes in their own branches lets you manage multiple contributions to a single project. There was a project last year where I had over 20 changes ready to be reviewed and merged, each in their own branch. Most of them were simple and could be reviewed in a minute, but since they were split up into dedicated changes, you didn’t have to coordinate with the reviewer about which requests to review in what order (most of the time).

As an example, let’s use the Git repository for 2ping, a somewhat popular utility I wrote. First, we want to clone the origin repository.

$ git clone https://github.com/rfinnie/2ping
Cloning into '2ping'...
$ cd 2ping/
2ping{main}$

2ping’s primary branch is “main”, as are a growing number of repositories as of 2020, but keep in mind the default primary branch for Git repositories is “master”, so most repositories you will come across use that.

Now, if you want to make a change, the first thing you want to do is create a new, purposeful branch. In this example, we want to add a new file, so let’s call the branch “addfile”.

2ping{main}$ git checkout -b addfile
Switched to a new branch 'addfile'
2ping{addfile}$

The examples here show the current branch between curly brackets (you can do the same with your own PS1 prompt), but you can always see which branch you are currently on.

2ping{addfile}$ git branch
* addfile
  main

Now, let’s get editing! Commit early and commit often, and don’t worry about getting everything right with each commit. We will be squashing everything down to a single, purposeful commit at the end, so right now the development commits are more of a stream-of-consciousness log for your own benefit.

2ping{addfile}$ echo foo > bar
2ping{addfile}$ git add bar
2ping{addfile}$ git commit -m 'Add a new file, "bar"'
2ping{addfile}$ echo dive > bar
2ping{addfile}$ git commit -a -m 'Oops, this is what the new file should look like'

Other people may be working on the repository at the same time as you, and new commits may be added to the primary branch between the time you start the new branch and the time you’re ready for review. Occasionally, you want to pull in any new commits, and integrate them into your working branch.

2ping{addfile}$ git fetch origin main
2ping{addfile}$ git rebase origin/main

The first command fetches any new commits for the “main” branch from the “origin” remote (when you did a git clone originally, the URL you specified became the “origin” remote by default). The second command will take the “main” branch as a base, and apply any commits you made to the “addfile” branch afterward.

There are several commands available which give you an idea of the current status of your branch in relation to the primary branch.

2ping{addfile}$ git diff origin/main
2ping{addfile}$ git log origin/main..HEAD

Commit often, but remember to keep the actual branch’s scope limited to a single purposeful change. Say you’re in the middle of this addfile change and you notice a typo in README.md. It’s easy to commit your work in progress, then switch to a yet another new branch off the primary branch and update README.md.

2ping{addfile}$ git commit -a -m 'Work in progress commit'
2ping{addfile}$ git checkout main
2ping{main}$ git pull
2ping{main}$ git checkout -b readme-typo
2ping{readme-typo}$ vi README.md
2ping{readme-typo}$ git commit -a -m 'Fix README.md typo'
2ping{readme-typo}$ git checkout addfile
2ping{addfile}$

(git stash is also available to do roughly the same thing, but as you’re already working with a dedicated branch which will be squashed, it’s actually easier to just commit your WIP.)

Once you’re ready, you’ll have a few commits for a single change. You’ll want to squash those down into a single commit.

2ping{addfile}$ git rebase -i main

This will open an editor with all of your commits since the branch point, with the oldest commit at the top.

pick 9e669cc36 Add a new file, "bar"
pick ab0c50f93 Oops, this is what the new file should look like

Edit the commands so you squash all other commits into the top commit.

pick 9e669cc36 Add a new file, "bar"
squash ab0c50f93 Oops, this is what the new file should look like

Once you save and exit, another editor will open for the final commit message. It will contain the text of all your commits, so you can pare the text down into what you want the final commit message to be.

Now you’ve got a single, purposeful commit with the desired change.

2ping{addfile}$ git show
commit 67017eed63b41cc57a6c81fe9e0be9ded733be30 (HEAD -> addfile)
Author: Ryan Finnie <ryan@finnie.org>
Date:   Mon Jul 13 15:34:42 2020 -0700

    Add a new file, "bar"

    This file is vital to the operation of 2ping.

diff --git a/bar b/bar
new file mode 100644
index 0000000..1bc5169
--- /dev/null
+++ b/bar
@@ -0,0 +1 @@
+dive

Now you’ll want to push it to a personal repository. We’ll use GitHub as an example here, but unfortunately GitHub does not allow pushing to new namespaces, so you’ll need to clone the origin repository through the web site first. Once that’s done, you can add your clone of the repository as a new “remote”.

2ping{addfile}$ git remote add personal https://github.com/youruser/2ping

If you remember above, new data was being checked for via the “origin” remote. This new “personal” remote is simply adding a second, non-default remote to your local repository. You only need to add this remote to your local repository once.

Now you can push your new branch to your personal remote.

2ping{addfile}$ git push personal addfile

At this point, GitHub will notice that this is a new branch in a clone of a third-party repository, and will give you a URL which lets you create a pull request against the origin.

Say you create a pull request and the upstream requests alterations. Go ahead and make them, then rebase squash against the primary branch again.

2ping{addfile}$ echo gold > bar
2ping{addfile}$ git commit -a -m 'Upstream does not like dive bars, and instead wants gold'
2ping{addfile}$ git rebase -i origin/main

Now push again, but since you are pushing a change which cannot be fast-forwarded on your remote “addfile” branch (remember, rebasing effectively alters history), you’ll need to force the push.

2ping{addfile}$ git push personal addfile --force

While this is the preferred workflow for working branches for the purpose of submitting changes, you should never rebase a primary branch, as it alters history and makes it very hard for others to work with your repository.

Now, if your pull request is accepted and merged, you can go back to the primary branch and pull in the changes from the default “origin” remote.

2ping{addfile}$ git checkout main
2ping{main}$ git pull

Congratulations, you’ve navigated a change workflow! Now, you can apply this workflow to local development as well. Even if the repository is yours, you can use branches to manage your development. For a complex personal project, I’ll often have dozens of half-implemented branches in various states of work. When a change is ready, all you need to do is rebase against the primary branch, then switch to it and merge.

2ping{addfile}$ git checkout main
2ping{main}$ git merge addfile
2ping{main}$ git push

Safechain: safe, atomic and idempotent iptables firewall management

When I joined Canonical in 2012, we in IS had a number of choke firewalls which were literally just servers which ran iptables rules. These firewalls would filter gigabits of traffic without blinking an eye, so it was sound from a technical perspective, but config management was a problem.

The general layout was a firewall.sh file which was run early on boot, which did initial setup and created network-specific chains. The network-specific chain (which usually took the form net1_to_net2.sh) would be run from /etc/network/interfaces and would flush the chain and populate it. This approach had two rather annoying flaws.

The first flaw was any sort of syntax error would leave the chain in a broken state. Because of this, any updates to a firewall in the config management system (Puppet at the time) required a +2 to commit, instead of the normal +1. Even then, chains would regularly break, leaving an SRE to scramble to cowboy in a fix while downtime occurred.

The second flaw is less obvious at first, but became a large problem later on. As chains grew larger, they took longer to apply. A chain with thousands of rules could take seconds; tens of thousands of rules could take over a minute. And since the first step was to flush the chain, this was time when a partially applied chain was in production. SREs would start announcing when chain updates were being applied, chain files were rearranged so the most important rules were near the top, etc.

In early 2013, I wrote Safechain, which ended up being one of the most proportionally simple-to-write-versus-headache-reducing scripts I had ever written. In a nutshell, it takes the basic concept of chain-specific iptables firewall scripts, and makes it safe, atomic and idempotent.

A converted chain called “host_ingress” might look like this:

#!/bin/sh

set -e

. /etc/safechain/safechain.sh

# host_ingress chain preprocessing
sc_preprocess host_ingress

# Allow ICMP
sc_add_rule host_ingress -p icmp -j ACCEPT

# Allow all inbound traffic from the LAN
sc_add_rule host_ingress -i eth1 -j ACCEPT

# Allow certain services
sc_add_rule host_ingress -p tcp --dport 80 -j ACCEPT
sc_add_rule host_ingress -p tcp --dport 443 -j ACCEPT

# Allow SSH from trusted host
sc_add_rule host_ingress -s 10.2.8.3 -p tcp --dport 22 -j ACCEPT

# Drop all other traffic
sc_add_rule host_ingress -j LOG --log-prefix "BAD-host-in: "
sc_add_rule host_ingress -j DROP

# host_ingress chain postprocessing
# Goes live here if all went well
sc_postprocess host_ingress

In most cases, converting from iptables to Safechain was a matter of adding sc_preprocess host_ingress, sc_postprocess host_ingress, and changing all iptables -A to sc_add_rule.

sc_preprocess creates a temporary chain, which sc_add_rule adds to. When sc_postprocess is run, a jump from the main chain to the temporary chain is added, the existing live chain is removed (including any references to it, at which point the new chain is now live), and the new chain is renamed to the live chain (including any references to it). If an error occurs at any point in this process, the old chain will remain active, and it’s impossible for a half-broken chain to be running.

This served us well for about five years, until we implemented a replacement declarative-based firewall system which used ipset under the hood. But I still used Safechain at home, and got permission from Canonical to open source it. Normally something like this would have been open sourced from the beginning, usually under the Ubuntu banner, but since it was used completely internally, nobody really thought about putting an LGPL header on it and posting it publicly. Canonical doesn’t really have an interest in Safechain since they don’t use it for production anymore, so it’s effectively “mine” now, but I still wanted to go by the book.

PayPal is abusively incompetent

<update date=”2020-06-23”>It’s a month later, and I noticed the BBB complaint page kept pushing back the required date for PayPal to respond, currently at the end of July. I took that as them not actually holding one of the world’s largest financial institutions to task, because scary things are going on in the world. So, back to square one, having no hope that my situation will ever be resolved.

I posted this observation on Twitter, and an hour later, they fixed my account and I got a call from PayPal Corporate Escalations regarding my BBB complaint. They claimed they had no knowledge of the tweet I just sent, or of the previous 11 interactions. What an amazing coincidence! I verified I could log in and immediately transferred all the money out of my account.

I did not hold back with my assesment of the situation, and also asked about the two promised “specialist” escalations from @AskPayPal that never happened, the fact that there has been literally no way to deal with account security issues for months now (and for months in the future, see below), the lies from @AskPayPal claiming the opposite of that, etc. The person I talked to was generically contrite, but obviously didn’t have any answers.

I’m relieved this is fixed for me personally, but obviously not happy. They said the current estimate for phone support to return is OCTOBER. From one of the world’s largest financial institutions. PayPal online chat can’t access your account. Their brand protection accounts on Twitter/Facebook/etc can’t access your account. Literally the only way to deal with account security issues at the present time is to file a BBB complaint and be as loud as possible. Maybe file a small claims lawsuit if you have money held hostage. This is just to get the attention of someone who can make a difference.

If you have money in your PayPal account, withdraw it immediately. If you use it as your primary account, find a different bank immediately. And if you’re in the same situation as I was for over 3 months, I’m truly sorry.</update>

<update date=”original (2020-05-26)”>They immediately responded with almost exactly what I predicted in the “What shouldn’t I do?” section. I’ve updated the counts below, we’re now up to 11 different PayPal representatives.

Seeing this was going nowhere, I filed a Better Business Bureau complaint against PayPal. I let PayPal know, and turns out they have a form reply specific to when people tell them they’ve filed BBB complaints. I don’t know why I found that surprising; must happen a lot.

This form reply stated they will wait to receive the complaint until they do anything. To be clear, filing a BBB complaint does not preclude PayPal from fixing the problem they caused until they receive the complaint. They are perfectly capable of fixing their problem, but are now explicitly refusing to, instead of leading me on. At least this is closure, in the sense that they can stop lying to me regularly and repeatedly.

And in case this goes viral, I am open to media inquiries. Yes, I have a grudge.</update>

First of all, I want to apologize to the average reader reading this. I know most people treat these sorts of grudge posts the digital equivalent of putting your hand at the side of your lowered head and walking by without making eye contact. I get it, and I’m sorry for making you uncomfortable. Instead, this post is intended to reach someone from one of the world’s largest financial institutions.

Short version: About two months ago, PayPal changed it so I can no longer access my account. Since then, I’ve had NINE ELEVEN PayPal representatives “try” and fail to help me, and have been lied to multiple times.

It used to be that when I logged in, it would ask security questions (mother’s maiden name, last 4 SSN, etc). Then they changed it so it tries to send a verification text to a cell number I haven’t had in years. This cell number I had explicitly removed from my profile, again, years ago.

When I click “Having trouble logging in?”, it says “Sorry, we couldn’t confirm it’s you. Need a hand? We can help.” The “we can help” link takes me to their knowledge base, which doesn’t address this scenario.

Any attempt to find a “contact us” link results in being asked to log in first. I explained the situation to @AskPayPal on Twitter, and they replied with:

Hi there! Thank you for reaching out to us via Twitter. We are sorry to hear that you are unable to access your account. Please send us a DM with your registered email address. We’d be happy to help.

Their “help” consisted of showing me exactly what sequence to click on the web site to be able to get to chat support without being logged in. It’s… not intuitive, and definitely designed to discourage people from finding it.

So, chat support. First two attempts involved them trying to send password resets. I explained no, my password isn’t the issue, it’s this old phone number that they’re trying to send a text to all of a sudden. I finally got someone who actually looked at my account, confirmed and understood the issue… and refused to help further. Login issues like that must be done over the phone, and their phone support is closed indefinitely. And then he immediately ended the chat session.

Now, here’s where the proper psychological abuse begins. I’ve since come to realize that @AskPayPal is not customer support. It’s brand reputation protection. They look for situations where their brand is being tarnished, and do their best not to remedy the situation, but to sweep it under the rug. They can’t look at accounts, or fix accounts, or do anything but quietly and discretely direct the brand problem somewhere else where maybe it will be helped, but honestly who cares at that point. The brand reputation problem has been mitigated. It’s possible they’re not even PayPal employees, but instead staffed by a marketing subcontractor.

I posted on my Twitter timeline about the chat experience, and that I’ve basically given up. A few days later, they replied publicly:

Hi there, thanks for getting in touch with us. We have responded to the Direct message. Please check. Thanks for your patience. ^MJD

And then the cycle begins again. They’ll ask a few token questions, and either stop responding, or tell me to go to chat support (which we’ve established will not help me). Any time I mention PayPal publicly, I get a response eventually, but it never goes anywhere. This has happened with SIX EIGHT different PayPal representatives (at least judging by the initials at the end; add the 3 chat support attempts and we’re at ~~9~~ 11). Twice they ended with promising to escalate to a “specialist”, but will then disappear.

Every time I feel like hope is properly lost and I can move on (did I mention I foolishly had over $700 in the account?), they’ll reply and ask to repeat the situation, or ask what happens when I try to log in, or suggest that if I contact chat support they can reset my password. With each response, I know that @AskPayPal cannot actually fix the problem PayPal created, but there’s still that bit of hope which makes me respond. But it ends the same each time, and honestly, I feel terrible.

What can I do?

So, you’re a member of the aforementioned brand protection team and have discovered this post. Is there anything that can be done to remedy this blight on the PayPal brand? Maybe!

  1. Read this post. Like, all those paragraphs above. They contain valuable information.
  2. Find someone who can actually get stuff done. Like, a programmer or someone. I assume since the web site is still up and one of the world’s largest financial institutions is still processing transactions (for accounts which haven’t been locked out, at least), there may still be people working for PayPal. On the chance that you’re actually a third party contractor, pick up the phone and call your escalation contact for dealing with problem customers like me.
  3. Okay, have we got someone who can make a difference? Excellent. Now, find my account (it’s not difficult to guess my email; my name is Ryan Finnie and you’re reading a post on finnie.org), remove the 2FA option you somehow added for a cell phone which isn’t even part of my profile (it ends in 72), and set it back so I can answer security questions when I try to log in.
  4. Optionally, if you want to have a chat, feel free to call my home number. It’s the primary number on my profile. Ends in 69. It’s actually a pretty cool looking number, and initially makes you think “wait, is that a fake phone number?” But no, it’s real.

What shouldn’t I do?

Please don’t reply with “I’m sorry you’ve had a bad experience! Please send us a direct message with more information, and we would be happy to help you.” That would be abusive and would make me more angry.

But you’re going to do that anyway. That’s exactly what they did. Gotta protect the brand.

Coda

I may have been a little irreverent there, but make no mistake, this is not something I wanted to write. I do not feel good writing this, and I just want the problem resolved. I’m currently out over $700, I cannot pay friends (which is how this all started), or buy on eBay, or buy from small businesses which only accept PayPal. I don’t want a 10th encounter with PayPal support, or a 11th, or… I’ve also left some parts out, as detailing each of the 9 attempts at service would make this post even longer. Suffice it to say, #9 was so incredibly tone deaf that it prompted me to write this post.

« All posts