Ryan Finnie

Better-Assembled Access Tokens

A few years ago, GitHub changed the format of their access tokens from a hexadecimal format which was indistinguishable from a SHA1 hash, to a format with a human-identifiable prefix and built-in checksumming, which can be identified as a GitHub token by a program. This is useful for being able to determine if, for example, an access token was accidentally committed into a repository. I welcomed this, but recently wanted to build an agnostic version which could be used in other systems.

Enter: Better-Assembled Access Tokens (BAAT). The token format looks like so:

bat_pfau4bdvkqwmwwur2bjo2q2squjeld5fafgyk5sd
bat_3udmmr57bglierumrjxjxrkiv3nydd5faebohhgn
bat_bbzz6q4rnbnu6tkujrb73vhfuk6pdd5fafme5kq5

“bat” is the prefix and can be any lowercase alphanumeric string, but should be between 2 and 5 characters.

The other part – the wrapped data – contains a payload of 144 bits (18 bytes), a magic number and version identifier, and a checksum. This payload size allows for a full UUID to be generated, with 2 bytes left for additional control data if needed.

The checksum includes all of the data, including the prefix (which is not a feature of GitHub’s tokens), and the fact that it has a binary magic number means a BAAT can be identified programmatically, no matter the prefix chosen by the application. A BAAT is canonically all lowercase, but can handle being case-corrupted in transit.

A sample Python implementation is below, but the general specification for BAAT is:

  • If the binary payload is under 18 bytes, pad it to 18 bytes
  • CRC32 the prefix + payload + magic number (\x8f\xa5) + version (\x01)
  • Assemble the wrapped data as a base32 concatenation of payload + magic number + version + CRC
  • Assemble the final BAAT as the prefix + “_” + the wrapped data

And to verify a BAAT:

  • Split the string into prefix and wrapped data by “_”
  • Base32 decode the wrapped data and verify it’s at least 7 bytes
  • Verify the 2 bytes at position -7 is \x8f\xa5
  • Verify the byte at position -5 is \x01 for version 1 (currently the only version, but doesn’t hurt to future-proof – the rest of the process assumes a version 1 BAAT)
  • Verify the wrapped data is 25 bytes
  • Extract the payload as the 18 bytes at position 0 (the beginning), and the checksum as the 4 bytes at position -4 (the end)
  • Verify the checksum as the CRC32 of prefix + payload + magic number + version

This specification is open; feel free to use it in your implementations!

If you’re wondering why the magic number and version are in the middle of the wrapped data instead of the front like it normally is for a data format (thus requiring some additional positional math), it’s because it shows up as a static sequence of text in a list of multiple BAATs. Placing the payload at the beginning and the checksum at the end allows a human to quickly pattern match “oh, this is the ‘3ud’ token, not the ‘pfa’ token”.

If you’re wondering why the payload is 18 bytes, it’s because BAAT use base32 for the encoding, which will use trailing equal signs as padding. 20 input bytes is a multiple with no padding, which would have allowed for a 16-byte payload and 4-byte checksum. But I wanted to have a 2-byte magic number, and the next multiple without padding was 25 bytes, so the final 3 bytes were used for a 1-byte version and 2 extra payload bytes.

# Better-Assembled Access Tokens
# SPDX-FileCopyrightText: Copyright (C) 2023 Ryan Finnie
# SPDX-License-Identifier: MIT

from base64 import b32decode, b32encode
from random import randint
from zlib import crc32


class BAATError(ValueError):
    pass


def make_baat(prefix="bat", payload=None):
    magic = b"\x8f\xa5"
    baat_ver = b"\x01"
    if payload is None:
        payload = bytes([randint(0, 255) for _ in range(18)])
    elif len(payload) > 18:
        raise BAATError("Payload too large")
    elif len(payload) < 18:
        payload = payload + bytes(18 - len(payload))
    prefix = prefix.lower()
    crc = crc32(prefix.encode("utf-8") + payload + magic + baat_ver) & 0xFFFFFFFF
    wrapped_data_b32 = b32encode(payload + magic + baat_ver + crc.to_bytes(4))
    return (prefix + "_" + wrapped_data_b32.decode("utf-8")).lower()


def parse_baat(baat):
    parts = baat.split("_")
    if len(parts) != 2:
        raise BAATError("Malformed")
    prefix = parts[0].lower()
    wrapped_data = b32decode(parts[1].upper())
    if len(wrapped_data) < 7:
        raise BAATError("Impossible length")
    magic = wrapped_data[-7:-5]
    baat_ver = wrapped_data[-5:-4]
    if magic != b"\x8f\xa5":
        raise BAATError("Invalid magic number")
    if baat_ver != b"\x01":
        raise BAATError("Invalid BAAT version")
    if len(wrapped_data) != 25:
        raise BAATError("Wrong length")
    payload = wrapped_data[0:18]
    crc = crc32(prefix.encode("utf-8") + payload + magic + baat_ver) & 0xFFFFFFFF
    if wrapped_data[-4:] != crc.to_bytes(4):
        raise BAATError("Invalid CRC")
    return payload


def is_baat(baat):
    try:
        parse_baat(baat)
    except ValueError:
        return False
    return True


if __name__ == "__main__":
    payload = bytes([randint(0, 255) for _ in range(18)])
    baat = make_baat("bat", payload)
    print(baat)
    parsed_payload = parse_baat(baat)
    assert is_baat(baat)
    assert parsed_payload == payload

ChatGPT unsettled me

Tom Scott recently put out a video where he had a “minor existential crisis” after giving ChatGPT a coding task. His conclusion was basically, this works better than it should, and that’s unsettling. After watching this, I had my own minor coding task which I decided to give to ChatGPT, and, spoiler alert, I am also unsettled.

The problem I needed to solve was I have an old Twitter bot which had automatically followed a bunch of people over the years, and I wanted to clear out those follows. As of this writing, Twitter’s API service seems to inexplicably still exist, but the single-purpose OAauth “app” associated with that account was for API v1.1, not v2, so I needed to use API v1.1 calls.

I’d done a lot of Twitter API work over the years, and a lot of that was through Python, so I was ready to kitbash something together using existing code snippets. But let’s see what ChatGPT would do if given the opportunity:

Write a script in Python to use the Twitter API v1.1 to get a list of all friends and then unsubscribe from them

And yeah, it created a correctly-formatted, roughly 25 line Python script to do this exactly. It even gave a warning that API access requires authentication, and, amusingly, that unsubscribing from all friends would affect the account’s “social reach”.

(I’m summarizing its responses here; a full chat log, including code at every step, is available at the end of this post.)

One drawback to the specific situation was it wrote the script to use the tweepy library, which I had never heard of and wasn’t sure if it was using API v1.1 (though I suspected it was, from the library function destroy_friendship(); “friendships” are verbs in v1.1 but not v2). Nonetheless, I was more familiar with requests_oauthlib and the direct API endpoints, so I just asked ChatGPT to rewrite it to use that.

Can we use the requests_oauthlib library instead of tweepy?

Sure enough, it produced exactly what I wanted, and I ended up using it for my task.

Everything beyond this was “what-ifs” to poke at ChatGPT. The first thing I noticed was it was using a less efficient API endpoint. Thinking back to Tom’s video where he realized he could simply ask ChatGPT why it did something a certain way, I realized I could simply say:

That works, but the friends/ids.json endpoint allows for 5000 results per request, versus 200 on friends/list.json as you pointed out. Let’s use friends/ids.json instead.

ChatGPT’s response was basically “yep, I agree that’s more efficient; here’s an updated script!”, utilizing the new endpoint and specifying the new 5000 user limit.

This was a subtle test for it since the endpoint I suggested is very similar to the old one, but not a drop-in replacement. You need to make a few minor changes elsewhere in the script to utilize it. ChatGPT passed this test and updated both the endpoint name and the required changes.

I’m using Python 3.5 and can’t use f-strings. Can you rewrite the code to use string format() instead?

I’m not actually using Python 3.5, but I wanted to ask this as f-strings and format() are very different looking ways to format strings. It rewrote the script correctly.

Can you package this up to run in a GitHub Actions workflow?

It gave me the Python script it had been working on up until this point, and a workflow YAML file, along with instructions on where to put them. The workflow format was correct, and, impressively, remembered my off-hand requirement of Python 3.5 from earlier (even though the context of that was about a specific feature, not Python 3.5 specifically), and made sure to use 3.5 when setting up actions/setup-python@v2. It also knew that requests and requests_oauthlib were not stdlib, so it had the workflow install them through pip.

The workflow did use the GitHub Actions secret store to correctly use the secrets and pass them as environment variable, but up until this moment, the script itself had hardcoded placeholders. I asked ChatGPT to actually use the environment variables.

The workflow file is correctly passing CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN and ACCESS_TOKEN_SECRET secrets as environment variables to twitter_unfollow.py, but twitter_unfollow.py still has the placeholder variables. Can you modify twitter_unfollow.py to use the environment variables?

Again, it did this without a problem.

In conclusion, I am impressed by ChatGPT’s performance, to the extent of being unsettled by the experience. There are many logical, legal and ethical problems with AI generation and I don’t want to minimize those concerns, but I just wanted to document my limited experience with using it for a coding exercise.

(It’s also not lost on me the irony of using current-generation AI to fiddle with the account of X11R5, an old MegaHAL Markov bot which is decidedly more ancient AI.)


Complete chat log

Here is the entire session log between ChatGPT and myself:

Write a script in Python to use the Twitter API v1.1 to get a list of all friends and then unsubscribe from them

Before proceeding with the code, please note that the Twitter API v1.1 requires authentication, which means you need to have valid credentials to access the API. Also, please be aware that unsubscribing from all friends can have significant consequences on your account’s social reach, and it should not be done lightly.

To use the Twitter API v1.1 in Python, we need to first install the tweepy library. We can use the following code to install it:

pip install tweepy

Once tweepy is installed, we can use the following script to get a list of all friends and then unsubscribe from them:

import tweepy

# set up Twitter API credentials
consumer_key = "your_consumer_key"
consumer_secret = "your_consumer_secret"
access_token = "your_access_token"
access_token_secret = "your_access_token_secret"

# authenticate with the Twitter API
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

# create an API object
api = tweepy.API(auth)

# get a list of all friends
friends = api.friends_ids()

# loop through the list of friends and unsubscribe from them
for friend in friends:
    api.destroy_friendship(friend)

In this script, we first set up our Twitter API credentials, then authenticate with the Twitter API using tweepy. We then create an API object and use the friends_ids() method to get a list of all friends.

Finally, we loop through the list of friends and use the destroy_friendship() method to unsubscribe from each friend. Note that this is a permanent action, so use it with caution.

Can we use the requests_oauthlib library instead of tweepy?

Yes, we can use the requests_oauthlib library instead of tweepy to interact with the Twitter API v1.1 in Python. Assuming you have valid credentials and have installed the requests_oauthlib library, here’s an example script to get a list of all friends and then unsubscribe from them using requests_oauthlib:

import requests
from requests_oauthlib import OAuth1Session

# replace with your own credentials
consumer_key = 'your consumer key'
consumer_secret = 'your consumer secret'
access_token = 'your access token'
access_token_secret = 'your access token secret'

# set up OAuth1Session
twitter = OAuth1Session(consumer_key, client_secret=consumer_secret,
                        resource_owner_key=access_token, resource_owner_secret=access_token_secret)

# get a list of all friends
url = 'https://api.twitter.com/1.1/friends/list.json'
params = {'count': 200}
response = twitter.get(url, params=params)
friend_list = response.json()['users']

# unsubscribe from each friend
for friend in friend_list:
    url = f'https://api.twitter.com/1.1/friendships/destroy.json?user_id={friend["id"]}'
    response = twitter.post(url)
    if response.status_code == 200:
        print(f'Unsubscribed from {friend["screen_name"]}')
    else:
        print(f'Failed to unsubscribe from {friend["screen_name"]}')

In this script, we first set up an OAuth1Session object with our Twitter API credentials. We then use the get method to retrieve a list of all friends (limited to 200 per request, as per the API’s limitations). We extract the list of friends from the JSON response using the json method.

We then loop through each friend in the list and use the post method to unsubscribe from each friend by making a request to the friendships/destroy.json endpoint with the friend’s ID as a parameter. We print a message indicating whether the unsubscribe request was successful or not.

That works, but the friends/ids.json endpoint allows for 5000 results per request, versus 200 on friends/list.json as you pointed out. Let’s use friends/ids.json instead.

You’re right, using the friends/ids.json endpoint would allow us to retrieve up to 5000 friend IDs per request, which would be more efficient than using the friends/list.json endpoint. Here’s an updated script that uses the friends/ids.json endpoint:

import requests
from requests_oauthlib import OAuth1Session

# replace with your own credentials
consumer_key = 'your consumer key'
consumer_secret = 'your consumer secret'
access_token = 'your access token'
access_token_secret = 'your access token secret'

# set up OAuth1Session
twitter = OAuth1Session(consumer_key, client_secret=consumer_secret,
                        resource_owner_key=access_token, resource_owner_secret=access_token_secret)

# get a list of all friend IDs
url = 'https://api.twitter.com/1.1/friends/ids.json'
params = {'count': 5000}
response = twitter.get(url, params=params)
friend_ids = response.json()['ids']

# unsubscribe from each friend
for friend_id in friend_ids:
    url = f'https://api.twitter.com/1.1/friendships/destroy.json?user_id={friend_id}'
    response = twitter.post(url)
    if response.status_code == 200:
        print(f'Unsubscribed from friend ID {friend_id}')
    else:
        print(f'Failed to unsubscribe from friend ID {friend_id}')

In this updated script, we use the friends/ids.json endpoint to retrieve a list of all friend IDs, with a maximum count of 5000 per request. We extract the list of friend IDs from the JSON response using the json method.

We then loop through each friend ID in the list and use the post method to unsubscribe from each friend by making a request to the friendships/destroy.json endpoint with the friend’s ID as a parameter. We print a message indicating whether the unsubscribe request was successful or not.

I’m using Python 3.5 and can’t use f-strings. Can you rewrite the code to use string format() instead?

Sure! Here’s the same code using string format() instead of f-strings:

import requests
from requests_oauthlib import OAuth1Session

# replace with your own credentials
consumer_key = 'your consumer key'
consumer_secret = 'your consumer secret'
access_token = 'your access token'
access_token_secret = 'your access token secret'

# set up OAuth1Session
twitter = OAuth1Session(consumer_key, client_secret=consumer_secret,
                        resource_owner_key=access_token, resource_owner_secret=access_token_secret)

# get a list of all friend IDs
url = 'https://api.twitter.com/1.1/friends/ids.json'
params = {'count': 5000}
response = twitter.get(url, params=params)
friend_ids = response.json()['ids']

# unsubscribe from each friend
for friend_id in friend_ids:
    url = 'https://api.twitter.com/1.1/friendships/destroy.json?user_id={}'.format(friend_id)
    response = twitter.post(url)
    if response.status_code == 200:
        print('Unsubscribed from friend ID {}'.format(friend_id))
    else:
        print('Failed to unsubscribe from friend ID {}'.format(friend_id))

In this updated code, we use format() to insert the friend ID into the string url and the message strings.

Can you package this up to run in a GitHub Actions workflow?

Sure, here’s how you could package the code in a GitHub Actions workflow:

  1. Create a new Python file in your repository called twitter_unfollow.py with the following content:
import requests
from requests_oauthlib import OAuth1Session

# replace with your own credentials
consumer_key = 'your consumer key'
consumer_secret = 'your consumer secret'
access_token = 'your access token'
access_token_secret = 'your access token secret'

# set up OAuth1Session
twitter = OAuth1Session(consumer_key, client_secret=consumer_secret,
                        resource_owner_key=access_token, resource_owner_secret=access_token_secret)

# get a list of all friend IDs
url = 'https://api.twitter.com/1.1/friends/ids.json'
params = {'count': 5000}
response = twitter.get(url, params=params)
friend_ids = response.json()['ids']

# unsubscribe from each friend
for friend_id in friend_ids:
    url = 'https://api.twitter.com/1.1/friendships/destroy.json?user_id={}'.format(friend_id)
    response = twitter.post(url)
    if response.status_code == 200:
        print('Unsubscribed from friend ID {}'.format(friend_id))
    else:
        print('Failed to unsubscribe from friend ID {}'.format(friend_id))
  1. Create a new file in your repository called .github/workflows/twitter_unfollow.yml with the following content:
name: Twitter Unfollow

on:
  push:
    branches: [ main ]
  schedule:
    - cron: '0 0 * * *'  # run every day at midnight

jobs:
  unfollow:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v2

    - name: Set up Python 3.5
      uses: actions/setup-python@v2
      with:
        python-version: 3.5

    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install requests requests_oauthlib

    - name: Unfollow Twitter friends
      run: python twitter_unfollow.py
      env:
        CONSUMER_KEY: $
        CONSUMER_SECRET: $
        ACCESS_TOKEN: $
        ACCESS_TOKEN_SECRET: $

This workflow runs the twitter_unfollow.py script every day at midnight, using Python 3.5. It installs the requests and requests_oauthlib packages using pip, and sets the CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, and ACCESS_TOKEN_SECRET environment variables using GitHub Secrets. Be sure to replace the placeholder strings in the workflow file with the actual values of your credentials.

The workflow file is correctly passing CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN and ACCESS_TOKEN_SECRET secrets as environment variables to twitter_unfollow.py, but twitter_unfollow.py still has the placeholder variables. Can you modify twitter_unfollow.py to use the environment variables?

Sure, you can modify twitter_unfollow.py to use the environment variables like this:

import os
import requests
from requests_oauthlib import OAuth1Session

# get credentials from environment variables
consumer_key = os.environ['CONSUMER_KEY']
consumer_secret = os.environ['CONSUMER_SECRET']
access_token = os.environ['ACCESS_TOKEN']
access_token_secret = os.environ['ACCESS_TOKEN_SECRET']

# set up OAuth1Session
twitter = OAuth1Session(consumer_key, client_secret=consumer_secret,
                        resource_owner_key=access_token, resource_owner_secret=access_token_secret)

# get a list of all friend IDs
url = 'https://api.twitter.com/1.1/friends/ids.json'
params = {'count': 5000}
response = twitter.get(url, params=params)
friend_ids = response.json()['ids']

# unsubscribe from each friend
for friend_id in friend_ids:
    url = 'https://api.twitter.com/1.1/friendships/destroy.json?user_id={}'.format(friend_id)
    response = twitter.post(url)
    if response.status_code == 200:
        print('Unsubscribed from friend ID {}'.format(friend_id))
    else:
        print('Failed to unsubscribe from friend ID {}'.format(friend_id))

In this updated code, we use the os.environ dictionary to get the values of the CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, and ACCESS_TOKEN_SECRET environment variables.

IPv6 Prefix Delegation on Spectrum with dhclient

A few years ago, Spectrum (a US cable company formed from the combination of Charter and Time Warner Cable) started offering IPv6 Prefix Delegation (PD) /64s. The device connected to the cable modem will normally get an individual global dynamic IPv6 address via Router Advertisement, but Prefix Delegation is essentially the ability to request an entire network to be routed to you.

I used to live in Reno in a formerly Charter network, but recently moved to Southern California in a formerly Time Warner network, so I’m confident this information applies to all Spectrum regions. The dhclient invocation should work for any provider which supports Prefix Delegation, but the lease behavior I describe is probably not universal.

Here’s the systemd dhclient6-pd.service file on my router, a Raspberry Pi 4 connected directly to the cable modem. Replace eext0 with your external interface name.

[Unit]
Description=IPv6 PD lease reservation
Wants=network-online.target
After=network-online.target
StartLimitIntervalSec=0

[Service]
Restart=always
RestartSec=30
ExecStart=/sbin/dhclient -d -6 -P -v -lf /var/lib/dhcp/dhclient6-pd.leases eext0

[Install]
WantedBy=multi-user.target

Once running, dhclient6-pd.leases should give you something like this:

default-duid "\000\001\000\001#\225\311g\000\006%\243\332{";
lease6 {
  interface "eext0";
  ia-pd 25:a3:da:7b {
    starts 1628288112;
    renew 1800;
    rebind 2880;
    iaprefix 2600:6c51:4d00:ff::/64 {
      starts 1628288112;
      preferred-life 3600;
      max-life 3600;
    }
  }
  option dhcp6.client-id 0:1:0:1:23:95:c9:67:0:6:25:a3:da:7b;
  option dhcp6.server-id 0:1:0:1:4b:73:43:3a:0:14:4f:c3:f6:90;
  option dhcp6.name-servers 2607:f428:ffff:ffff::1,2607:f428:ffff:ffff::2;
}

So now I can see that 2600:6c51:4d00:ff::/64 is routable to me, and can set up network addresses and services. dhclient could be set up to run scripts on trigger events, but in this current state it just keeps the PD reservation.

But… max-life 3600? Does that mean I’ll lose the PD if dhclient doesn’t check in within an hour? What if I have a power outage? Yes, you will lose the PD after an hour if dhclient isn’t running… for now. After a few renewals, the far end will trust that your initial PD request wasn’t a drive-by, and will up the period from 1 hour to 7 days, and dhclient6-pd.leases will look like this:

default-duid "\000\001\000\001#\225\311g\000\006%\243\332{";
lease6 {
  interface "eext0";
  ia-pd 25:a3:da:7b {
    starts 1628288112;
    renew 1800;
    rebind 2880;
    iaprefix 2600:6c51:4d00:ff::/64 {
      starts 1628288112;
      preferred-life 3600;
      max-life 3600;
    }
  }
  option dhcp6.client-id 0:1:0:1:23:95:c9:67:0:6:25:a3:da:7b;
  option dhcp6.server-id 0:1:0:1:4b:73:43:3a:0:14:4f:c3:f6:90;
  option dhcp6.name-servers 2607:f428:ffff:ffff::1,2607:f428:ffff:ffff::2;
}
lease6 {
  interface "eext0";
  ia-pd 25:a3:da:7b {
    starts 1628291743;
    renew 300568;
    rebind 482008;
    iaprefix 2600:6c51:4d00:ff::/64 {
      starts 1628291743;
      preferred-life 602968;
      max-life 602968;
    }
  }
  option dhcp6.client-id 0:1:0:1:23:95:c9:67:0:6:25:a3:da:7b;
  option dhcp6.server-id 0:1:0:1:4b:73:43:3a:0:14:4f:c3:f6:90;
  option dhcp6.name-servers 2607:f428:ffff:ffff::1,2607:f428:ffff:ffff::2;
}

(The last lease6 is the most recent lease received.)

As far as I can tell, this 7 day PD can be renewed indefinitely; I was using the same network for nearly 2 years. But be warned: max-life is final. If you have a misconfiguration and dhclient doesn’t check in for a week, Spectrum will release your PD immediately after 7 days and your client will receive a completely new /64.


Since this is fresh in my mind from setting up my new home, here are a few things to set up on your core router, but this is not meant to be an exhaustive IPv6 Linux router guide.

The external interface automatically gets a global dynamic v6 address; as for the internal interface, while you technically don’t need a static address thanks to link-local routing, in practice you should give it one. Here’s my /etc/systemd/network/10-eint0.network:

[Match]
Name=eint0

[Network]
Address=10.9.8.1/21
Address=2600:6c51:4d00:ff::1/64
Address=fe80::1/128
IPv6AcceptRA=false
IPForward=true

You’ll also want an RA daemon for the internal network. My /etc/radvd.conf:

interface eint0 {
  IgnoreIfMissing on;
  MaxRtrAdvInterval 2;
  MinRtrAdvInterval 1.5;
  AdvDefaultLifetime 9000;
  AdvSendAdvert on;
  AdvManagedFlag on;
  AdvOtherConfigFlag on;
  AdvHomeAgentFlag off;
  AdvDefaultPreference high;
  prefix 2600:6c51:4d00:ff::/64 {
    AdvOnLink on;
    AdvAutonomous on;
    AdvRouterAddr on;
    AdvValidLifetime 2592000;
    AdvPreferredLifetime 604800;
  };
  RDNSS 2600:6c51:4d00:ff::1 {
  };
};

And a DHCPv6 server. My /etc/dhcp/dhcpd6.conf, providing information about DNS and DHCP-assigned addressing (in addition to the RA autoconfiguration):

default-lease-time 2592000;
preferred-lifetime 604800;
option dhcp-renewal-time 3600;
option dhcp-rebinding-time 7200;
allow leasequery;
option dhcp6.preference 255;
option dhcp6.rapid-commit;
option dhcp6.info-refresh-time 21600;
option dhcp6.name-servers 2600:6c51:4d00:ff::1;
option dhcp6.domain-search "snowman.lan";

subnet6 2600:6c51:4d00:ff::/64 {
  range6 2600:6c51:4d00:ff::c0c0:0 2600:6c51:4d00:ff::c0c0:ffff;
}

host workstation {
  host-identifier option dhcp6.client-id 00:01:00:01:21:37:85:10:01:23:45:ab:cd:ef;
  fixed-address6 2600:6c51:4d00:ff::2;
}

The Repository Run-Parts CI Directory (RRPCID) specification

Years ago, I wrote dsari (“Do Something And Record It”), a lightweight CI system. This was prompted by administering Jenkins installations for multiple development groups at the time, each environment having increasingly specialized (and often incompatible) plugins layered onto the core functionality.

This led me to take the opposite approach. I made a CI system based on one executable (usually a script) per job, and the assumption that you, the CI job developer, know exactly what functionality you want. Want custom notifications? Write it into the script. Sub-job triggers based on the result of the run? You can totally do that. Remote agents? Bah, just tell the script to ssh to a remote system based on the concurrency group the run is currently in. dsari’s acronym was a light-hearted take on this simplicity.

Fast forward to now, and GitHub’s CI has quickly become ubiquitous. But before that, Travis essentially pushed the idea of in-repository CI definitions, as opposed to a CI job being built around the repository as in Jenkins. As an example, finnix-live-build has a GitHub workflow which makes a test build, but I also have multiple dsari instances at home, for different architectures, doing the same thing on schedule.

However, the dsari job script merely replicates the build process of the GitHub workflow. If I add new functionality to the GitHub workflow, I need to also update build scripts on 5 different machines. This would be great to move in-repository, but I quickly found there is no established general-purpose in-repository CI layout.

So let’s make one!

If the closest simplification to the Jenkins CI model is cron, the closest simplification to the GitHub CI model is run-parts. However, since run-parts has different functionality on different systems (with Debian’s implementation currently being the most versatile), the “Run-Parts” part of the RRPCID acronym is in spirit only (though you could use Debian’s run-parts --exit-on-error for the job processing part of the RRPCID logic).

Here’s the specification I came up with:

  • A workflow is a collection of jobs, and is a readable directory under .rrpcid/workflows/.
  • A job is a collection of actions, and is a readable directory under ${workflow_dir}/jobs/.
  • An action is an executable file under ${job_dir}/. In theory this can be anything, but is likely to be a shell script.
  • Actions are executed with the repository as the current working directory.
  • Actions are executed with the environment variable CI=true. Other environment variables may be passed in from the underlying CI manager.
  • Actions are executed in lexical sort order within the job directory.
  • Workflow, job and action names must only contain letters a through z and A through Z, numbers 0 through 9, and characters “-“ (dash) and “_” (underscore). Note specifically the lack of “.” (period).
  • A repository may have multiple workflows, a workflow may have multiple jobs, and a job may have multiple actions.
  • If an action exits with a status other than 0, further actions in a job are skipped.
  • All jobs in a workflow are run, regardless of whether other jobs’ actions have failed.
  • Actions within the job must not assume another job has previously run.
  • An implicit, unnamed workflow lives directly in .rrpcid/.
  • Whether all, some or none of the repository’s workflows are run is up to the CI manager; that logic is outside the scope of this specification.
  • All other files and directories are ignored. For example, a directory named .rrpcid/testdata/ is outside the scope of this specification, and would not be handled.
  • A recommended directory for generated artifacts is artifacts/ within the workflow directory, but a CI manager is not required to do anything with this.

A script layout utilizing multiple workflows (including the implicit unnamed workflow) and multiple jobs might look like this:

.rrpcid/jobs/ci/action_1
.rrpcid/jobs/ci/action_2
.rrpcid/jobs/lint/run-lint
.rrpcid/workflows/deploy/jobs/env1/deploy
.rrpcid/workflows/deploy/jobs/env2/deploy
.rrpcid/workflows/deploy/jobs/archive/01tar
.rrpcid/workflows/deploy/jobs/archive/02upload

The following shell code would satisfy the above requirements, assuming it’s being run from dash/bash (both sort “*” matches which are needed for the job directory; other Bourne shells may not). It satisfies the requirements, but is by no means the only way to implement an RRPCID processor.

export CI=true
run_workflow() {
    for job_dir in "${1}/jobs"/*; do
        [ -d "${job_dir}" ] || continue
        [ -x "${job_dir}" ] || continue
        [ -z "$(basename "${job_dir}" | sed 's/[a-zA-Z0-9_-]//g')" ] || continue
        for action in "${job_dir}"/*; do
            [ -x "${action}" ] || continue
            [ -z "$(basename "${action}" | sed 's/[a-zA-Z0-9_-]//g')" ] || continue
            "${action}" || break
        done
    done
}

run_workflow .rrpcid
for workflow_dir in .rrpcid/workflows/*; do
    [ -d "${workflow_dir}" ] || continue
    [ -x "${workflow_dir}" ] || continue
    [ -z "$(basename "${workflow_dir}" | sed 's/[a-zA-Z0-9_-]//g')" ] || continue
    run_workflow "${workflow_dir}"
done

finnix-live-build now has a simple RRPCID job, though as of this writing I have not yet switched over the home dsari jobs to utilize it.


Side note / rant: I went back and forth on whether to allow “.” as part of the names, specifically the action script names. Historically, ignoring “.” has been traditional for run-parts, cron.d, etc, because it ignores automatically-created files such as foo.bak, foo.swp, etc. However, I acknowledge using extensions for executable scripts within a project (.sh, .py, etc) is currently popular.

My answer to this? Stop Doing That. I can’t tell you how many times I’ve seen someone (including me) put do_cool_thing.sh into a repository, and over time it gets expanded to the point it’s too complicated to be effectively managed as a shell script, and is rewritten in, say, Python. The problem is references to do_cool_thing.sh are now too entrenched, and you now have do_cool_thing.sh which is actually a Python script (!!!), or do_cool_thing.sh which is just a wrapper call to do_cool_thing (if I learned my lesson) or do_cool_thing.py (if I didn’t).

Just drop the extension when creating a script. Shebangs exist for a reason. 😉

By the way, if you do find yourself in this migration situation, here’s a general-purpose redirect script for the old location:

#!/bin/sh

exec "$(dirname "$0")/$(basename "$0" .sh)" "$@"

Want to hire me? Let's talk careers.

Summary

Hello! I am a highly experienced Linux systems engineer, looking to work with the right team. If you are in need of a senior SRE with a focus on operational development, or a developer with a focus on design for infrastructure, here’s my resume; I’d love to talk with you!

Background

I left my previous employer last year, having planned to take a several month sabbatical. In a stroke of… interesting timing, my last day was the first week of March 2020. With COVID and lockdowns and the world in turmoil, I decided to extend my sabbatical and work on a bunch of personal projects.

A year has passed and I’m ready to join the career world again.

About me

My resume contains the essential details, but the nice thing about a blog post is it allows me to be more fluid. So let’s be fluid!

My immediate previous work experience was 8 years as a Site Reliability Engineer at Canonical, the company behind Ubuntu. Canonical was an 80% remote work company (and since became 100% since COVID), and I worked within a group of about 20 SREs, supporting the company’s operations, as well as interacting directly with the open source community. While I have experience with many Linux distributions, suffice it to say, I know Ubuntu inside and out.

I’ve been doing Linux systems engineering for over 20 years, and have worked with many people across the open source world. My claim to fame is Finnix, a bootable utility Linux distribution (LiveCD), geared toward system administration, rescue and recovery, etc. I am a Debian maintainer, an Ubuntu technical member, and additionally have packaging experience with Fedora and Homebrew.

While I can pick up nearly any programming language, I describe myself as a prolific Python programmer. For a portfolio example of my current Python ability, see rf-pymods, a collection of standalone helper modules. Docstrings on each function, 100% code coverage on unit tests, tox, GitHub workflows. For a more realistic example, see 2ping, a network investigation utility which was developed in 2010 and has been updated and maintained since. Tox, CI, test framework (but not (yet) 100% coverage), reasonable code and functional documentation.

Public cloud (I wrote the caching proxy software running the per-region Ubuntu mirrors on AWS, Azure and GCE). Private cloud (nearly a decade of OpenStack experience). Containers. Continuous integration (I even have my own lightweight CI system called dsari). The list goes on. And yes, I know Git.

About you

The top consideration I have with a potential employer is a healthy remote work lifestyle. I’ve been working from home for 10 years now, and recognize the strengths (and weaknesses) of a remote work setup. COVID caused many companies to shoehorn in work-from-home into their existing business strategy on short notice, while I’ve been most pleased with companies who have remote collaboration as part of their DNA.

That being said, I’m looking for a senior SRE position which has a focus on operational development. This can also take the form of a development role which has a focus on design for infrastructure. In many ways, these are one in the same. Open source development and contribution is strongly preferred; I do most of my work in the open, and value companies which do the same.

I’ve had experience with startups and are not opposed to them, but would prefer a mid-sized established company or a late-stage startup. Industry is not as important as the people and the teams. I am based on the US west coast and have extensive experience working with geographically distributed teams.

Let’s talk

If you’re excited, here’s my resume, here’s my GitHub profile, go give Finnix a try, etc, then send me an email. I’d love to talk with you.

« All posts