Ryan Finnie

A brave new world of blogging

About a month ago, I converted my blogs over to Jekyll. (I’ve also made the leap and converted all of my web sites over to HTTPS, and upgraded my main colo box from Ubuntu 14.04 to 16.04, taking the opportunity to retire a ton of old PHP code during the upgrade from PHP 5 to PHP 7. It’s been a busy few weeks.)

Part of the process of converting the blogs to Jekyll was storing the site data and posts in Git repositories. This was mostly for my own benefit, and I certainly could have kept the repositories private, but instead chose to push them to GitHub (finnie.org, blog.finnix.org) to serve as examples for others who may wish to do the same thing.

At the bottom of each page, I do have a small “Git source” link to the corresponding page on GitHub. I didn’t expect anyone would really notice, and I certainly did not expect anyone to make a pull request against the repositories. So I was surprised when I saw this pull request, by someone who fixed the formatting on one of my imported posts.

Unexpected, but certainly welcome! So feel free to open Issues or PRs against those repositories, but I would discourage you from actually writing new posts and asking for them to be published via a PR.

As for writing posts, editing was an unexpected complication. While I spend most of my day in a terminal editor[0] which is fine for writing code and the occasional documentation, I found I wasn’t comfortable doing free-form writing there. I’ve simply been used to writing blog posts in a web browser for over 15 years (LiveJournal, then Wordpress), with instant spell checking and a relatively quick preview available. While I’ve got a few things I want to write about, I found myself hesitating with the actual writing because of the editor situation.

I went looking for an editor I could use just for Markdown blog post writing, and eventually came across ReText. It’s primarily a Markdown editor which is good for this purpose, has built-in spell checking, Markdown highlighting, a dropdown list of common Markdown formatting, and most usefully, allows for side-by-side editing and live preview. This is the first post I’ve written completely with ReText, and while it’s not perfect, it’s made it easier for me to just write.

[0] Most of the time it’s nano, if you must know. Yes, get it out of your system. I’ve been using nano née pico since my first days on “the Internet”, a 1992 dialup BSD VAX account which had the full UW suite. So yes, I’ve been using nano since before some of you were born.

Sure, Let's Encrypt!

This weekend I converted all of my web sites to SSL (technically TLS), using Let’s Encrypt certificates.

For a few years, I had been using StartCom’s free SSL cert for a non-essential web site on my main server. StartCom was effectively the only viable free certificate authority at the time, with good browser support. CAcert wasn’t too useful since almost nothing ships with its CA certificates, while Comodo has free SSL certs which expire after 90 days, essentially a free demo.

I had been planning on tackling global SSL since earlier this year. The biggest hurdle is adoption of Server Name Indication, or SNI. Before SNI, you were limited to one SSL certificate per IP address, much like the time in the 90s before name-based virtual hosts. And like name-based virtual hosts, even after the technology was developed, it was slow to be adopted since name-based vhosts and SNI both require the web client to have support for them.

I had set a self-imposed delay until April 2017, with the 5-year EOL of Ubuntu 12.04 LTS (precise). While overall, this was a good round number to stick to, the technical reason was wget had not yet added SNI support until just after precise was released. After that, remaining incompatibilities would be at an acceptable level (basically just Windows XP).

This weekend I was playing around, and ordered a few new StartCom certs as a test for a few sites (finnie.org, vad.solutions), with the idea of setting up dual HTTP/HTTPS, and not switching over to HTTPS only until 2017. Things were working well, until I noticed when the iPhone browser tried to connect, it immediately dropped the connection. Likewise, when trying with Safari on a Mac, it came back with “This certificate has an invalid issuer”. Examining the cert path side-by-side with the old site (with a cert from February), both the old site and the new sites had identical issuer certificates, but Apple browsers were not trusting the new certs.

I spent hours trying to figure this out, thinking it must be an SNI-related server-side issue. Eventually I came across a post by Mozilla Security which explained the issue. Apparently StartCom Did a Bad (backdated cert issuances and lied about its organizational structure), and got the mark of death by the major browser vendors. However, the way the vendors did that was by continuing to trust issued certs before a certain date a few months ago, but anything after that is not trusted, which explains why my old site continued to work. Firefox and Chrome worked fine with the new certs, but that’s because their change has not yet hit released products yet, while Apple’s been quicker on the matter. (This also explains why StartCom has gone from free 1-year certs to free 3-year certs: they’re essentially worthless.)

So I started looking into Let’s Encrypt. I had known about the project since it was launched last year, but didn’t give it much thought until now. Primarily, I thought the project wouldn’t be too useful as a new CA would take years to get into the major browsers and OSes to the extent it would be viable. Also, it had sounded like I needed to run an agent which took over configuration of Apache for me, something I did not want to do.

Turns out I was wrong on both fronts. Their intermediary signing cert is actually cross-signed by an established CA (IdenTrust), so even if the browsers and OSes don’t have the Let’s Encrypt root CA, certs they sign will still be trusted. On the software side, you still do need to run an agent which handles the verification and ordering, but it doesn’t need to completely take over your web server. You can run it in “certonly” mode and point it at a web site’s root (it needs to be able to place a few challenge/response files there for verification), but lets you handle the resulting certificate. The certs are only valid for 90 days, but the idea is you idempotently run the renew command from cron daily, and it’ll seamlessly renew the certificates.

Let’s Encrypt has since become the world’s largest certificate provider, by raw number of certificates. And since their cry is “convert all your sites over to SSL”, the implicit implication is SNI is here and we don’t need to worry too much about older clients. So hey, may as well get going now!

Most sites were trivial to convert. For example, 74d.com has nothing except “hey, maybe some day someone will pay me an insane amount of money for this domain”. Other sites required more planning. The biggest problem with starting to serve an existing site via HTTPS is all images, CSS and JS also needs to be served via HTTPS, even if the domain is configured to completely redirect from HTTP to HTTPS. finnie.org is a deceptively massive site, with decades worth of junk layered on it (the domain itself is actually coming up on its 20 year anniversary in April), so it took a lot of grepping for places where I had hard-coded HTTP content. I’m sure I’ve missed places, but most of it has been fixed.

finnix.org is another example of requiring a lot of thought. The main site itself, a mostly self-referential MediaWiki site, was easy to do. But it also has about a dozen subdomains which required individual examination. For example, archive.finnix.org is a static site which just serves files; trivial to redirect to HTTPS, right? Problem is, they are mostly fetched by apt, which 1) does not follow HTTP redirects, and 2) does not support HTTPS unless an additional transport package is installed. So if I switched that to HTTPS only, it would break a number of Finnix operations. In the end, I decided on serving both HTTP and HTTPS, and setting it up so if you go to http://archive.finnix.org/ it’ll redirect to https://archive.finnix.org/, but individual files can be retrieved via either HTTP or HTTPS.

In total, I’ve created 22 certificates, covering 41 hostnames. As Let’s Encrypt is now essentially the only free CA, I wish them well, and have even donated $100 to their non-profit. This is really putting all my eggs in one basket; once you go to HTTPS-only for a site, it’s very hard to go back.

Edit: It’s been pointed out that you can get around SNI issues by using multiple Subject Alternative Names (SANs) with Let’s Encrypt, as SAN has much more older support than SNI current does. I had been using SANs for multiple similar hostnames on a certificate (e.g. www.finnie.org had a SAN for finnie.org), but thought certbot required them all to have a single document root. Turns out you can define a “webroot map” for multiple hostnames to multiple document roots, and there are no defined limits to the number of SANs you can use (though it appears the accepted effective limit in the industry is about 100).

The big downside of one cert with multiple SANs is you are now publicy advertising a group of which sites you are administering, but in this case I’m fine with that. I’ve changed things so 37 of my hostnames are now covered under one certificate.

Also, the part about Ubuntu precise’s wget not supporting SNI is no longer correct. Thankfully SNI support for wget had been backported to precise in May 2016.

Blog sites updated

I’ve been on the ol’ blogosphere for over 15 years now, starting with a LiveJournal back in 2001. In 2009, I moved the contents of the LiveJournal to my personal site, a WordPress installation. In addition, I’ve got a blog for Finnix at blog.finnix.org, also a WordPress site.

I haven’t been blogging much (my last post was a little over a year ago), but I’ve been keeping the software up to date on them, as WordPress is not exactly know to be the most secure⁰. However, I’ve had a thought in the back of my mind for awhile: Why not convert the posts to Markdown, write up a template and serve them statically? Turns out a decently generalized solution already exists for that, Jekyll.

Over the last week, I’ve been converting both sites over from WordPress to Jekyll. The layout is simple, mostly baseline Bootstrap, but it’s starting to reach the point where it takes a lot of effort to make it look like something which didn’t take a lot of effort. One of the nice things about Jekyll is it’s all just flat files, so it’s easy to track with Git. I’ve got repositories for both sites on GitHub (finnie.org, blog.finnix.org), if you want to take a look. The templates would mostly work as a generic base, but do have some personal customizations (such as the Google site search at the top of finnie.org).

For my personal site, I also did some pruning of old content. When I migrated to WordPress in 2009, I imported all of my LiveJournal posts. Personal LiveJournal posts tend to have a different “feel” than content-focused blog posts, and let’s be honest, there was a lot of angsty 20-something posting I did throughout the 2000s I’m a bit embarrassed by now. There were over 900 posts on my personal WordPress site. I saved all of my posts since the conversion to WordPress in 2009, plus a handful of pre-2009 LiveJournal posts I wanted to keep because they were interesting or important. Interestingly, nothing from the first 2½ years survived, but in total, I picked a little over 200 posts to keep.

⁰ One might blame that on being PHP, and while it’s true that PHP makes it very easy to shoot yourself in the foot, it doesn’t have to be that way. The classic example I give of a good PHP project is MediaWiki. It’s a large open source project with a decent track record for security, and it has a very good development contribution model. (It was one of the first large projects to embrace the open contribution model. Fork the repository, make a change, push it to a personal branch, open a merge request, it gets tested against CI, and core developers can then merge it directly. Back in the day it was an eye opener to go, “Wait I don’t have to be a project developer to make a commit directly?”)

2ping 3.0.0 released

2ping 3.0.0 has been released. It is a total rewrite, with the following features:

  • Total rewrite from Perl to Python.
  • Multiple hostnames/addresses may be specified in client mode, and will be pinged in parallel.
  • Improved IPv6 support:
    • In most cases, specifying -4 or -6 is unnecessary. You should be able to specify IPv4 and/or IPv6 addresses and it will “just work”.
    • IPv6 addresses may be specified without needing to add -6.
    • If a hostname is given in client mode and the hostname provides both AAAA and A records, the AAAA record will be chosen. This can be forced to one or another with -4 or -6.
    • If a hostname is given in listener mode with -I, it will be resolved to addresses to bind as. If the hostname provides both AAAA and A records, they will both be bound. Again, -4 or -6 can be used to restrict the bind.
    • IPv6 scope IDs (e.g. fe80::213:3bff:fe0e:8c08%eth0) may be used as bind addresses or destinations.
  • Better Windows compatibility.
  • ping(8)-compatible superuser restrictions (e.g. flood ping) have been removed, as 2ping is a scripted program using unprivileged sockets, and restrictions would be trivial to bypass. Also, the concept of a “superuser” is rather muddied these days.
  • Better timing support, preferring high-resolution monotonic clocks whenever possible instead of gettimeofday(). On Windows and OS X, monotonic clocks should always be available. On other Unix platforms, monotonic clocks should be available when using Python 2.7
  • Long option names for ping(8)-compatible options (e.g. adaptive mode can be called as –adaptive in addition to -A). See 2ping --help for a full option list.

Because of the IPv6 improvements, there is a small breaking functionality change. Previously, to listen on both IPv4 and IPv6 addresses, you needed to specify -6, e.g. 2ping --listen -6 -I -I ::1. Now that -6 restricts binds to IPv6 addresses, that invocation will just listen on ::1. Simply remove -6 to listen on both IPv4 and IPv6 addresses.

This is a total rewrite in Python, and the original Perl code was not used as a basis, instead writing the new version from the 2ping protocol specification. (The original Perl version was a bit of a mess, and I didn’t want to pick up any of its habits.) As a result of rewriting from the specification, I discovered the Perl version’s implementation of the checksum algorithm was not even close to the specification (and when it comes to checksums, “almost” is the same as “not even close”). As the Perl version is the only known 2ping implementation in the wild which computes/verifies checksums, I made a decision to amend the specification with the “incorrect” algorithm described in pseudocode. The Python version’s checksum algorithm matches this in order to maintain backwards compatibility.

This release also marks the five year anniversary of 2ping 1.0, which was released on October 20, 2010.

dsari - Do Something and Record It

When Finnix first started transitioning to Project NEALE, the ability to produce Finnix builds from scratch in a normalized fashion, I began making NEALE builds on a nightly schedule using cron. This was fine in the beginning, but as things got more complex (there are currently 16 different variants of Finnix being built nightly), I started looking at alternatives.

Like many people throughout history, my ad-hoc CI system transitioned from a cron script to Jenkins. This worked decently, but there were drawbacks. Jenkins requires Java, and is very memory intensive. I had ARM builders being fed by Jenkins, and found the remote was taking up most of the memory. Occasionally remotes would just freeze up. And since the main instance was running inside my home network, I needed to proxy the web interface from my main colo box for reports to be visible to the world. Overall, Jenkins had the feel of a Very Big Project, complete with drawbacks.

That got me thinking of what I would need for something midway between cron and Jenkins. “Well, I basically need to do something and record it. Everything else can hang off the ‘do something’ part, or the ‘record it’ part.” And with that, dsari was born.

dsari is a lightweight continuous integration (CI) system. It provides scheduling, concurrency management and trigger capabilities, and is easy to configure. Job scheduling is handled via dsari-daemon, while dsari-render may be used to format job run information as HTML.

That’s basically it. All other functionality is based on the idea that you have a better idea of what you want to do than I do. The “do something” portion of the job run is literally running a single command - this is almost always a shell script. For example, all of the jobs used to do NEALE builds call the same shell script, which uses the JOB_NAME and RUN_ID environment variables to determine what variants to build. The shell script then performs the build and emails me if a run fails or returns to normal.

Want to produce an off-schedule run based on a trigger event, such as a VCS commit? dsari has a powerful trigger system, but it’s based on the idea that you figure out what the trigger event is, and you write the trigger configuration file which dsari picks up on.

dsari has a decent scheduler which is based off the cron format, with Jenkins-style hash expansion so you can easily spread runs out without having to hard-code separation. And dsari has an expansive concurrency system which lets you limit runs to one or more concurrency groups, which lets you do things like resource limiting and/or pooling.

Run data (output and metadata) is stored in a standardized location, and dsari includes a utility which renders the data as simple HTML reports. You may then sync the HTML tree to the final destination, rather than relying on exposing a web daemon.

dsari fits my requirements: a simple CI system which slots somewhere between cron and Jenkins. Surely this will be insufficient for some people, while it will be overkill for others. Hopefully it will be useful to people.