Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

MIDI to Music Box score converter

Keeping with the theme of things I've forgotten to blog about (every time I think I've mentioned everything, I remember something else), in this post I'm going to talk about yet another project of mine that's been happily sitting around (and I've nearly blogged about but then forgot on at least 3 separate occasions): A MIDI file to music box conversion script. A while ago, I bought one of these customisable music boxes:

(Above: My music box (centre), and the associated hole punching device it came with (top left).)

The basic principle is that you bunch holes in a piece of card, and then you feed it into the music box to make it play the notes you've punched into the card. Different variants exist with different numbers of notes - mine has a staggering 30 different nodes it can play!

It has a few problems though:

  1. The note names on the card are wrong
  2. Figuring out which one is which is a problem
  3. Once you've punched a hole, it's permanent
  4. Punching the holes is fiddly

Solving problem #4 might take a bit more work (a future project out-of-scope of this post), but the other 3 are definitely solvable.

MIDI is a protocol (and a file format) for storing notes from a variety of different instruments (this is just the tip of the iceberg, but it's all that's needed for this post), and it just so happens that there's a C♯ library on NuGet called DryWetMIDI that can read MIDI files in and allow one to iterate over the notes inside.

By using this and an homebrew SVG writer class (another blog post topic for another time once I've tidied it up), I implemented a command-line program I've called MusicBoxConverter (the name needs work - comment below if you can think of a better one) that converts MIDI files to an SVG file, which can then be printed (carefully - full instructions on the project's page - see below).

Before I continue, it's perhaps best to give an example of my (command line) program's output:

Example output

(Above: A lovely little tune from the TV classic The Clangers, converted for a 30 note music box by my MusicBoxConverter program. I do not own the song!)

The output from the program can (once printed to the correct size) then be cut out and pinned onto a piece of card for hole punching. I find paper clips work for this, but in future I might build myself an Arduino-powered device to punch holes for me.

The SVG generated has a bunch of useful features that makes reading it easy:

  • Each note is marked by a blue circle with a red plus sign to help with accuracy (the hole puncher that came with mine has lines on it that make a plus shape over the viewing hole)
  • Each hole is numbered to aid with getting it the right way around
  • Big arrows in the background ensure you don't get it backwards
  • The orange bar indicates the beginning
  • The blue circle should always be at the bottom and the and the red circle at the top, so you don't get it upside down
  • The green lines should be lined up with the lines on the card

While it's still a bit tedious to punch the holes, it's much easier be able to adapt a score in Musescore, export to MIDI, push it through MusicBoxConverter than doing lots of trial-and-error punching it directly unaided.

I won't explain how to use the program in this blog post in any detail, since it might change. However, the project's README explains in detail how to install and use it:

https://git.starbeamrainbowlabs.com/sbrl/MusicBoxConverter

Something worth a particular mention here is the printing process for the generated SVGs. It's somewhat complicated, but this is unfortunately necessary in order to avoid rescaling the SVG during the conversion process - otherwise it doesn't come out the right size to be compatible with the music box.

A number of improvements stand out to me that I could make to this project. While it only supports 30 note music boxes at the moment, it can easily be extended to support multiple other different type of music box (I just don't have them to hand).

Implementing a GUI is another possible improvement but would take a lot of work, and might be best served as a different project that uses my MusicBoxConverter CLI under the hood.

Finally, as I mentioned above, in a future project (once I have a 3D printer) I want to investigate building an automated hole punching machine to make punching the holes much easier.

If you make something cool with this program, please comment below! I'd love to hear from you (and I may feature you in the README of the project too).

Sources and further reading

  • Musescore - free and open source music notation program with a MIDI export function
  • MusicBoxConverter (name suggestions wanted in the comments below)

NAS Backups, Part 1: Overview

After building my nice NAS, the next thing on my sysadmin todo list was to ensure it is backed up. In this miniseries (probably 2 posts), I'm going to talk about the backup NAS that I've built. As of the time of typing, it is sucessfully backing up my Btrfs subvolumes.

In this first post, I'm going to give an overview of the hardware I'm using and the backup system I've put together. In future posts, I'll go into detail as to how the system works, and which areas I still need to work on moving forwards.

Personally, I find that the 3-2-1 backup strategy is a good rule of thumb:

  • 3 copies of the data
  • 2 different media
  • 1 off-site

What this means is tha you should have 3 copies of your data, with 2 local copies and one remote copy in a different geographical location. To achieve this, I have solved first the problem of the local backup copy, since it's a lot easier and less complicated than the remote one. Although I've talked about backups before (see also), in this case my solution is slightly different - partly due to the amount of data involved, and partly due to additional security measures I'm trying to get into place.

Hardware

For hardware, I'm using a Raspberry Pi 4 with 4GB RAM (the same as the rest of my cluster), along with 2 x 4 TB USB 3 external hard drives. This is a fairly low-cost and low-performance solution. I don't particularly care how long it takes the backup to complete, and it's relatively cheap to replace if it fails without being unreliable (Raspberry Pis are tough little things).

Here's a diagram of how I've wired it up:

(Can't see the above? Try a direct link to the SVG. Created with drawio.)

I use USB Y-cables to power the hard rives directly from the USB power supply, as the Pi is unlikely to be able to supply enough power for mulitple external hard drives on it's own.

Important Note: As I've discovered with a different unrelated host on my network, if you do this you can back-power the Pi through the USB Y cable, and potentially corrupt the microSD card by doing so. Make sure you switch off the entire USB power supply at once, rather than unplug just the Pi's power cable!

For a power supply, I'm using an Anker 10 port device (though I bought through Amazon, since I wasn't aware that Anker had their own website) - the same one that powers my Pi cluster.

Strategy

To do the backup itself I'm using the fact that I store my data in Btrfs subvolumes and the btrfs send / btrfs receive commands to send my subvolumes to the remote backup host over SSH. This has a number of benefits:

  1. The host doing the backing up has no access to the resulting backups (so if it gets infected it can't also infect the backups)
  2. The backups are read-only Btrfs snapshots (so if the backup NAS gets infected my backups can't be altered without first creating a read-write snapshot)
  3. Backups are done incrementally to save time, but a full backup is done automatically on the first run (or if the local metadata is missing)

While my previous backup solution using Restic for the server that sent you this web page has point #3 on my list above, it doesn't have points 1 and 2.

Restic does encrypt backups at rest though, which the system I'm setting up doesn't do unless you use LUKS to encrypt the underlying disks that Btrfs stores it's data on. More on that in the future, as I have tentative plans to deal with my off-site backup problem using a similar technique to that which I've used here that also encrypts data at rest when a backup isn't taking place.

In the next post, I'll be diving into the implementation details for the backup system I've created and explaining it in more detail - including sharing the pair of scripts that I've developed that do the heavy lifting.

WorldEditAdditions: The story of the //convolve command

Sometimes, one finds a fascinating new use for an algorithm in a completely unrelated field to its original application. Such is the case with the algorithm behind the //convolve command in WorldEditAdditions which I recently blogged about. At the time, I had tried out the //smooth command provided by we_env, but The //smooth command smooths out rough edges in the terrain inside the defined WorldEdit region, but I was less than satisfied with it's flexibility, configurability, and performance.

To this end, I set about devising a solution. My breakthrough in solving the problem occurred when I realised that the problem of terrain smoothing is much easier when you consider it as a 2D heightmap, which is itself suspiciously like a greyscale image. Image editors such as GIMP already have support for smoothing with their various blur filters, the most interesting of which for my use case was the gaussian blur filter. By applying a gaussian blur to the terrain heightmap, it would have the effect of smoothing the terrain - and thus //convolve was born.

If you're looking for a detailed explanation on how to use this command, you're probably in the wrong place - this post is more of a developer commentary on how it is implemented. For an explanation on how to use the command, please refer to the chat command reference.

The gaussian blur effect is applied by performing an operation called a convolution over the input 2D array. This function is not specific to image editor (it can be found in AI too), and it calculates the new value for each pixel by taking a weighted sum of it's existing value and all of it's neighbours. It's perhaps best explained with a diagram:

All the weights in the kernel (sometimes called a filter) are floating-point numbers and should add up to 1. Since we look at each pixels neighbours, we can't operate on the pixels around the edge of the image. Different implementations have different approaches to solving this problem:

  1. Drop them in the output image (i.e. the output image is slightly smaller than the input) - a CNN in AI does this by default in most frameworks
  2. Ignore the pixel and pass it straight through
  3. Take only a portion of the kernel and remap the weights to that we can use that instead
  4. Wrap around to the other side of the image

For my implementation of the //convolve command, the heightmap in my case is essentially infinite in all directions (although I obviously request a small subset thereof), so I chose option 2 here. I calculate the boundary given the size of the kernel and request an area with an extra border around it so I can operate on all the pixels inside the defined region.

Let's look at the maths more specifically here:

Given an input image, for each pixel we take a multiply the kernel by each pixel and it's neighbours, and for each resulting matrix we sum all the values therein to get the new pixel value.

The specific weightings are of course configurable, and are represented by a rectangular (often square) 2D array. By manipulating these weightings, one can perform different kinds of operations. All of the following operations can be performed using this method:

Currently, WorldEditAdditions supports 3 different kernels in the //convolve command:

  • Box blur - frequently has artefacts
  • Gaussian Blur (additional parameter: sigma, which controls the 'blurriness') - the default, and also most complicated to implement
  • Pascal - A unique variant I derived from Pascal's Triangle. Works out as slightly sharper than Gaussian Blur without having artifacts like box blur.

After implementing the core convolution engine, writing functions to generate kernels of defined size for the above kernels was a relatively simple matter. If you know of another kernel that you think would be useful to have in WorldEditAdditions, I'm definitely open to implementing it.

For the curious, the core convolutional filter can be found in convolve.lua. It takes 4 arguments:

  1. The heightmap - a flat ZERO indexed array of values. Think of a flat array containing image data like the HTML5 Canvas getImageData() function.
  2. The size of the heightmap in the form { x = width, z = height }
  3. The kernel (internally called a matrix), in the same form as the heightmap
  4. The size of the kernel in the same form as the size of the heightmap.

Reflecting on the implementation, I'm pretty happy with it within the restrictions of the environment within which it runs. As you might have suspected based on my explanation above, convolutionally-based operations are highly parallelisable (there's a reason they are used in AI), and so there's a ton of scope for at using multiple threads, SIMD, and more - but unfortunately the Lua (5.1! It's so frustrating since 5.4 is out now) environment in Minetest doesn't allow for threading or other optimisations to be made.

The other thing I might do something about soon is the syntax of the command. Currently it looks like this:

//convolve <kernel> [<width>[,<height>]] [<sigma>]

Given the way I've used this command in-the-field, I've found myself specifying the kernel width and the sigma value more often than I have found myself changing the kernel. With this observation in hand, it would be nice to figure out a better syntax here that reduces the amount of typing for everyday use-cases.

In a future post about WorldEditAdditions, I want to talk about a number of topics, such as the snowballs algorithm in the //erode. I also potentially may post about the new //noise2d command I've working on, but that might take a while (I'm still working on implementing a decent noise function for that, since at the time of typing the Perlin noise implementation I've ported is less than perfect).

WorldEditAdditions: More WorldEdit commands for Minetest

Personally, I see enormous creative potential in games such as Minetest. For those who aren't in the know, Minetest is a voxel sandbox game rather like the popular Minecraft. Being open-source though, it has a solid Lua Modding API that allows players with a bit of Lua knowledge to script their own mods with little difficulty (though Lua doesn't come with batteries included, but that's a topic for another day). Given the ease of making mods, Minetest puts much more emphasis on installing and using mods - in fact most content is obtained this way.

Personally I find creative building in Minetest a relaxing activity, and one of the mods I have found most useful for this is WorldEdit. Those familiar with Minecraft may already be aware of WorldEdit for Minecraft - Minetest has it's own equivalent too. WorldEdit in both games provides an array of commands one can type into the chat window to perform various functions to manipulate the world - for example fill an area with blocks, create shapes such as spheres and pyramids, or replace 1 type of node with another.

Unfortunately though, WorldEdit for Minetest (henceforth simply WorldEdit) doesn't have quite the same feature set that WorldEdit for Minecraft does - so I decided to do something about it. Initially, I contributed a pull request to node weighting support to the //mix command, but I quickly realised that the scale of the plans I had in mind weren't exactly compatible with WorldEdit's codebase (as of the time of typing WorldEdit for Minetest's codebase consists of a relatively small number of very long files).

(Above: The WorldEditAdditions logo, kindly created by @VorTechnix with Blender 2.9)

To this end, I decided to create my own project in which I could build out the codebase in my own way, without waiting for lots of pull requests to be merged (which would probably put strain on the maintainers of WorldEdit).

The result of this is a new mod for Minetest which I've called WorldEditAdditions. Currently, it has over 35 additional commands (though by the time you read this that number has almost certainly grown!) and 2 handheld tools that extend the core feature set provided by WorldEdit, adding commands to do things like:

These are just a few of my favourites - I've implemented many more beyond those in this list. VorTechnix has also contributed a raft of improvements, from improvements to //torus to an entire suite of selection commands (//srect, //scol, //scube, and more), which has been an amazing experience to collaborate on an open-source project with someone on another continent (open-source is so cool like that).

A fundamental difference I decided on at the beginning when working of WorldEditAdditions was to split my new codebase into lots of loosely connected files, and have each file have a single purpose. If a files gets over 150 lines, then it's a candidate for being split up - for example every command has a backend (which sometimes itself spans multiple files, as in the case of //convolve and //erode) that actually manipulates the Minetest world and a front-end that handles the chat command parsing (this way we also get an API for free! Still working on documenting that properly though - if anyone knows of something like documentation for Lua please comment below). By doing this, I enable the codebase to scale in a way that the original WorldEdit codebase does not.

I've had a ton of fun so far implementing different commands and subsequently using them (it's so satisfying to see a new command finally working!), and they have already proved very useful in creative building. Evidently others think so too, as we've already had over 4800 downloads on ContentDB: ContentDB Shield listing the live number of downloads

Given the enormous amount of work I and others have put into WorldEditAdditions and the level of polish it is now achieving (on par with my other big project Pepperminty Wiki, which I've blogged about before), recently I also built a website for it:

The WorldEditAdditions website

You can visit it here: https://worldeditadditions.mooncarrot.space/

I built the website with Eleventy, as I did with the Pepperminty Wiki website (blog post). My experience with Eleventy this time around was much more positive than last time - more on this in future blog post. For the observant, I did lift the fancy button code on the new WorldEditAdditions website from the Pepperminty Wiki website :D

The website has the number of functions:

  • Explaining what WorldEditAdditions is
  • Instructing people on how to install it
  • Showing people how to use it
  • Providing a central reference of commands

I think I covered all of these bases fairly well, but only time will tell how actual users find it (please comment below! It gives me a huge amount of motivation to continue working on stuff like this).

Several components of the WorldEditAdditions codebase deserve their own blog posts that have not yet got one - especially //erode and //convolve, so do look out for those at some point in the future.

Moving forwards with WorldEditAdditions, I want to bring it up to feature parity with Worldedit for Minecraft. I also have a number of cool and unique commands in mind I'm working on such as //noise (apply an arbitrary 2d noise function to the height of the terrain), //mathapply (execute another command, but only apply changes to the nodes whose coordinates result in a number greater than 0.5 when pushed through a given arbitrary mathematical expression - I've already started implementing a mathematical expression parser in Lua with a recursive-descent parser, but have run into difficulties with operator precedence), and exporting Minetest regions to obj files that can then be imported into Blender (a collaboration with @VorTechnix).

If WorldEditAdditions has caught your interest, you can get started by visiting the WorldEditAdditions website: https://worldeditadditions.mooncarrot.space/.

If you find it useful, please consider starring it on GitHub and leaving a review on ContentDB! If you run into difficulties, please open an issue and I'll help you out.

Pure CSS Gallery

Galleries of pictures are nothing new. News sites, photography portfolios, screenshots of applications - you've no doubt seen one recently.

I found myself with the task of implementing such a gallery recently, and inspired by CSSBox I decided to implement my own - purely in CSS without any Javascript at all as a challenge (also because the website upon which I wanted to put it doesn't yet have any Javascript on it, and I don't want to add any unless I have to). I started with a basic HTML structure:

<section class="gallerybox">

    <section class="gallerybox-gallery">

        <figure class="gallerybox-item" id="picture1">
            <picture>
                <img src="https://placecats.com/800/500?a" alt="" decoding="async" />
            </picture>

            <figcaption>Caption A</figcaption>

            <a class="gallerybox-prev" href="#picture3">❰</a>
            <a class="gallerybox-next" href="#picture2">❱</a>
        </figure>
        <figure class="gallerybox-item" id="picture2">
            <picture>
                <img src="https://placecats.com/800/501" alt="" decoding="async" />
            </picture>

            <figcaption>Caption B</figcaption>

            <a class="gallerybox-prev" href="#picture1">❰</a>
            <a class="gallerybox-next" href="#picture3">❱</a>
        </figure>
        <figure class="gallerybox-item" id="picture3">
            <picture>
                <img src="https://placecats.com/800/502" alt="" decoding="async" />
            </picture>

            <figcaption>Caption C</figcaption>

            <a class="gallerybox-prev" href="#picture2">❰</a>
            <a class="gallerybox-next" href="#picture1">❱</a>
        </figure>
    </section>

</section>

I've omitted the <html><head></head><body><main></main></body></html> wrapper bit to keep this short, and I'm using {placecats} for nice placeholder images (my placeholder image service is cool, but doesn't generate images that work well in a gallery).

Each gallery item is a <figure> element and has it's own unique id.next / previous buttons are implement with <a> elements linking between ids. By using the :target element, we can apply CSS rules to the element whose id matches that of the anchor element that we've linked to. For example, if at the end of the URL in the address bar there was #picture2, then the <figure> element with the id picture2 would be targeted by the :target pseudo-selector.

The key challenge in a gallery is to display only the image that's currently being viewed, without showing any of the other images. This is trivial with Javascript, but slightly more challenging with CSS only.

To solve this problem, at first I jumped almost immediately to :target-within and did something like this:

.gallerybox-gallery:target-within > .gallerybox-item:target {
    display: block;
}
.gallerybox-gallery:target-within > .gallerybox-item:not(:target) {
    display: none;
}

Unfortunately though, as of the time of typing it's not supported by any browser:

(Click to see the latest Can I Use support table - by the time you're reading this support may have been implemented)

Annoyingly, :focus-within is already supported, but not :target-within. Not to be deterred, I came up with another plan. By using a CSS Grid with only 1 element, we can make all the images occupy the same space on the page, and then use position: relative and z-index to control which one is displayed at the top of the stack to the viewer:

.gallerybox-gallery {
    display: grid;
    grid-template-columns: auto;
    grid-template-rows: auto;
    grid-template-areas: "main";
}

.gallerybox-item {
    grid-area: main;
    z-index: 2;
}

.gallerybox-item:target {
    z-index: 100;
}

It took me a bit of fiddling around to come up with, but it's a wonderful hack that solves the problem. In the future, I'll revise the code behind this to use :target-within, because it's probably much more performant.

With this, we have the basics of a gallery! With a of styling of the captions and next / previous arrows, I got this:

See the Pen gallerybox: Pure CSS Gallery by Starbeamrainbowlabs (@sbrl) on CodePen.

(Can't see the above? Try this direct link.)

The other thing of note here are is the way I vertically aligned the next / previous arrows:

.gallerybox-item {
    position: relative;
}
:where(.gallerybox-prev, .gallerybox-next) {
    display: flex;
    align-items: center;
    height: 100%;
    position: absolute; top: 0;
}

I use a flexbox here along with the align-items directive (which aligns things perpendicular to the main axis of the flexbox) to vertically align the text inside the <a> elements. To make the <a> elements the right height, I use the fact that position: abolute is relative to any ancestor elements that are also positioned with a height: 100% (100% of the containing element's height) to make them the right height.

This post was mostly just to quickly show off the some of the techniques I've used here and this gallery script. It does have the flaw that if used in a long page it causes the user's browser to jump and put the top of the gallery at the top of the page (which wouldn't be an issue if I used Javascript), but that's an acceptable issue given the constraints of the challenge.

If anyone found this interesting, I'd be happy to talk more in depth about CSS in a future post (though CSS Tricks does this very well already).

Note: placekitten.com has been replaced with placecats.com because the former service appears to be down long-term. Thanks to emails from folk letting me know ref this!

--@sbrl, 2025-02-21

Automatically downloading emails and extracting their attachments

I have an all-in-one printer that's also a scanner - specifically the Epson Ecotank 4750 (though annoyingly the automated document feeder doesn't support duplex). While it's a great printer (very eco-friendly, and the inks last for ages!), my biggest frustration with it is that it doesn't scan directly to an SMB file share (i.e. a Windows file share). It does support SANE though, which allows you to use it through a computer.

This is ok, but the ability to scan directly from the device itself without needing to use a computer was very convenient, so I set out to remedy this. The printer does have a cloud feature they call "Epson Connect", which allows one to upload to various cloud services such as Google Drive and Box, but I don't want to upload potentially sensitive data to such services.

Fortunately, there's a solution at hand - email! The printer in question also supports scanning to a an email address. Once the scanning process is complete, then it sends an email to the preconfigured email address with the scanned page(s) attached. It's been far too long since my last post about email too, so let's do something about that.

Logging in to my email account just to pick up a scan is clunky and annoying though, so I decided to automate the process to resolve the issue. The plan is as follows:

  1. Obtain a fresh email address
  2. Use IMAP IDLE to instantly download emails
  3. Extract attachments and save them to the output directory
  4. Discard the email - both locally and remotely

As some readers may be aware, I run my own email server - hence the reason why I wrote this post about email previously, so I reconfigured it to add a new email address. Many other free providers exist out there too - just make sure you don't use an account you might want to use for anything else, since our script will eat any emails sent to it.

Steps 2, 3, and 4 there took some research and fiddling about, but in the end I cooked up a shell script solution that uses fetchmail, procmail (which is apparently unmaintained, so I should consider looking for alternatives), inotifywait, and munpack. I've also packaged it into a Docker container, which I'll talk about later in this post.

To illustrate how all of these fit together, let's use a diagram:

A diagram showing how the whole process fits together - explanation below.

fetchmail uses IMAP IDLE to hold a connection open to the email server. When it receives notification of a new email, it instantly downloads it and spawns a new instance of procmail to handle it.

procmail writes the email to a temporary directory structure, which a separate script is watching with inotifywait. As soon as procmail finishes writing the new email to disk, inotifywait triggers and the email is unpacked with munpack. Any attachments found are moved to the output directory, and the original email discarded.

With this in mind, let's start drafting up a script. The first order of the day is configuring fetchmail. This is done using a .fetchmailrc file - I came up with this:

poll bobsrockets.com protocol IMAP port 993
    user "user@bobsrockets.com" with pass "PASSWORD_HERE"
    idle
    ssl

...where user@bobsrockets.com is the email address you want to watch, bobsrockets.com is the domain part of said email address (everything after the @), and PASSWORD_HERE is the password required to login.

Save this somewhere safe with tight file permissions for later.

The other configuration file we'll need is one for procmail. let's do that one now:

CORRECTHOME=/tmp/maildir
MAILDIR=$CORRECTHOME/

:0
Mail/

Replace /tmp/maildir with the temporary directory you want to use to hold emails in. Save this as procmail.conf for later too.

Now we have the mail config files written, we need to install some software. I'm using apt on Debian (a minideb Docker container actually), so you'll need to adapt this for your own system if required.

sudo apt install ca-certificates fetchmail procmail inotify-tools mpack
# or, if you're using minideb:
install_packages ca-certificates fetchmail procmail inotify-tools mpack

fetchmail is for some strange reason extremely picky about the user account it runs under, so let's update the pre-created fetchmail user account to make it happy:

groupadd --gid 10000 fetchmail
usermod --uid 10000 --gid 10000 --home=/srv/fetchmail --uid=10000 --gi=10000 fetchmail
chown fetchmail:fetchmail /srv/fetchmail

fetchmail now needs that config file we created earlier. Let's update the permissions on that:

chmod 10000:10000 path/to/.fetchmailrc

If you're running on bare metal, move it to the /srv/fetchmail directory now. If you're using Docker, keep reading, as I recommend that this file is mounted using a Docker volume to make the resulting container image more reusable.

Now let's start drafting a shell script to pull everything together. Let's start with some initial setup:

#!/usr/bin/env bash

if [[ -z "${TARGET_UID}" ]]; then
    echo "Error: The TARGET_UID environment variable was not specified.";
    exit 1;
fi
if [[ -z "${TARGET_GID}" ]]; then
    echo "Error: The TARGET_GID environment variable was not specified.";
    exit 1;
fi
if [[ "${EUID}" -ne 0 ]]; then
    echo "Error: This Docker container must run as root because fetchmail is a pain, and to allow customisation of the target UID/GID (although all possible actions are run as non-root users)";
    exit 1;
fi

dir_mail_root="/tmp/maildir";
dir_newmail="${dir_mail_root}/Mail/new";
target_dir="/mnt/output";

fetchmail_uid="$(id -u "fetchmail")";
fetchmail_gid="$(id -g "fetchmail")";

temp_dir="$(mktemp --tmpdir -d "imap-download-XXXXXXX")";
on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

log_msg() {
    echo "$(date -u +"%Y-%m-%d %H:%M:%S") imap-download: $*";
}

This script will run as root, and fetchmail runs as UID 10000 and GID 10000, The reasons for this are complicated (and mostly have to do with my weird network setup). We look for the TARGET_UID and TARGET_GID environment variables, as these define the uid:gid we'll be setting files to before writing them to the output directory.

We also determine the fetchmail UID/GID dynamically here, and create a second temporary directory to work with too (the reasons for which will become apparent).

Before we continue, we need to create the directory procmail writes new emails to. Not because procmail won't create it on its own (because it will), but because we need it to exist up-front so we can watch it with inotifywait:

mkdir -p "${dir_newmail}";
chown -R "${fetchmail_uid}:${fetchmail_gid}" "${dir_mail_root}";

We're running as root, but we'll want to spawn fetchmail (and other things) as non-root users. Technically, I don't think you're supposed to use sudo in non-interactive scripts, and it's also not present in my Docker container image. The alternative is the setpriv command, but using it is rather complicated and annoying.

It's more powerful than sudo, as it allows you to specify not only the UID/GID a process runs as, but also the capabilities the process will have too (e.g. binding to low port numbers). There's a nasty bug one has to work around if one is using Docker too, so given all this I've written a wrapper function that abstracts all of this complexity away:

# Runs a process as another user.
# Ref https://github.com/SinusBot/docker/pull/40
# $1    The UID to run the process as.
# $2    The GID to run the process as.
# $3-*  The command (including arguments) to run
run_as_user() {
    run_as_uid="${1}"; shift;
    run_as_gid="${1}"; shift;
    if [[ -z "${run_as_uid}" ]]; then
        echo "run_as_user: No target UID specified.";
        return 1;
    fi
    if [[ -z "${run_as_gid}" ]]; then
        echo "run_as_user: No target GID specified.";
        return 2;
    fi

    # Ref https://github.com/SinusBot/docker/pull/40
    # WORKAROUND for `setpriv: libcap-ng is too old for "all" caps`, previously "-all" was used here
    # create a list to drop all capabilities supported by current kernel
    cap_prefix="-cap_";
    caps="$cap_prefix$(seq -s ",$cap_prefix" 0 "$(cat /proc/sys/kernel/cap_last_cap)")";

    setpriv --inh-caps="${caps}" --reuid "${run_as_uid}" --clear-groups --regid "${run_as_gid}" "$@";
    return "$?";
}

With this in hand, we can now wrap fetchmail and procmail in a function too:

do_fetchmail() {
    log_msg "Starting fetchmail";

    while :; do
        run_as_user "${fetchmail_uid}" "${fetchmail_gid}" fetchmail --mda "/usr/bin/procmail -m /srv/procmail.conf";

        exit_code="$?";
        if [[ "$exit_code" -eq 127 ]]; then
            log_msg "setpriv failed, exiting with code 127";
            exit 127;
        fi 

        log_msg "Fetchmail exited with code ${exit_code}, sleeping 60 seconds";
        sleep 60
    done
}

In short this spawns fetchmail as the fetchmail user we configured above, and also restarts it if it dies. If setpriv fails, it returns an exit code of 127 - so we catch that and don't bother trying again, as the issue likely needs manual intervention.

To finish the script, we now need to setup that inotifywait loop I mentioned earlier. Let's setup a shell function for that:


do_attachments() {
    while :; do # : = infinite loop
        # Wait for an update
        # inotifywait's non-0 exit code forces an exit for some reason :-/
        inotifywait -qr --event create --format '%:e %f' "${dir_newmail}";

        # Process new email here
    done
}

Processing new emails is not particularly difficult, but requires a sub loop because:

  • More than 1 email could be written at a time
  • Additional emails could slip through when we're processing the last one
while read -r filename; do

    # Process each email

done < <(find "${dir_newmail}" -type f);

Finally, we need to process each email we find in turn. Let's outline the steps we need to take:

  1. Move the email to that second temporary directory we created above (since the procmail directory might not be empty)
  2. Unpack the attachments
  3. chown the attach

Let's do this in chunks. First, let's move it to the temporary directory:

log_msg "Processing email ${filename}";

# Move the email to a temporary directory for processing
mv "${filename}" "${temp_dir}";

The filename environment variable there is the absolute path to the email in question, since we used find and passed it an absolute directory to list the contents of (as opposed to a relative path).

To find the filepath we moved it to, we need to do this:

filepath_temp="${temp_dir}/$(basename "${filename}")"

This is important for the next step, where we unpack it:

# Unpack the attachments
munpack -C "${temp_dir}" "${filepath_temp}";

Now that we've unpacked it, let's do a bit of cleaning up, by deleting the original email file and the .desc description files that munpack also generates:

# Delete the original email file and any description files
rm "${filepath_temp}";
find "${temp_dir}" -iname '*.desc' -delete;

Great! Now we have the attachments sorted, now all we need to do is chown them to the target UID/GID and move them to the right place.

chown -R "${TARGET_UID}:${TARGET_GID}" "${temp_dir}";
chmod -R a=rX,ug+w "${temp_dir}";

I also chmod the temporary directory too to make sure that the permissions are correct, because otherwise the mv command is unable to read the directory's contents.

Now to actually move all the attachments:

# Move the attachment files to the output directory
while read -r attachment; do
    log_msg "Extracted attachment ${attachment}";
    chmod 0775 "${attachment}";
    run_as_user "${TARGET_UID}" "${TARGET_GID}" mv "${attachment}" "${target_dir}";
done < <(find "${temp_dir}" -type f);

This is rather overcomplicated because of an older design, but it does the job just fine.

With that done, we've finished the script. I'll include the whole script at the bottom of this post.

Dockerification

If you're running on bare metal, then you can skip to the end of this post. Because I have a cluster, I want to be able to run this thereon. Since said cluster works with Docker containers, it's natural to Dockerise this process.

The Dockerfile for all this is surprisingly concise:

(Can't see the above? View it on my personal Git server instead)

To use this, you'll need the following files alongside it:

It exposes the following Docker volumes:

  • /mnt/fetchmailrc: The fetchmailrc file
  • /mnt/output: The target output directory

All these files can be found in this directory on my personal Git server.

Conclusion

We've strung together a bunch of different programs to automatically download emails and extract their attachments. This is very useful as for ingesting all sorts of different files. Things I haven't covered:

  • Restricting it to certain source email addresses to handle spam
  • Restricting the file types accepted (the file command is probably your friend)
  • Disallowing large files (most 3rd party email servers do this automatically, but in my case I don't have a limit that I know of other than my hard disk space)

As always, this blog post is both a reference for my own use and a starting point for you if you'd like to do this for yourself.

If you've found this useful, please comment below! I find it really inspiring / motivating to learn how people have found my posts useful and what for.

Sources and further reading

run.sh script

(Can't see the above? Try a this link, or alternatively this one (bash))

PhD, Update 8: Eggs in Baskets

I'm back again with another PhD update blog post! Before we begin, here's a list of all the parts in the series so far:

As in the previous post, progress since last time is split in 2: The Temporal CNN, and the social media side of things. I've started to split my time more evenly between the 2 sides, as it seems like the Temporal CNN is going to take lots more work than anticipated and I'd rather not put all my eggs in 1 basket.

Temporal CNN

As you might have guessed, the Temporal CNN still isn't learning anything, but at least now I think I know what the problem is. Since last time, I've done a bunch of debugging and tests to try and figure out what the problem is. During that process, I've managed to reach a record of ~20% accuracy, which at least gives me hope that it's going to work!

Specifically, I used the MNIST (alternative site) handwriting digit dataset with my "easy" task as explain in the previous post, but with a small difference: I pre-generated 2 random tensors to serve as the "below 5" and "5 and above" targets to predict instead of a pair of tensors filled with 0s or 1s respectively. The model didn't like this at all, so this is how I now know what the problem is.

For those interested, here's the laundry list of other things I've tried since last time:

  • Giving it more data (all of 2007, with the 2013 floods as validation; made things a bit worse)
  • Found and fixed a bug in data normalisation that managed to sneak through during the rewrite (reduced training times a touch)
  • Inverting the heightmap (helped a bit)
  • Making the model deeper (Gave me a full 5% accuracy increase from 15% to 20%!)

Knowing what the problem is though is 1 thing, but solving it is another matter entirely. Thankfully, my supervisor and I have a plan to look into using a modified version of the latter half of a variational autoencoder and squidge it onto the tail end of the Temporal CNN. If it works, then I'm imagining that we'll need a new name for the Temporal CNN (suggestions?), but I'll tackle that once I've finished revising the model.

For context, a variational autoencoder is a modified "vanilla" autoencoder, and is 1 of 2 different main classes of generative AI model architecture - the other being Generative Adversarial Networks (GAN). In contrast to a GAN, a variational autoencoder does image-to-image translation with a single model, and maps an input parameter space onto an output parameter space. It first encodes the input to the model into a smaller tensor of features, before upscaling that back into an image again. In this fashion, it can learn to translate between 2 different images - for example putting glasses on people's faces.

To do this, I'm going to implement a vanilla variational autoencoder using the MNIST dataset, and once I've done this I'll then lift part of the model structure and transpose it onto the top of my existing Temporal CNN - by doing it this way I'll ensure that I have a known-good model to work with that is definitely capable of image-to-image translation.

Social Media

In other news, I've started to make some real progress on the social media side of things. I've downloaded and anonymised some tweets (the code for which is open source on npm under the package name twitter-academic-downloader - I intend to write a separate blog post about it at some point soon-ish), and I've also put together an LSTM-based model to start looking at doing some text classification.

I decided to implement said model in Python instead of Javascript, because for what I can tell Tensorflow.js doesn't come with as many batteries included as Tensorflow for Python does for natural language processing-based tasks. This has caused some interesting adventures (and a number of frustrating crashes), but I think I'm starting to get the hang of it.

In particular it's interesting coming from Tensorflow.js (which is a later project), because it seems that Tensorflow for Python is much less cohesive and more disjointed as a library compared to Tensorflow.js, which has learnt and applied lessons from the Python implementation - resulting in a much more cohesive and well thought out API. A prime example of this is the tf.Dataset vs tf.keras.Sequence in the Python version, which isn't an issue in Tensorflow.js, as in the Javascript bindings we have a single tf.Dataset.

This aside, my next step here is to train a significantly sized model that's larger than the mini model with a single layer and 100 units I've been using for testing purposes (that's my task for this afternoon - which I've likely done by the time you're reading this post).

In terms of literature, I've read a bunch more papers on the subject since last time - but I still feel like I've got more to read. Recently I read a series of papers about word embeddings (converting words into numerical tensors), which was very interesting. The process has evolved over the years, starting from a simple dictionary mapping incrementing numbers to words, to training an AI to generate said representations in increasingly sophisticated ways (starting with word2vec, then moving on to in no particular order ELMo, GloVe, and finally BERT - transformers are pretty incredible models). It was a fascinating read - I can recommend it to anyone who's interested in natural language processing (along with this excellent post)

In the model I've implemented, I've ultimately decided to go with GloVe (Global Vectors for Word Representation), as the pre-trained model is simply a text file containing a lookup table one can read into a dictionary or hash table.

Conclusion

Things have been moving forwards - albeit slowly. I've got an idea as to how I can resolve the issues I've been facing with the Temporal CNN (pending a new name once I'm done with all the modifications and I know what the model architecture is going to be like), though it's going to take a lot of work.

Things are finally starting to move in social media land - hopefully the accuracy of the LSTM-based model will be higher than that of the mini model I trained, which was only 50% on a balanced dataset - no better than blind guessing!

See you again in 2 months or so, when hopefully I'll have some real results to show (though of course I'll be keeping up with weekly posts about other things in the meantime). If you have any comments or questions about any of this - please leave a comment below! I'd love to hear your thoughts.

Sources and further reading

A much easier way to install custom versions of Python

Recently, I wrote a rather extensive blog post about compiling Python from source: Installing Python, Keras, and Tensorflow from source.

Since then, I've learnt of multiple other different ways to do that which are much easier as it turns out to achieve that goal.

For context, the purpose of running a specific version of Python in the first place was because on my University's High-Performance Computer (HPC) Viper, it doesn't have a version of Python new enough to run the latest version of Tensorflow.

Using miniconda

After contacting the Viper team at the suggestion of my supervisor, I discovered that they already had a mechanism in place for specifying which version of Python to use. It seems obvious in hindsight - since they are sure to have been asked about this before, they already had a solution in the form of miniconda.

If you're lucky enough to have access to Viper, then you can load miniconda like so:

module load python/anaconda/4.6/miniconda/3.7

If you don't have access to Viper, then worry not. I've got other methods in store which might be better suited to your environment in later sections.

Once loaded, you can specify a version of Python like so:

conda create -n py python=3.8

The -n py specifies the name of the environment you'd like to create, and can be anything you like. Perhaps you could use the name of the project you're working on would be a good idea. The python=3.8 is the version of Python you want to use. You can list the versions of Python available like so:

conda search -f python

Then, to activate the new environment, do this:

conda init bash
conda activate py
exec bash

Replace py with the name of the environment you created above.

Now, you should have the specific version of Python you wanted installed and ready to use.

Edit 2022-03-30: Added conda install pip step, as some systems don't natively have pip by default which causes issues.

The last thing we need to do here is to install pip inside the virtual conda environment. Do that like so:

conda install pip

You can also install packages with pip, and it should all come out in the wash.

For Viper users, further information about miniconda can be found here: Applications/Miniconda Last

Gentoo Project Prefix

Another option I've been made aware of is Gentoo's Project Prefix. Essentially, it installs Gentoo (a distribution of Linux) inside a directory without root privileges. It doesn't work very well on Ubuntu, however due to this bug, but it should work on other systems.

They provide a bootstrap script that you can run that helps you bootstrap the system. It asks you a few questions, and then gets to work compiling everything required (since Gentoo is a distribution that compiles everything from source).

If you have multiple versions of gcc available, try telling it about a slightly older version of GCC if it fails to install.

If you can get it to install, a Gentoo Prefix install allows the installation whatever software you like!

pyenv

The last solution to the problem I'm aware of is pyenv. It automates the process of downloading and compiling specified versions of Python, and also updates you shell automatically. It does require some additional dependencies to be installed though, which could be somewhat awkward if you don't have sudo access to your system. I haven't actually tried it myself, but it may be worth looking into if the other 2 options don't work for you.

Conclusion

There's always more than 1 way to do something, and it's always worth asking if there's a better way if the way you're currently using seems hugely complicated.

Servers demystified

Something I see a lot of around the Internet are people who think that you need to purchase a big (often rack-mounted) "server" in order to host things like websites, email, game servers, and more (exhibit a). Quite often, they turn to ebay to purchase used enterprise rack mounted servers too.

I want to take a moment here to write up my thoughts here on why that is almost never the correct approach for a home user to take to host such applications at home, and what the (much better) alternatives are to serve as a reference post I can direct people to who need educating about this important issue.

What is a "server"?

A server can mean 2 things: a physical computer whose primary role is to act as a server, and server applications, which serve content to other users elsewhere - be it phones, laptops, desktops, etc.

A lot of people new to the field don't realise it, but any computer can take on the role of a server - you don't need any fancy hardware. The things that a computer does is defined by the software it runs - not the hardware that it is built from.

Does a server need a graphics card (GPU)?

No. It really doesn't. It's extremely unlikely that for a general purpose server you would need a GPU. Another related myth here is that you need a GPU in your server if you're running a game server. This is also false. Most of the time a server is going to be running headlessly (i.e. without a monitor) - so it really doesn't need a GPU in order to function effectively.

The following tasks however may require a GPU:

  • Serious Machine Learning / Artificial Intelligence workloads
  • 3D Rendering (e.g. Blender)
  • Live video streaming (video transcoding does not always utilise the GPU, as far as I can tell - make sure you check the documentation for your video editing software before buying any hardware)

Web servers, game servers, email servers, and other application servers do not use and cannot make use of a GPU. Programs need to be specially designed to support GPUs.

I need to purchase a license for Windows Server. Windows 10 isn't enough.

This is false. If you prefer Windows, then a regular old Windows 10 machine will be just fine for most home server use-cases. Windows Server provides additional features for enterprise that you are unlikely to need.

Personally, I recommend running a distribution of Linux though such as Ubuntu Server.

The problems with used hardware

Of particular frustration is the purchasing of old used (often rack mountable) servers from eBay and other auction sites. The low prices might be attractive, but such servers will nearly always have a number of issues:

  1. The CPU and other components will frequently be 10+ years old, and draw lots of electricity
  2. The fans will be very loud - sounding like a jet is taking off inside your house
  3. They often don't come with hard drives, and often have custom drive bays that require purchasing expensive drives to fill

Awkward issues to be sure! Particularly of note here is the electricity problem. Very old devices draw orders of magnitude more power than newer ones - leading to a big electricity bill. It will practically always be cheaper to purchase a newer more expensive machine - it'll pay for itself in dramatically lower electricity bills.

What are the alternatives?

Many far more suitable alternatives exist. They fall into 2 categories:

  1. Renting from a hosting company
  2. Buying a physical device

I'll be talking through both of these options below.

Renting from a hosting company

If you'd rather not have any hardware of your own locally, you can always rent a server from a hosting company. These come in 1 flavours:

  • Virtual Private Servers (VPS): A virtual machine running on the hosting company's infrastructure. Often easier to scale to multiple machines.
  • Dedicated servers: Bare-metal hardware running in a hosting company's datacentre somewhere. Useful if you've outgrown a VPS.

Example providers include OVH, Kimsufi (dedicated servers), Digital Ocean, and many more.

Things to watch out for when choosing one include:

  • How can you get support if you have an issue?
  • What network speeds are provided? Are there any data caps?
  • How much hard drive space do they come with? You often can't get any additional hard drive space once you've bought it without switching to a new host.
  • How many CPU cores does it have (or, if you want to run a game server, what's the clock speed)?
  • How much RAM does it have?
  • How much is it per month?

Buying a physical device

If you'd rather buy a physical device (beware that email servers cannot be effectively hosted on a residential Internet connection), then I can recommend either looking into one of these 2:

  1. An Intel NUC or other Mini PC in the same form factor
  2. A Raspberry Pi (or, for more advanced users, I've heard good things about a Rock Pi, but haven't tried it myself)

Both options are quiet, reasonably priced, and will draw orders of magnitude less power than a big rack mounted server.

A notable caveat here is that if you intend to run a game server, you'll want to check the CPU architecture it runs on, as it may not be compatible with the Raspberry Pi (which has an ARM chip built it - which can be either arm64 or armv7l - I use the official Debian CPU architecture codes here to avoid ambiguity).

Other alternatives here include old laptops and desktops you already have lying around at home. Make sure they aren't too old though, because otherwise you'll run afoul of point #1 in my list of problems there above.

Conclusion

In this post, I've busted some common myths about serves. I've also taken a quick look some appropriate hardware that you can buy or rent to use as a server.

If you're in the market for a server, don't be fooled by low prices for used physical servers. Rather, either rent one from a hosting company, or buy a Mini PC or Raspberry Pi instead. It'll run quieter and use less power too.

Other common questions I see are how to get started with running various different applications on a server. This is out of scope of this article, but there are plenty of tutorials out there on how to do this.

Often you'll need some basic Linux terminal skills to follow along though - I've written a blog post about how you can get started with the terminal already. I also on occasion post tutorials here on this blog on how to setup various applications - these are usually tagged with tutorial and server.

Other sites have excellent tutorials on to setup all manner of different applications - I'll leave a bunch of links at the end of this post.

If this this post has helped demystify servers for you, please consider sharing it with others to clear up their misconceptions too.

Sources and Further Reading

Installing Python, Keras, and Tensorflow from source

I found myself in the interesting position recently of needing to compile Python from source. The reasoning behind this is complicated, but it boils down to a need to use Python with Tensorflow / Keras for some natural language processing AI, as Tensorflow.js isn't going to cut it for the next stage of my PhD.

The target upon which I'm aiming to be running things currently is Viper, my University's high-performance computer (HPC). Unfortunately, the version of Python on said HPC is rather old, which necessitated obtaining a later version. Since I obviously don't have sudo permissions on Viper, I couldn't use the default system package manager. Incredibly, pre-compiled Python binaries are not distributed for Linux either, which meant that I ended up compiling from source.

I am going to be assuming that you have a directory at $HOME/software in which we will be working. In there, there should be a number of subdirectories:

  • bin: For binaries, already added to your PATH
  • lib: For library files - we'll be configuring this correctly in this guide
  • repos: For git repositories we clone

Make sure you have your snacks - this was a long ride to figure out and write - and it's an equally long ride to follow. I recommend reading this all the way through before actually executing anything to get an overall idea as to the process you'll be following and the assumptions I've made to keep this post a reasonable length.

Setting up

Before we begin, we need some dependencies:

  • gcc - The compiler
  • git - For checking out the cpython git repository
  • readline - An optional dependency of cpython (presumably for the REPL)

On Viper, we can load these like so:

module load utilities/multi
module load gcc/10.2.0
module load readline/7.0

Compiling openssl

We also need to clone the openssl git repo and build it from source:

cd ~/software/repos
git clone git://git.openssl.org/openssl.git;    # Clone the git repo
cd openssl;                                     # cd into it
git checkout OpenSSL_1_1_1-stable;              # Checkout the latest stable branch (do git branch -a to list all branches; Python will complain at you during build if you choose the wrong one and tell you what versions it supports)
./config;                                       # Configure openssl ready for compilation
make -j "$(nproc)"                              # Build openssl

With openssl compiled, we need to copy the resulting binaries to our ~/software/lib directory:

cp lib*.so* ~/software/lib;
# We're done, cd back to the parent directory
cd ..;

To finish up openssl, we need to update some environment variables to let the C++ compiler and linker know about it, but we'll talk about those after dealing with another dependency that Python requires.

Compiling libffi

libffi is another dependency of Python that's needed if you want to use Tensorflow. To start, go to the libgffi GitHub releases page in your web browser, and copy the URL for the latest release file. It should look something like this:

https://github.com/libffi/libffi/releases/download/v3.3/libffi-3.3.tar.gz

Then, download it to the target system:

cd ~/software/lib
curl -OL URL_HERE

Note that we do it this way, because otherwise we'd have to run the autogen.sh script which requires yet more dependencies that you're unlikely to have installed.

Then extract it and delete the tar.gz file:

tar -xzf libffi-3.3.tar.gz
rm libffi-3.3.tar.gz

Now, we can configure and compile it:

./configure --prefix=$HOME/software
make -j "$(nproc)"

Before we install it, we need to create a quick alias:

cd ~/software;
ln -s lib lib64;
cd -;

libffi for some reason likes to install to the lib64 directory, rather than our pre-existing lib directory, so creating an alias makes it so that it installs to the right place.

Updating the environment

Now that we've dealt with the dependencies, we now need to update our environment so that the compiler knows where to find them. Do that like so:

export LD_LIBRARY_PATH="$HOME/software/lib:${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}";
export LDFLAGS="-L$HOME/software/lib -L$HOME/software/include $LDFLAGS";
export CPPFLAGS="-I$HOME/software/include -I$HOME/software/repos/openssl/include -I$HOME/software/repos/openssl/include/openssl $CPPFLAGS"

It is also advisable to update your ~/.bashrc with these settings, as you may need to come back and recompile a different version of Python in the future.

Personally, I have a file at ~/software/setup.sh which I run with source $HOME/software/setuop.sh in my ~/.bashrc file to keep things neat and tidy.

Compiling Python

Now that we have openssl and libffi compiled, we can turn our attention to Python. First, clone the cpython git repo:

git clone https://github.com/python/cpython.git
cd cpython;

Then, checkout the latest tag. This essentially checks out the latest stable release:

git checkout "$(git tag | grep -ivP '[ab]|rc' | tail -n1)"

Important: If you're intention is to use tensorflow, check the Tensorflow Install page for supported Python versions. It's probable that it doesn't yet support the latest version of Python, so you might need to checkout a different tag here. For some reason, Python is really bad at propagating new versions out to the community quickly.

Before we can start the compilation process, we need to configure it. We're going for performance, so execute the configure script like so:

./configure --with-lto --enable-optimizations --with-openssl=/absolute/path/to/openssl_repo_dir

Replace /absolute/path/to/openssl_repo with the absolute path to the above openssl repo.

Now, we're ready to compile Python. Do that like so:

make -j "$(nproc)"

This will take a while, but once it's done it should have built Python successfully. For a sanity check, we can also test it like so:

make -j "$(nproc)" test

The Python binary compiled should be called simply python, and be located in the root of the git repository. Now that we've compiled it, we need to make a few tweaks to ensure that our shell uses our newly compiled version by default and not the older version from the host system. Personally, I keep my ~/bin folder under version control, so I install host-specific to ~/software, and put ~/software/bin in my PATH like so:

export PATH=$HOME/software/bin

With this in mind, we need to create some symbolic links in ~/software/bin that point to our new Python installation:

cd $HOME/software/bin;
ln -s relative/path/to/python_binary python
ln -s relative/path/to/python_binary python3
ln -s relative/path/to/python_binary python3.9

Replace relative/path/to/python_binary with the relative path tot he Python binary we compiled above.

To finish up the Python installation, we need to get pip up and running, the Python package manager. We can do this using the inbuilt ensurepip module, which can bootstrap a pip installation for us:

python -m ensurepip --user

This bootstraps pip into our local user directory. This is probably what you want, since if you try and install directly the shebang incorrectly points to the system's version of Python, which doesn't exist.

Then, update your ~/.bash_aliases and add the following:

export LD_LIBRARY_PATH=/absolute/path/to/openssl_repo_dir/lib:$LD_LIBRARY_PATH;
alias pip='python -m pip'
alias pip3='python -m pip'

...replacing /absolute/path/to/openssl_repo_dir with the path to the openssl git repo we cloned earlier.

The next stage is to use virtualenv to locally install our Python packages that we want to use for our project. This is good practice, because it keeps our dependencies locally installed to a single project, so they don't clash with different versions in other projects.

Before we can use virtualenv though, we have to install it:

pip install virtualenv

Unfortunately, Python / pip is not very clever at detecting the actual Python installation location, so in order to actually use virtualenv, we have to use a wrapper script - because the [shebang]() in the main ~/.local/bin/virtualenv entrypoint does not use /usr/bin/env to auto-detect the python binary location. Save the following to ~/software/bin (or any other location that's in your PATH ahead of ~/.local/bin):

#!/usr/bin/env bash

exec python ~/.local/bin/virtualenv "$@"

For example:

# Write the script to disk
nano ~/software/bin/virtualenv;
# chmod it to make it executable
chmod +x ~/software/bin/virtualenv

Installing Keras and tensorflow-gpu

With all that out of the way, we can finally use virtualenv to install Keras and tensorflow-gpu. Let's create a new directory and create a virtual environment to install our packages in:

mkdir tensorflow-test
cd tensorflow-test;
virtualenv "$PWD";
source bin/activate;

Now, we can install Tensorflow & Keras:

pip install tensorflow-gpu

It's worth noting here that Keras is a dependency of Tensorflow.

Tensorflow has a number of alternate package names you might want to install instead depending on your situation:

  • tensorflow: Stable tensorflow without GPU support - i.e. it runs on the CPU instead.
  • tf-nightly-gpu: Nightly tensorflow for the GPU. Useful if your version of Python is newer than the version of Python supported by Tensorflow

Once you're done in the virtual environment, exit it like this:

deactivate

Phew, that was a huge amount of work! Hopefully this sheds some light on the maddenly complicated process of compiling Python from source. If you run into issues, you're welcome to comment below and I'll try to help you out - but you might be better off asking the Python community instead, as they've likely got more experience with Python than I have.

Sources and further reading

Art by Mythdael