Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

Document your network infrastructure with a wiki

A screenshot of my infrastructure wiki

Recently, I've realised that I actually manage quite an extensive network of interconnected devices. Some of them depend on each other in a non-obvious fashion, so after reading a comment on Reddit I decided to begin the process of documenting it all on a wiki.

If you've followed this blog for a while, then you'll probably know that I'm the author of a wiki engine called Pepperminty Wiki. If you're wondering, I'm still working away at it on and off - and have no plans to abandon it any time soon. After all, I have multiple instances of it setup - so I've got more reasons than 1 to keep developing it!

Anyway, I've recently setup a new instance of it to document the network infrastructure I maintain and manage. I'm currently going for 1 page per device, with sub-pages for the services they host. I tried putting the services under sub-headings on each device's page, but it quickly got cluttered and difficult to sort through.

Even if you just manage a Raspberry Pi acting as a server (I've got several of these myself actually), I do recommend looking into it. You'll save yourself so much time when asking questions like "what does this do?", "how does that work again?", and "where was it that I put the configuration file for that?".....

It's almost as if I'm speaking from experience :P

Don't forget to avoid putting passwords in a wiki though - as tempting as it sounds. As I recommend in my earlier post, Keepass 2 is my password manager of choice for that.

I might be a bit biased, but I can thoroughly recommend Pepperminty Wiki for this - and many other - purposes. If there's something missing, then don't hesitate to open an issue and I'll do my best to help you out :-)

Found this useful? Got different way you document stuff? Comment below! It really helps motivate me to write and program more.

Cluster, Part 3: Laying groundwork with Unbound as a DNS server

Looking up at the canopy of some trees

(Above: A picture from my wallpapers folder. If you took this, comment below and I'll credit you)

Welcome to another blog post about my cluster! Although I had to replace the ATX PSU I was planning on using to power the thing with a USB power supply instead, I've got all the Pis powered up and networked together now - so it's time to finally start on the really interesting bit!

In this post, I'm going to show you how to install Unbound as a DNS server. This will underpin the entire stack we're going to be setting up - and over the next few posts in this series, I'll take you through the entire process.

Starting with DNS is a logical choice, which comes with several benefits:

  • We get a minor performance boost by caching DNS queries locally
  • We get to configure our DNS server to make encrypted DNS-over-TLS queries on behalf of the entire network (even if some of the connected devices - e.g. phones don't support doing so)
  • If we later change our minds and want to shuffle around the IP addresses, it's not such a big deal as if we used IP addresses in all our config files

With this in mind, I starting with DNS before moving on to Docker and the Hashicorp stack:

A picture of the homepages of the Hashicorp stack

Before we begin, let's set out our goals:

  • We want a caching resolver, to avoid repeated requests across the Internet for the same query
  • We want to encrypt queries that leave the network via DNS-over-TLS
  • We want to be able to add our own custom DNS records for a domain of our choosing, for internal resolution only.

The last point there is particularly important. We want to resolve something like 172.16.230.100 to server1.bobsrockets.com internally, but not externally outside the network. This way we can include server1.bobsrockets.com in config files, and if the IP changes then we don't have to go back and edit all our config files - just reload or restart the relevant services.

Without further delay, let's start by installing unbound:

sudo apt install unbound

If you're on another system, translate this for your package manager. See this amazing wiki page for help on translating package manager commands :-)

Next, we need to write the config file. The default config file for me it located at /etc/unbound/unbound.conf:

# Unbound configuration file for Debian.
#
# See the unbound.conf(5) man page.
#
# See /usr/share/doc/unbound/examples/unbound.conf for a commented
# reference config file.
#
# The following line includes additional configuration files from the
# /etc/unbound/unbound.conf.d directory.
include: "/etc/unbound/unbound.conf.d/*.conf"

There's not a lot in the /etc/unbound/unbound.conf.d/ directory, so I'm going to be extending /etc/unbound/unbound.conf. First, we're going to define a section to get Unbound to forward requests via DNS-over-TLS:

forward-zone:
    name: "."
    # Cloudflare DNS
    forward-addr: 1.0.0.1@853
    # DNSlify - ref https://www.dnslify.com/services/resolver/
    forward-addr: 185.235.81.1@853
    forward-ssl-upstream: yes

The . name there simply means "everything". If you haven't seen it before, the fully-qualified domain name for seanssatellites.io for example is as follows:

seanssatellites.io.

Notice the extra trailing dot . there. That's really important, as it signifies the DNS root (not sure on it's technical name. Comment if you know it, and I'll update this). The io bit is the top-level domain (commonly abbreviated as TLD). seanssatellites is the actual domain bit that you buy.

It's a hierarchical structure, and apparently used to be inverted here in the UK before the formal standard was defined by the IETF (Internet Engineering Task Force) - of which RFC 1034 was a part.

Anyway, now that we've told Unbound to forward queries, the next order of business is to define a bunch of server settings to get it to behave the way we want it to:

server:
    interface: 0.0.0.0
    interface: ::0

    ip-freebind: yes

    # Access control - default is to deny everything apparently

    # The local network
    access-control: 172.16.230.0/24 allow
    # The docker interface
    access-control: 172.17.0.1/16 allow

    username: "unbound"

    harden-algo-downgrade: yes
    unwanted-reply-threshold: 10000000


    prefetch: yes

There's a lot going on here, so let's break it down.

Property Meaning
interface Tells unbound what interfaces to listen on. In this case I tell it to listen on all interfaces on both IPv4 and IPv6.
ip-freebind Tells it to try and listen on interfaces that aren't yet up. You probably don't need this, so you can remove it. I'm including it here because I'm currently unsure whether unbound will start before docker, which I'm going to be using extensively. In the future I'll probably test this and remove this directive.
access-control unbound has an access control system, which functions rather like a firewall from what I can tell. I haven't had the time yet to experiment (you'll be seeing that a lot), but once I've got my core cluster up and running I intend to experiment and play with it extensively, so expect more in the future from this.
username The name of the user on the system that unbound should run as.
harden-algo-downgrade Protect against downgrade attacks when making encrypted connections. For some reason the default is to set this to no, so I enable it here.
unwanted-reply-threshold Another security-hardening directive. If this many DNS replies are received that unbound didn't ask for, then take protective actions such as emptying the cache just in case of a DNS cache poisoning attack
prefetch Causes unbound to prefetch updated DNS records for cache entries that are about to expire. Should improve performance slightly.

If you have a flaky Internet connection, you can also get Unbound to return stale DNS cache entries if it can't reach the remote DNS server. Do that like this:

server:
    # Service expired cached responses, but only after a failed 
    # attempt to fetch from upstream, and 10 seconds after 
    # expiration. Retry every 10s to see if we can get a
    # response from upstream.
    serve-expired: yes
    serve-expired-ttl: 10
    serve-expired-ttl-reset: yes

With this, we should have a fully-functional DNS server. Enable it to start on boot and (re)start it now:

sudo systemctl enable unbound.service
sudo systemctl restart unbound.service

If it's not already started, the restart action will start it.

Internal DNS records

If you're like me and want some custom DNS records, then continue reading. Unbound has a pretty nifty way of declaring custom internal DNS records. Let's enable that now. First, you'll need a domain name that you want to return custom internal DNS records for. I recommend buying one - don't use an unregistered one, just in case someone else comes along and registers it.

Gandi are a pretty cool registrar - I can recommend them. Cloudflare are also cool, but they don't allow you to register several years at once yet - so they are probably best set as the name servers for your domain (that's a free service), leaving your domain name with a registrar like Gandi.

To return custom DNS records for a domain name, we need to tell unbound that it may contain private DNS records. Let's do that now:

server:
    private-domain: "mooncarrot.space"

This of course goes in /etc/unbound/unbound.conf, as before. See the bottom of this post for the completed configuration file.

Next, we need to define some DNS records:

server:
    local-zone: "mooncarrot.space." transparent
    local-data: "controller.mooncarrot.space.   IN A 172.16.230.100"
    local-data: "piano.mooncarrot.space.        IN A 172.16.230.101"
    local-data: "harpsichord.mooncarrot.space.  IN A 172.16.230.102"
    local-data: "saxophone.mooncarrot.space.    IN A 172.16.230.103"
    local-data: "bag.mooncarrot.space.          IN A 172.16.230.104"

    local-data-ptr: "172.16.230.100 controller.mooncarrot.space."
    local-data-ptr: "172.16.230.101 piano.mooncarrot.space."
    local-data-ptr: "172.16.230.102 harpsichord.mooncarrot.space."
    local-data-ptr: "172.16.230.103 saxophone.mooncarrot.space."
    local-data-ptr: "172.16.230.104 bag.mooncarrot.space."

The local-zone directive tells it that we're defining custom DNS records for the given domain name. The transparent bit tells it that if it can't resolve using the custom records, to forward it out to the Internet instead. Other interesting values include:

Value Meaning
deny Serve local data (if any), otherwise drop the query.
refuse Serve local data (if any), otherwise reply with an error.
static Serve local data, otherwise reply with an nxdomain or nodata answer (similar to the reponses you'd expect from a DNS server that's authoritative for the domain).
transparent Respond with local data, but resolve other queries normally if the answer isn't found locally.
redirect serves the zone data for any subdomain in the zone.
inform The same as transparent, but logs client IP addresses
inform_deny Drops queries and logs client IP addresses

Adapted from /usr/share/doc/unbound/examples/unbound.conf, the example Unbound configuration file.

Others exist too if you need even more control, like always_refuse (which always responds with an error message).

The local-data directives define the custom DNS records we want Unbound to return, in DNS records syntax (again, if there's an official name for the syntax leave a comment below). The local-data-ptr directive is a shortcut for defining PTR, or reverse DNS records - which resolve IP addresses to their respective domain names (useful for various things, but also commonly used as a step to verify email servers - comment below and I'll blog on lots of other shady and not so shady techniques used here).

With that, our configuration file is complete. Here's the full configuration file in it's entirety:

# Unbound configuration file for Debian.
#
# See the unbound.conf(5) man page.
#
# See /usr/share/doc/unbound/examples/unbound.conf for a commented
# reference config file.
#
# The following line includes additional configuration files from the
# /etc/unbound/unbound.conf.d directory.
include: "/etc/unbound/unbound.conf.d/*.conf"

server:
    interface: 0.0.0.0
    interface: ::0

    ip-freebind: yes

    # Access control - default is to deny everything apparently

    # The local network
    access-control: 172.16.230.0/24 allow
    # The docker interface
    access-control: 172.17.0.1/16 allow

    username: "unbound"

    harden-algo-downgrade: yes
    unwanted-reply-threshold: 10000000

    private-domain: "mooncarrot.space"

    prefetch: yes

    # ?????? https://www.internic.net/domain/named.cache

    # Service expired cached responses, but only after a failed 
    # attempt to fetch from upstream, and 10 seconds after 
    # expiration. Retry every 10s to see if we can get a
    # response from upstream.
    serve-expired: yes
    serve-expired-ttl: 10
    serve-expired-ttl-reset: yes

    local-zone: "mooncarrot.space." transparent
    local-data: "controller.mooncarrot.space.   IN A 172.16.230.100"
    local-data: "piano.mooncarrot.space.        IN A 172.16.230.101"
    local-data: "harpsichord.mooncarrot.space.  IN A 172.16.230.102"
    local-data: "saxophone.mooncarrot.space.    IN A 172.16.230.103"
    local-data: "bag.mooncarrot.space.          IN A 172.16.230.104"

    local-data-ptr: "172.16.230.100 controller.mooncarrot.space."
    local-data-ptr: "172.16.230.101 piano.mooncarrot.space."
    local-data-ptr: "172.16.230.102 harpsichord.mooncarrot.space."
    local-data-ptr: "172.16.230.103 saxophone.mooncarrot.space."
    local-data-ptr: "172.16.230.104 bag.mooncarrot.space."

    fast-server-permil: 500

forward-zone:
    name: "."
    # Cloudflare DNS
    forward-addr: 1.0.0.1@853
    # DNSlify - ref https://www.dnslify.com/services/resolver/
    forward-addr: 185.235.81.1@853
    forward-ssl-upstream: yes

Where do we go from here?

Remember that it's important to not just copy and paste a configuration file, but to understand what every single line of it does.

In a future post in this series, we'll be revising this to forward requests for *.service.mooncarrot.space to Consul, a clustered service discovery system that keeps track of what is running where and presents a DNS server as an interface (there are others).

In the next post, we'll (probably) be looking at setting up Consul - unless something else happens first :P Nomad should be next after that, followed closely by Vault.

Once I've got all that set up (which is a blog post series in and of itself!), I'll then look at encrypting all communications between all nodes in the cluster. After that, we'll (finally?) get to Docker and my plans there. Other things include an apt and apk (the Alpine Linux package manager) caching servers - which will have to be tackled separately.

Oh yeah, and I want to explore Traefik, which is apparently like Nginx, but intended for containerised infrastructure.

All this is definitely going to keep me very busy!

Found this interesting? Got a suggestion? Comment below!

Sources and Further Reading

Measuring maximum RAM usage with Bash on Linux

While running a simulation on my University's Viper HPC, I found that I needed to measure the maximum RAM usage of a simulation that I was running. Since the solution wasn't particularly easy to find, I thought I'd quickly blog about it here.

Doing this isn't actually as easy as you might think. In the end, I used this:

/usr/bin/time --format 'Max RAM working set size: %Mk' command_here --foo bar

....replacing command_here --foo bar with the command you want to measure.

/usr/bin/time is a program that measures how long a command takes to execute, but it's evident that it measures a bunch of other different things as well. While the output format leaves something to be desired (hence the --format in the above), it does the job pretty well.

Note that /usr/bin/time is distinct from the time built-in you get in Bash. Depending on your shell, you may need to explicitly specify the full path to the time binary. In addition, if your system doesn't have the command (like Viper), you may need to copy it from another system that does that has the same CPU architecture.

I forget where it was that I found this solution, but if you comment below then I'll add the credit to this post if your post looks familiar.

As a quick extra, you can limit the time a command can execute for like this:

timeout 60 command_here

That will limit command_here to executing for only 60 seconds.

Found this interesting? Comment below!

Happy Easter 2020!

Some lovely flowers at University last year.

Happy Easter 2020! I hope everyone has a peaceful day.

Since I'm currently unable to obtain a nice picture for this holiday blog post, here's one from the archives instead. I took it at University last spring. They have always had such amazing flowerbeds - especially in the spring. You can really smell the flower, and makes the experience of going into University that much more peaceful.

In terms of my plans on here, I've got a number of interesting posts coming up. I discovered a wonderfully easy way to measure the maximum RAM usage of a process in Bash which I've blogged about, and I've finally got all my ducks in a row and powered up my Raspberry Pi cluster for the first time - so expect regular blog posts about my progress configuring that!

Also, I compressed the above image with mozjpeg, a JPEG encoder for the web by Mozilla. Since I couldn't find a command-line tool for it and can't be bothered to implement one for just a single picture, I used this website to compress it.

How to hash and sign files with GPG and a bit of Bash

When making a release of your software or sending some important documents, it's pretty common practice (especially amongst larger projects) to distribute hashes and GPG signatures along with release binaries themselves. Example projects that do this include:

This is great practice, since it allows downloaders to verify that their download has not been corrupted, and that it was you who released them and not some imposter.

In this post, I'm going to outline how you can do this too.

I've recently both verified a number of signatures and generated some of my own too, so I thought I'd post about it here to show others how to do it too.

Verification

Before we get into the generation of hashes and signatures, we should first talk about verifying them. I'm mentioned this is good practice already, so it makes sense to briefly talk about verification first. First, let's download Nomad version 0.10.5. You can grab the files here. Download the following files:

  • nomad_0.10.5_SHA256SUMS - The hashes themselves
  • nomad_0.10.5_SHA256SUMS.sig - The GPG signature of the above file
  • nomad_0.10.5_linux_arm.zip - Nomad itself, for the ARM architecture (feel free to pick whichever one you like and adapt these instructions accordingly)

Verifying the hashes is easier, so let's do that first. We can see from the filenames that we have SHA 256 hashes, so we'll want the sha256sum command. Windows users will need to use the Windows Subsystem for Linux, or setup an msys environment:

sha256sum --ignore-missing --check nomad_0.10.5_SHA256SUMS

It should output an OK message and return an exit code zero (to check this in a script, you can do echo "$?" directly after running it to check the exit code). 99% of the time this check will succeed, but you'll be glad that you checked the 1% of the time it fails.

Checking the hashes here is the bit that ensures that the files haven't been corrupted. Next, we'll verify the GPG signature. This is the bit that ensures that the files we've downloaded were actually originally released by who we think they were. Do that like this:

gpg --verify nomad_0.10.5_SHA256SUMS.sig

Doing this, you may get an error message telling you that it can't verify the signature because you haven't got the public key of the signer imported into your keyring. To remedy this, look for the bit that tells you the key id (e.g. using RSA key 51852D87348FFC4C). Copy it, and then do this:

gpg --recv-keys 51852D87348FFC4C

This will download it and import the key id into your local GPG keyring. Then re-run the gpg --verify command above, and it should work.

Generation

Now that we know how to verify a signature, let's generate our own. First, put the files you want to hash into a directory and cd into it in your terminal. Then, let's generate the hash file:

# Hash files
find . -type f -not -name "*.SHA256*" -print0 | xargs -0 -I{} -P"$(nproc)" sha256sum -b "{}" >HASHES.SHA256

We use find here to locate all the files (other than the hash file itself), and then pass them to xargs, which calls sha256sum to hash the files in question. Finally, we write the hashes to HASHES.SHA256.

Next, lets generate a GPG signature for the hash file. For this, you'll need a GPG key. That's out of scope of this post really, but this tutorial looks like it will show you how to do it. Note that in order for other people to verify the GPG signature you create, you'll probably need to upload your GPG public key to a keyserver (the article I link to shows you how to do this too).

Once done, generate the signature like this:

# Sign the hashes
gpg --sign --detach-sign --armor HASHES.SHA256

Specifically, we generate a detached signature here - meaning that it's in a separate file to the file that is being signed. The --armor there just means to wrap the signature in base64 encoding (I'm pretty sure) and some text so that it's not a raw binary file that might confuse the uninitiated.

Finally, let's verify the signature we just created - just in case:

# Verify the signature, and check we used the right key
gpg --verify HASHES.SHA256.asc

If all's good, it should tell you that the signature is ok. If you've got multiple keys, ensure that you signed it with the correct key here. GPG will sign things with the key you have marked as the default key.

Edit 27th October 2021: Fixed issues with HASHES.SHA256 generation with find

Found this interesting? Having trouble? Want to say hi? Comment below!

PhD Update 3: Simulating simulations with some success

Hey there! Welcome to another PhD update blog post. The last time I posted, I was still working away at getting the rainfall radar data downloader working as intended.

Thankfully, since then I've managed to get it to complete (wow, that too much longer than expected) - and I've now turned my attention to running the physics-based simulation, and beginning to implement the AI(s) that will (hopefully) implicitly learn the parameters of the model in question.

Physics-based simulation patching

Getting to this point, as you might imagine, wasn't quite as straight-forward as I initially thought. The physics-based model I'm (currently) using is HAIL-CAESAR, a (supposedly) high-performance version of CAESAR-Lisflood (yes, it's SourceForge shudder). Unfortunately, the format for the rainfall data that it takes in is especially space inefficient - after writing a converter, I found that my 4.5GiB compress JSON stream files (1 per year) would have turned into about 66GB of uncompressed ASCII! Theoretically speaking by my calculations. I don't have that much disk space free - so clearly another approach is in order.

This approach I speak of is convincing HAIL-CAESAR to take the data in via the standard input. I initially tried using a FIFO (also known as a named pipe), but I ran into this bug in Node.js.

HAIL-CAESAR by default doesn't support taking the data in on the standard input though, so I had to patch HAIL-CAESAR to add support. I did this by getting it to interpret the filename - to mean "use the standard input instead", which from my previous experiences seem to be an unofficial convention that lots of other programs follow. Perhaps at some point soon I should consider contributing my patch back to HAIL-CAESAR for others to enjoy.

Heightmap tweaking

With that sorted, I also had to mess around with the heightmap (I got this through my University's "Digimap" service thingy) I obtained to get it to be precisely the same size as the rainfall radar data I have.

It turned out that the service I got the heightmap from isn't smart enough to give you precisely the bit you asked for - instead giving you just the tiles that overlap the area you specify. In the end I found myself with ~170 separate tiles (some of which I had to get after the fact because I found I was missing some) - so I ended up implementing a program to stitch them all back together again.

That program ended up turning out much more complete as a separate whole than I thought it would. I'm pretty sure that these heightmap files I've been dealing with are in a standard format, but I'm not aware of its name (if you know, I'd love to hear from you - post a comment below!). It's for these reasons that I ended up releasing it as a pair of packages on npm.

You can find them here:

  • terrain50 - OS Terrain 50 manipulation library
  • terrain50-cli - Command-line interface for the above to make it easy to manipulate heightmaps from the command line

I'll probably make a separate blog post about them at some point soon. For the curious, the API docs (there's a link in the README of the library package too) are automatically updated with my Laminar CI setup :D

Tensor trouble

It is with a considerable amount of anticipation that I'm finally reaching the really interesting part of this experiment. This week, I've started work on implementing a Temporal Convolutional Neural Network (see also this paper). A Temporal CNN is a network type I discovered recently that takes advantage of multiple 3-dimensional CNN layers to allow a CNN-based model to learn temporal-based relationships in a dataset.

I'm not sure how well it's going to work on my particular dataset, given that the existing papers I've found on it use it for classification-based tasks, but I'm pretty hopeful that, with some tweaking, it should perform pretty well. While I haven't yet finished writing up the dataset input logic, I have implemented the core model using the Tensorflow.js layers API:

asciicast

In the end I've decided to give Tensorflow.js another go (I don't think I mentioned it, but it attempted to use it for my Master's summer project, but it didn't work out so well), since I realised that I've implemented a good portion of the data processing code in Javascript (Node.js) already (as mentioned above). Interestingly, HAIL-CAESAR spits out files in the same format as the heightmap I've been working with, which makes processing even easier!

What's next

From here, I intend to finish up my Temporal CNN implementation and get it running on the data I have so far from the HAIL-CAESAR model (which isn't unfortunately a lot - so far I've only got ~8K 5-minute time-steps worth of output which, if I'm calculating correctly, is just 29 days worth of simulation). I'm probably going to have to swap HAIL-CAESAR out at some point though, because it's really slow. Or perhaps I just don't know how to use it properly (maybe I should find someone more experienced with it and ask them first).

Anyway, I'm also going to try implementing a model inspired by the Google rainfall radar nowcasting paper I mentioned in my last post in this series. With both of these implemented, I can start to compare them and see which one is better suited for the task of flood prediction. I might even implement the Grid LSTM model I saw too.

In addition, I have my PhD panel 1 review coming up soon too - so apparently I've got a list of things I need to do to prepare for that - including writing a ~5K word report. I'll probably do this pretty soon - I don't want to be rushing it at the last minute.

Found this interesting? Got a suggestion? Want to say hi? Comment below!

Cluster, Part 2: Grand Designs

In the last part of this series, I talked about my plans for building an ARM-based cluster, because I'm growing out of the Raspberry Pi 3B+ I currently have at home. Since then, I have decided to focus on the compute cluster first, as I have a reasonable amount of room left on the 1tB WD Pidrive I have attached to my existing Raspberry Pi 3B+.

Hardware

To this end, I have been busy ordering parts and organising things to get construction of the compute cluster side of things going. The most important part of the whole cluster is the compute boards themselves. I've decided to go with 4 x Raspberry Pi 4s with 4GB RAM each for the worker nodes, and 1 x Raspberry Pi 4 with 2GB of RAM as the controller (it would have been a 1GB RAM model, but a recent announcement changed my mind :D):

(Above: The Raspberry Pi 4s I'm going to be using. The colourful heatsink cases there are to dissipate heat passively if possible and reduce the need for the fan to run as often. The one with the smaller red heatsink is the controller node - I don't anticipate the load on that node being high enough to need a bigger more expensive heatsink)

My reasoning for Raspberry Pis is software support. They are hugely popular - and from experience I can tell that they are pretty well supported on the software side of things. Issues with hardware features not being supported by the operating system are minimal - and where issues do arise they are more often than not sorted out. Regular kernel security updates are also provided - something that isn't always a thing with Linux distributions for other boards I've noticed.

Although the nodes in the cluster are very important, they are far from the only component I'll need. I'll also need a way to power it - which I've settled on an using a desktop ATX power supply (generously donated by University).

(Above: The ATX power supply, with a few wires cut and other bits and bobs attached. As of this blog post I'm in the middle of wiring it up, so I haven't finished it yet)

This adds some additional complications though, because wiring an ATX power supply up to a fleet of Raspberry Pi 4s isn't as easy as it sounds. To do that, I've decided to wire the 5V and ground wires up to 5 USB type-a breakout boards, with a 3 amp self-resettable fuse on each live (red) wire. Then I can use 5 short type-a to type-c converter cables to power the Raspberry Pi 4s.

(Above: The extra bits and bobs laid out that I'll be using to wire the ATX power supply up to the USB type-a breakout boards. From left to right: 3A self-resettable fuses, 18 AWG wire, Wagos, header pins, and finally the USB type-a breakout boards themselves)

With power to the Raspberry Pis, the core compute hardware is in place. I still need a bunch of things around the edges though, such as a (very quiet) fan to keep it cool:

(Above: A Noctua NF-P14s redux-1200)

I found this particular fan on quietpc.com. While their prices and shipping are somewhat expensive (I didn't actually buy it from there - I got a better deal on Amazon instead), they are a great place to look into the different options available for really quiet fans. I'm pretty sensitive to noise, so having a quiet fan is an important part of my cluster design.

This one is the large 14cm model, so that it fits in front of all 5 Raspberry Pis if they are stood up on their sides and stacked horizontally. It takes 12 volts, so I'll be connecting it to the 12V rail from the ATX power supply. The fan speed is also controllable via PWM (pulse-width modulation), so I plan on using an Arduino (probably one of the Arduino Unos I've got lying around) to control it and present a serial interface or something to the Raspberry Pi that's acting as the controller node in the cluster.

Lastly, another extremely important part of any cluster is a solid switch. Without a great switch at the base of the network, you'll have all sorts of connection issues and the performance of the cluster will be degraded significantly. I'm anticipating that I'll want to transfer significant amounts of data around very quickly (e.g. Docker container images, and later large blocks of data during a storage cluster rebalance).

For this reason, I've bought myself a Netgear GS116v2. While its unmanaged, I can't currently afford a more expensive managed switch at this time. It is however gigabit and also has an array of other features such as energy efficient ethernet (802.3az), full duplex gigabit (i.e. 32GB bandwidth available to all ports, which is enough for all ports to be transmitting and receiving gigabit at the same time), and a silent fanless design.

My Netgear GS116v2

(Above: The switch I'll be using. I watched eBay and got it used for much less than it's available new)

Networking

Hardware isn't the only thing I've been thinking about. While I've been waiting for packages to arrive, I've also been planning out the software I'm going to use and how I'm going to network all my Pis together.

My plans on the networking side of things are subject to significant change depending on how many responsibilities I can convince my home router to give up, but I have drawn up a network diagram showing what I'm currently aiming towards:

An ideal-case scenario network diagram. Explained below.

The cluster is represented on the left half of the diagram. This will probably entail some considerable persuasion of my router to pull off, but a quick look reveals that it's (probably) possible with some trial-and-error.

The idea is that I have a separate subnet for the cluster than the rest of the home network. Then I can do strange stuff and fiddle with it (hopefully) without affecting everyone else on the network.

Software

Meanwhile, out of all the different aspects of building this cluster I've got the clearest picture as to the software I'm going to be using.

I've decided that I'm going to use a container-based system. I've looked at a number of different options (such as podman and Singularity) - but I'm currently of the opinion that Docker is the most suitable option for what I'm going for. It's not as enterprisey as Singularity, and it seems to be more mature than podman. It also has a huge library of prebuilt containers too - but for learning purposes I'm going to be writing almost all my container scripts from scratch - probably using some sort of Alpine Linux container as a base. If I ever run into a situation where Docker isn't suitable and I need something closer to a VM, I'll probably use LXC, which I believe sits on top of the same underlying container runtime that Docker does.

I'm anticipating that container-based tech is going to be great for managing the stuff that's running on my cluster - so you can expect more posts that go into some depth about how it all works and how I'm setting my system up in the future.

To complement my container-based tech, I'm also going to be using a workload orchestrator. The Viper High-Performance Computer I've recently gained access to has lots of nodes in it and uses Slurm for workload orchestration, but that seems more geared towards environments that have lots of jobs that each have a defined running time. Great for scientific simulations and other such things, but not so great for personal self-hosted applications and the like.

Instead, I'm probably going to use Nomad. It looks seriously cool, and an initial look at the documentation reveals that it's probably going to be much simpler easier to understand than Kubernetes (see also), which seems to be the other competing software in the business. It also seems to integrate well with other programs done by the same company (Hashicorp) like Consul for service networking management (I'm hoping I can get DNS resolution for the services running on the cluster under control with it) and Vault for secret management (e.g. API keys, passwords, and other miscellaneous secrets) - all of which I'm going to install and experiment with (expect more on that soon).

All of those for now will be backed by an NFS share on all nodes in the cluster for the persistent volumes attached to running containers.

On the controller node I mentioned earlier I'm also going to be running a few extra items to aid in the management of the cluster:

  • A Docker registry, from which the worker nodes will be pulling containers for execution (worker nodes will not have access to the public Docker registry at hub.docker.com)
  • An apt caching proxy - probably apt-cacher-ng. Since all the nodes in the cluster are going to be using the same OS, have the same packages installed, and the same configuration settings etc, it doesn't make much sense for them to be downloading apt packages from the Internet every time - so I'll be caching them locally on the controller node
  • Potentially some sort of reverse proxy that sits in front of all the services running on the cluster, but I haven't decided on how this will fit into the larger puzzle just yet (more research is required). I'm already very familiar with Nginx, but I've seen Traefik recommended for dynamic container-based setups, so I'm going to investigate that too.

That about covers my high-level design ideas. As of the time of typing, the next thing I need to do is organise a case for it all to go in, fix the loose connections in the screw terminals (not pictured; they arrived after I took the pictures), and then find a place to put it....

Testing storage devices with f3

Some microSD cards (Above: Some microSD cards. Thankfully none of these are fake, but you never know.....)

Always test storage devices after you buy them. I don't just mean check to see if they work (though that's a good idea too), but also that they can actually store the amount of stuff that they advertise they can.

Recently, I bought myself 5 64GB microSD cards for my cluster (more on this very soon in a future blog post!). The first thing did when I got them was test them to make sure that they could actually store 64GB of stuff. My tool of choice was f3, which stands for Fight Flash Fraud or Fight Fake Flash. I'm glad I did - because 3 of them turned out to be faulty. 2 of them were actually 32GB cards in disguise, and 1 of them wouldn't mount at all.

While this might be my first experience with fake or fault storage devices, it's hardly an uncommon occurrence. Everything from microSD cards to flash drives - and even regular hard drives! - may be faulty upon arrival - or worse appear fine at first, and then a few months down the line start corrupting random data for no reason.

f3 is a suite of tools for testing storage devices to make sure they function properly. They work best as a destructive test - i.e. one that destroys existing data on the disk - so if you've got some data on the target disk you want to test, now is the time to back it up (hopefully this is something you've been doing already - more on that in another post if there's the demand).

f3 consists of 3 principle tools:

  • f3probe, which runs a fast test to check for issues (sadly I couldn't get this to work reliably)
  • f3write, which fills a disk with test files
  • f3read, which reads the test files back from disk and validates them

It's a real shame that I can't get f3probe to work reliably. Maybe at some point I'll implement my own version that writes data to every nth block of a device to test it more quickly than the f3write/f3read mechanism I'll explain below (if anyone knows of a better tool that works on Linux, please let comment below!)

To test a device, you first need to write the test files to it. I've taken to reformatting the device as ext4 (the Linux filesystem) first:

sudo umount /dev/sdXY; # Unmount it if it's currently mounted
sudo mkfs.ext4 /dev/sdXY; # Format it to ext4

....where /dev/sdXY is the partition you want to format. This isn't mandatory, but it is a quick way of making sure a disk is empty.

Next, we need to write the test files to the device. If it isn't already, you'll need to mount it first. This can be done like so:

# If it's not mounted automatically:
sudo mkdir /media/YOUR_USERNAME_HERE/SOME_NAME_HERE;
sudo mount /dev/sdXY /media/YOUR_USERNAME_HERE/SOME_NAME_HERE;
f3write /media/YOUR_USERNAME_HERE/SOME_NAME_HERE

This might take a while - don't forget to replace the paths there with those specific to your setup. With the test files written to the disk, we need to read them back again to make sure they are valid:

f3read /media/YOUR_USERNAME_HERE/SOME_NAME_HERE

This will read them all back again, and then print a summary report at the bottom to tell you what it found. Ideally, it should show a big number of blocks as succeeded, and no blocks in any of the other failure categories.

Running multiple commands like this is effort though, so surely we can do better than this. With some simple shell scripting, we can run both commands at once:

location=/media/YOUR_USERNAME_HERE/SOME_NAME_HERE; f3write "${location}"; && f3read "${location}"; alert

If you're on a machine with a graphical desktop, then the ; alert bit on the end should generate a desktop notification when it's done. For other users (e.g. over SSH), this should be removed. Just in case you have a graphical desktop (e.g. Ubuntu Desktop) and the alert bit doesn't work for you, append this to your ~/.bashrc file and restart your terminal:

# Add an "alert" alias for long running commands.  Use like so:
#   sleep 10; alert
alias alert='notify-send --urgency=low -i "$([ $? = 0 ] && echo terminal || echo error)" "$(history|tail -n1|sed -e '\''s/^\s*[0-9]\+\s*//;s/[;&|]\s*alert$//'\'')"'

....I forget where this is from exactly.

If you're not likely to be at your computer when it finishes, then there's still something you can do. Personally I use XMPP for personal messaging, so I thought it would be great if I could get a notification when it was done. Since I've already written xmppbridge for easily sending XMPP messages from the terminal, it was pretty trivial to write a shell script for my bin folder that would send my a message when the process was complete:

#!/usr/bin/env bash

# f3test: Runs f3 on the current directory.
# 
# Usage:
#     f3test "alerts@xmpp.example.com"
# 

destination="$1";

f3write .;
f3read .;

echo "Card testing complete in ${SECONDS}s" | xmppbridge --groupchat --destination "${destination}";

I called this script f3test, and put it in my ~/bin folder. To use it, first cd to the root of the device you want to test (`` in the above examples), and then set a pair of environment variables to let it know how to login to an XMPP account to send a message:

export XMPP_JID="someone@bobsrockets.com"; # The JID to login with.
export XMPP_PASSWORD="weN33dM0reBoost3rs"; # The password to use when logging in

...remove the --groupchat in the script if it's not a groupchat you want it to send a message to (I have a personal group chat that's just between me and various bots that notify me about various aspect of the systems I manage). If you don't have an XMPP account yet, you can get one at any public server in the XMPP directory, or run your own (see also snikket, which is a distribution of Prosody that's designed to be extremely easy to setup & run)!

Of course, you could just as easily swap the xmppbridge call there with a different command to send a message via a different channel. For example mailx can send emails.

Found this interesting? Got a better tool? Need some help? Comment below!

Installing libonig4 from source to fix php7.4-mbstring

I have several Raspberry Pis. The one I'd like tot alk about today though is a 3B+, and for 1 reason or another it has PHP installed on it with the excellent deb.sury.org apt PPA for PHP. Recently, I've upgraded to PHP 7.4. This was fine initially, but soon enough I started to get a warning that php-mbstring couldn't be installed and that I have held broken packages.

This was not a good sign, but after doing some digging it transpired that the package libonig4 was missing - and couldn't be installed because it wasn't available in the Raspbian apt repositories. Awkward.

After doing some quick digging into the Ubuntu apt repositories, I discovered that while it does exist, it isn't built for armhf (the architecture of the Raspberry Pi).

Thankfully though, Ubuntu is open-source - so the source package was available. The Debian tooling makes it relatively easy to build source packages once downloaded too. Unfortunately I couldn't use the apt-get source command to download it as I didn't have an Ubuntu machine to hand, but their website makes it easy to download packages:

https://packages.ubuntu.com/bionic/libonig4

On here, you'll want to download the 3 source package files:

The source package download page

Download them to a new directory. Then, extract the source files like so:

cd path/to/directory;
dpkg-source -x *.dsc;

Next, cd into the created directory, and build the source files into a bunch of .deb files:

cd libonig-6.7.0/;
dpkg-buildpackage --no-sign;

The --no-sign there is necessary, because otherwise I encountered errors where it tried to automatically sign the resulting package with the original author's secret key, which we obviously don't have access to!

Once done (it might make a moment), a bunch of .deb files will be generated in the parent directory:

Filename Description
libonig4_6.7.0-1_armhf.deb The actual package itself
libonig4-dbgsym_6.7.0-1_armhf.deb Debugging symbols generated in the build process
libonig-dev_6.7.0-1_armhf.deb Development headers (in case you need to build another package against it)

Out of these 3, the top and bottom ones are probably the ones you want to install. This can be done like so:

sudo dpkg -i libonig4_6.7.0-1_armhf.deb;
sudo dpkg -i libonig-dev_6.7.0-1_armhf.deb;

This completes the process. Now, we can install php7.4-mbstring as normal:

sudo apt install php7.4-mbstring

Success! This should solve the problem. I figured this out in part by following a Unix Stackexchange answer that I have since lost, but I had to adapt the instructions significantly - so I decided to blog about it here.

Found this useful? Still encountering issues? Comment below!

Variable-length fuzzy hashes with Nilsimsa for did you mean correction

Or, why fuzzy hashing isn't helpful for improving a search engine. Welcome to another blog post about one of my special interests: search engines - specifically the implementation thereof :D

I've blogged about search engines before, in which I looked at taking my existing search engine implementation to the next level by switching to a SQLite-based key-value datastore backing and stress-testing it with ~5M words. Still not satisfied, I'm now turning my attention to query correction. Have you ever seen something like this when you make a typo when you do a search?

Surprisingly, this is actually quite challenging to achieve. The problem is that given a word with a typo in it, while it's easy to determine if a word contains a typo, it's hard to determine what the correct version of the word is. Consider a wordlist like this:

apple
orange
pear
grape
pineapple

If the user entered something like pinneapple, then it's obvious to us that the correct spelling would be pineapple - but in order to determine this algorithmically you need an algorithm capable of determining how close 2 different words are to 1 another.

The most popular algorithm for this is called leveshtein. Given 2 words a and b, it calculates the number of edits to turn a into b. For example, the edit distance between pinneapple and pineapple is 1.

This is useful, but it still doesn't help us very much. With this, we'd have to calculate the leveshtein distance between the typo and all the words in the list. This could easily run into millions of words for large wikis, so this is obviously completely impractical.

To this end, we need a better idea. In this post, I'm going to talk about my first attempt at solving this problem. I feel it's important to document failures as well as successes, so this is part 1 of a 2 part series.

The first order of business is to track down a Nilsimsa implementation in PHP - since it doesn't come built-in, and it's pretty complicated to implement. Thankfully, this isn't too hard - I found this one on GitHub.

Nilsimsa is a fuzzy hashing algorithm. This means that if you hash 2 similar words, then you'll get 2 similar hashes:

Word Hash
pinneapple 020c2312000800920004880000200002618200017c1021108200421018000404
pineapple 0204239242000042000428018000213364820000d02421100200400018080200256

If you look closely, you'll notice that the hashes are quite similar. My thinking is that if we vary the hash size, then words that are similar will have identical hashes, allowing the search space to be cut down significantly. The existing Nilsimsa implementation I've found doesn't support that though, so we'll need to alter it.

This didn't turn out to be too much of a problem. By removing some magic numbers and adding a class member variable, it seems to work like a charm:

(Can't view the above? Try this direct link.)

I removed the comparison functions since I'm not using them (yet?), and also added a static convenience method for generating hashes. If I end up using this for large quantities of hashes, I may come back to it make it resettable, to avoid having to create a new object for every hash.

With this, we can get the variable-length hashes we wanted:

256       0a200240020004a180810950040a00d033828480cd16043246180e54444060a5
128       3ba286c0cf1604b3c6990f54444a60f5
64        02880ed0c40204b1
32        060a04f0
16        06d2
8         06

The number there is the number of bits in the hash, and the hex value is the hash itself. The algorithm defaults to 256-bit hashes. Next, we need to determine which sized hash is best. The easiest way to do this is to take a list of typos, hash the typo and the correction, and count the number of hashes that are identical.

Thankfully, there's a great dataset just for this purpose. Since it's formatted in CSV, we can download it and extract the typos and corrections in 1 go like this:

curl https://raw.githubusercontent.com/src-d/datasets/master/Typos/typos.csv | cut -d',' -f2-3 >typos.csv

There's also a much larger dataset too, but that one is formatted as JSON objects and would require a bunch of processing to get it into a format that would be useful here - and since this is just a relatively quick test to get a feel for how our idea works, I don't think it's too crucial that we use the larger dataset just yet.

With the dataset downloaded, we can run our test. First, we need to read the file in line-by line for every hash length we want to test:

<?php
$handle = fopen("typos.csv", "r");

$sizes = [ 256, 128, 64, 32, 16, 8 ];
foreach($sizes as $size) {
    fseek($handle, 0); // Jump back to the beginning
    fgets($handle); // Skip the first line since it's the header

    while(($line = fgets($handle)) !== false) {
        // Do something with the next line here
    }
}

PHP has an inbuilt function fgets() which gets the next line of input from a file handle, which is convenient. Next, we need to actually do the hashes and compare them:

<?php

// .....

$parts = explode(",", trim($line), 2);
if(strlen($parts[1]) < 3) {
    $skipped++;
    continue;
}
$hash_a = Nilsimsa::hash($parts[0], $size);
$hash_b = Nilsimsa::hash($parts[1], $size);

$count++;
if($hash_a == $hash_b) {
    $count_same++;
    $same[] = $parts;
}
else {
    $not_same[] = $parts;
}
echo("$count_same / $count ($skipped skipped)\r");

// .....

Finally, a bit of extra logic around the edges and we're ready for our test:

<?php
$handle = fopen("typos.csv", "r");
$line_count = lines_count($handle);
echo("$line_count lines total\n");

$sizes = [ 256, 128, 64, 32, 16, 8 ];
foreach($sizes as $size) {
    fseek($handle, 0);fgets($handle); // Skipt he first line since it's the header

    $count = 0; $count_same = 0; $skipped = 0;
    $same = []; $not_same = [];
    while(($line = fgets($handle)) !== false) {
        $parts = explode(",", trim($line), 2);
        if(strlen($parts[1]) < 3) {
            $skipped++;
            continue;
        }
        $hash_a = Nilsimsa::hash($parts[0], $size);
        $hash_b = Nilsimsa::hash($parts[1], $size);

        $count++;
        if($hash_a == $hash_b) {
            $count_same++;
            $same[] = $parts;
        }
        else $not_same[] = $parts;
        echo("$count_same / $count ($skipped skipped)\r");
    }

    file_put_contents("$size-same.csv", implode("\n", array_map(function ($el) {
        return implode(",", $el);
    }, $same)));
    file_put_contents("$size-not-same.csv", implode("\n", array_map(function ($el) {
        return implode(",", $el);
    }, $not_same)));

    echo(str_pad($size, 10)."→ $count_same / $count (".round(($count_same/$count)*100, 2)."%), $skipped skipped\n");
}

I'm writing the pairs that are the same and different to different files here for a visual inspection. I'm also skipping words that are less than 3 characters long, and that lines_count() function there is just a quick helper function for counting the number of lines in a file for the progress indicator (if you write a \r without a \n to the terminal, it'll reset to the beginning of the current line):

<?php
function lines_count($handle) : int {
    fseek($handle, 0);
    $count = 0;
    while(fgets($handle) !== false) $count++;
    return $count;
}

Unfortunately, the results of running the test aren't too promising. Even with the shortest hash the algorithm will generate without getting upset, only ~23% of typos generate the same hash as their correction:

7375 lines total
256       → 7 / 7322 (0.1%), 52 skipped
128       → 9 / 7322 (0.12%), 52 skipped
64        → 13 / 7322 (0.18%), 52 skipped
32        → 64 / 7322 (0.87%), 52 skipped
16        → 347 / 7322 (4.74%), 52 skipped
8         → 1689 / 7322 (23.07%), 52 skipped

Furthermore, digging deeper with an 8-bit you start to get large numbers of words that have the same hash, which isn't ideal at all.

A potential solution here would be to use hamming distance (basically counting the number of bits that are different in a string of binary) to determine which hashes are similar to each other like leveshtein distance does, but that still doesn't help us as we then still have a problem that's almost identical to where we started.

In the second part of this mini-series, I'm going to talk about how I ultimately solved this problem. While the algorithm I ultimately used (a BK-Tree, more on them next time) is certainly not the most efficient out there (it's O(log n) if I understand it correctly), it's very simple to implement and is much less complicated than Symspell, which seems to be the most efficient algorithm that exists at the moment.

Additionally, I have been able to optimise said algorithm to return results for a 172K wordlist in ~110ms, which is fine for my purposes.

Found this interesting? Got another algorithm I should check out? Got confused somewhere along the way? Comment below!

Art by Mythdael