Stardust | Starbeamrainbowlabs

Running multiple local versions of CUDA on Ubuntu without sudo privileges

I've been playing around with Tensorflow.js for my PhD (see my PhD Update blog post series), and I had some ideas that I wanted to test out on my own that aren't really related to my PhD. In particular, I've found this blog post to be rather inspiring - where the author sets up a character-based recurrent neural network to generate text.

The idea of transcoding all those characters to numerical values and back seems like too much work and too complicated just for a quick personal project though, so my plan is to try and develop a byte-based network instead, in the hopes that I can not only teach it to generate text as in the blog post, but valid Unicode as well.

Obviously, I can't really use the University's resources ethically for this (as it's got nothing to do with my University work) - so since I got a new laptop recently with an Nvidia GeForce RTX 2060, I thought I'd try and use it for some machine learning instead.

The problem here is that Tensorflow.js requires only CUDA 10.0, but since I'm running Ubuntu 20.10 with all the latest patches installed, I have CUDA 11.1. A quick search of the apt repositories on my system reveals nothing that suggests I can install older versions of CUDA alongside the newer one, so I had to devise another plan.

I discovered some months ago (while working with Viper - my University's HPC - for my PhD) that you can actually extract - without sudo privileges - the contents of the CUDA .run installers. By then fiddling with your PATH and LD_LIBRARY_PATH environment variables, you can get any program you run to look for the CUDA libraries elsewhere instead of loading the default system libraries.

Since this is the second time I've done this, I thought I'd document the process for future reference.

First, you need to download the appropriate .run installer for the CUDA libraries. In my case I need CUDA 10.0, so I've downloaded mine from here:

https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_type=runfilelocal

Next, we need to create a new subdirectory and extract the .run file into it. Do that like so:

cd path/to/runfile_directory;
mkdir cuda-10.0
./cuda_10.0.130_410.48_linux.run --extract=${PWD}/cuda-10.0/

Make sure that the current working directory contains no spaces, no preferably no other special characters either. Also, adjust the file and directory names to suit your situation.

Once done, this will have extract 3 subfiles - which also have the suffix .run. We're only interested in CUDA itself, so we only need to extract the the one that starts with cuda-linux. Do that like so (adjusting file/directory names as before):

cd cuda-10.0;
./cuda-linux.10.0.130-24817639.run -noprompt -prefix=$PWD/cuda;
rm *.run;
mv cuda/* .;
rmdir cuda;

If you run ./cuda-linux.10.0.130-24817639.run --help, it's actually somewhat deceptive - since there's a typo in the help text! I corrected it for this above though. Once done, this should leave the current working directory containing the CUDA libraries - that is a subdirectory next to the original .run file:

+ /path/to/some_directory/
    + ./cuda_10.0.130_410.48_linux.run
    + cuda-10.0/
        + version.txt
        + bin/
        + doc/
        + extras/
        + ......

Now, it's just a case of fiddling with some environment variables and launching your program of choice. You can set the appropriate environment variables like this:

export PATH="/absolute/path/to/cuda-10.0/bin:${PATH}";
if [[ ! -z "${LD_LIBRARY_PATH}" ]]; then
    export LD_LIBRARY_PATH="/absolute/path/to/cuda-10.0/lib64:${LD_LIBRARY_PATH}";
else
    export LD_LIBRARY_PATH="/absolute/path/to/cuda-10.0/lib64";
fi

You could save this to a shell script (putting #!/usr/bin/env bash before it as the first line, and then running chmod +x path/to/script.sh), and then execute it in the context of the current shell for example like so:

source path/to/activate-cuda-10.0.sh

Many deep learning applications that use CUDA also use CuDNN, a deep learning library provided by Nvidia for accelerating deep learning applications. The archived versions of CuDNN can be found here: https://developer.nvidia.com/rdp/cudnn-archive

When downloading (you need an Nvidia developer account, but thankfully this is free), pay attention to the version being requested in the error messages generated by your application. Also take care to download the version of CUDA you're using, and match the CuDNN version appropriately.

When you download, select the "cuDNN Library for Linux" option. This will give you a tarball, which contains a single directory cuda. Extract the contents of this directory over the top of your CUDA directory from following my instructions above, and it should work as intended. I used my graphical archive manager for this purpose.

PhD Update 6: The road ahead

Hey, I'm back early with another post in my PhD series! Turns out there was a bit of mix up last time, and I misplaced update 4 - so I've renamed the duplicate update 4 to update 5. Before we continue, here are all the posts in this series so far:

Earlier today, I had my PhD panel 2. For those reading who don't know, the primary assessments for a PhD (at least in my case) take the form of 6 panels - in which you write a big report, and then your supervisors get together with you and discuss it. The even numbered ones at the end of each year are the more important ones, I'm led to believe - so I was understandably nervous.

Thankfully, it went well, and I ended up writing a report that's a third longer than my undergraduate dissertation at ~9k words O.o Anyway, I wanted to make another post in this series, as the process of writing the report (and research plan) for my panel has made me look at the bigger picture of where my PhD is going, and how it's all going to tie together - and I wanted to share this here.

Temporal CNN

In the last few posts, I've given some insights into the process I've been working through to train a Temporal CNN to predict the output of HAIL-CAESAR. This is starting to reach its conclusion - though there are a number of tasks I have yet to complete to close this chapter of the story. In particular, I've (finally!) got the results from the hyperparameter optimisation I've been doing. Let's check out a heatmap:

The aforementioned heatmap - explained below.

I plotted the above with GNUPlot, but ran into a number of issues (of course) while generating it, which I'd like to briefly discuss here. If you're interested in a more detailed discussion of this, please get in touch (I'll also need to know your real name and where you're working / studying, just in case) via a private communication method - such as my email address on my website.

Firstly, those who are particularly particularly perceptive will notice the rather strange filename for the above image. As indicated, I ran into some instability in my early stopping algorithm. For each epoch, the algorithm I implemented looked at the error from 3 epochs ago - and if it's greater than the error value for the epoch that's just finished it allows training to continue. Unfortunately, due to the nature of the task at hand, this sometimes caused it to stop training too soon. Further analysis of the results revealed that it sometimes allowed it to continue training too long as well (which is reflected in the above chart), but I'm baffled on that one.

For this reason, all hyperparameter combinations that didn't train for quite long enough were omitted in the above graph.

Secondly, there's a large area of the chart that isn't filled in. This is because both increasing the number of filters and the temporal depth increases memory usage - and the area that's unfilled is the area in which it failed to train because it ran out of GPU memory (I used 4 x Nvidia GeForce V100 - thanks very much to my department for providing this!).

Finally, the most important thing to note is that after reviewing other similar models, I discovered that this really isn't the best way to evaluate the performance of this kind of model. Rather, it would be better to consider this as a classification-based task instead. This can be achieved by creating a number of different bins for the water depth values, and assign a probability to each pixel for each water depth bin. The technical name for this is cross-entropy loss as far as I'm aware, which in my case I've been tending to prefix with pixel-based.

In this fashion, I can use things like a confusion matrix (can't find a good explanation of this, so I'll talk about it in a future blog post) to evaluate the performance of the model - which allows me to see the models strengths and weaknesses. What are its strengths? What are its weaknesses? I hope to find out - though I'm a little nervous about it, as I haven't yet looked into how to generate one with Tensorflow.js or tested my initial cross-entropy loss implementation with my kind of dataset yet.

Having a probability-based system like this should also make the output more useful. By this, I mean that having probabilities assigned to different water depths should be easier to interpret than a single value which nobody knows how the model came to that conclusion, or how uncertain the model is about the result.

Let's take look at 1 more set of graphs before we move on:

4 graphs showing the training / validation root mean squared error - see below for explanation.

In this set of 4 graphs, I've taken the training and validation root mean squared error from 4 different combinations of hyperparameters. These rather effectively illustrate the early stopping stability here (top left), and they also demonstrate some instability in the training process (all graphs). I'm guessing the latter is probably caused by insufficient training data - this I can remedy quite easily as I have plenty more besides the 5K time steps I trained on (~1.5M time steps to be precise), but in the interests of time I limited the training dataset so that I could train a variety of different models.

It also shows the shortcomings of the root mean-squared-error approach I've been using so far - is an average error of 2 biased towards deeper or shallower water? What can it predict well, and what does it struggle with?

My supervisor also wants me to write a publication on the Temporal CNN stuff I've done when it's complete, so that's going to be an interesting new experience for me too (I'm both excited and terribly nervous at the same time).

Social Media Analysis

Moving forwards, I'm hoping to complement my Temporal CNN model with some social media analysis (subject to ethical approval of course - filling out the paperwork for this is part of my task list for the rest of this week). If I get the go-ahead, my intention is to bring some natural language processing AI model to the subject of flood mapping using social media - as I've noticed that there's.... limited existing material on this so far.

I have yet to read up properly on the subject, but most recently I've found this paper rather interesting. It uses an unsupervised model to identify the topic(s) a tweet is talking about, which they then follow up with some statistical analysis.

I'm a little fuzzy on how they identify where tweets are talking about (especially since this paper mentions that only a very small percentage of tweets have a geotag attached, and even of these a fraction will be wrong or misleading for 1 reason or another), but the researchers put together a rather nice map by chaining together several statistical techniques techniques I'm currently unfamiliar with. It shows hot spots and cold spots, which indicate where the damage for an earthquake has occurred.

If possible, it would be very nice indeed to plot a flood on a map (stretch goal: in real-time!). I anticipate this to be a key issue I'll need to pay attention to.

In terms of my most immediate starting point, I'm going to be doing some more reading on the subject, and then have a chat with my supervisor about the next steps (she's particularly knowledgable about natural language processing :D).

Conclusion

In conclusion, I'm starting to come to the end of the Temporal CNN chapter of my PhD, although I suspect it's going to keep coming back over and over again to say hello. Moving forwards, I'm hoping to complement work I've done so far with some social media analysis (subject to ethical approval) using AI-based Natural Language Processing - which will probably consist of improving an existing model or something more directly computer science related (my existing work is classified more as a contribution to environmental sciences, apparently).

Look out for more posts in this series in the future, as I'm sure I'll have plenty to talk about soon (I might do another one in a month's time, or it might end up being nearer 2 months depending on circumstances).

Proteus VIII Laptop from PC Specialist in Review

Recently I bought a new laptop from PC Specialist. Unfortunately I'm lost the original quote / specs that were sent to me, but it was a Proteus VIII. It has the following specs:

CPU: Intel i7-10875H
RAM: 32 GiB DDR4 2666MHz
Disk: 1 TiB SSD (M.2; nvme)
GPU: Nvidia GeForce RTX 2060

In this post, I want to give a review now that I've had the device for a short while. I'm still experiencing some teething issues (more on those later), but I've experienced enough of the device to form an opinion on it. This post will also serve as a sort-of review of the installation process of Ubuntu too.

It arrived in good time - thankfully I didn't have any issues with their choice of delivery service (DPD in my area have some problems). I did have to wait a week or 2 for them to build the system, but I wasn't in any rush so this was fine for me. The packaging it arrived it was ok. It came in a rather large cardboard box, inside which there was some plastic padding (sad face), inside which there was another smaller cardboard box. Work to be done in the eco-friendly department, but on the whole good here.

I ordered without an operating system, as my preferred operating system is Ubuntu (the latest version is currently 20.10 Groovy Gorilla). The first order of business was the OS installation here. This went went fine - but only after I could actually get the machine to boot! It turns out that despite it appearing to have support for booting from USB flash drives as advertised in the boot menu, this feature doesn't actually work. I tried the following:

The official Ubuntu ISO flashed to a USB 3 flash drive
A GRUB installation on a USB 3 flash drive
A GRUB installation on a USB 2 flash drive
Ubuntu 20.10 burned to a DVD in an external DVD drive (ordered with the laptop)

....and only the last one worked. I've worked with a diverse range of different devices, but never have I encountered one that completely refused to boot at all from a USB drive. Clearly some serious work is required on the BIOS. The number of different settings in the BIOS were also somewhat limited compared to other systems I've poked around on, but I can't give any specific examples here of things that were missing (other than a setting to toggle the virtualisation extensions, which was on by default) - so I guess it doesn't matter all that much. The biggest problem is the lack of USB flash drive boot support - that was really frustrating.

When installing Ubuntu this time around, I decided to try enabling LVM (Logical Volume Management, it's very cool I've discovered) and a LUKS encrypted hard drive. Although I've encountered these technologies before, this will be my first time using them regularly myself. Thankfully, the Ubuntu installer did a great job of setting this up automatically (except the swap partition, which was too small to hibernate, but I'll talk about that in a moment).

Once installed, I got to doing the initial setup. I'm particularly picky here - I use the Unity 7.5 Desktop (yes, I know Ubuntu now uses the GNOME shell, and no I haven't yet been able to get along with it). I'll skip over the details of the setup here, as it's not really relevant to the review. I will mention though that I'm also using X11, not Wayland at the moment - and that I have the propriety Nvidia driver installed (version 450 at the time of typing).

Although I've had a discrete graphics card before (most recently an AMD Radeon R7 M445, and an Nvidia 525M), this is the first time I've had one that's significantly more powerful than the integrated graphics that's built into the CPU. My experience with this so far is mostly positive (it's rather good at rendering in Blender, but I have yet to stress it significantly), and in some graphical tests it gives significantly higher frame rates than the integrated graphics. If you use the propriety graphics drivers, I recommend going into the Nvidia X server settings (accessed through the launcher) → PRIME Profiles, changing it to "On-Demand", and then rebooting. This will prolong your battery life and reduce the noise from the fans by using the integrated graphics by default, but allow you to run select applications on the GPU (see my recent post on how to do this).

It's not without its teething issues though. I think I'm just unlucky, but I have yet to setup a system with an Nvidia graphics card where I haven't had some kind of problem. In this case, it's screen flickering. To alleviate this somewhat, I found and followed the instructions in this Ask Ubuntu Answer. I also found I had to enable the Force synchronization between X and GLX workaround (and maybe another one as well, I can't remember). Even with these enabled, sometimes I still get flickering after it resumes from suspension / stand by.

Speaking of stand by mode, I've found that this laptop does not like hibernation at all. I'm unsure as to whether this is just because I'm using LVM + LUKS, or whether it's an issue with the device more generally, but if I try sudo pm-hibernate from the terminal, the screen flashes a bit, the mouse cursor disappears, and then the fan spins up - with the screen still on and all my windows apparently still open.

I haven't experimented with the quirks / workarounds provided yet, but I guess ties into the early issues with the BIOS, in that there are some clear issues with the BIOS that need to be resolved.

This hibernation issue also ties into the upower subsystem, in that even if you tell it (in both the Unity and GNOME desktop shells) to "do nothing" on low battery, it will forcefully turn the device off - even if you're in the middle of typing a sentence! I think this is because upower doesn't seem to have an option for suspend or "do nothing" in /etc/Upower/UPower.conf or something? I'm still investigating this issue (if you have any suggestions, please do get in touch!).

Despite these problems, the build quality seems good. It's certainly nice having a metal frame, as it feels a lot more solid than my previous laptop. The keyboard feels great too - the feedback from pressing the keys enhances the feeling of a solid frame. The keyboard is backlit too, which makes more a more pleasant experience in dimly lit rooms (though proper lighting is a must in any workspace).

The layout of the keyboard feels a little odd to me. It's a UK keyboard yes (I use a UK keyboard myself), but it doesn't have dedicated Home / End / Page Up / Page Down keys - these are built into the number pad at the right hand side of the keyboard. It's taken some getting used to toggling the number lock every time I want to use these keys, which increases cognitive load.

It does have a dedicated SysRq key though (which my last laptop didn't have), so now I can articles like this one and use the SysRq feature to talk to the Linux Kernel directly in case of a lock-up or crash (I have had the screen freeze on me once or twice - I later discovered this was because it had attempted to hibernate and failed, and I also ran into this problem, which I have yet to find a resolution to), or in case I accidentally set off a program that eats all of the available RAM.

The backlight of the keyboard goes from red at the left-hand side to green in the middle, and blue at the right-hand side. According to the PC Specialist forums, there's a driver that you can install to control this, but the installation seems messy - and would probably need recompiling every time you install a new kernel since DKMS (Dynamic Kernel Module System, I think) isn't used. I'm ok with the default for now, so I haven't bothered with this.

The touchpad does feel ok. It supports precision scrolling, has a nice feel to it, and isn't too small, so I can't complain about it.

The laptop doesn't have an inbuilt optical drive, which is another first for me. I don't use optical disks often, but it was nice having a built-in drive for this in previous laptops. An external one just feels clunky - but I guess I can't complain too much because of the extra components and power that are built-in to the system.

The airflow of the system - as far as I can tell so far, is very good. Air comes in through the bottom, and is then pushed out again through the back and the back of the sides by 2 different fans. These fans are, however, rather noisy at times - and have taken some getting used to as my previous Dell laptop's fans were near silent until I started to stress the system. The noise they make is also slightly higher pitched too, which makes it more noticeable - and sound like a jet engine (though I admit I've never heard a real one in person, and I'm also somewhat hypersensitive to sound) when at full blast. Curiously, there's a dedicated key on the keyboard that - as far as I can tell - toggles between the normal on-demand fan mode and locking the fans at full blast. Great to quickly cool down the system if the fans haven't kicked in yet, but not so great for your ears!

I haven't tested the speakers much, but from what I can tell they are appropriately placed in front of the keyboard just before the hinge for the screen - which is a much better placement than on the underside at the front in my last laptop! Definitely a positive improvement there.

I wasn't sure based on the details on the PC specialist website, but the thickness of the base is 17.5mm at the thickest point, and 6mm for the screen - making ~23.5mm in total (although my measurements may not be completely accurate).

To summarise, the hardware I received was great - overlooking a few pain points such as the BIOS and poor keyboard layout decisions. Some work is still needed on environmental issues and sustainability, but packaging was on the whole ok. Watch out for the delivery service, as my laptop was delivered by DPD who don't have a great track record in my area.

Overall, the hardware build quality is excellent. I'm not sure if I can recommend them yet, but if you want a new PC or laptop they are certainly not a bad place to look.

Found this helpful? Got a suggestion? Want to say hi? Comment below!

Run a Program on your dedicated Nvidia graphics card on Linux

I've got a new laptop, and in it I have an Nvidia graphics card. The initial operating system installation went ok (Ubuntu 20.10 Groovy Gorilla is my Linux distribution of choice here), but after I'd done a bunch of configuration and initial setup tasks it inevitably came around to the point in time that I wanted to run an application on my dedicated Nvidia graphics card.

Doing so is actually really quite easy - it's figuring out the how that was the hard bit! To that end, this is a quick post to document how to do this so I don't forget next time.

Before you continue, I'll assume here that you're using the Nvidia propriety drivers. If not, consult your distribution's documentation on how to install and enable this.

With Nvidia cards, it depends greatly on what you want to run as to how you go about doing this. For CUDA applications, you need to install the nvidia-cuda-toolkit package:

sudo apt install nvidia-cuda-toolkit

...and then there will be an application-specific menu you'll need to navigate to tell it which device to use.

For traditional graphical programs, the process is different.

In the NVIDIA X Server Settings, go to PRIME Profiles, and then ensure NVIDIA On-Demand mode is selected. If not, select it and then reboot.

The NVIDIA X Server Settings dialog showing the correct configuration as described above.

Then, to launch a program on your Nvidia dedicated graphics card, do this:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia command_name arguments

In short, the __NV_PRIME_RENDER_OFFLOAD environment variable must be set to 1 , and the __GLX_VENDOR_LIBRARY_NAME environment variable must be set to nvidia. If you forget the latter here or try using DRI_PRIME=1 instead of both of these (as I advise in my previous post about AMD dedicated graphics cards), you'll get an error like this:

libGL error: failed to create dri screen
libGL error: failed to load driver: nouveau

The solution is to set the above environment variables instead of DRI_PRIME=1. Thanks to this Ask Ubuntu post for the answer!

Given that I know I'll be wanting to use this regularly, I've created a script in my bin folder for it. It looks like this:

#!/usr/bin/env bash
export __NV_PRIME_RENDER_OFFLOAD=1;
export __GLX_VENDOR_LIBRARY_NAME=nvidia;

$@;

Save the above to somewhere in your PATH (e.g. your bin folder, if you have one). Don't forget to run chmod +x path/to/file to mark it executable.

Found this useful? Does this work on other systems and setups too? Comment below!

Adventurous Accounts: The Pepperminty Wiki Android App

I've found recently that there are a growing number of things I haven't blogged about on here yet because I either haven't finished them, or I have run into difficulties. I'd like to start blogging about these projects here and there, so in this post I'm going to talk about the Pepperminty Wiki Android App.

For those not in the know, Pepperminty Wiki is a wiki engine I've built myself from scratch that is all packed into a single file using a custom build system. It's pretty stable, and I'm committed to supporting it in the long term - as I have more than 1 rather important wikis hosted using the software myself.

Anyway, as a companion to this I have also implemented a simple Android app companion that reads information from the Pepperminty Wiki Rest API. It's read-only at the moment, but long-term I'd like to build an app that can write back changes as well.

3 screenshots of the Pepperminty Wiki Android App

In it's current form, it works quite well actually. It's functional (although there are a few bugs lurking around). You can even download it from the Google Play Store: https://play.google.com/store/apps/details?id=com.sbrl.peppermint

I haven't really worked on it for a while though, and I'd like to talk about why that is.

Trying to implement additional features is where things get complicated though. The app is written in Kotlin, which I found much easier to use than Java (example: spawning a new thread is 3 lines instead of 30!). For some crazy reason when I was reading the guidance on building such an app, one of the guidelines was to keep the class structure as flat as possible.

This was a bad idea.

For so many reasons as well. Without using lots of files and multiple classes, a codebase gets difficult to follow and build upon. Taking this advice has severely impacted my ability to continue to add new features to the project, so a large refactor is needed to bring this under control. Add to this the complexity and ambiguity of the Android APIs themselves (how about a OnListFragmentInteractionListener, or an android.support.design.widget.BottomNavigationView? the API to display a list of things is also particularly complicated), and you can see how it can spiral out of control.

In addition, I've discovered that I have some serious problems updating it for Android 10. When you build an Android app, you have to regularly update the target platform version (I forget the exact name) - otherwise you get nagging emails from Google Play about this (I've had several already). I'm pretty sure you can't use newer APIs either if you don't keep the platform version updated.

It's been a while since I've tried to update it so I can't remember the exact error messages, but I do remember doing some pretty extensive searching around on the Internet and having no luck.

To this end, the app works in its current form quite well, but at some point (eventually) I'm going to make another attempt at writing a replacement for it. To do this though, I'd like to find some kind of framework that eases the process just a little bit - building an Android app at the moment is unhelpfully complicated and a larger time investment than I can afford at the moment with the regular maintenance required - since regular updates have to be done to keep the app updated, which don't seem to serve much purpose at all to me other than being a bother and creating extra work - since often when you update dependencies and the build system behind an Android app everything breaks, which you then have to fix.

This is different from the maintenance for Pepperminty Wiki itself, which is still regular - but constitutes fixing bugs as they crop up and occasionally implementing new features - and it has settled into a delightfully sedate pace that allows me to follow my inspiration. I find this to be much more enjoyable.

If anyone has any suggestions of alternative approaches I could try, I'm definitely interested. Comment below!

Lua in Review 2

The Lua Logo Back in 2015, I reviewed the programming language Lua. A few months ago I rediscovered the maze generation implementation I ported as part of that post, and since then I've been writing quite a bit of Lua - so I thought I'd return to the thoughts in that original post and write another language review now that I've had some more experience with the language.

For those not in the know, Lua is a lightweight scripting language. You can find out more here: https://www.lua.org/

In the last post, I mentioned that Lua is very lightweight. I still feel this is true today - and it has significant advantages in that the language is relatively simple to understand and get started in - and feels very predictable in how it functions.

It is often said that Lua is designed to be embedded in other programs (such as to provide a modding interface to a game, for example) - and this certainly seems to hold true. Lua definitely seems to be well-suited for this kind of use-case.

The lightweightness comes at a cost though. The first of these is the standard library. Compared to other languages such as C♯ and even Javascript, the standard library sucks. At least half of the time you find yourself reimplementing some algorithm that should have been packaged with the language itself:

Testing if a string starts with a given substring
Rounding a number to the nearest integer
Making a shallow copy of a table

Do you want to do any of these? Too bad, you'll have to implement them yourself in Lua. While these really aren't a big deal, my point here is that with functions like these it can be all too easy to make a mistake when implementing them, and then your code has a bug in it. If you find and fix an obscure edge case for example, that fix will only apply to your code and not the hundreds of other ad-hoc implementations other developers have had to cook up to get things done, leading to duplicated and wasted effort.

A related issue I'm increasingly finding is that of the module system and the lack of reusable packages. In Lua, if you want to import code from another file as a self-contained module, you use the require function, like this:

local foo = require("foo")

The above will import code from a file named foo.lua. However, this module import here is done relative to the entrypoint of your program, and not the file that's requesting the import, leading to a number of issues:

If you want to move a self-contained subsection of a codebase around, suddenly you have to rewrite all the imports of not only the rest of the codebase (as normal), but also of all the files in the subdirectory you've just moved
You can't have a self-contained 'package' of code that, say, you have in a git submodule - because the code in the submodule can't predict the path to the original entrypoint of your program relative to itself

While LuaRocks attempts to alleviate this issue to some extent (and I admit I haven't yet investigated it in any great detail), as far as I can tell it installs packages globally, which doesn't help if you're writing some Lua that is going to be embedded inside another program, as the global package may or may not be available. Even if it is available, it's debatable as to whether you'd be allowed to import it anyway, since many embedded environments have restrictions in place here for security purposes.

Despite these problems, I've found Lua to be quite a nice language to use (if a little on the verbose side, due to syntactical structure and the lack of a switch statement). Although it's not great at getting out of your way and letting you get on with your day (Javascript is better at this I find), it does have some particularly nice features - such as returning multiple values from a single function (which mostly makes up for the lack of exceptions), and some cute table definition syntax.

It's not the kind of language you want to use for your next big project, but it's certainly worth experimenting with to broaden your horizons and learn a new language that forces you to program in a significantly different style than you would perhaps use normally.

New website for Pepperminty Wiki

By now, Pepperminty Wiki is quite probably my longest running project - and I'm absolutely committed to continuing to support and improve it over time (I use it to host quite a lot of very important information myself).

As part of this, one of the things I'm always looking to improve is the installation process and the first impression users get when they first visit Pepperminty Wiki. Currently, this has a GitHub repository. This is great (as it shows people that we're open-source), but it isn't particularly user-friendly for those who are less technically inclined.

To this end, I've built a shiny new website to introduce people to Pepperminty Wiki and the features it has to offer. I've been thinking about this for a while, and I realised that actually despite the fact that I haven't yet incremented the version number to v1.0 yet (as of the time of posting the latest stable release is v0.22), Pepperminty Wiki is actually pretty mature, easy to deploy and use, and stable.

The new website for Pepperminty Wiki (link below)

(Above: The new Pepperminty Wiki website. Check it out here!)

The stability is a new one for me, as it isn't something I've traditionally put much of a focus on - instead focusing on educational purposes. Development of Pepperminty Wiki has sort of fallen into a pattern of 2-3 releases per year - each of which is preceded by one or more beta releases. I always leave at least 1 week between releasing a beta and the subsequent stable release to give myself and beta testers (of which Pepperminty Wiki has some! If you're reading this, I really appreciate it) time to spot any last-minute issues.

Anyway, the website can be found here: https://peppermint.mooncarrot.space/

Share it with your friends! :D

The initial plan was to buy a domain name like pepperminty.wiki for it, but after looking into the prices (~£36.29 per year) I found it was waaay too expensive for a project that I'm not earning a penny from working on (of course, if you're feeling that way inclined I have a Liberapay setup if you'd like to contribute towards server costs, but it's certainly not required).

Instead, I used a subdomain of one of my existing domains, mooncarrot.space (I use this one mostly for personal web app instances on my new infrastructure I'm blogging about in my cluster series), which is a bit shorter and easier to spell/say than starbeamrainbowlabs.com if you're not used to it.

After a few false starts, I settled on using Eleventy as my static site generator of choice. I'm not making use of all it's features (not even close), but I've found it fairly easy to use and understand how it ticks - and also flexible enough such that it will work with me, rather attempting to force me into a particular way of working.

Honourable mentions here include Hugo (great project, but if I recall correctly I found it confusing and complicated to setup and use), documentation (an epic documentation generator for JS projects, but not suitable for this type of website - check out some of the docs I auto-generate via my Laminar CI setup: powahroot, applause-cli, terrain50).

The Pepperminty Wiki website light theme

(Above: The light theme for the website - which one you see depends on your system preference - I use prefers-colour-scheme here. Personally I prefer the dark theme myself, as it's easier on my eyes)

The experience of implementing the website was an interesting one. Never having built a website to 'sell' something before (even if this is for a thing that's free), I found the most challenging part of the experience determining what text to use to appropriately describe the features of Pepperminty Wiki.

From the beginning I sort of had a vision for how I wanted the website to look. I wanted an introductory bit at the top (with a screenshot at a cool angle!), followed by a bit that explained the features, the some screenshots with short descriptions, followed finally by a download section. I also wanted it to be completely mobile-friendly.

A screenshot of the website as viewed by a mobile device

(Above: A screenshot of the website as viewed by a mobile device. The Firefox Developer Tools were useful for simulating this)

For the most part, this panned out quite well. Keeping the design relatively simple enabled me to support mobile devices as I went along, with minimal tweaks needed at the end of the process (mobile support really needs to be part of the initial design process).

The cool screenshot at the top and the fancy orange buttons you'll see in various places across the site were especially fun to put together - the iterative process of adding CSS directives to bring the idea I had in my head as to how I wanted it to look to life was very satisfying. I think I'll use the same basic principle I used for the fancy buttons again elsewhere (try hovering over them and clicking them to see the animations).

The bottom of the website, showing the fancy orange buttons

(Above: The bottom of the website, showing the fancy orange buttons)

I did contemplate the idea of using a CSS framework for the website, but not having seriously used one before for a personal project combined with the advent of the CSS grid ended up in the decision to abandon the use of a framework once again (I'll learn one eventually, I'm sure ).

So far my experience with frameworks is that they just get in the way when you want to do something that wasn't considered when the framework was built, but I suppose that given their widespread use elsewhere that I really should make an effort to learn at least one framework to get that experience (any suggestions in the comments are welcome).

All in all the experience of building the Pepperminty Wiki website was an enjoyable one. It took a number of hours over a number of days to put together (putting the false starts aside), but I feel as though it was definitely worth it.

Find the website here: https://peppermint.mooncarrot.space/

If I end up moving it at a later date, I'll ensure there's a redirect in place so the above link won't break.

Found this useful? Got a comment about or a suggestion to improve the website? Comment below! I'd love to hear from you.

Stitching videos from frames with ffmpeg (and audio/video editing tricks)

ffmpeg is awesome. In case you haven't heard, ffmpeg is a command-line tool for editing and converting audio and video files. It's taken me a while to warm up to it (the command-line syntax is pretty complicated), but since I've found myself returning to it again and again I thought I'd blog about it so you can use it too (and also for my own reference :P).

I have 2 common tasks I perform with ffmpeg:

Converting audio/video files to a more efficient format
Stitching PNG images into a video file

I've found that most tasks will fall into 1 of the 2 boxes. Sometimes I want to do something more complicated, but I'll usually just look that up and combine the flags I find there with the ones I usually use for one of the 2 above tasks.

horizontal rule made up of an orange rotating maze cube

Stitching PNG images into a video

I don't often render an animation, but when I do I always render to individual images - that way if it crashes half way through I can more easily restart where I left off (and also divide the workload more easily across multiple machines if I need it done quickly).

Individual image files are all very well, but they have 2 problems:

They take up lots of space
You can't easily watch them like a video

Solving #1 is relatively easy with optipng (sudo apt install optipng), for example:

find . -iname "*.png" -print0 | xargs -0 -P4 -n1 optipng -preserve

I cooked that one up a while ago and I've had it saved in my favourites for ages. It finds all png images in the current directory (recursively) and optimises them with optipng.

To resolve #2, we can use ffmpeg as I mentioned earlier in this post. Here's one I use:

ffmpeg -r 24 -i path/to/frame_%04d.png -c:v libvpx-vp9 -b:v 2000k -crf 31 path/to/output.webm

A few points on this one:

-r 24: This sets the frames per second.
-i path/to/frame_%04d.png: The path to the png images to stitch. %04d refers to 4 successive digits in a row that count up from 1 - e.g. 0001, 0002, etc. %d would mean 1, 2, ...., and %02d for example would mean 01, 02, 03....
-c:v libvpx-vp9: The codec. We're exporting to webm, and vp9 is the latest available codec for webm video files.
-b:v 2000k: The bitrate. Higher values mean more bits will be used per second of video to encode detail. Raise to increase quality.
-crf 31: The encoding quality. Should be anumber betwee 0 and 63 - higher means lower quality and a lower filesize.

Read more about encoding VP9 webm videos here: Encode/VP9- ffmpeg

Converting audio / video files

I've talked about downmuxing audio before, so in this post I'm going to be focusing on video files instead. The above command for encoding to webm can be used with minimal adjustment to transcode videos from 1 format to another:

ffmpeg -hide_banner -i "path/to/input.avi" -c:v libvpx-vp9 -c:a libopus -crf 31 -b:v 0 "path/to/output/webm";

In this one, we handle the audio as well as the video all in 1 go, encoding the audio to use the opus codec, and the video as before. This is particularly useful if you've got a bunch of old video files generated by an old camera. I've found that often old cameras like to save videos as raw uncompressed AVI files, and that I can reclaim a sigificant amount of disk space if I transcode them to something more efficient.

Naturally, I cooked up a one-liner that finds all relevant video files recursively in a given directory and transcodes them to 1 standard efficient format:

find . -iname "*.AVI" -print0 | nice -n20 xargs --verbose -0 -n1 -I{} sh -c 'old="{}"; new="${old%.*}.webm"; ffmpeg -hide_banner -i "${old}" -c:v libvpx-vp9 -c:a libopus -crf 30 -b:v 0 "${new}" && rm "${old}"';

Let's break this down. First, let's look at the commands in play:

find: Locates the target files to transcode
nice: Runs the following command in the background
xargs: Runs the following command over and over again for every line of input
sh: A shell, to execute multiple command in sequence
ffmpeg: Does the transcoding
rm: Deletes the old file if transcoding completes successfully

The old="{}"; new="${old%.*}.webm"; bit is a bit of gymnastics to pull in the path to the target file (the -I {} tells xargs where to substitute it in) and determine the new filepath by replacing the file extension with .webm.

I've found that this does the job quite nicely, and I can set it running and go off to do something else while it happily sits there and transcodes all my old videos, reclaiming lots of disk space in the process.

Found this useful? Got a cool command for processing audio/video/image files in bulk? Comment below!

Monitoring latency / ping with Collectd and Bash

I use Collectd as the monitoring system for the devices I manage. As part of this, I use the Ping plugin to monitor latency to a number of different hosts, such as GitHub, the raspberry pi apt repo, 1.0.0.1, and this website.

I've noticed for a while that the ping plugin doesn't always work: Even when I check to ensure that a host is pingable before I add it to the monitoring list, Collectd doesn't always manage to ping it - showing it as NaN instead.

Yesterday I finally decided that enough was enough, and that I was going to do something about it. I've blogged about using the exec plugin with a bash script before in a previous post, in which I monitor HTTP response times with curl.

Using that script as a base, I adapted it to instead parse the output of the ping command and translate it into something that Collectd understands. If you haven't already, you'll want to go and read that post before continuing with this one.

The first order of business is to sort out the identifier we're going to use for the data in question. Collectd assigns an identifier to all the the data coming in, and it uses this to determine where it is stored on disk - and subsequently which graph it will appear in on screen in the front-end.

Such identifiers follow this pattern:

host/plugin-instance/type-instance

This can be broken down into the following parts:

Part	Meaning
`host`	The hostname of the machine from which the data was collected
`plugin`	The name of the plugin that collected the data (e.g. `memory`, `disk`, `thermal`, etc)
`instance`	The instance name of the plugin, if the plugin is enabled multiple times
`type`	The type of reading that was collected
`instance`	If multiple readings for a given type are collected, this instance differentiates between them.

Of note specifically here are the type, which must be one of a number of pre-defined values, which can be found in a text file located at /usr/share/collectd/types.db. In my case, my types.db file contains the following definitions for ping:

ping: The average latency
ping_droprate: The percentage of packets that were dropped
ping_stddev: The standard deviation of the latency (lower = better; a high value here indicates potential network instability and you may encounter issues in voice / video calls for example)

To this end, I've decided on the following identifier strings:

HOSTNAME_HERE/ping-exec/ping-TARGET_NAME
HOSTNAME_HERE/ping-exec/ping_droprate-TARGET_NAME
HOSTNAME_HERE/ping-exec/ping_stddev-TARGET_NAME

I'm using exec for the first instance here to cause it to store my ping results separately from the internal ping plugin. The 2nd instance is the name of the target that is being pinged, resulting in multiple lines on the same graph.

To parse the output of the ping command, I've found it easiest if I push the output of the ping command to disk first, and then read it back afterwards. To do that, a temporary directory is needed:

temp_dir="$(mktemp --tmpdir="/dev/shm" -d "collectd-exec-ping-XXXXXXX")";

on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

This creates a new temporary directory in /dev/shm (shared memory in RAM), and automatically deletes it when the script terminates by scheduling an exit trap.

Then, we can create a temporary file inside the new temporary directory and call the ping command:

tmpfile="$(mktemp --tmpdir="${temp_dir}" "ping-target-XXXXXXX")";
ping -O -c "${ping_count}" "${target}" >"${tmpfile}";

A number of variables in that second command. Let me break that down:

${ping_count}: The number of pings to send (e.g. 3)
${target}: The target to ping (e.g. starbeamrainbowlabs.com)
${tmpfile}: The temporary file to which to write the output

For reference, the output of the ping command looks something like this:

PING starbeamrainbowlabs.com (5.196.73.75) 56(84) bytes of data.
64 bytes from starbeamrainbowlabs.com (5.196.73.75): icmp_seq=1 ttl=55 time=28.6 ms
64 bytes from starbeamrainbowlabs.com (5.196.73.75): icmp_seq=2 ttl=55 time=15.1 ms
64 bytes from starbeamrainbowlabs.com (5.196.73.75): icmp_seq=3 ttl=55 time=18.9 ms

--- starbeamrainbowlabs.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 15.145/20.886/28.574/5.652 ms

We're only interested in the last 2 lines of output. Since this is a long-running script that is going to be executing every 5 minutes, to minimise load (it will be running on a rather overburdened Raspberry Pi 3B+ :P), we will be minimising the number of subprocesses we spawn. To read in the last 2 lines of the file into an array, we can do this:

mapfile -s "$((ping_count+3))" -t file_data <"${tmpfile}"

The -s here tells mapfile (a bash built-in) to skip a given number of lines before reading from the file. Since we know the number of ping requests we sent and that there are 3 additional lines that don't contain the ping request output before the last 2 lines that we're interested in, we can calculate the number of lines we need to skip here.

Next, we can now parse the last 2 lines of the file. The read command (which is also a bash built-in, so it doesn't spawn a subprocess) is great for this purpose. Let's take it 1 line at a time:

read -r _ _ _ _ _ loss _ < <(echo "${file_data[0]}")
loss="${loss/\%}";

Here the read command splits the input on whitespace into multiple different variables. We are only interested in the packet loss here. While the other values might be interesting, Collectd (at least by default) doesn't have a definition in types.db for them and I don't see any huge benefits from adding them anyway, so I use an underscore _ to indicate, by common convention, that I'm not interested in those fields.

We then strip the percent sign % from the end of the packet loss value here too.

Next, let's extract the statistics from the very last line:

read -r _ _ _ _ _ _ min avg max stdev _ < <(echo "${file_data[1]//\// }");

Here we replace all forward slashes in the input with a space to allow read to split it properly. Then, we extract the 4 interesting values (although we can't actually log min and max).

With the values extracted, we can output the statistics we've collected in a format that Collectd understands:

echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_droprate-${target}\" interval=${COLLECTD_INTERVAL} N:${loss}";
echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping-${target}\" interval=${COLLECTD_INTERVAL} N:${avg}";
echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_stddev-${target}\" interval=${COLLECTD_INTERVAL} N:${stdev}";

Finally, we mustn't forget to delete the temporary file:

rm "${tmpfile}";

Those are the major changes I made from the earlier HTTP response time monitor. The full script can be found at the bottom of this post. The settings that control the operation of the script are the top, which allow you to change the list of hosts to ping, and the number of ping requests to make.

Save it to something like /etc/collectd/collectd-exec-ping.sh (don't forget to sudo chmod +x /etc/collectd/collectd-exec-ping.sh it), and then append this to your /etc/collectd/collectd.conf:

<Plugin exec>
        Exec    "nobody:nogroup"        "/etc/collectd/collectd-exec-ping.sh"
</Plugin>

Final script

#!/usr/bin/env bash
set -o pipefail;

# Variables:
#   COLLECTD_INTERVAL   Interval at which to collect data
#   COLLECTD_HOSTNAME   The hostname of the local machine

declare targets=(
    "starbeamrainbowlabs.com"
    "github.com"
    "reddit.com"
    "raspbian.raspberrypi.org"
    "1.0.0.1"
)
ping_count="3";

###############################################################################

# Pure-bash alternative to sleep.
# Source: https://blog.dhampir.no/content/sleeping-without-a-subprocess-in-bash-and-how-to-sleep-forever
snore() {
    local IFS;
    [[ -n "${_snore_fd:-}" ]] || exec {_snore_fd}<> <(:);
    read ${1:+-t "$1"} -u $_snore_fd || :;
}

# Source: https://github.com/dylanaraps/pure-bash-bible#split-a-string-on-a-delimiter
split() {
    # Usage: split "string" "delimiter"
    IFS=$'\n' read -d "" -ra arr <<< "${1//$2/$'\n'}"
    printf '%s\n' "${arr[@]}"
}

# Source: https://github.com/dylanaraps/pure-bash-bible#use-regex-on-a-string
regex() {
    # Usage: regex "string" "regex"
    [[ $1 =~ $2 ]] && printf '%s\n' "${BASH_REMATCH[1]}"
}

# Source: https://github.com/dylanaraps/pure-bash-bible#get-the-number-of-lines-in-a-file
# Altered to operate on the standard input.
count_lines() {
    # Usage: count_lines <"file"
    mapfile -tn 0 lines
    printf '%s\n' "${#lines[@]}"
}

# Source https://github.com/dylanaraps/pure-bash-bible#get-the-last-n-lines-of-a-file
tail() {
    # Usage: tail "n" "file"
    mapfile -tn 0 line < "$2"
    printf '%s\n' "${line[@]: -$1}"
}

###############################################################################

temp_dir="$(mktemp --tmpdir="/dev/shm" -d "collectd-exec-ping-XXXXXXX")";

on_exit() {
    rm -rf "${temp_dir}";
}
trap on_exit EXIT;

# $1 - target name
# $2 - url
check_target() {
    local target="${1}";

    tmpfile="$(mktemp --tmpdir="${temp_dir}" "ping-target-XXXXXXX")";

    ping -O -c "${ping_count}" "${target}" >"${tmpfile}";

    # readarray -t result < <(curl -sS --user-agent "${user_agent}" -o /dev/null --max-time 5 -w "%{http_code}\n%{time_total}\n" "${url}"; echo "${PIPESTATUS[*]}");
    mapfile -s "$((ping_count+3))" -t file_data <"${tmpfile}"

    read -r _ _ _ _ _ loss _ < <(echo "${file_data[0]}")
    loss="${loss/\%}";
    read -r _ _ _ _ _ _ min avg max stdev _ < <(echo "${file_data[1]//\// }");


    echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_droprate-${target}\" interval=${COLLECTD_INTERVAL} N:${loss}";
    echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping-${target}\" interval=${COLLECTD_INTERVAL} N:${avg}";
    echo "PUTVAL \"${COLLECTD_HOSTNAME}/ping-exec/ping_stddev-${target}\" interval=${COLLECTD_INTERVAL} N:${stdev}";

    rm "${tmpfile}";
}

while :; do
    for target in "${targets[@]}"; do
        # NOTE: We don't use concurrency here because that spawns additional subprocesses, which we want to try & avoid. Even though it looks slower, it's actually more efficient (and we don't potentially skew the results by measuring multiple things at once)
        check_target "${target}"
    done

    snore "${COLLECTD_INTERVAL}";
done

PhD Update 5: Hyper optimisation and frustration

Hello there again! It's been longer than I anticipated since the last proper post in this series. Before I continue, here's a list of all the (proper) posts in this series so far:

I've haven't managed to get as much done since last time as I was hoping (partly due to the fact that I'm currently having to work from home, which is more challenging than I expected), but I have finished my implementation of the Temporal CNN, and am now working on hyperparameter optimisation. I've also fixed a number of issues in my rainfall radar data downloader and processing programs - which I'll talk about in more detail below

HAIL-CAESAR and the iterative improvements

Someone at the University recently approached me (if you are reading this and have a blog, comment below and I'll update) to ask if they could use my rainfall radar data downloader program to download some rainfall radar data for their project. Naturally, I helped them out. This turned out to be a great thing for me as well, as with their help I managed to uncover a number of very nasty issues with the data pipeline I had been building up to that point:

The hydro index file that HAIL-CAESAR uses was completely scrambled
The date on the data downloaded was a month out
The data downloaded was (and still is) rotated by 90° on disk
The data was out by a factor of 32

While fixing each of these bugs was a (relatively) simple process, I can't help but wonder how they managed to escape my notice until (for all but 1 of them) someone else told me about them.

The other issue was that because of the amount of data I'm working with, it took forever to re-run the program to test to see if I had managed to fix the problem - and if I had, I'd encounter another problem. This long iteration process makes implementing a new feature or fixing a bug a very time-consuming process.

Despite fixing all these issues, I'm still experiencing issues with my latest refactoring of the rainfall radar data downloader (namely a hang in the event system when reading tar files). My current thinking is that I'm going to completely reimplement it (using snippets from the old programif I need to use it again in the future, as it is currently neither particularly efficient (it's single-threaded) nor easy to bugfix (it's pretty complicated).

I've got an idea for a parallel system that processes each tar file separately first, and then only after all the tar files have been converted separately are they strung together into the actual files the existing implementation spits out currently.

Temporal CNN delight

Last time, I had just started my implementation of a Temporal CNN. This is now pretty much complete, and I've also been able to run it and get some results! Check out this graph:

A graph showing the root mean squared error while training my Temporal CNN implementation - more details below.

This graph shows the root mean squared error when training on 1000 time steps of data (about 3 days 11 hours or so). Epochs are along the X axis, and the root mean squared error is on the Y axis.

A few things to note here. Firstly, the implementation I've come up with essentially does video-to-image translation. The original model in the paper I've linked to is demonstrates a classification task (specifically land use over time) - so what I'm doing is a little different.

Secondly, I've omitted the root mean squared error for the first epoch. It was so high that it made the rest of the graph impossible to see - hence the omission.

I'm pretty pleased with this result so far - as I have a nice downwards curve indicating that the model is (probably) learning something useful.

I am still rather nervous about the output though, as due to the way I've implemented the network I haven't actually been able to 'see' the output of the network at all as an image yet. Doing so would take a while to implement, so I haven't done so for now (although I really should do this soon). It would be really cool to see a short video (maybe at ~10fps) of the network output as the epochs move forwards to visualise the network training process.

Hyperparameter frustrations

Lastly, at the suggestion of my supervisor I've been working on hyperparameter optimisation. In short, this consists of training the model with random combinations of hyperparameters and seeing which ones work best.

A hyperparameter is a tunable parameter that controls an aspect of a model. In my case, I have 2 key hyperparameters I need to tune:

Filter count: CNN layers in Tensorflow.js have a filter count associated with them. I theorise that increasing this will increase the model's ability to learn spatial information.
Temporal depth: The number of time steps to push through the model at once. Increasing this will allow the model to make predictions based on events that occur further in the past.

My eventual aim here is to create a heatmap that has the above hyperparameters along the X and Y axes, and the colour showing the accuracy of the model that was trained - similar to the one I created previously.

To do this, I implemented a program that tries random combinations of hyperparameters - but never the same combination twice. It starts the model in a subprocess and passes the chosen filter count and temporal depth values in as CLI arguments, which the child process picks up, parses, and then trains a model based on. This CLI is the same one as the one I developed that generated the above graph in the previous section of this blog post.

This approach has the advantage that it isolates the model in a subprocess, so when the subprocess exits and a new one spawns for the next combination of hyperparameters, the environment is completely clean and there isn't anything that might interfere with it.

Unfortunately though, while I set off a run of this implementation before I took a 'holiday' - and even checked on it to ensure it was running as expected (multiple times) - it still managed to crash when I wasn't looking.

After some debugging, I discovered that problem was because the model ran out of memory while training. This was something I had expected - and used the --unhandled-rejections=strict option for Node.js, which tells Node.js to crash and exit when an UnhandledPromiseRejection is thrown - like this one:

2020-08-04 15:31:26.174395: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at conv_grad_ops_3d.cc:1783 : Resource exhausted: OOM when allocating tensor with shape[8,2,104,348,210] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
(node:62355) UnhandledPromiseRejectionWarning: Error: Invalid TF_Status: 8
Message: OOM when allocating tensor with shape[8,2,104,348,210] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
    at Object.<anonymous> (<anonymous>)
.....

Unfortunately though, I used this flag on the parent process (that drives the hyperparameter optimisation) and not the child process - leading to a situation whereby the child process crashed due to the aforementioned error and just hangs around doing nothing. Even more frustatingly, the solution si as simple as doing a quick export NODE_OPTIONS="--unhandled-rejections=strict" before running the hyperparameter optimisation program to ensure that the flag propagates to the child processes......

Very frustrating indeed - especially considering I calculate that it will take multiple weeks to gather enough data to create a meaningful heatmap.

Conclusion

Reading back over this post, I have got more done than I expected. I've started and finished my Temporal CNN implementation, and fixed lots of bugs in them existing code.

However, the long iteration times to test code I've written (despite using a small slice of the dataset to test with), the large datasets I'm working with, having work on a system remotely via SSH by pushing and pulling code with git (many times), working from home all the time, and the continued bugs I've been facing and will likely continue to face have caused and are causing significant unexpected slowdowns moving forwards.

At least the VPN is no longer dropping out every 5 minutes!

Sources and Further Reading

Paper: Temporal Convolutional Neural Network for the classification of satellite image time series

Stardust Blog

Tag Cloud

Running multiple local versions of CUDA on Ubuntu without sudo privileges

PhD Update 6: The road ahead

Temporal CNN

Social Media Analysis

Conclusion

Proteus VIII Laptop from PC Specialist in Review

Run a Program on your dedicated Nvidia graphics card on Linux

Adventurous Accounts: The Pepperminty Wiki Android App

Lua in Review 2

New website for Pepperminty Wiki

Stitching videos from frames with ffmpeg (and audio/video editing tricks)

Stitching PNG images into a video

Converting audio / video files

Monitoring latency / ping with Collectd and Bash

Final script

PhD Update 5: Hyper optimisation and frustration

HAIL-CAESAR and the iterative improvements

Temporal CNN delight

Hyperparameter frustrations

Conclusion

Sources and Further Reading

Stardust
Blog