Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

PhD Update 17: Light at the end of the tunnel

Wow..... it's been what, 5 months since I last wrote one of these? Oops. I'll do my best to write them at the proper frequency in the future! Things have been busy. Before I talk about what's been happening, here's the ever-lengthening list of posts in this series:

As I sit here at the very bitter end of the very last day of a long but fulfilling semester, I'm feeling quite reflective about the past year and how things have gone on my PhD. One of these posts is definitely long overdue.

Timescales

Naturally the first question here is about timescales. "What happened?" I hear you ask. "I thought you said you were aiming for intent to submit September 2023 for December 2023 finish?"

Well, about that.......

As it turns out, spending half of one's week working as Experimental Officer throws off one's estimation of how much work they do. To this end, it's looking more likely that I will be submitting my thesis in early-mid semester 2 this year. In other words, that's around about March 2024 time - give or take a month or two.

After submission the next step will be my viva. Hoping I pass, it's then likely followed by corrections that must be completed based on the feedback from the viva.

What is a viva though? From what I understand, it is an oral exam in which you, your primary supervisor, and 2 examiners comb through your thesis with a fine toothcomb and ask you lots of questions. I've heard it can take several hours to complete. While the standard is to have 1 examiner be chosen internally from your department / institute and one to be chosen externally (chosen by your primary supervisor), in my case I will be having both chosen from external sources as I am now a (part-time) staff member in the Department of Computer Science at the University of Hull (my home institution).

While it's still a little ways out yet, I can't deny that the thought of my viva is making me rather nervous - having everything I've done over the past 4.5 years scrutinised by completely unknown people. In a sense, it feels like once it is time for my viva, there will be nothing more I can do. I will either know the answers to their questions.... or I will not.

Writing

As you might have guessed by now, writing has been the name - and, indeed, aim - of the game since the last post in this series. Everything is coming together rather nicely. It's looking like I'm going to end up with the following structure:

  1. Introduction (not written*)
  2. Background (almost there! currently working on this)
  3. Rainfall radar for 2d flood forecasting (needs expanding)
  4. Social media sentiment analysis (done!)
  5. Conclusion
  6. Acknowledgements, Appendices, etc
  7. Dictionary of terms; List of acronyms (grows organically as I write - I need to go through and make sure I \gls all the terms I've added later)
  8. Bibliography (currently 27 pages and counting O.o)
  • Technically I have written it, it's just outdated and very bad and needs throwing out the window of the tallest building I can find. Rewrite is pending - see below.

A sneak preview of my thesis as a PDF.

(Above: A sneak preview of my thesis PDF. I'm writing in LaTeX - check out my templates with the University of Hull reference style here! Evidently the pictured section needs some work.....)

I've finished the chapter on social media work, barring some minor adjustments I need to apply to ensure consistency. My current focus is the background chapter. This is most of the way there, but I need some more detail in several sections so I'm working my way through them one at a time. This is resulting a bunch more reading (especiall for vision-based water detection via satellite data), so this is taking some time.

Once I've wrapped up the background section, it will be time to turn my attention to the content chapter #2: Rainfall radar for 2d flood forecasting. Currently, it sits at halfway between a conference paper (check it out! You can read it now, though a DOI is pending and should be available after the conference) and a thesis chapter - so I need to push (pull? drag?) it the rest of the way to the finish line. This will primarily entail 2 things:

  • Filling out the chapter-specific related works, which are currently rather brief given space and time limitations in a conference paper
  • Elaborating on things like the data preprocessing, experiments, discussion, etc.

This will also take some time, which together with the background section explains the uncertaincy I still have in my finish date. Once these are both complete, I will be submitting my intent to submit! This will start a 3 month timer, by the end of which I must have submitted my thesis. During this timer period, I will be working on the introduction and conclusion chapters, which I do not expect to take nearly as long as any of the other chapters.

Once I am done writing and have submitted my thesis, I will do everything I can to ensure it is available under an open source licence for everyone to read. I believe strongly in the power of open source (and, open science) to benefit everyone, and want to share everything I've learned with all of you reading this.

At 102 pages A4 single space so far and counting though (not including the aforementioned bibliography), it's a big time investment to read. To this end, I have various publications I've written and posted about here previous that cover most of the stuff I've done (namely the rainfall radar conference paper and social media journal article), and I also want to somehow condense the content of my thesis down into a 'mini-thesis' that's about 3-6 pages ish and post that alongside my main thesis here on my website. I hope that this should provide the broad strokes and a navigation aid for the main document.

Predicting Persuasive Posts

All this writing is going to drive me crazy if I don't do something practical alongside it. Unfortuantely I have long since run out of exuses to run more experiments on my PhD work, so a good friend of mine who is also doing a PhD (they've published this paper) came along at the perfect time the other day asking for some help with a challenge competition submission they want to do. Of course, I had to agree to help out in a support role as the project sounds really interesting1.

The official title of the challenge is thus: Multilingual Detection of Persuasion Techniques in Memes

The challenge is part of SemEval-2024 and it's basically about classifying memes from some social media network (it's unclear which one they are from) as to which persuasion tactic they are employing to manipulate the reader's opinions / beliefs.

The full challenge page is can be found here: https://propaganda.math.unipd.it/semeval2024task4/index.html

We had a meeting earlier this week to discuss, and one of the key problems we identified was that to score challengers they be using posts in multiple unseen languages. To this end, it strikes me that it is important to have multiple languages embedded in the same space for optimal results.

This is not what GloVe does (it embeds them to different 'spaces', so a model trained data in 1 language won't necessarily work well with another) - as I discovered in my demo for the Hull Science Festival - definitely want to write about this in the final post in that series - so as my role in the team I'm going to push a number of different word embeddings through the system I have developed for the aforementioned science demo to identify which one is best for embedding multilingual text. Expect some additional entries to be added to the demo and an associated blog post on my findings very soon!

Currently, I have the following word embedding systems on my list:

  • Word2vec
  • FastText
  • CLIP
  • BERT/mBERT
  • XLM/XLM-RoBERTa

If you know of any other good word embedding models / algorithms, please do leave a comment below.

It also occurs to me while writing this that I'll have to make sure the multilingual dataset I used for the online demo has the same or similar words translated to every language to rule out any difference in embeddings there.

A nice challenge for the Christmas holidays! My experience of collaborating with other researchers is rather limited at the moment, so I'm looking forward to working in a team to achieve a goal much faster than would otherwise be possible.

Beyond the edge

Something that has been constant nagging presence in my mind and steadily growing is the question of what happens next after my thesis. While the details have not been confirmed yet, once everything PhD-related is wrapped up I will most likely be increasing my hours by some amount such that I work Monday - Friday rather than just Monday - Wednesday lunchtime as I have been doing so far.

This extra time will consist of 2 main activities. To the best of my current understanding, this will include some additional teaching responsibilities - I will probably be teaching a module that lies squarely within 1 of my strong points. It will also, crucially, include some dedicated time for research.

This time for research I believe I will be able to spend on research related activities, including for example collaborating with other researchers, reading papers, designing and running experiments, and writing up results into publication form. Essentially what I've been doing on my PhD, just minus the thesis writing!

Of course, the things I talk about here are not set in stone, and me talking about them here is not a declaration of such.

Either way, I do feel that the technical is a strong point of mine that I am rather passsionate about, so I do desire very much to continue dedicating a significant portion of my energy towards doing practical research tasks.

I'm not sure how much I am allowed to talk about the teaching I will be doing, but do expect some updates on that here on my blog too - however high-level and broad strokesy they happen to be. What kind of teaching-related things would you be interested in being updated about here? Please do leave a comment below.

Talking more specifically, I do have a number of research ideas - one of which I have alluded to above - that I want to explore after my PhD. Most of these are based on what I have learnt from doing my PhD and the logical next steps to analyse complex real-time data sources with a view to extracting and processing information to increase situational awareness in natural disaster scenarios. When I get around to this, I will be blogging about my progress in detail here on my blog.

It should probably be mentioned that I am still quite a long way off actually putting any of these ideas into practice (I would definitely not recommend trusting any predictions my current rainfall radar → binarised water depth model makes in the real world yet!), but if you or someone you know works in the field of managing natural disasters, I would be fascinated to know what you would find most useful related to this - please leave a comment below.

Conclusion

This post has ended up being a lot longer than I expected! I've talked about my current writing progress, a rather interesting side-project (more details in a future blog post!), and initial conceptual future plans - both researchy and otherwise.

While my thesis is drawing close to completion (relatively, at least), I hope you will join me here beyond the end of this long journey that is almost at an end. As one book closes, so does another one open. A new journey is / will be only just beginning - one I can't wait to share with everyone here in future blog posts.

If you've got any thoughts, it would be cool if you could share them below.


  1. It goes without saying, but I won't let it impact my writing progress. I divide my day up into multiple slices - one of which is dedicated to focused PhD work - and I'll be pulling from a different slice of time other than the one for my PhD writing to help out with this project. 

NLDL 2024: My rainfall radar paper is out!

A cool night sky and northern lights banner I made in Ink
scape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

......that's the title of the conference paper I've written about my rainfall radar research that I've been doing as part of my PhD, and now that the review process is complete I'm told by my supervisor that I can now share it!

This paper is the culmination of one half of my PhD (the other is multimodal social media sentiment analysis and it resulted in a journal article). Essentially, the idea behind the whole project was asking the question of "Can we make flood predictions all at once in 2D?".

The answer, as it turns out, is yes*.... but with a few caveats and a lot more work required before it's anywhere near ready to be coming to a smartphone near you.

It all sort of spiralled from there - and resulted in the development of a DeepLabV3+-based image semantic segmentation model that learns to approximate a physics-based water simulation.

The abstract of the paper is as follows:

Traditional predictive simulations and remote sensing techniques for forecasting floods are based on fixed and spatially restricted physics-based models. These models are computationally expensive and can take many hours to run, resulting in predictions made based on outdated data. They are also spatially fixed, and unable to scale to unknown areas.

By modelling the task as an image segmentation problem, an alternative approach using artificial intelligence to approximate the parameters of a physics-based model in 2D is demonstrated, enabling rapid predictions to be made in real-time.

I'll let the paper explain the work I've done in detail (I've tried my best to make it understandable by a wide audience). You can read it here:

https://openreview.net/forum?id=TpOsdB4gwR

(Direct link to PDF)

Long-time readers of my blog here will know that I haven't had an easy time of getting the model to work. If you'd like to read about the struggles of developing this and other models over the course of my PhD so far, I've been blogging about the whole process semi-regularly. We're currently up to part 16:

PhD Update 16: Realising the possibilities of the past

Speaking of which, it's high time I wrote another PhD update blog post, isn't it? A lot has been going on, and I'd really like to document it all here on my blog. I've also been finding it's been really useful to get me to take a step back to look at the big picture of my research - something that I've found very helpful in more ways than one. I'll definitely discuss this and my progress in the next part of my PhD update blog post series, which I tag with PhD to make them easy to find.

Until then, I'll see you in the next post!

Website update: Share2Fediverse, and you can do it too!

Heya! Got another short port for you here. You night notice that on all posts now there's a new share button (those buttons that take you to difference places with a link to this site to share it elsewhere) that looks like this:

The 5-pointed rainbow fediverse logo

If you haven't seen it before, this is the logo for the Fediverse, a decentralised network of servers and software that all interoperate (find out more here: https://fedi.tips/).

Since creating my Mastodon account, I've wanted some way to allow everyone here to share my posts on the Fediverse if they feel that way inclined. Unlike other centralised social media platforms like Reddit etc though, the Fediverse doesn't have a 'central' server that you can link to.

To this end, you need a landing page to act as a middleman. There are a few options out there already (e.g. share2fedi), but I wanted something specific and static, so I built my own solution. It looks like this:

A screenshot of Share2Fediverse. The background is rainbow like the fediverse logo, with translucent pentagons scattered across it. The landing page window is centred, with a title and a share form.

(Above: A screenshot of Share2Fediverse.)

It's basically a bit of HTML + CSS for styling, a splash of Javascript to make the interface function and remember the instance + software you select for next time via localStorage.

Check it out at this demo link:

https://starbeamrainbowlabs.com/share2fediverse/#text=The%20fediverse%20is%20cool!%20%E2%9C%A8

Currently, it supports sharing to Mastodon, GNU Social, and Diaspora. As it turns out, finding the share url (e.g. for Mastodon on fediscience.org it's https://fediscience.org/share?text=some%20text%20here) is more difficult than it sounds, as I haven't found it to be well advertised. I'd love to add e.g. Pixelfed, Kbin, GoToSocial, Pleroma, and more.... but I need the share URL! If you know the share URL for any piece of Fediverse software, please do leave a comment below.

If you're interested in the source code, you can find it here:

https://github.com/sbrl/Share2Fediverse/

...if you'd really like to help out, you could even open a pull request! The file you want to edit is src/lib/software_db.mjs - though if you leave a comment here or open an issue I'll pick it up and add any requests.

See you on the Fediverse! o/

I'm going to NLDL 2024!

A cool night sky and northern lights banner I made in Ink
scape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

Heya! Been a moment 'cause I've been doing a lot of writing and revising of writing for my PhD recently (the promised last part of the scifest demo series is coming eventually, promise!), but I'm here to announce that, as you might have guessed by the cool new banner, I have today (yesterday? time is weird when you stay up this late) had a paper accepted for the Northern Lights Deep Learning Conference 2024, which is to be held on 9th - 11th January 2024 in Tromsø, Norway!

I have a lot of paperwork to do between now and then (and many ducks to get in a row), but I have every intention of attending the conference in person to present my rainfall radar research I've been rambling on about in my PhD update series.

I am unsure whether I'm allowed to share the paper at this stage - if anyone knows, please do get in touch. In the meantime, I'm pretty sure I can share the title without breaking any rules:

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

I also have a Cool Poster, which I'll share here after the event too in the new (work-in-progress) Research section of the main homepage that I need to throw some CSS at.

I do hope that this cool new banner gets some use bringing you more posts about (and, hopefully, from!) NLDL 2024 :D

--Starbeamrainbowlabs

Building the science festival demo: How to monkeypatch an npm package

A pink background dotted with bananas, with the patch-package logo front and centre, and the npm logo small in the top-left. Small brown package boxes are present in the bottom 2 corners.

In a previous post, I talked about the nuts and bolts of the demo on a technical level, and put it's all put together. I alluded to the fact that I had to monkeypatch Babylon.js to disable the gamepad support because it was horribly broken, and I wanted to dedicate an entire post to the subject.

Partly because it's a clever hack I used, and partly because if I ever need to do something similar again I want a dedicated tutorially-style post on how I did it so I can repeat the process.

Monkeypatching an npm package after installation in a reliable way is an inherently fragile task: it is not something you want to do if you can avoid. In some cases though, it's unavoidable:

  1. If you're short on time, and need something to work
  2. If you are going to submit a pull request to fix something now, but need an interim workaround until your pull request is accepted upstream
  3. If upstream doesn't want to fix the problem, and you're forced to either maintain a patch or fork upstream into a new project, which is a lot more work.

We'll assume that one of these 3 cases is true.

In the game Factorio, there's a saying 'there's a mod for that' that is often repeated in response to questions in discourse about the game. The same is true of Javascript: If you need to do a non-trivial thing, there's usually an npm package that does it that you can lean on instead of reinventing the wheel.

In this case, that package is called patch-package. patch-package is a lovely little tool that enables you to do 2 related things:

a) Generate patch files simply by editing a given npm package in-situ b) Automatically and transparently apply generated patch files on npm install, requiring no additional setup steps should you clone your project down from its repository and run npm install.

Assuming you have a working setup with the target npm package you want to patch already installed, first install patch-package:

npm install --save patch-package

Note: We don't --save-dev here, because patch-package needs to run any time the target package is installed... not just in your development environment - unless the target package to patch is also a development dependency.

Next, delve into node_modules/ and directly edit the files associated with the target package you want to edit.

Sometimes, projects will ship multiple npm packages, with one being containing the pre-minified build distribution, and th other distributing the raw source - e.g. if you have your own build system like esbuild and want to tree-shake it.

This is certainly the case for Babylon.js, so I had to switch from the main babylonjs package to @babylon/core, which contains the source. Unfortunately official documentation for Babylon.js is rather inconsistent which can lead to confusion using the latter, but once I figured out how the imports worked it all came out in the wash.

Once done, generate the patch file for the target package like so:

npx patch-package your-package-name-here

This should create a patch file in the directory patches/ alongside your package.json file.

The final step is to enable automatic and transparent application of the new patch file on package installation. To do this, open up your package.json for editing, and add the following to the scripts object:

"scripts": {
    "postinstall": "patch-package"
}

...so a complete example might look a bit like this:

{
    "name": "research-smflooding-vis",
    "version": "1.0.0",
    "description": "Visualisations of the main smflooding research for outreach purposes",
    "main": "src/index.mjs",

    // ....

    "scripts": {
        "postinstall": "patch-package",
        "test": "echo \"No tests have been written yet.\"",
        "build": "node src/esbuild.mjs",
        "watch": "ESBUILD_WATCH=yes node src/esbuild.mjs"
    },

    // ......

    "dependencies": {
        // .....
    }
}

That's really all you need to do!

After you've applied the patch like this, don't forget to commit your changes to your git/mercurial/whatever repository.

I would also advise being a bit careful installing updates to any packages you've patched in future, in case of changes - though of course installing dependency package updates are vitally important to keep your code updated and secure.

As a rule of thumb, I recommend actively working to minimise the number of patches you apply to packages, and only use this method as a last resort.

That's all for this post. In future posts, I want to look more at the AI theory behind the demo, it's implications, and what it could mean for research in the field in the future (is there even a kind of paper one writes about things one learns from outreach activities that accidentally have a bearing on my actual research? and would it even be worth writing something formal? a question for my supervisor and commenters on that blog post when it comes out I think).

See you in the next post!

(Background to post banner: Unsplash)

Building the science festival demo: technical overview

Banner showing gently coloured point clouds of words against a dark background on the left, with the Humber Science Festival logo, fading into a screenshot of the attract screen of the actual demo on the right.

Hello and welcome to the technical overview of the hull science festival demo I did on the 9th September 2023. If you haven't already, I recommend reading the main release post for context, and checking out the live online demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/

I suspect that a significant percentage of the readers of my blog here love technical nuts and bolts of things (though you're usually very quiet in the comments :P), so I'm writing a series of posts about various aspects of the demo, because it's was a very interesting project.

In this post, we'll cover the technical nuts and bolts of how I put it together, the software and libraries I used, and the approach I took. I also have another post written I'll be posting after this one on monkeypatching npm packages after you install them, because I wanted that to be it's own post. In a post after that we'll look look at the research and the theory behind the project and how it fits into my PhD and wider research.

To understand the demo, we'll work backwards and deconstruct it piece by piece - starting with what you see on screen.

Browsing for a solution

As longtime readers of my blog here will know, I'm very partial to cramming things into the browser that probably shouldn't run in one. This is also the case for this project, which uses WebGL and the HTML5 Canvas.

Of course, I didn't implement using the WebGL API directly. That's far too much effort. Instead, I used a browser-based game engine called Babylon.js. Babylon abstracts the complicated bits away, so I can just focus on implementing the demo itself and not reinventing the wheel.

Writing code in Javascript is often an exercise in putting lego bricks together (which makes it very enjoyable, since you rarely have to deviate from your actual task due to the existence of npm). To this end, in the process of implementing the demo I collected a bunch of other npm packages together to which I could then delegate various tasks:

Graphics are easy

After picking a game engine, it is perhaps unsurprising that the graphics were easy to implement - even with 70K points to display. I achieved this with Babylon's PointsCloudSystem class, which made the display of the point cloud a trivial exercise.

After adapting and applying a clever plugin (thanks, @sebavan!), I had points that were further away displaying smaller and closer ones larger. Dropping in a perceptually uniform colour map (I wonder if anyone's designed a perceptually uniform mesh map for a 3D volume?) and some fog made the whole thing look pretty cool and intuitive to navigate.

Octopain

Now that I had the points displaying, the next step was to get the text above easy point displaying properly. Clearly with 70K points (140K in the online demo!) I can't display text for all of them at once (and it would look very messy if I did), so I needed to index them somehow and efficiently determine which points were near to the player in real time. This is actually quite a well studied problem, and from prior knowledge I remember that Octrees were reasonably efficient. If I had some tine to sit down and read papers (a great pastime), this one (some kind of location recognition from point clouds; potentially indoor/outdoor tracking) and this one (AI semantic segmentation of point clouds) look very interesting.

Unfortunately, the task of extracting a list of points within a given radius was not something commonly implemented in octree implementations on npm, and combined with a bit of headache figuring out the logic of this and how to hook it up to the existing Babylon renderer resulted in this step taking some effort before I found octree-es and got it working the way I wanted it to.

In the end, I had the octree as a completely separate point indexing data structure, and I used the word as a key to link it with the PointsCloudSystem in babylon.

Gasp, is that a memory leaks I see?!

Given I was in a bit of a hurry to get the whole demo thing working, it should come as no surprise that I ended up with a memory leak. I didn't actually have time to fix it before the big day either, so I had the demo on the big monitor while I kept an eye on the memory usage of my laptop on my laptop screen!

A photo of my demo up and running on a PC with a PS4 controller on a wooden desk. An Entroware laptop sits partially obscured by a desktop PC monitor, the latter of which has the demo full screen.

(Above: A photo of my demo in action.... I kept an eye on the memory graph the taskbar on my laptop the whole time. It only crashed once!)

Anyone who has done anything with graphics and game engines probably suspects where the memory leak was already. When rendering the text above each point with a DynamicTexture, I didn't reuse the instance when the player moved, leading to a build-up of unused textures in memory that would eventually crash the machine. After the day was over, I had time to sit down and implement a pool to re-use these textures over and over again, which didn't take nearly as long as I thought it would.

Gamepad support

You would think that being a well known game engine that Babylon would have working gamepad support. The documentation even suggests as such, but sadly this is not the case. When I discovered that gamepad support was broken in Babylon (at least for my PS4 controller), I ended up monkeypatching Babylon to disable the inbuilt support (it caused a crash even when disabled O.o) and then hacking together a custom implementation.

This custom implementation is actually quite flexible, so if I ever have some time I'd like to refactor it into its own npm package. Believe it or not I tried multiple other npm packages for wrapping the Gamepad API, and none worked reliably (it's a polling API, which can make designing an efficient and stable wrapper an interesting challenge).

To do that though I would need to have some other controllers to test with, as currently it's designed only for the PS4 dualshock controller I have on hand. Some time ago I initially purchased an Xbox 360 controller wanting something that worked out of the box with Linux, but it didn't work out so well so I ended up selling it on and buying a white PS4 dualshock controller instead (pictured below).

I'm really impressed with how well the PS4 dualshock works with Linux - it functions perfectly out of the box in the browser (useful test website) just fine, and even appears to have native Linux mainline kernel support which is a big plus. The little touchpad on it is cute and helpful in some situations too, but most of the time you'd use a real pointing device.

A white PS4 dualshock controller.

(Above: A white PS4 dualshock controller.)

How does it fit in a browser anyway?!

Good question. The primary answer to this is the magic of esbuild: a magical build tool that packages your Javascript and CSS into a single file. It can also handle other associated files like images too, and on top of that it's suuuper easy to use. It tree-shakes by default, and just all-around a joy to use.

Putting it to use resulted in my ~1.5K lines of code (wow, I thought it was more than that) along with ~300K lines in libraries being condensed into a single 4MiB .js and a 0.7KiB .css file, which I could serve to the browser along with the the main index.html file. It's event really easy to implement subresource integrity, so I did that just for giggles.

Datasets, an origin story

Using the Fetch API, I could fetch a pre-prepared dataset from the server, unpack it, and do cool things with it as described above. The dataset itself was prepared using a little Python script I wrote (source).

The script uses GloVe to vectorise words (I think I used 50 dimensions since that's what fit inside my laptop at the time), and then UMAP (paper, good blog post on why UMAP is better than tSNE) to do dimensionality reduction down to 3 dimensions, whilst still preserving global structure. Judging by the experiences we had on the day, I'd say it was pretty accurate, if not always obvious why given words were related (more on this why this is the case in a separate post).

My social media data, plotted in 2D with PCA (left), tSNE (centre), and UMAP (right). Points are blue against a white background, plotted with the Python datashader package.

_(Above: My social media data, plotted in 2D with PCA (left), tSNE (centre), and UMAP (right). Points are blue against a white background, plotted with the Python datashader package.)_

I like Javascript, but I had the code written in Python due to prior research, so I just used Python (looking now there does seem to be a package that implementing UMAP in JS, so I might look at that another time). The script is generic enough that I should be able to adapt it for other projects in the future to do similar kinds of analyses.

For example, if I were to look at a comparative analysis of e.g. language used by social media posts from different hashtags or something, I could use the same pipeline and just label each group with a different colour to see the difference between the 2 visually.

The data itself comes from 2 different places, depending on where you see the demo. If you were luck enough to see it in person, then it's directly extracted from my social media data. The online one comes from page abstracts from various Wikipedia language dumps to preserve privacy of the social media dataset, just in case.

With the data converted, the last piece of the puzzle is that of how it ends up in the browser. My answer is a gzip-compressed headerless tab-separated-values file that looks something like this (uncompressed, of course):

cat    -10.147051      2.3838716       2.9629934
apple   -4.798643       3.1498482       -2.8428414
tree -2.1351748      1.7223179       5.5107193

With the data stored in this format, it was relatively trivial to load it into the browser, decompressed as mentioned previously, and then display it with Babylon.js. There's also room here to expand and add additional columns later if needed, to e.g. control the colour of each point, label each word with a group, or something else.

Conclusion

We've pulled the demo apart piece by piece, and seen at a high level how it's put together and the decisions I made while implementing it. We've seen how I implemented the graphics - aided by Babylon.js and a clever hack. I've explained how I optimised the location polling using achieve real-time performance with an octree, and how reusing textures is very important. Finally, we took a brief look at the dataset and where it came from.

In the next post, we'll take a look at how to monkeypatch an npm package and when you'd want to do so. In a later post, we'll look at the research behind the demo, what makes it tick, what I learnt while building and showing it off, and how that fits in with the wider field from my perspective.

Until then, I'll see you in the next post!

Edit 2023-11-30: Oops! I forgot to link to the source code....! If you'd like to take a gander at the source code behind the demo, you can find it here: https://github.com/sbrl/research-smflooding-vis

My Hull Science Festival Demo: How do AIs understand text?

Banner showing gently coloured point clouds of words against a dark background on the left, with the Humber Science Festival logo, fading into a screenshot of the attract screen of the actual demo on the right.

Hello there! On Saturday 9th September 2023, I was on the supercomputing stand for the Hull Science Festival with a cool demo illustrating how artificial intelligences understand and process text. Since then, I've been hard at work tidying that demo up, and today I can announce that it's available to view online here on my website!

This post is a general high-level announcement post. A series of technical posts will follow on the nuts and bolts of both the theory behind the demo and the actual code itself and how its put together, because it's quite interesting and I want to talk about it.

I've written this post to serve as a foreword / quick explanation of what you're looking at (similar to the explanation I gave in person), but if you're impatient you can just find it here.

All AIs currently developed are essentially complex parametrised mathematical models. We train these models by updating their parameters little by little until the output of the model is similar to the output of some ground truth label.

In other words, and AI is just a bunch of maths. So how does it understand text? The answer to this question lies in converting text to numbers - a process often called 'word embedding'.

This is done by splitting an input sentence into words, and then individually converting each word into a series of numbers, which is what you will see in the demo at the link below - just convert with some magic to 3 dimensions to make it look fancy.

Similar sorts of words will have similar sorts of numbers (or positions in 3D space in the demo). As an example here, at the science festival we found a group of footballers, a group of countries, and so on.

In the demo below, you will see clouds of words processed from Wikipedia. I downloaded a bunch of page abstracts for Wikipedia in a number of different languages (source), extracted a list of words, converted them to numbers (GloVeUMAP), and plotted them in 3D space. Can you identify every language displayed here?


Find the demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/

A screenshot of the initial attract screen of the demo. A central box allows one to choose a file to load, with a large load button directly beneath it. The background is a blurred + bloomed screenshot of a point cloud from the demo itself.

Find the demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/


If you were one of the lucky people to see my demo in person, you may notice that this online demo looks very different to the one I originally presented at the science festival. That's because the in-person demo uses data from social media, but this one uses data from Wikipedia to preserve privacy, just in case.

I hope you enjoy the demo! Time permitting, I will be back with some more posts soon to explain how I did this and the AI/NLP theory behind it at a more technical level. Some topics I want to talk about, in no particular order:

  • General technical outline of the nuts and bolts of how the demo works and what technologies I used to throw it together
  • How I monkeypatched Babylon.js's gamepad support
  • A detailed and technical explanation of the AI + NLP theory behind the demo, the things I've learnt about word embeddings while doing it, and what future research could look like to improve word embeddings based on what I've learnt
  • Word embeddings, the options available, how they differ, and which one to choose.

Until next time, I'll leave you with 2 pictures I took on the day. See you in the next post!

Edit 2023-11-30: Oops! I forgot to link to the source code....! If you'd like to take a gander at the source code behind the demo, you can find it here: https://github.com/sbrl/research-smflooding-vis

A photo of my demo up and running on a PC with a PS4 controller on a wooden desk. An Entroware laptop sits partially obscured by a desktop PC monitor, the latter of which has the demo full screen.

(Above: A photo of my demo in action!)

A photo of some piles of postcards arranged on a light wooden desk. My research is not shown, but visuals from other researchers' projects are printed, such as microbiology to disease research to jellyfish galaxies.

(Above: A photo of the postcards on the desk next to my demo. My research is not shown, but visuals from other researchers' projects are printed, with everything from microbiology to disease research to jellyfish galaxies.)

I've submitted a paper on my rainfall radar research to NLDL 2024!

A screenshot of the nldl.org conference website.

(Above: A screenshot of the NLDL website)

Hey there! I'm excited that last week I submitted a paper to what I hope will become my very first conference! I've attended the AAAI-22 doctoral consortium online, but I haven't had the opportunity to attend a conference until now. Of course, I had to post about it here.

First things first, which conference have I chosen? With the help of my supervisor, we chose the Northern Lights Deep Learning Conference. It's relatively close by the UK (where I live), it's relevant to my area and the paper I wanted to submit (I've been working on the paper since ~July/August 2023), and the deadline wasn't too tight. There were a few other conferences I was considering, but they either had really awkward deadlines (sorry, HADR! I've missed you twice now), or got moved to an unsafe country (IJCAI → China).

The timeline is roughly as follows:

  • ~early - ~mid November 2023: acceptance / rejection notification
  • somewhere in the middle: paper revision time
  • 9th - 11th January 2024: conference time!

Should I get accepted, I'll be attending in person! I hope to meet some cool new people in the field of AI/machine learning and have lots of fascinating discussions about the field.

As longtime readers of my blog here might have guessed, the paper I've submitted is on my research using rainfall radar data and abusing image segmentation to predict floods. The exact title is as follows:

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

As the paper is unreviewed, I don't feel comfortable with releasing it publicly yet. However, feel free to contact me if you'd like to read it and I'm happy to hand out a copy of the unreviewed paper individually.

Most of the content has been covered quite casually in my phd update blog post series (16 posts in the series so far! easily my longest series by now), just explained in formal language.

This paper will also form the foundation of the second of two big meaty chapters of my thesis, the first being based on my social media journal article. I'm currently at 80 pages of thesis (including appendices, excluding bibliography, single spaced a4), and I still have a little way to go before it's done.

I'll be back soon with another PhD update blog post with more details about the thesis writing process and everything else I've been up to over the last 2 months. I may also write a post on the hull science festival which I'll be attending on the supercomputing stand with a Cool Demo™, 'cause the demo is indeed very cool.

See you then!

How to read a paper

So you've got a paper. Maybe even a few papers. Okay, it's a whole stack of them and you don't have the time to read them all (they do have a habit of multiplying when you're not looking). What is one to do? I've had this question asked of me a few times, so I thought I'd write up a quick post to answer it, organise my thoughts, and explain my personal process for sorting through and reading scientific papers (I generally find regular 'news'papers to be of questionable reliability, lacking depth, and to just not to be worth the effort).

A bunch of papers

(A bunch of papers I've read.... and one that I've written.)

Finding papers

If you are in a position where you don't have any papers to begin with, then search engines are your best friend. Just like DuckDuckGo, Ecosia, and others provide an interface to search the web, there are special search engines designed to search for scientific papers. The two main ones I suggest are:

Personally, semantic scholar is my paper search engine of choice. Enter some general search terms for the field / thing you want to read about, and relevant papers will be displayed. It can be useful to change the sort order from relevance to citation count or most influential papers to get a look at what are likely to be the seminal papers (i.e. the ones that first introduced a thing - e.g. like the Attention is all you need paper first introduced the transformer) in that field - though they may be less relevant.

The other nice feature these search engines have is copying out BibTeX to paste into your bibliography in LaTeX (see also the LaTeX templates I maintain for reports/papers/dissertations/theses)

A note on reliability: Papers on preprint servers like arXiv have not been peer reviewed. Avoid these unless there's no other option.

Sorting through them

So you've you know how to find papers now, but how do you actually read them? Personally, I use a tiered system to this.

Reading the abstract: Firstly, I'll read the abstract. Just like you read the title of a search result to decide whether you want to click on the search result, so do I read the abstract to decide whether a paper is worth my time to read it.

Sometimes I'll stop there. Maybe the paper isn't what I thought it was, or I've simply got all the information I need from it. The latter is most common when I'm writing some paper or report: often I'll need a paper as a reliable source for something, and I won't need to read the whole paper to know that it has the information I need.

Okay, so suppose a paper passes a quick look at the title and abstract, and I want to go deeper. You'd think it's time to jump right in and read it from top to bottom, but you'd be wrong. Reading an entire paper in detail is significantly time consuming, and I want to be really sure it's worth the effort before I commit to it.

Skim reading: The next test is a quick skim read. If it's a journal article, there might be some key contributions at the top of the paper - these are a good place to start. If not, then they can often be found at the end of the introduction - this also goes for conference papers as well. The introduction is usually my second stop (though remember I'm still not reading it word for word yet), followed by the end of the results/experimental discussion section to understand the key points of what they did and how that went for them.

AI summarisation Another option if a paper is dense and/or long is to use an AI summarisation tool. These must always be taken with a grain of salt, but can help to direct my search when I'm having difficulty extracting a specific piece of information. AI summarisation can also be a good start if an abstract is bad or missing the information I want but the subject itself is interesting. I often find AI-generated summaries can be quite generic, so it's not a complete solution.

A note on ChatGPT: ChatGPT is a generic language model, and as such isn't ideal for generating summaries of documents. It's best to use a model specifically trained for this purpose, and to take any output you get with a grain of salt.

AI document discussion: Occasionally the abstract of a paper suggests that it contains a significantly interesting nugget of information I'm interested in acquiring (again, most often when writing a paper rather than initial research), but the paper is long, dense, I'm having difficulty finding it, or some combination of the three.

This is where AI-driven document discussion can be invaluable. As I noted earlier, AI-generated summaries tend to be quite generic, so it's not great if there's something highly specific I'm after. The only place I'm currently aware of that ships this feature in a useful form is Kagi, a paid-for search engine with AI features (document summarisation and document discussion) built-in. I'm sure others have shipped the feature, but I haven't seen them yet.

Essentially, AI-driven document discussion is where you ask a natural language question about the target paper, and it does the reading comprehension for you by answering your question with useful quotes from the paper. Then once you have the answer you can go and look at that specific part of the paper (use your browser's find tool) to get additional context.

I've found this to be a great time saver. It can also be useful if I'm unsure if a paper actually talks about the thing I'm interested in or not.

Kagi: Specifically, Kagi (my current main search engine) implements both of the aforementioned features. They can be access via the Discuss document option next to search engines, or by dedicated !bangs (Kagi implements all of DuckDuckGo's !bangs too), which are significantly helpful as I touched on above.

  • AI summarisation: !sum <url_of_paper_or_webpage>
  • AI discuss document: !discuss <url_of_paper_or_webpage>

A disclaimer: I have received no money or other forms of compensation for mentioning Kagi here. Kagi have no asked me to mention them here at all, I just think their product is helpful, useful, produces good search results, and saves me time. AI models can be computationally expensive, so I speculate it would be difficult to find a free version without strings attached.

A screenshot of a sample discuss document discussion about the paper Attention is all you need.

(Above: A screenshot of a sample discuss document discussion about the paper Attention is all you need)

How to read a paper effectively

So a paper has somehow made it through all of those steps unscathed, and yet I still haven't extracted everything I want to know from it. By this point it must be a significantly interesting paper that I likely want lots of details from.

The process of actually reading a paper from top to bottom is an inherently time consuming one: hence all the other steps above to filter papers out with minimal effort before I commit to spending what is typically an hour or more of my time to a single paper.

My general advice is to do a re-read of the abstract to confirm, and then start with the introduction and make your way down. Take it slow.

Making notes: When I do read a paper, I always make notes when doing so. Having 2 monitors is also helpful, as I can make notes on 1 and have the paper on the other. My current tool of choice here is Obsidian, a fabulous open-source note taking system that I'll wholeheartedly recommend to everyone. It's Markdown-based and has a tagging system (nested tags are supported too!) to keep papers organised. The directed graph and canvas features are also pretty cool. My general template at the moment I use for making notes on papers is as follows:

---
tags: some, tags/here
---

> - URL: <https://example.com/paper_url_here/doi_if_possible.pdf>
> - Year: YEAR_PAPER_WAS_PUBLISHED

- Bulleted notes go here
    - I nest bullet points based on the topic
        - To as many levels as needed
    - These notes are very casual
- [I contain my own thoughts in square brackets]
    - This keeps the things that the paper says separate from the things that I think about it
- Sometimes if I'm making a lot of notes I'll split them up into sections derived from the paper


## PDF
The last section contains the PDF of the paper itself. Obsidian supports dragging and dropping PDFs in, and it also has a dedicated PDF viewer.

Complete with an explanation of what each section is for!

You don't have to use Obsidian (it's the best one I've found), but I strongly recommend making notes while you read a paper. This way you have some distilled notes in your own words to refer back to later. It also helps to further your own understanding of the topic of a paper by putting it into your own words. Other tools I'm aware of include OneNote and QOwnnotes (I still use this for making notes in meetings and recording random stuff that's not necessarily related to research. I keep Obsidian quite focused atm).

Make sure these notes are digital. You'll thank me later. The number of times I've used Obsidian's search function to find the notes I made about a specific paper is absolutely unreal. Over time you'll get a good sense for what you need to make notes on, to avoid both having to refer back to the paper again later and having so many notes that it takes longer than hunting around in the source paper for the information you were after.

A screenshot of my obsidian workspace.

(Above: A screenshot of my Obsidian workspace.)

Sometimes your research project will change direction, and the notes you made are suddenly less relevant. Or you've learned something elsewhere and now come back with fresh and more experienced eyes. I often update the notes I took initially to add more information, or references to other related papers that go together.

Continual evaluation: As I read, I'm continually evaluating in the back of my mind whether it's worth continuing to read. I'm asking questions like "is this paper going on a tangent?", and "is the solution to their problem the researchers employed actually interesting to me?", and "is this paper getting too dense for me to understand?", and "is the explanation the paper gives actually intelligible?" (yes, papers do vary in explanatory quality). If the exercise of reading a paper becomes not worth the time, stop reading it and move on.

Sometimes it's worth jumping into skim-reading mode for a bit if something's irrelevant etc to see if it gets better.

But I don't understand something!

This is a normal part of reading a paper. This can be for a number of reasons:

  1. The paper is bad
  2. The paper is good, but is terrible at explaining things
  3. The paper contains more maths than explanation of the variables contained therein
  4. I'm lacking some prerequisite knowledge that the paper doesn't properly explain
  5. Some other issue

It is not always obvious which of these cases I find myself in when I encounter difficulty reading a paper. Nevertheless, I employ a number of strategies to deal with the situation:

  • Reading around: As in most things, reading around the area of the paper that is causing and issue may yield additional information. Sometimes returning to the related works / background / approach section can help.
  • Search for related papers: There are many papers that have been written, so it can be worth going looking for a related paper. It might be a better paper or worded differently that makes it easier to understand.
  • Look through the paper's references: This can also be a good way to trace back to the source of an idea. Semantic scholar's References tab below the abstract lists all the references too, and the related works section of a paper will tell you how each cited work is relevant to the problem, motivations, and subsequent method and results thereof.
  • Look for seminal papers: See above. Finding the original paper on a given idea can help a lot, as it's often explained in much more detail than later papers that assume you've read the so-called seminal work.
  • Web search: For specific terms or concepts. Sometimes just a quick definition is needed. Other times it's more substantial and requires reading an entire separate blog post - compare Attention is all you need with the blog post the illustrated transformer. Each provides a different perspective. In this case I actually read both at the same time to fully understand the topic. Make sure you properly assess anything you find for reliability as usual.

Supervision: It's very unlikely that after all of these steps I'll still be stumped on how to proceed, but it has happened. In these situation it can be extremely helpful to have someone more experienced in the field to discuss with. For me, this is my PhD supervisor Nina.

Whoever they are, keeping in regular contact is best as you work through a project. Frequency varies, but for my PhD supervision this has fallen somewhere between 1 week and 3 weeks between each meeting, and each meeting is no less than an hour long. Their advice and insight can guide your efforts as you progress through a research project.

They will also likely be busy people, so make sure you properly prepare before meeting them. Summarise what you've read and how it relates to your project and what you want to do. Make a list of questions that you want to ask them. Gather your thoughts. This will help you make the most of your discussion with them.

Conclusion

I've outlined my personal process I employ when reading a paper (in perhaps more detail than was necessary). It's designed to save me time and allow me to cover ground relatively quickly (though quickly is still a relative term, as in a worst-case with a completely new broad field it can take weeks to cover it enough to gain a good understanding thereof).

This is my process: you need to find something that works for you. It's okay if this takes time. Maybe lots of time... but you'll get there in the end. The more you read, the more you'll get an instinctive sense of the stuff I ramble about here. My method isn't perfect either - I'm still learning, so my process will likely evolve over time.

If you've got any comments or questions, do leave them in the comments section below and I'll do my best to answer them.

Chromium nightly script

I don't really like chrome (I could write an entire blog post about this), but sometimes circumstances demand I have to use a blink-based (Chromium's rendering engine) browser for some rare and limited but equally essential tasks.

Unfortunately, the default chromium package in Ubuntu is now a snap, which complicates matters as snaps generally cause issues I'd rather not deal with on my system. This left me out of options, until I did some digging and found that chromium nightly was available to download as a zip. Fast-forward an hour and I now have a quick little script that automates the process of downloading and running chromium nightly, so I thought I'd share it here.

I've talked about shell scripts being lego before (exhibits A, B, C, and D), and the same applies here - so I'll break it down and explain each part. Let's set out first what we want to do:

  1. If chromium has already been downloaded, skip to step 4
  2. Download .zip from https://download-chromium.appspot.com/dl/Linux_x64?type=snapshots
  3. Extract to somewhere in /tmp
  4. Run the chromium binary

Now, let's put this together into a shell script. First, let's define some variables:

#!/usr/bin/env bash

set -e;
download_url="https://download-chromium.appspot.com/dl/Linux_x64?type=snapshots";
temp_dir="/tmp/chromium-nightly";

#!/usr/bin/env bash tells Linux to run the script with Bash - this must be the first line of the file. set -e tells Bash to exit immediately if any errors are encountered instead of trying to continue - this is a shell flag, so you could get the same effect by executing a script like bash -e path/to/script.sh instead of doing it here, but in this case we always want the option to be set, hence the use of set here instead.

Now, let's create that temporary directory:

if [[ ! -d "${temp_dir}" ]]; then mkdir "${temp_dir}"; fi

Next on the list is to to check if we've already downloaded chromium nightly already. The laziest way I can think of to do this is to check if the chrome binary exists or not and whether it's executable. This can be done like so:

if [[ ! -x "${temp_dir}/chrome-linux/chrome" ]]; then
    echo "download chromium nightly here";
fi

If statements are a bit weird in Bash. -x checks to see if the file at the following path is executable or not, and ! inverts it.

Next, we need to download the archive. Let's do that inside the if statement:

    echo ">>> Downloading chromium" >&2;
    curl -SL --progress-bar "${download_url}" -o "${temp_dir}/chromium.zip";

>&2 sends the output to the standard error instead of the standard output. curl is a command for downloading things from the internet. We provide the URL to download (it supports almost every protocol imaginable, but here we're just using https) and the place to download it to (-o), and it does the rest.

Next up is extracting it:

    echo ">>> Extracting zip" >&2;
    unzip "${temp_dir}/chromium.zip" -d "${temp_dir}";

    echo ">>> Cleaning up" >&2;
    rm "${temp_dir}/chromium.zip";

unzip is the command to unzip .zip archives, and -d tells it the directory to extract everything to. Here I manually downloaded the file at the download URL and inspected it with the file command (file path/to/unknown_file) to see what format I was dealing with - then once I knew it was a .zip archive I chose chromium.zip as the filename to download it to.

In cases where I have a file with the correct file extension that I want to extract as a one-off, I also have an all-in-one script that can automatically determine the right extractor for it. Here though we use the direct command to simplify the script.

Finally, we delete the .zip after we're done extracting it, as it's no longer needed.

Now that chromium nightly is downloaded, we can start it like so:

echo ">>> Starting chromium" >&2;
exec "${temp_dir}/chrome-linux/chrome";

...and we're done! exec here is a builtin that replaces the current process with another, which reduces the number of running processes. Here's the full script:

(Can't see the above? Try a direct link)

Shell scripts - like this one - can be really useful for automating repetitive tasks. Whether you use Linux, macOS, Windows, or something else I can absolutely recommend learning your system's default shell scripting language - it will save you a lot of time.

Let me know if you have any questions about this or shell scripting in the comments below, and I'll do my best to help.

Art by Mythdael