06 Aug 2025

Do I need a Lisp machine come back?

words

One might say I’m too obsessed with Lisp. Or in general with “unconventional” things. Maybe I am. Or maybe dead technologies have got some buried gems which you cannot find in the modern world of computing.

This story is about WakeGP. Few years ago, I wanted to start with Evolutionary Machine Learning, specifically Genetic Programming. So I started writing WakeGP software using Rust. And it’s been few months which I’m doing experiments using different parameters and algorithms to see which ones produce better results(e.g. better accuracy).

My program uses TOML as the configuration file. It first loads the dataset as specified in the configuration, and through a process, finally produces train, test and validation datasets. Then the Genetic Programming run starts and after each generation, a line is printed to standard output to tell me about the last generation just passed. Something like this:

[000:00:16] Gen 600 avg/avg_ft/ft_div/avg_size/Z -/+ 62.5890/65.6710 0.641114 0.000420 694.9 0

Finally when the end condition has happened, it optionally finds the best and the smallest individuals and evaluates them on the 3 datasets and then quits.

Assuming that my algorithm works as it should, which itself is very questionable, I need to find the best hyperparameters which lead to best and smallest individuals, or at least the best ones. To find the best combination of hyperparameters, I need to do a lot of runs using each combination. Then finally I should do a T test to find out which ones actually perform better.

The current process is that I write a Python script which holds a list of combinations(see my July devlog). And it does a number of runs per each combination. I used to write shell scripts. But working with TOML files is not easy in shell. And also they are slower. But that’s not the end, I also have to write a systemd service with which I can better manage the process and it’ll also start on boot, if for whatever reason, a reboot happens.

When the experiments are done, I have to write another Python script to read all WakeGP logs, extract the numbers(fitness, accuracy and size), and do T tests to find out which ones perform the best. First a lower number of runs per combination are done. Some combinations fall way behind. Then I schedule a “second wave” to do more runs on the most fit combinations. I call these “best buckets”. Those are the combinations I don’t have enough data to prove one of them perform better than the other since $p$ value is large enough. But… a Python script which extracts all data and does a large number of T tests is too slow. I can either install more headache on my mind by doing multi processing or use pypy which is faster. But even pypy, while being tons faster, is not fast enough.

Anyway, I do it with pypy, now I have a list of “bests”. I have to do a second wave of experiments. I have to change the runner script to read combinations from a text file instead of iterating over all combinations.

Again I just get through it. I do the same for the next waves. Finally I decide to do experiments for a different set of hyperparameters. I have to copy the runner script, change it according to my need, write a new systemd unit file and do the same process again. Needless to say that I also have to write another Python script to extract the results because this time, I’m not looking at the fitness values, I’m rather looking at something else.

I can try writing a more generalized Python script and even interact with systemd using it. But then I’ll have to spend 2-3 days to design and write such a thing, plus probably few more days to debug it whenever things go wrong like in edge cases. And it won’t be late before I realize the script is not general enough and I need to change it again. It greatly slows me down…

All these issues are there while I don’t need to change WakeGP code. Which written in Rust, it means each time I change the algorithm, I need to recompile it. And wait tens of seconds for it to finish. This story I’m narrating is just for the hyperparameters.

There are also other problems. Like if I need my computer for something, I’ll have to either stop the runner service or change number of threads WakeGP uses. Both routes create multiple issues.

For the first one, I have to remember to start it again. If I don’t and go home, I have lost few hours. I do it all the week and then I have to wait a week longer to get the experiment results. My mind has a limited random access memory!

For the latter, I have to go down adding a configuration option to the runner script to somehow read thread count from some configuration file and revert thread count to original after a time frame. Heck no! I wrote it but it didn’t work well. And I didn’t have time to spend debugging things.

If I want to do some small experiments. Like compare just two combinations, I can’t go for the runner script. I have to do it with POSIX shell and the for loop. It comes with tons of disadvantages, however with the advantage that initial investment is negligible.

Now to summarize, there are multiple problems and things which I need here:

I need to interact with systemd so the experiments will always run even if the computer reboots. Two common reasons are kernel updates and power outages. It could be nice if somehow power state of my PC was invisible to the programs.
I need a mechanism to reduce resource usage of WakeGP when I need my CPU for something else. Like for compiling a program or doing another ad hoc WakeGP experiments.
I need a shell/language combo which is fast and expressive enough to extract and process data. And at the same time it’s convenient if I want to do some ad hoc stuff. POSIX shell is certainly not a good choice for such these. It offers some tools which are useful. But then it lacks processing floating point numbers, doing T tests, and so on. Python is slow and is very inconvenient as a shell.
I need a workspace which my changes to it persist across reboots and process terminate. Like I define a function for something. It should be there in the workspace next time I want to use it. If I close it and again open it, my things are already there.

Solution? It seems I’m looking for a Lisp Machine, or a Lisp shell which changes are persistent and reboots or program terminates are invisible to it. Does such a thing exist? I don’t know.

Edit 0

I talked about my needs in #lisp, #commonlisp and #sbcl on LiberaChat IRC network. And I got some keywords and information for further research. There is SBCL’s save-lisp-and-die which saves the current state into a core image and kills the process. You can load the image later and start from where you stopped. If you don’t want to kill the process, you can fork, do save-lisp-and-die on the fork and continue using your stuff.

Quoting one of the folks in #sbcl:

years ago I had sbcl rigged up to fork & save-lisp-and-die when I logged out, then when I logged in again that saved image would start up again and it was like (almost) nothing happened. it was fun, but occasionally it would break and I’d have to start from a fresh image. Then I’d find out that my on-disk source code had diverged from what was in the image and struggle to bring things up properly again

Other problems they mentioned was with library upgrades. “They aren’t designed to be hot upgraded” and also upgrading SBCL itself.

The thing I’m looking for, seems to be “residential-style development”. Someone else added:

Interlisp tried to do so. But such these have fallen out of flavor for so long in favor of file systems and databases. Maybe the reason was too much complexity.

Finally, I got a link to Richard Stallman’s note on comparison between file based vs residential systems. Note that Richard Stallman is one of the geeks/hackers who was working on the Lisp Machine at the MIT AI lab. And interestingly, one of the only two who didn’t join any company to commercialize Lisp Machines!

Edit 1

Paolo Amoroso gave a link to read more about “residential style developement” and Interlisp. And yeah there is an Interlisp Mastodon account too.