karuiwm
Today, merge request 14 was accepted into karuiwm's master branch, and karuiwm can now finally be used without having to embed user preferences into the binary at compile-time via a configuration header.
But what even is karuiwm, you're asking?
Let me reach back a bit.
dwm
In the year 2013, Yes,
I am aware that 2013 is 8 years ago and this seems excessive. Bear with me, this
is relevant. ayekat was a user of suckless' dwm, a dynamically
tiling window manager that is—among other things—known for not
loading any configuration at runtime. Instead, things like key bindings, window
border colours and other user preferences are all declared in a C source header
file, config.h
, that is then included in dwm's source code at compile-time.
This requires the user to have some basic understanding of the source code itself, but that sounds scarier than it really is: The configuration file is mostly just a bunch of variable declarations whose syntax is relatively easy to grasp, and an example configuration file is provided to showcase how you can invoke specific functions within dwm.
For more advanced tweaking, though, you have to patch parts of dwm's source code itself. This is even actively encouraged (and people's patches are also gathered on dwm's website itself). Overall, dwm users quickly get to know its codebase and how to extend it the way they like. There's probably a reason why many of today's existing tiling window managers can be traced back to dwm in some form or another.
One notable property of dwm is that it doesn't group windows together to workspaces (or "desktops"). Instead, it provides tags that you can assign to its windows. So as a user, rather than switching around between workspaces, you instead decide which subset of tags to select (= which windows to show). This allows selecting windows a lot more flexibly, and overall, this is quite powerful if you grasp the concept.
Of course, ayekat didn't grasp the concept.
Instead, ayekat spent considerable time and energy into patching dwm and defining key bindings such that only one tag at a time could be active, and a window could be tagged with only one tag at a time, and one could switch between "adjacent" tags (left/right) and optionally grab a window and pull it from one "tag" to another, and— … yeah, well, it was basically workspaces.
dwm statusbar showing "basically workspaces".
At some later point, I abused dwm's tags system further to get a special "scratchpad" tag, with one window assigned that I could easily toggle on and off with a key binding, overlaying the windows in the current workspace (note that a scratchpad patch did not exist back then). All in all, I lived a pretty happy life with "workspaces" in dwm.
That is, until one day I saw florv use an XMonad module called GridSelect, which looked something like this:
My brain somehow misinterpreted this as some sort of visualisation of a
two-dimensional map of workspaces where one could move around and create
workspaces dynamically, and I was intrigued (especially given my own limited
tagsworkspaces). And I wanted that as well.
The two-dimensionally arranged, dynamically created workspaces, I mean. Not GridSelect.
Now it probably doesn't come as a surprise to anyone that dwm's code isn't exactly designed with workspaces in mind. And it definitely isn't designed with dynamically allocated workspaces arranged in two dimensions in mind.
To patch this into dwm, I would have needed to change its codebase to a point where one could not really consider it "dwm" anymore. Maintaining patches for such a thing would've been a nightmare; at this point, it would be easier and less painful to just write my own window manager. But who in their right mind would do that?
stwm
In 2013—the same year, but a little later—ayekat decided to write the Simple Tiling Window Manager. It didn't do much, but it did tile. And it did manage windows. Not any windows, mind you, but at least its own, empty windows.
The goal was pretty straighforward: Write a window manager in the vein of dwm (and largely inspired by its codebase, because I didn't know C and Xlib very well back then), with the features I wanted directly baked into it:
- Workspaces.
- Workspaces arranged in two dimensions!
- A pretty-looking map to switch around between those workspaces.
- A Scratchpad.
- Workspaces!!
And so I spent Winter 2013 on writing my second-ever major C program, a window manager. Around the end of the year, it was then feature-complete enough for me to switch from dwm to stwm. To be honest, I'm still quite impressed by my past self, because this took less than two months overall. There's probably a reason why I failed all my maths courses in that semester…
Here's a screenshot showcasing the scratchpad and the desktops arranged in two dimensions (inside a Xephyr window):
Yes, this is from today. While skimming through my old screenshots, I noticed that the looks of my setup haven't really changed at all since back then, so I might as well just take a screenshot right now. Please kindly ignore the 8 years gap.
At some point (actually barely a week after I had started using it), I decided that "simple tiling window manager" wasn't a particularly uniquely identifying name, so I went with "karuiwm" instead. The attentive reader may of course have noticed that "karui" (かるい, 軽い) means "lightweight" in Japanese, so this makes this the "lightweight window manager", which… meh
Nevermind.
karuiwm and The Weight of Legacy dwm
karuiwm is not a fork of dwm; it was—and the Git commit history will confirm that—implemented from scratch. But it certainly wasn't a clean-room implementation in any sense of that word.
Much of the structure and many of the function and variable names even to this
day will remind readers of dwm. There's monitors and clients; there's mfact
and nmaster
; one can step()
through clients and shift()
and zoom()
them
in the layout; and one can quit()
karuiwm.
More importantly, though, karuiwm adopted the approach of not reading any
configuration files at runtime, but instead requiring the user to write a
config.h
file very much like in dwm
(with a very similar structure, to some
extent probably even compatible).
While that worked fine in the beginning, at some point I decided that I would want to package all software installed on my system for better system files trackingThat was an odyssey in its own right; see the Tale told here., and I realised that sometimes I want configuration for different machines to be different. But how do you do that if the configuration must be baked into the software when being built?
In karuiwm's case, I built one package for each system.
That wasn't terribly elegant, of course. Here's for instance what the pkgbuild file looked like when I added a third system to my fleet. And while the target userbase was originally just myself, vanity probably still got the better part of me, and I wanted it to be useful for others as well. Or maybe I simply got tired of having to reply to curious friends that, yes, this was a window manager, but, no, it currently wasn't meant for them to use. Or maybe my inner pedant woke up and stated loudly enough that this was, in fact, not terribly elegant.
Whatever the reason, I decided to work on adding "configuration file reading" as a feature. But given the lack of any abstraction layers and the abundance of spaghetti in my code at that point, this meant that it would first take a bit of untangling and refactoring.
Or maybe just rewrite the thing from scratch…?
life engineering lesson
That was around the end of 2014. You may have noticed that 2014 was seven years ago.
Here's the most important lesson I took from this project: If anyone ever suggests to "just rewrite it from scratch" because refactoring the code would be too tedious and unfunny, please
Slap them.2014 ayekat? *Slap!*
2017
ayekat? *Slap!*
I wasted almost exactly 4 years of my life on this before it finally dawned on me that this was doomed to fail everytime: With new code, it doesn't really matter which components you implement at what point. You can easily leave out (or even remove) some critical feature with the promise of adding it back sometime later when "it's all cleaned up a bit", because who cares? You aren't sitting on that branch that you just sawed off. Nobody is.
But with code that is actually in production, it counts. If you break some critical functionality, you better fix it quickly, because otherwise it'll hurt. It's ugly and painful, and it isn't as fun as spitting out hundreds and thousands of lines of new code, but it forces you to stick to the existing feature set, and it prevents you from "giving up". In return, because you usually only make small changes incrementally, it's easy to leave the codebase and then come back half a year later to continue working on it. There are no "fantasy features" where you don't remember anymore what exactly the goal was; every feature is actually in use.
And it worked! It took me another 3 years to refactor the code, but it worked. Yes, 3 years is still a long time, but in my defense, Life somehow continued to happen as well, and besides, you should have seenActually, just see for yourself: The codebase in 2018 before I started refactoring. the initial spaghetti.
Also, it wouldn't be a true product of mine without at least some amount of overengineering:
- Everything a user can do is exposed as an action. An action has a uniquely
identifying name and a list of arguments it accepts; for example:
focus_step_window
changes the window focus to the previous or next window in the layout, and it takes one argument (an integer);scratchpad_toggle
toggles the visibility of the scratchpad, and takes zero arguments;key_bind
binds a key to an action, and it takes three arguments (a keymap name, a key combination, and a string describing the action invocation).
- To allow users to invoke actions, karuiwm exposes an RPC mechanism via Unix socket, and reads line-based "commands" that invoke an action.
- When karuiwm starts up, it invokes a script that connects back to the RPC
socket and invokes actions like
key_bind
andwindow_set_border_colour
. That script is of course user-defined.
Ultimately, karuiwm doesn't actually do any specific "configuration" loading at all. It simply sets up the RPC interface, and then hands off the configuration to an external script.
outlook
With the runtime configuration in place, I now no longer need to feel (too) bad whenever someone requests to try out the window manager; it is now at an acceptable pain level, I thinkI mean, there's no documentation, there's glitchy bugs everywhere, and things still change all the time and probably break your configuration before you've finished writing it. But I feel comfortable enough with inflicting that amount of pain to you..
That being said, after 8 years of dog-feeding myself, I've started to notice some limitations. They aren't major, but… let's say that I shouldn't have to be annoyed even by only minor things in a tool that I've written myself, for myself.
So here's what's planned for the near and far future:
mouse follows focus
When using key bindings to switch focus between windows, I often end up with the mouse pointer not being in the focused window. When using multiple monitors, the distance between the mouse pointer and the focused window can become significant; sometimes I have to move it around by potentially several thousand pixels to get it into the window I'm currently working in, which—as much as I love the ThinkPad TrackPoint—is a bit painful with the ThinkPad TrackPointOr however you call that thing.
In those situations, I often wish I had a "mouse-follows-focus" feature, where the mouse would automatically be warped into the currently focused window upon focus change.
I still need to determine how exactly the coordinates should be set, of course. Remembering the last coordinates and restoring those might work in some cases, but what happens if the mouse was moved out of the window manually the last time the window had (lost) focus? Or what about windows that have changed their dimensions in the meantime (and the saved coordinates now lie outside the window)?
modules
Many of the features may not be needed by all users, so it would be nice to just load the parts one really needs, but also allow users to easily extend karuiwm's functionality however they like, without having to get their code into karuiwm directly.
Here, I'm thinking about a modules system, by dynamically loading shared objects and then establishing some API that those modules could use to announce themselves to karuiwm and expose additional functionality (actions, layouts, …); a bit similar to—but hopefully a lot cleaner than—what I'm doing in karuibar.
The challenge probably lies in keeping the API simple enough to avoid getting tangled up in compatibility issues, while still being versatile enough to allow users to easily write useful modules.
In an initial step (also as a sort of dog-feeding), I'm planning to split off much of the current functionality into modules, e.g. the scratchpad, the workspace map (including, ironically (given the history of karuiwm), the fact that workspaces are arranged in 2 dimensions), or even just the RPC socket handling itself.
proper statusbar support
Right now, karuiwm displays a dummy statusbar with override_redirect
that only
shows some minimal information about the currently viewed workspace (a logo of
the applied layout and the workspace's name). The rest of the statusbar is empty
and unused; I currently just overlay karuibar in that empty area,
though karuibar doesn't interact with the window manager at all.
The part to the right is hidden underneath karuibar.
The goal here is to stop shipping a statusbar with karuiwm altogether, and instead pass all necessary WM-related information to karuibar. For that, karuibar needs to be extended to at least:
- Be able to distinguish between a focused and an unfocused monitor; and
- Allow an external component to independently update one slot in the bar.
On the karuiwm end, I'm probably going to create a module that prepares
information to be forwarded and shown in karuibar. Potentially it could be
abstracted away in a way that one could also use other statusbar
implementations (most importantly, I should start handling the _NET_WM_STRUT
property, to support all kinds of statusbar implementations).
dummy windows
karuiwm manages one single workspace set, no matter how many monitors are attached. So monitors are "sharing" a set of workspaces between them.
Consequently, it may happen that one monitor tries to view a workspace that is already visible on another monitor. But X cannot show the content of one window in multiple locations,At least not without a compositor involved, I think so I need to handle this "conflict" case.
The current approach is fairly straightforward: Simply swap the views between the monitors, and we're done. But that occasionally causes some confusion. Imagine the following:
- On monitor A, show the
dev
workspace; on monitor B, show thechat
workspace. - On monitor A, switch to the
chat
workspace, then notice the mistake, and switch to themails
workspace. Later, switch back to thedev
workspace. - Now try to look at the chat (expected on monitor B), but instead see mails there.
- Confused ayekat.
Here, I would prefer monitor B to simply stay at the chat
workspace. But now I
need a solution for what happens if multiple monitors try to view the same
workspace (and thus the same windows).
The new idea here is to let the unfocused monitor stay on its workspace, but because the windows are also shown on the other (focused) monitor, have the unfocused monitor simply show some "dummy" windows in their stead, i.e. empty windows that only contain the name of each client window that would be shown there. As far as I know, there's a project called clfswm that implements something like this.
As soon as the other monitor moves away, "reclaim" the client window content again, and everything is fine. No confused ayekat.
Whether I'll implement this by switching between dummy windows and client windows, or whether I'll start reparenting windows (such that there's always a dummy window, but sometimes it simply doesn't have the client window mapped) remains to be defined.
Wayland
I guess one of the major issues of working on something for almost a decade is that the technology landscape around you changes so much that by the time you're finally done, it has already become obsolete.
X still isn't quite dead, fortunatelyI mean, mostly just fortunate for karuiwm., but I should still consider preparing for a transition to Wayland in some not-too-far future.
To do so, I will first need to split all the X-specific parts away from the generic window management part in karuiwm. Ideally, it will become its own module that provides some generic "compositor interface" to karuiwm, and then I could provide a second module exposing the same interface towards karuiwm, but implementing window management (or rather, compositing) for Wayland.
It probably isn't as simple as I make it sound here, but one can dream…
but, so, is it usable now?
For playing around? Sure!
For production use? Hell no.
As noted somewhere aboveAt this point, I wouldn't even be mad if you somehow missed it in this wall of text., documentation is still severly lacking, there are some awkward glitches and occasional crashes, and overall a lot of behavioural changes are still expected, especially in the context of the RPC.
I mainly wrote this article because I felt excited about the fact that I have finally completed something that had kept me busy for the past 7 years, and how a project that's been going for over 8 years at this point is finally starting to take some serious shape, and I somehow wanted to get my excitement out into the World.
That being said, a first release is now becoming more and more realistic, and I expect version 0.1 to drop sometime in 2022. Then again, there was a time in late 2019 where I thought I'd be done by Summer 2020, so…
Let's just not stress ourselves to much.