Not-a-bug fixing

As far as system maintenance is concerned, I guess I have some minor case of obsessive compulsive disorder. I very much dislike having random applications dropping files in my home directory, and I am equally displeased whenever I find software packages on my system that I don't need. While I consider it a waste of time and bandwidth to download and perform updates for such packages, what causes me more unease is the mere fact that the system does not appear clean to me. It is clutter that serves no purpose.

So whenever I see a package that I do not recognise, my very first reaction is typically

Why is this package installed?

In theory, this is very easily determined:

In practice, however, there are a couple of obstacles.

First, I don't always remember why I installed a given package. Sometimes I need it for some one-time task and then forget to remove it afterwards. And a few months later, I've forgotten about it entirely.

To avoid that sort of situations, I've started writing meta-packages for the various tasks. A meta-package has a name, a description of the task, and a list of package dependencies required for that task. Rather than directly installing the required tools explicitly, I install the meta-package and let that pull in all the wanted packages as dependencies.

After shifting to using meta-packages for almost everything on my system, I felt rather clever and happy, as I had found a way to exploit the package manager's dependency system to encode all the information I wanted.

I didn't feel clever for very long, though. Because…

Second,

Why are the dependencies broken?

My distribution of choice is currently Arch Linux.

Arch Linux uses the package manager pacman. Pacman defines so-called package groups, which on a technical level behave like tags or topics, i.e. they have no influence on how packages depend on, conflict with, or in any other way relate to each other. However, rather oddly, pacman allows installing and removing groups with the same syntax as used for packages. This explains why some pacman users (and even developers) occasionally mistake groups for meta-packages.

Arch Linux itself doesn't really help resolving this confusion either, as it doesn't use groups very consistently. In many cases, they are like tags; packages are assigned to groups like vim-plugins or xorg, which aren't intended to be installed all at once. But there are also groups like lxqt or mate, which may very well be installed in their entirety; however, simply replacing them by meta-packages would cause issues for users who wish to remove some of the packages.

Furthermore, there are groups like base-devel, a set of packages that are assumed to be present on a build system, and thus never explicitly listed among buildtime dependencies for Arch Linux packages; it makes little sense to install packages from this group only partially (at least if one installs packages within the context of base-devel), so replacing it by a meta-package would very much be feasible.

Given the confusing nature of groups, for a user it would be easiest to simply ignore their existence entirely (except for base-devel: turn that into a meta-package, and be done with it).

It would be easiest.

Because Arch Linux has managed to add to the confusion about groups by adding a group whose own purpose is entirely undefined: base.

Depending on whom you ask, the reply you get will be different. Some developers argue that all packages in base are supposed to be installed on all Arch Linux systems, and will intentionally not declare some dependencies when they are in base. There are other developers who take a more moderate stance and argue that only some packages should be expected on an Arch Linux installation (glibc or filesystem come to mind), while others should be uninstallable without issues; essentially they agree with the other ones, but argue that the current base is too large—unfortunately, nobody is clear where exactly they draw the line.

And last, there are the pedantic people like me who argue that base should be considered a set of "recommended packages" at most, and packages in there should be removable in any way the user wishes, i.e. none of the packages should be implicitly assumed to be installed.

But all those opinions don't matter, because this is a case of a metaphorical chain breaking at its weakest link: even just one maintainer intentionally not declaring dependencies on base packages is enough to "compromise" the entire system: users now have to globally assume that things may break if base is not fully installed.

Essentially, the user is forced to have installed a bunch of packages that they will never need.

Arch Linux is a general-purpose distribution. Upon installation, only a command-line environment is provided: rather than tearing out unneeded and unwanted packages, the user is offered the ability to build a custom system […]

The lack of policy also means that users cannot submit any bug reports for such packages: if there is no document that properly describes the base group, any interpretation of it is a valid one, no matter how absurd. In fact, if I were to only submit packages to the Arch User Repository where I state all the base packages as dependencies, no matter if required or not, my packages might get removed by an AUR overseer at some point, but the reasoning would probably be something in the vein of "absurd" or "silly", but nobody would be able to refer me to a document stating that my interpretation of the base group was technically wrong.

But there was a time when I didn't know that. In autumn 2017, when setting up a new system, I managed to crash the locale-gen command by not having sed installed. The glibc package (which provides the locale-gen command) doesn't mention sed at all, neither as a mandatory nor optional dependency. So I did what every optimistic, enthusiastic, naive Arch Linux user would do: open a bug report. This one. And this is what followed on the developers-only mailing list.

… and there went my enthusiasm.

persona non grata

Based on that reaction in particular and what I had otherwise seen of the Arch Linux bug tracker overseers' behaviour on the forums and mailing lists, I got the feeling that they would now be specifically careful not to resolve any of my present or future requests. As far as the bug tracker was concerned, I was now practically dead.

Not that this would change much; the majority of issues I had previously reported (or voted for) had also been either rejected or simply ignored, despite having patches or trivial solutions attached. It appeared to me that my perception of what is considered a "bug" differed quite a bit from the Arch Linux maintainers' one. I concluded that I couldn't really trust my own judgement on that matter anymore: if something appeared like a bug to me, is it really a bug? Should I bother reporting it?

I decided no.

It wasn't worth the hassle.

I can fix this shit myself

Building packages for pacman is not rocket science. The packaging tooling isn't exactly powerful, but it is also rather simple, and it gets the job done without getting in my way—so actually quite fitting for a distribution that Arch Linux claims to be. And having used this distribution for more than seven years and packaged quite a few things, I decided that I was capable enough of taking over some of their packages as well.

While I didn't feel comfortable tackling a package as fundamental as glibc right away, I felt confident enough to at least adopt debootstrap (FS#48908), lighttpd (FS#45902, FS#51931) and pass (FS#55059, FS#55504).

I started by setting up a personal package repository on a publicly accessible server, archlinux.zuepfe.net, and configured my systems to add that repository to their pacman package sources. Then I wrote a convenience script that would allow me to more easily push packages to that repository, zr (for Züpfe Repositories), and added my custom variants of abovementioned packages to my custom repository. Up to this point, things were rather smooth.

But then I started realising that this was going to be a lot of work—at least for a regular user. I had no way of knowing when a library soname bump related rebuild was due, or when some bug was reported for the "real" version of the packages that would force me to react someway. I would also need to subscribe to upstream channels to get all the relevant information required as a packager for software I was not necessarily very familiar with.

A lot of duplicated efforts, ultimately. And all that for merely changing some metadata…?

That sounded a bit backwards.

Let's just repackage

Pacman packages are pretty simple. They are Xzipped tarballs containing the following three files at the top level:

The tarball may also optionally contain a .CHANGELOG file and a .INSTALL file that defines actions to be triggered upon installing, upgrading or removing the package. Everything else in the tarball is the files that are part of the package, with a directory structure reflecting how they will be installed to the system.

That's all, no magic. Theoretically, we can easily modify a package directly: just unpack the tarball, then modify the metadata or add/rename/patch/remove a file. Sounds crazy? Then let's do it. Rather than maintaining our own version of the package and building the software from scratch, we just use the work that's already been done by the package maintainer. This is about as low-effort as it can get.

And thus, repkg was born.

And this is the list of packages that I currently modify with it. I can now get sane packages, and the Arch Linux maintainers can keep their bugs without having a pedantic who complains. That's a win-win.

future

It's now been 1.5 years since that "bug tracker incidence", and the base group still hasn't changed. I've seen 3 discussions among the developers come and go again without any notable progress. In the most recent iteration, people still seemed to have different ideas of what the purpose of the base group should be, so the one concrete proposal of a new (slimmed down) list of packages in base seemed just halfhearted at best. I don't expect this issue to be resolved anytime soon.

Then again, even if the Arch developers suddenly changed their minds and decided to fix all the things I consider issues, I'd still keep using repkg to modify packages to my needs, because it's just so damn convenient. Not only can I now easily fix all the cases of "that's not a packaging bug", but also fix upstream issues, and otherwise adapt packages to my personal needs—and I wouldn't want to miss that.

As far as repkg is concerned, I've written a wrapper script, remakepkg, that handles the downloading of packages and makes the command line invocation more convenient. But I've noticed that it becomes a bit tedious after a while:

First, it is currently hardwired to the current checkout of the package database, so it can't repackage a package from a newer, remote database "ahead of time", i.e. I first have to attempt a pacman -Syu, have diffrepo detect that the repositories are out of sync, then repackage and push the offending package to my repository, and try again.

Second, I'd like it all to happen more automatically. The repackaging rules are not bound to a specific package version; they can apply to any version of the package (as long as the relevant parts don't change), so it could also just run fully automated, rather than having to invoke it manually each time.

Third, it doesn't scale particularly well: it works fine for one or two dozens of packages, but if I crank up things to the hundreds (e.g. because I want to fix something more fundamentally that affects a great number of packages), running remakepkg for each one separately isn't terribly convenient.

But this is all assuming that I'll keep using this distribution. After 9 years, I've started to see quite a few cracks. And for a distribution that I would have otherwise considered as "not getting in my way", I must admit I've now spent a considerable time working around its flaws.

It's still good enough, though, so…… meh

read more

2021

2019

  • Not-a-bug fixing

2017

2015