Not-a-bug fixing
As far as system maintenance is concerned, I guess I have some minor case of obsessive compulsive disorder. I very much dislike having random applications dropping files in my home directory, and I am equally displeased whenever I find software packages on my system that I don't need. While I consider it a waste of time and bandwidth to download and perform updates for such packages, what causes me more unease is the mere fact that the system does not appear clean to me. It is clutter that serves no purpose.
So whenever I see a package that I do not recognise, my very first reaction is typically
Why is this package installed?
In theory, this is very easily determined:
-
Was the package installed explicitly? Then I should know myself why I installed it.
-
Was the package installed as a dependency? Then I should look at its reverse dependencies, recursively, until I hit the "topmost" package (i.e. the one that was installed explicitly); then see above.
In practice, however, there are a couple of obstacles.
First, I don't always remember why I installed a given package. Sometimes I need it for some one-time task and then forget to remove it afterwards. And a few months later, I've forgotten about it entirely.
To avoid that sort of situations, I've started writing meta-packages for the various tasks. A meta-package has a name, a description of the task, and a list of package dependencies required for that task. Rather than directly installing the required tools explicitly, I install the meta-package and let that pull in all the wanted packages as dependencies.
After shifting to using meta-packages for almost everything on my system, I felt rather clever and happy, as I had found a way to exploit the package manager's dependency system to encode all the information I wanted.
I didn't feel clever for very long, though. Because…
Second,
Why are the dependencies broken?
My distribution of choice is currently Arch Linux.
Arch Linux uses the package manager pacman. Pacman defines so-called package groups, which on a technical level behave like tags or topics, i.e. they have no influence on how packages depend on, conflict with, or in any other way relate to each other. However, rather oddly, pacman allows installing and removing groups with the same syntax as used for packages. This explains why some pacman users (and even developers) occasionally mistake groups for meta-packages.
Arch Linux itself doesn't really help resolving this confusion either, as it
doesn't use groups very consistently. In many cases, they are like tags;
packages are assigned to groups like vim-plugins
or
xorg
, which aren't intended to be installed all at once. But there
are also groups like lxqt
or mate
, which may very well
be installed in their entirety; however, simply replacing them by meta-packages
would cause issues for users who wish to remove some of the packages.
Furthermore, there are groups like base-devel
, a set of
packages that are assumed to be present on a build system, and thus never
explicitly listed among buildtime dependencies for Arch Linux packages; it makes
little sense to install packages from this group only partially (at least if one
installs packages within the context of base-devel
), so replacing it by a
meta-package would very much be feasible.
Given the confusing nature of groups, for a user it would be easiest to simply
ignore their existence entirely (except for base-devel
: turn that into a
meta-package, and be done with it).
It would be easiest.
Because Arch Linux has managed to add to the confusion about groups by adding a
group whose own purpose is entirely undefined: base
.
Depending on whom you ask, the reply you get will be different. Some developers
argue that all packages in base
are supposed to be installed on all Arch Linux
systems, and will intentionally not declare some dependencies when they are in
base
. There are other developers who take a more moderate stance and argue
that only some packages should be expected on an Arch Linux installation
(glibc
or filesystem
come to mind), while others should be uninstallable
without issues; essentially they agree with the other ones, but argue that the
current base
is too large—unfortunately, nobody is clear where exactly
they draw the line.
And last, there are the pedantic people like me who argue that base
should be
considered a set of "recommended packages" at most, and packages in there should
be removable in any way the user wishes, i.e. none of the packages should be
implicitly assumed to be installed.
But all those opinions don't matter, because this is a case of a metaphorical
chain breaking at its weakest link: even just one maintainer intentionally not
declaring dependencies on base
packages is enough to "compromise" the entire
system: users now have to globally assume that things may break if base
is
not fully installed.
Essentially, the user is forced to have installed a bunch of packages that they will never need.
Arch Linux is a general-purpose distribution. Upon installation, only a command-line environment is provided: rather than tearing out unneeded and unwanted packages, the user is offered the ability to build a custom system […]
The lack of policy also means that users cannot submit any bug reports for such
packages: if there is no document that properly describes the base
group, any
interpretation of it is a valid one, no matter how absurd. In fact, if I were to
only submit packages to the Arch User Repository where I state all the
base
packages as dependencies, no matter if required or not, my packages might
get removed by an AUR overseer at some point, but the reasoning would probably
be something in the vein of "absurd" or "silly", but nobody would be able to
refer me to a document stating that my interpretation of the base
group was
technically wrong.
But there was a time when I didn't know that. In autumn 2017, when setting up a
new system, I managed to crash the locale-gen
command by not having sed
installed. The glibc
package (which provides the locale-gen
command) doesn't
mention sed
at all, neither as a mandatory nor optional dependency. So I did
what every optimistic, enthusiastic, naive Arch Linux user would do: open a
bug report. This one. And this is what followed on the
developers-only mailing list.
… and there went my enthusiasm.
persona non grata
Based on that reaction in particular and what I had otherwise seen of the Arch Linux bug tracker overseers' behaviour on the forums and mailing lists, I got the feeling that they would now be specifically careful not to resolve any of my present or future requests. As far as the bug tracker was concerned, I was now practically dead.
Not that this would change much; the majority of issues I had previously reported (or voted for) had also been either rejected or simply ignored, despite having patches or trivial solutions attached. It appeared to me that my perception of what is considered a "bug" differed quite a bit from the Arch Linux maintainers' one. I concluded that I couldn't really trust my own judgement on that matter anymore: if something appeared like a bug to me, is it really a bug? Should I bother reporting it?
I decided no.
It wasn't worth the hassle.
I can fix this shit myself
Building packages for pacman is not rocket science. The packaging tooling isn't exactly powerful, but it is also rather simple, and it gets the job done without getting in my way—so actually quite fitting for a distribution that Arch Linux claims to be. And having used this distribution for more than seven years and packaged quite a few things, I decided that I was capable enough of taking over some of their packages as well.
While I didn't feel comfortable tackling a package as fundamental as glibc
right away, I felt confident enough to at least adopt debootstrap
(FS#48908), lighttpd
(FS#45902,
FS#51931) and pass
(FS#55059,
FS#55504).
I started by setting up a personal package repository on a publicly accessible
server, archlinux.zuepfe.net
, and configured my systems to add that
repository to their pacman package sources. Then I wrote a convenience script
that would allow me to more easily push packages to that repository, zr
(for
Züpfe Repositories), and added my custom variants of abovementioned packages
to my custom repository. Up to this point, things were rather smooth.
But then I started realising that this was going to be a lot of work—at least for a regular user. I had no way of knowing when a library soname bump related rebuild was due, or when some bug was reported for the "real" version of the packages that would force me to react someway. I would also need to subscribe to upstream channels to get all the relevant information required as a packager for software I was not necessarily very familiar with.
A lot of duplicated efforts, ultimately. And all that for merely changing some metadata…?
That sounded a bit backwards.
Let's just repackage
Pacman packages are pretty simple. They are Xzipped tarballs containing the following three files at the top level:
.BUILDINFO
: a list of packages installed on the build system as this package was built;.MTREE
: a Gzipped list of files contained in the package, following themtree(5)
format;.PKGINFO
: all package metadata (except what is listed in.BUILDINFO
and.MTREE
), i.e. name, version, description, dependencies, etc.
The tarball may also optionally contain a .CHANGELOG
file and a .INSTALL
file that defines actions to be triggered upon installing, upgrading or removing
the package. Everything else in the tarball is the files that are part of the
package, with a directory structure reflecting how they will be installed to the
system.
That's all, no magic. Theoretically, we can easily modify a package directly: just unpack the tarball, then modify the metadata or add/rename/patch/remove a file. Sounds crazy? Then let's do it. Rather than maintaining our own version of the package and building the software from scratch, we just use the work that's already been done by the package maintainer. This is about as low-effort as it can get.
And thus, repkg
was born.
And this is the list of packages that I currently modify with it. I can now get sane packages, and the Arch Linux maintainers can keep their bugs without having a pedantic who complains. That's a win-win.
future
It's now been 1.5 years since that "bug tracker incidence", and the base
group
still hasn't changed. I've seen 3 discussions among the developers come and go
again without any notable progress. In the most recent iteration, people still
seemed to have different ideas of what the purpose of the base
group should
be, so the one concrete proposal of a new (slimmed down) list of packages in
base
seemed just halfhearted at best. I don't expect this issue to be resolved
anytime soon.
Then again, even if the Arch developers suddenly changed their minds and decided
to fix all the things I consider issues, I'd still keep using repkg
to
modify packages to my needs, because it's just so damn convenient. Not only
can I now easily fix all the cases of "that's not a packaging bug", but also fix
upstream issues, and otherwise adapt packages to my personal needs—and I
wouldn't want to miss that.
As far as repkg
is concerned, I've written a wrapper script, remakepkg
, that
handles the downloading of packages and makes the command line invocation more
convenient. But I've noticed that it becomes a bit tedious after a while:
First, it is currently hardwired to the current checkout of the package
database, so it can't repackage a package from a newer, remote database "ahead
of time", i.e. I first have to attempt a pacman -Syu
, have diffrepo
detect
that the repositories are out of sync, then repackage and push the offending
package to my repository, and try again.
Second, I'd like it all to happen more automatically. The repackaging rules are not bound to a specific package version; they can apply to any version of the package (as long as the relevant parts don't change), so it could also just run fully automated, rather than having to invoke it manually each time.
Third, it doesn't scale particularly well: it works fine for one or two dozens
of packages, but if I crank up things to the hundreds (e.g. because I want to
fix something more fundamentally that affects a great number of packages),
running remakepkg
for each one separately isn't terribly convenient.
But this is all assuming that I'll keep using this distribution. After 9 years, I've started to see quite a few cracks. And for a distribution that I would have otherwise considered as "not getting in my way", I must admit I've now spent a considerable time working around its flaws.
It's still good enough, though, so…… meh
read more
2019
- Not-a-bug fixing