NOTE: This is a highly technical topic dealing with Haiku’s in-kernel “device manager” system. If you have never written or worked within device driver code on Haiku or BeOS, the discussions in this topic are probably not relevant to you, in which case: look, but don’t touch!
It has been my personal view ever since I started working on Haiku kernel code that the “new-style” device manager (which isn’t really new anymore as it’s over a decade old), while it does have a lot of advantages over the old one, is in many ways clunky and cumbersome to write code for, and has a number of annoying sharp edges. I believe I am not the only one to think so; I think I recall @axeld saying somewhere at one point that it deserved a redesign (and mentioned some specific developments that should have triggered such a redesign.)
Well, today I was working on some things in the USB disk driver, and realized the only way to do some of the things I wanted was to switch it over to use the “new” driver API. But said new driver API, for USB devices in particular, has some major disadvantages of the old one.
The basic disadvanges are:
-
Confusing boilerplate. In the “old” driver API (which comes from BeOS), every driver has to manage its own set of names of what devices it has “published.” The new driver API thankfully manages this for you, but instead it requires that every driver have a “device” module which handles probing and the like, in addition to the “driver” module which actually handles the read/write/etc. hooks.
In most drivers, this “module” winds up being a lot of duplicate and boilerplate code, which there isn’t much that can be done about: the
register_device
,init_device
, etc. hooks.In addition, since the device manager is string-attribute-based, every
supports
hook has to fetch a bunch of string attributes one at a time and then compare them. This leads to even more boilerplate and can be very inefficient.- This last point is one of the major disadvantages of switching USB drivers from the “old” API to the “new” API. Under the “old” API, one constructs a pretty compact
usb_support_descriptor
array and passes it to the USB bus manager, which then invokes callbacks whenever devices matching the ones you specified are connected. Under the “new” API, one is left only with thesupports
hook, and winds up with a very boilerplate-y and not very pretty-looking probing function. You can readily compare theusb_ecm
driver under the old and then new APIs to see what I mean here.
- This last point is one of the major disadvantages of switching USB drivers from the “old” API to the “new” API. Under the “old” API, one constructs a pretty compact
-
Inflexible abstractions. Say I want to publish a device in
/dev
. Why shouldn’t it be as easy as invokingdevice_manager->publish(parent, path, hooks);
? But it isn’t. You have to- have a
device
module globally registered - call
register_node
to register a parent “device” node, using the name of the globally-registereddevice
module - perform whatever initialization you need to in the
init_device
method of the “device” node when thedevice_manager
calls it - have a
driver
module globally registered - call
publish_device
with your parent “device” node and the name of the globally-registereddriver
module - instantiate the driver using data passed through structures allocated back in step 3
Why is this so complicated? No wonder the
/dev/null
device (and basically all other drivers which are not attached to any other device) are still using the legacy driver API: because the new one would add tons of bloat to such simple drivers.The idea that "everything is a
device_node
" seems to be the root of this inflexibility. That a PCI device has adevice_node
makes sense, that (e.g.) an ethernet driver attached to it is adevice_node
which is a child of it makes much less sense, I think. - have a
-
Confusing APIs. Did you know that if
unregister_node
returnsB_BUSY
, this isn’t actually an error? I didn’t, until I saw this commit, and apparently @korli didn’t either as he is the one who originally wrote this code.But really, the whole system of how one publishes a device, as described in the previous item, is a very confusing API.
-
Hard-coded search paths. I have been bitten by this when working on drivers, and so has @X512 and pretty much anyone else trying to bring up Haiku on anything that is not basic x86 (even QEMU with
virtio
posed problems that required multiple rounds of adjusting things here to get them to work.)The original documentation for the device_manager seems to indicate this was supposed to be a feature, and not a bug. I think it’s clear that, while the idea sounds good on paper, it doesn’t really work in reality; something more flexible is necessary. Exactly what that looks like is up for debate.
-
Major missing features. Forget support for suspend/resume, did you know that Haiku does not even gracefully power off most devices? We basically just end all userland processes, synchronize mounted partitions, and then kill the power. Yes, really. The device manager has absolutely no idea how to ask drivers to shut down, and drivers expose no mechanism by which they would do so. We could at least try to
close
all devices, which would probably suffice for most things, but the poor device manager has no idea what order to close things in, because it usually relies on other things telling it to close and destroy device nodes rather than doing anything of its own volition. -
Probably a lot more. I just wrote these up based on what I am currently running into while working on the USB disk and related drivers. There are definitely more problems than just these. I really should have written down the ones I thought of while working on the
nvme_disk
driver years ago, to begin with.
Now, on to the proposed changes…