Re-designing the kernel device_manager

I mean this: VirtioMmioDeviceDriver::Init(). It get registers/interrupts depending on bus interface (FDT/ACPI).

Also virtio_block (VirtioBlockDriver::Init()
, current version), virtio_mmio initialization become cleaner. No need to store 2 pointers (vtable+cookie), no need to get driver module.

1 Like

The idea of a templated QueryBusInterface could be done as a C++ wrapper around a C method pretty easily. I do like the idea, even if the names or mechanism should maybe be a little different.

Again why so obsessed with C? C++ only pure virtual interfaces with no any wrappers are perfectly enough for now. C/Rust/Pascal definitions can be written on demand if needed in about a week of work. Because basic device node tree concept is mostly unchanged, it should be easy to port all current device manager drivers to new API.

Citation Details needed.

I plan to go with this new device manager design for new RISC-V drivers (clocks, resets, GPIO, FDT network adapters) because it solve serious problems with current device manager (such as node referencing other node declared before), allows to implement power management (FDT clockgating) and simplify driver development.

I also plan to make more advanced Devices GUI utility that can enable/disable/reset individual devices, live update device tree and inspect more information. Some more proper userland interface for device manager is needed (current one pass kernel pointers to userland directly and not check it [!], it is probably good idea to use FDs for accessing device nodes from userland).

I would like to also hear opinions from @axeld, @PulkoMandy, @korli.

1 Like

Which problem was this above?

With FDT tree one node may refer another node declared later with phandle (known to happen with real hardware (MangoPi)). Current device manager design attach driver directly in register_node code, so there is a chance that nodes come after that are not created yet.

I plan to attach drivers lazily, so first all device nodes are created, next new nodes are probed for drivers and QueryBusDriver may force probe to allow referencing nodes that come later.

My opinion at this stage: the discussion on C++ or not C++ seems a bit fruitless. I don’t care. I will write drivers in C++, I am OK with a C or a C++ interface. I have no idea if one of the other helps other languages. Probably a C interface helps C, but who would need that in Haiku?

As for other things: I don’t really know how to start reviewing it. There is no documentation, the commit messages do not give a lot of details.

I think that’s one of the difficult things to get right and influences the design of everything else. It is indeed where there was a lot of discussion in this thread.

What is the model for probing devices? Do we walk the complete device tree on boot and load all drivers? Do we do it on-demand like BeOS did? In that case, how do we know which parts of the device tree to explore?

In the old device manager I can at least see how it was meant to work: something (for example userspace opening a directory in /dev) has a need for a category of drivers (eg. mass storage), this triggers exploration of all branches of the device tree that may provide mass storage drivers. The separatation between device tree nodes (which can be walked, possibly while the driver is unloaded) and devices in devfs (which need the driver loaded) would allow for this.

Is all of this idea gone now? How does it work? Surely, some kind of “design document” would help with that, especially for the parts that are not yet written. This is where I think there are interesting things to do, not in the interface between the drivers and the device manager. The correct interface will emerge from defining what the device manager needs to know about each driver, when it is loaded, and when it isn’t loaded yet?

Yes, drivers of all found compatible hardware are loaded at boot. “Probe” means attempting to attach driver from candidates to device node. In planned design, candidates are found by matching device node attributes and driver add-on compatible info (currently considering to store that info in BFS attributes in KMessage format).

In addition to loading drivers at boot, device manager is supposed to do driver add-ons node monitoring and automatically load/unload installed/uninstalled drivers.

I expect that it is difficult to get it right because not all drivers expose devfs nodes, some drivers expose kernel interfaces and depend on other drivers. Other OSes seems manage to boot fast without lazy driver loading.

At least for mass storage devices it not worth because userland will enumerate all storage devices anyway.

2 Likes

This is something I’ve missed from Windows with its “Device Manager” application. Even on Linux, people still resort to finding the driver name through the terminal and then editing text files to disable devices.

1 Like

Mass storage devices need to be iterated by the kernel already, in order to find the boot device anyway.

I think lazy loading of drivers is an important feature, though. Not for actual hardware devices, as I guess they should always be properly initialized, and probably must be to support power management. It is important for drivers like /dev/random that shouldn’t use any resources unless they are actually needed.

As for C vs. C++, I don’t have a strong opinion to either one. Our current module API is C based, and it should be the single basis of all drivers IMO (besides the legacy ones). If that can be made work with a C++ API nicely, I certainly wouldn’t mind. I just value a stable kernel interface highly.

1 Like

I think it is a bad idea to use devfs directory enumeration for as trigger of lazy load. Various software may perform recursive devfs traverse that will cause loading all lazy load drivers making lazy load meaningless. Better implementation would be driver loading when actually opening devfs node (can be achieved by publishing stub devfs node without driver, path can be specified in DriverRoster compat info) or using some explicit request (ioctl etc.) to load driver.

My current public device manager and busses/devices API draft is located at haiku/headers/os/drivers/dm2 at device_manager2 · X547/haiku · GitHub. Using C++ should be not a problem because it use ABI similar to COM, all methods are virtual or inline. Unnecessary inheritance is removed and added ability to have multiple driver interfaces and/or versions.

1 Like

The current device manager maybe goes a bit too far, in that not only it doesn’t load the device drivers, but also it tries to not explore the busses until they are needed.

I think this is what create the needs for hardcoded rules in device_manager.cpp? Would a solution where we explore all busses (PCI, USB, FDT, ACPI) at boot, but load the “leaf” drivers only on-deman work better and be simpler?

There are some cases to be discussed, for example, HID arguably could be handled as a bus (we have discussed this in the past), with USB-HID, I2C-HID and Bluetooth-HID providing the bus, and then a generic HID input driver would attach to that bus, and do the HID report parsing and input_server interfacing.

And another case that is kind of the opposite situatuion: SDHC controllers. I tried to use the device manager interface properly, that means there is:

  • An SDHCI driver that attaches to the PCI device of the SD controller,
  • An SD/MMC bus that enumerates the cards on top of the SDHCI controller (and could attach to other controllers that are found for example in FDT)
  • And finally a driver for mass storage SD cards

In theory, there can be other type of cards using the same bus (SDIO), and there can be multiple cards attached to the same controller, requiring this interface. In practice, you will find that on some ARM devices, but on PC you will always find SDHCI, and a single SD card attached to it.

In both of these situations (HID and SDHCI), how do we know when we reached a “leaf”? You can imagine device trees as complicated as you want, such as SDHCI on PCI on some USB to PCB bridge on top of another PCI bus on top of device tree.

There are several steps to the lazy loading, the directory enumeration is just one of them.

The sequence is as follows:

  • The directory is enumerated (or some other event inside the kernel results in needing a driver)
  • Corresponding parts of the device tree are explored, and drivers are loaded
  • Drivers publish their devices in the directory
  • The application that was looking for a device opens it
  • The actual device initialization happens in the open() hook from the device

In a sense, the initialization that happens in the open() hook is always lazy: it is not done until the driver is needed by userspace. But, it turns out, a lot of the initialization is done before that for most devices.

Also, we have to think about driver reloading. The lazy loading also means it is easy to unload a driver (just stop the userspace things that are using it). This allows to update the driver without rebooting, sometime quite useful for testing. I do this a lot with USB devices. Install the new driver, unplug and replug the device, and you can test immediately. But it requires an unplug-replug, which isn’t always possible (for internal devices). What are our options here?

There is a risk that C++ feature will end up being used and break in surprising ways if one of the drivers is not actually written in C++.

You can provide a C structure that will mimic the C++ class (having the same memory layout) but will not behave well for dynamic_cast or similar operations. So, it makes the life of non-C++ developers a bit difficult to get this just right, whereas a C interface as we have now is a well understood object.

We also know that maintaining a stable C++ ABI is somewhat complicated, we had to deal with ABI changes, and set up various safeguargs (fbc padding, the Perform method, …) in our userspace classes to allow for it to really work.

What we win is a possibly easier to use interface for people writing drivers in C++. But, to me, it does not seem like a huge win, the current structure based system seems good enough. Certainly not perfect, but a good compromise between ease of use in C++, ease of use in other languages including C, and ease of guaranteeing binary stability. You make a decision to shift to easier use in C++, I think at the cost of compromising the two others a little. It is not easy for me to see how much they are compromised and decide if it is worth the simplification on the C++ side.

3 Likes

Already done in my branch: haiku/src/add-ons/kernel/busses/hid/i2c/i2c_hid.cpp at device_manager2 · X547/haiku · GitHub, haiku/src/add-ons/kernel/drivers/input/hid_input/hid_input.cpp at device_manager2 · X547/haiku · GitHub. Kernel HID device API: haiku/headers/os/drivers/dm2/bus/HID.h at device_manager2 · X547/haiku · GitHub.

dynamic_cast is not available in kernel mode as I understand. At least many kernel add-ons use -fno-rtti flag.

It should be documented that C++ RTTI (dynamic_cast, type_id) can’t be used with device manager interfaces. Things like fNode->QueryBusInterface<FdtDevice>() should be used instead.

C++ RTTI can’t properly function without dynamic symbol linking that Haiku kernel do not support. UPDATE: dynamic_cast is actually used in kernel code, but I suppose that only inside single image boundaries.

With my design drivers are supposed to be automatically loaded (if it match some device node) when driver add-on is installed and automatically unloaded when addon is moved/deleted.

Driver roster maintains 2 lists: driver add-ons and device nodes. Driver add-on list is live updated by node monitoring. Driver roster performs live addon and device attribute matching and updates compatible driver module list for each device node.

2 Likes

Bump @axeld @PulkoMandy @waddlesplash @jessicah @korli @mmu_man

@X512 has done a lot of work on this, and we need to discuss if it is the model we want.

1 Like

My current thoughts:

  1. I think the concerns enumerated above about use of C++ stand; I don’t think the benefits outweigh the downsides. A lot of said benefits could be gained by simply having a “C++ zero-cost abstraction”; i.e. a header-file-only class wrapping the C API, which would sidestep all the API/ABI concerns and allow for “cast safety”.

  2. The current linked implementation still does not do dynamic device/bus registration (at least in most cases): it has a large hard-coded static array of device driver informations inside the kernel itself (called CompatInfoData.) So, it does not even implement the critical feature any device manager actually requires, the existing implementation of which is one of the reasons for discussing such a redesign in the first place.

  3. A number of the items mentioned in my initial post (specifically: “Major missing features”, “Confusing APIs”, etc. but also parts of “Inflexible abstractions”, even) aren’t solved by this system. Perhaps it refactors things in such a way to “pave the way” for them, but I think the existing device_manager was thought to do so, as well. So, as mentioned above, the real test of a new design is whether it actually supports those features or not.

1 Like

I currently did not yet implemented a method how to put this information into driver add-on files so it can be read at early boot phases such as boot drivers loaded by Haiku boot loader. Currently I consider to put it into program headers, but it need adjusting ELF linker code. It will be trivial to eliminate static driver compatibility info table when reading that info from driver add-ons will be implemented. Compat info format will remain the same (KMessage).

Reading and generating that information is one of the critical questions here, though. How will this data be generated at build time (or runtime)? How will the kernel find it and read it efficiently? Will it keep that in memory always, or re-read it when it needs to find new drivers? With the static array setup, all those questions are totally side-stepped, it’s way simpler. So there’s a bunch of key design considerations that are just not taken into account until a real scanning system is implemented. Sure, eliminating the static array is “trivial” once you read from drivers, but it’s exactly that “reading from drivers” that’s the whole question in the first place, and which I was outlining detailed solutions to earlier on in this thread.

(For that matter: putting the data inside the ELF, rather than in extended attributes, was one of the suggestions I originally made which you strongly disagreed with at the time. So I guess you have changed your mind about using extended attributes here?)

The problem with extended attributes is that it currently not supported by boot loader VFS. Reading resources is also not trivial, existing code heavily use exceptions that are not usable in boot loader. Also It require traversing whole ELF file to find end of ELF file and start of resources.

I planned to make new method of storing Haiku resources in ELF file using program headers. It can be compiled from *.rc into *.o and linked in the same way as any other object file, this will also useful for generic GUI application resources. It is also easier to integrate into common build systems such as Meson. Userland may support both old and new resource storage method, but kernelland will support only new program header based one.

1 Like

If we’re going to have data in the ELF, why use resource compilation at all? Why not just have specific named symbols with standard struct layouts (like the BSDs use)? That way, these tables can be managed entirely in source code, and there’s no need to edit separate formats.

We can have the same setup of declared string versions for structure formats that we use elsewhere for backwards/forwards compatibility, too.

Keeping driver compatibility information separate make clear what hardware is supported by which driver and allow to implement various tooling like online driver compatibility database (you type for example PCI vendor and device IDs and check is your device supported by Haiku without installing anything) or driver attachment verification without running any driver code (similar things are done with Linux FDT to address boot problems caused by broken FDT declarations).

1 Like

All those things can be done in the methods I have proposed above (And the BSDs already do them with similar methods, proving it works in practice and not just in theory.)