Crazy concept ventures

Anyone here know how to completely deconstruct Haiku x86_64 (an old pre-pm version) to a primitive state where we can start reconstructing it into a totally different beast?

Primarily, I’d like the entirety of Haiku, itself, to run in one core. Secondarily, I’d want the the entirety of the core functionality of Haiku to be directly untouchable from the outside (user/code). No need to ask why. I have my reasons. But security and performance are my primary goals

Anyone interested in this crazy concept venture I’d like to try out? Name your price.

1 Like

Not interested.

You don’t go for the easy ones, do you? Haiku has been multi-core since it was BeOS, when neither Microsoft nor Apple knew what a multi-processor machine looked like and Linux was just starting to look into it. You’d probably have to go back to something as early as BeOS DR5 to find a single-processor implementation and that was long before x86 came into the picture.

I would suggest you try to hack the Haiku API onto a suitable single-core kernel rather than try to change something so fundamental. Either way, you would have literally years of work ahead.

1 Like

I would think that making Haiku single core would not be that difficult. Just make it think it has only 1 core to run on.

What do you mean by making the core untouchable from the outside? Do you mean making micro-kernel not-callable from user space? This would be a problem. It goes against the design of a micro-kernel. Most OS functionality is taken out of the kernel, and put into user space. Micro-kernel only provides small, basic functions. Most Haiku code runs in the servers, which are in user space.

As far as I know, BeOS was never ‘single core’. The BeBox of course was twin-processor. JLG’s mantra was “One processor is not enough!”

1 Like

Hilarious. You think I’m talking about making Haiku single core. That would be sheer foolishness. No. I’m talking about making Haiku function within one core. All to itself. Nothing else uses that core. It’s still the same multi-everything, but it allocates a single core to itself, exclusively. Is that the same as being single core, single-threaded? Not to my thinking. All threads that are spawned, for its own use, are spawned within that first core. Nothing else can use that core. If anything, I should think it would be considered wasteful, devoting an entire core to just the OS. But maybe my thinking is wrong. I allow that as a possibility. I am just an end-user, afterall. Maybe what I’m thinking is hampering rather than helpful. But I’m trying to “think different”, to coin a phrase. Can it be shown where my thinking is completely foolish? Please do so. I’m not afraid of being proven wrong. But I hate living with the status quo, when alternative thinking might “break the mold”.

However, there are several aspects of Haiku (namely, graphics rendering, I’m sure), that if stuffed into the same core, would probably easily overload it and slow it down. But I also have an interesting idea for that, as well.

But let’s focus on the first question. Can what I propose be done and will it improve performance or reduce it?

Anyone willing to give it a shot with my copy of Haiku x86_64? What would it cost to attempt this part? I’d rather be called a fool for having dreamed up something radical, that didn’t work, than to be considered wise for having followed the status quo and done nothing. Just me, I guess.

1 Like

Yeah, I get it. You’re a lay-person without training in the field that is ‘just asking questions’…
I would be surprised if you couldn’t find anyone willing to take your money to investigate that outside-the-box thinking further. You probably need a wider audience than the Haiku webforum though.

Guess we’ll just have to wait and see…

1 Like

I have no idea where you want to go. As it is, your idea makes little sense to me. Let’s look at this in more details just to see where it brings us.

I am writing an application, it runs two threads as usual: one is the application thread, the other is the window thread. So, these belong to the application and do not use the “system” core. Then, the application opens a file. This does not spawn a new thread or anything. However, eventually it opens the file, which ends up running code that definitely belongs to the OS, ultimately reading parts of the file through the file system and disk driver. This all happens in the same application thread. At which time would it switch to running on the “system” core? At which time would it switch back?

Second question: what would be gained by dong such a thing? I would suspect you have an idea of something you want to achieve, because “let’s run the system in one core” makes little sense from an end-user point of view. Performance? Security? Something else? Software design starts with what you want to achieve, and only then goes on with the technical details of how to achieve it. It looks like you have omitted that part, so it is not clear what you are trying to do (and indeed people misunderstood).

As for paying someone to do it, this will not be me. I have a paid job I’m quite happy with and pleinty other things to do in Haiku already. You can search for the usual rates for a software engineer on any search engine and find some useful results, I guess, if you want to have an idea how much such people are paid.

1 Like

I think you’ve overlooked that there’s already a huge amount of thought and research that goes into problems like this. Typing “scheduling” into google scholar will provide you with lots of scientific papers on the topic. For a brief overview you could look at the wikipedia page or even the page for the related job shop scheduling problem.

You’re unlikely to make an improvement without looking at a real world implementation and thinking very carefully about how it works, the trade-offs involved, and how it could be improved, or by reading some recent papers on scheduler design.

My thought was to try to isolate the OS in such a way that NOTHING could slow it down, no matter how intensive it was. The app might be crawling, because it was using up all the CPU resources of the core(s) it was running on, but the OS would never be negatively affected (operating at full speed), because the core that it operated in was never touched by other running apps.

However, given the explanation you’ve just given, I sense my concept simply isn’t possible, correct? Everything that runs HAS to be run through the OS core, in one way or another, no matter how isolated you made it? So, the OS (with a demanding enough app or enough apps making demands of the OS) would be slowed down, eventually, no matter what. Right?

Did you ever see Haiku slow down because of an application? I never did. Our current scheduler does a good job at this already.

It works quite differently from your idea, however. Haiku is (like any modern OS) doing preemptive multitasking, which means anything running on a CPU core can be interrupted to let other things run. In the case of Haiku, we do this rather agressively to let user interface stuff run first, which is what makes the OS seem fast (it always react quickly when you do something). Another part of this strategy is splitting each application in a lot of different threads, which gives the scheduler more fine grained control on what runs and what doesn’t.

There is no need to lock things to a CPU core for that. We can always kick applications out of it if we need it for something else anyway.

The whole point of a micro kernel architecture like BeOS and Haiku is to make the application more responsive to the user than a monolithic kernel like Linux. And it works pretty well as advertised. I did some testing where I loaded a laptop with a very CPU intensive application. The Haiku user interface was still very responsive. It is designed to be that way. You really want your OS to have more threads and run on more cores. This way, things don’t “block” and stop or act sluggish. Like Munchausen said, the Scheduler takes new requests for CPU resources and puts them on CPU cores. With the new Scheduler and the new CPUs, you can spread your work over a lot of cores. I did some testing on 12 cores. It works.

Ok, so my idea isn’t just a different wording/thinking of what Haiku already does. It actually IS different, but the question is, would it work better than the way Haiku operates currently, if it were implemented? Or just a different way of doing the same thing (essentially), with no appreciable performance improvement?

Luposian, have you researched unikernels? That might work better for the project you have in mind. I’m not sure that Haiku can be easily implemented as a unikernel.

What you are asking for is essentially what a real-time OS provides. A real-time OS guarantees that a thread running at priority will always be serviced within a a certain, fixed amount of time. Gar-run-teed. The assumption is that once serviced by the scheduler, a thread will only get a certain percentage of the CPU. Although BeOS and Haiku are not 100% real-time, for most practical purposes you will get the same results. QNX is a real-time OS. Used by Blackberry for phones and popular in cars and robots.

Haiku and BeOS are not microkernels. They are usually described as “hybrid modular” kernels, whatever that means.

In a microkernel, the idea is to move everything outside of the kernel: drivers, memory allocation, filesystems, network stack, and so on. The idea is that the kernel is a low level thing, cannot benefit from some hardware features because it needs to manage them (MMU, for example), and as such it is risky to write kernel code and any bug could have bad consequences. So, by having as few code as possible on the kernel side, you reduce this risk.

Haiku does not go this way, at all: drivers run inside the kernel, so do filesystems, the network stack and a lot of other things. But there are exceptions: for example, userlandfs allows to run file systems in userland, and app_server accelerants are essentially a large part of graphic drivers moved to userland. In BeOS, I think parts of the network stack were also running in net_server (userland), but they changed their mind on that with Bone.

Anyway, this is all software architecture and completely unrelated to what we are discussing here. You could do a microkernel that freezes the whole UI everytime you are waiting for the hard disk. What makes Haiku react well in high load situations is a set of design principles built into the API:

  • Fine grained threading: each window and each application you open has a thread. This forces developers to think about which thread they are using for each task they want to achieve, and allows the scheduler to decide which of these to run
  • Forcing to set thread priority: unless you use pthread APIs (pthread_create), you are forced in Haiku to give a priority to each thread you create. This gives the scheduler valuable information on which threads are important for user interaction (we have helpful constants like B_DISPLAY_PRIORITY, B_REALTIME_PRIORITY, etc).

Given these, the scheduler is given all the needed information (thread priorities) and flexibility (because threads are specialized) to decide what to do next. And it is tweaked so it will first run everything that involves user interface interaction: handling mouse and keyboard input, refreshing the display, etc. It is just that simple thing which makes Haiku react immediately when you click on something.

You don’t even need a real-time OS for the UI part, as there is no hard defined constraint on the latency here. Is it ok if the mouse freezes for 1 second? (probably not) 100 milliseconds? (I don’t know) 20 milliseconds? (the user probably wouldn’t notice) 16 milliseconds? (less than the time it takes to draw it on screen, so probably ok). The border is blurry between “ok” and “not ok”.

Haiku does use real-time scheduling only for select threads, and in particular everything that has to do with media. If your video skip frames, or your sound output has “holes” in it because the MP3 decoder doesn’t run fast enough, this is immediately noticeable and clearly not acceptable. Hence, for these tasks we use real-time scheduling.

4 Likes

In a way a lot of operating systems provide some pieces of this functionality through the concept of “affinity.” Essentially the idea is that you can pin the scheduler to always schedule a given thread/process on a given core. Some schedulers do some form of this automatically. Generally you want to avoid bouncing a thread between cores because the different levels of cache on the CPU. You could end up having to hit a higher level (slower) cache because the data you want is in the L1 cache of a core the scheduler just moved your thread off of. This becomes even more critical on NUMA machines where not only caches have levels but system memory does as well. On today’s multi-socket systems there will be some memory local to each socket, and getting data from memory associated with a different socket adds latency. NUMA aware schedulers will frequently add a notion of socket affinity as a result.

I will admit my ignorance of how Haiku handles scheduling, and it is less of an issue for Haiku’s primary use case which I would think would be on single socket SMP systems.

The kernel does work on behalf of applications. When applications want to do some I/O or IPC they ask the kernel to do it for them and the kernel spends a certain amount of time servicing the request. This time is time on the CPU. Now the kernel may be able to do some other things at the same time depending on the situation, but it still consumes resources. If applications aren’t asking the kernel to do work for them then the kernel isn’t really impacting performance all that much. Calculating a billion digits of pi on all your CPU cores isn’t going to tax the kernel much, and isn’t likely to tax any other system tasks either.

If you really want to improve responsiveness under load, the areas you’d want to look at are A) avoid ping-ponging threads between CPU cores and B) make sure the scheduler is prioritizing threads that involve things like UI, networking etc. Most operating systems have some concept of thread priority and if not, you could add some sort of priority mechanism that will guarantee time to certain system tasks. If throughput is significantly less important than interactivity you could use a really brain dead scheduler that preempts threads at a fixed interval and runs the next waiting thread in a round-robin fashion. In general though, I would assume Haiku’s scheduler is already fairly good. A lot of work goes into tuning schedulers on large machines for a small, incremental benefit.

It does have a scheduler with affinity, which can be tweaked between two different modes (“low latency” and “power saving” - switchable from the ProcessController menu) for more or less agressive waking up of different cores. This is largely the work of Pawel Dziepak who was contracted to design that new scheduler.

Well, crud. Here I thought I had envisioned such an original (totally unthought of) concept to security and suddenly I see anti-malware software is doing basically the same thing. Behavioral analysis. Except it’s still doing it from the outside. And it’s doing it in a more complex fashion. But it can’t do any better, because the OS is still the problem. Just as I said before, you CANNOT retrofit my concept into an existing OS. They’ve come close and it works better than the “old way” (still widely used), but it’s still not the same. Whew! Thought I’d lost my “edge” there, for a second. :smiley: