Haiku nightly lags considerably since 19/07/2024

Sadly I don’t have the hrev at hand but I update almost daily or at least when I’m workig on SEN, and since the build on July 19th, I noticed considerable (up to 20s) but inconsistent lags, even at boot.
I have a gut feeling it’s I/O related, possibly USB, but haven’t changed anything and just use an external USB touchpad since my internal one is not supported (and now even misbehaves on Linux after some time of use).

The lag happens in intervals at every 5 minutes or so, could’t really track down what triggers it yet.

Could this be related to / be a regression of this old issue here?

The best I can offer is 2 screenshots of idle and kernel load below…
kernel-load

This may be related to the problems reported with Mednafen and video playback by @andianton. It turns out that a recent change of mine uncovered a pretty old bug in the system timer code. I have a fix that I’m working on at the moment.

9 Likes

Should be fixed by hrev57916, please let me know if it’s not.

9 Likes

Thanks a bunch, will try the next nightly then…

I just updated to hrev57924 and still experience those issues, unfortunately.
Even on boot, Tracker freezes for some 20s, only showing the mounted disks, before displaying the rest of the desktop (no network drives or such).

You notice the lag best when browsing menus - it sometimes takes around 10s for the submenu to open, and the menu item is not even highlighted.

2 Likes

Are you running in a VM or on bare metal? Some other users reported similar issues, I haven’t reproduced them (or gotten someone to send me a trace with the necessary debug info, it sounds like it may be hard to collect as the system doesn’t respond to KDL requests in this state.)

Running on bare metal of course, nothing like the real thing​:muscle::nerd_face:

Can you drop to KDL via Alt+SysRq+D during a stall? Or does KDL refuse to respond?

If KDL does respond, the next thing to do is see what those other threads are doing / waiting for.

1 Like

First try, a lot of waiting threads right after startup…
Also, when I try to exit KDL using the command cont[inue], KDL just freezes…


I don’t see anything unexpected in there. The “idle threads” are all “running”, so, it sounds like there just aren’t any threads scheduled. If it just freezes after “co”, that probably means app_server didn’t refresh the display. So, what’s app_server’s threads waiting on?

Ultimately if the problems started July 19th then I can tell you exactly what commit broke things (this one), but really as far as I can tell it shouldn’t’ve (it was just an optimization to not reset the hardware timer many times if it’s still set to what we want it to be.) The previous bug was that it wasn’t set properly on some occasions which I then fixed, but it seems there must be some other case where it’s just not set properly

But if I don’t have any more opportunity to debug, I may just revert the part of the commit that’s causing problems and add a TODO for later.

1 Like

I had a KDL in app_server but i not think is related to this problem. Starting in safe mode and without graphics driver solves the problem but without wifi drivers. And even crashes with the kdl on app_server on the nightly hrev57291.

Do you need “safe mode” or just failsafe graphics drivers? You may have an unsupported card.

Safe Mode and failsafe graphics driver. The two options.

@grexe, do you have a setup for building Haiku? If so I can send you some patches to add more assertions to try and catch this bug. Otherwise I will have to build some packages and upload them somewhere I suppose.

1 Like

Sure I’m totally willing to help you with this, and of course I’ve got a setup as I need to fork Haiku for SEN integration :smirk: so just send those patches my way…

Alright, here it is:

If it works, this should cause an ASSERT FAILED KDL (I didn’t reinstate the change that caused the hanging though, so it wouldn’t hang anyway.) When that happens, please run the KDL timers command to get the information on the current timers, which should allow us to see what the invalid/wrong timer is and where it came from. (The panic is continuable, so after it happens you can just continue to go back to the system.)

1 Like

This was another radeon_hd driver related problem it works by enabling fail safe driver.

1 Like

Ok that was fast, seems the problem has been solved by sleeping for me😅
Will try that option and if it works here too then we have found the culprit.
Thanks @zantak for the quick help!

This was my concrete problem, It seems like booting normal stops with kdl and something related to app_server, but if you start with fail safe video driver voila works.

I have no KDL though, just intermittent hangs.