Building Haiku_x64 from within Haiku_x64

Seems I’m having no issue with my dual-core Pentium Lenovo laptop (N580). Should I just consider getting an Intel processor-based desktop system and not worry about the stability issue? Or are Ryzen 3/5/7 systems solid when compiling Haiku?

I have no clue. If the answer would be known, probably AMD related tickets would progress more. Intel CPUs are clearly more stable than AMD ones under Haiku. You just have to test it (and report bugs). But even more simple than getting a new system, is enabling / disabling CPUs / cores in runtime e.g. by selecting ProcessorController applet (first one under leaf menu) and tick / untick each single core.
cpu

Thank you. I had totally forgotten I could do that. I was thinking the only way to disable cores was via Safe Mode options (wouldn’t “Disable SMP” accomplish the same thing?).

Well, PRAISE GOD AGAIN! I have just gotten a working Anyboot .iso built on my AMD FX system, using “disable SMP” (so only one core functioning) in the Boot options. So, now, I need to re-enable SMP and see if I can build with two cores, then three, then four and so on. Fingers crossed.

Question is, is there (or could there be) a Haiku code work-around for this issue or is limiting cores the only solution? I already have the latest BIOS (2014; F2) for this system that was made, so nothing forthcoming in that direction. Not sure what you meant by:

Unless you meant a BIOS firmware update, which has already been done.

Ok, I apparently cannot disable all the CPU cores, because when I click off the 4th core (after disabling 8, 7, 6, and 5 (in that order)), Haiku locks up. So, is Haiku running in the core I shut off and it messes everything up or…?

UPDATE: Using Pulse, I can safely disable all cores, without Haiku locking up. But if I use ProcessController, I can’t disable past the 5th core (in reverse order; 8, 7, 6, and 5) before it locks up.

UPDATE 2: Was able to build with two cores enabled, but tried 3 cores and it blipped (rebooted). So, now we know the limit. Question is… why? You can use this CPU all day long in Haiku/Windows 10, all 8 cores running like the wind… but anything past 2 cores in Jam (Haiku) and it junks this system like a trash heap! Very strange.

Your experimentation deserve some comment in existing tickets like #14082, #14887 or #15817. Or you can open a new ticket for this.

I think we have enough tickets for this particular problem.

It may or may not be solved with microcode updates.

In the BIOS, look for the Power Management: Cool N’ Quiet and C1/C6 states. You can review
with the Cool N’Quiet (disabled/enabled) and C1/C6 (disabled).

I have already tried disabling “Cool and Quiet” mode and it doesn’t change a thing. I had enabled one of the C1/C6 modes (one of them was already enabled), but no change was noted there. Do you think disabling or enabling would help things?

Some BIOS settings can reset the system if misconfigured or conditions met.

If your hardware is in perfect working order and fully updated with the latest BIOS/microcode… then it
usually requires the proper kernel updates per CPU manufacturer specs.

I don’t think this is the hardware problem. I have AMD computer with the same issue for Haiku. But besides Haiku I have Windows, Linux, MacOS, Solaris, FreeBSD and Android-x86 installed in it and no one has such a problem.

We should at least do the microcode patching, and see if we still have problems after that.

1 Like

IIRC, one of the users with these processors ported the AMD microcode patcher for this specific hardware, and it did not help. (The patches were on Gerrit at one point, but they were GPL so we can’t/won’t merge them.)

Yes, it’s archived on Gerrit: https://review.haiku-os.org/c/haiku/+/517
I see no reference there to it helping or not with the issue, however.

I thought microcode (patches and fixes and such) had to be done/implemented by AMD/Intel. But it can be done by 3rd party/end users, too? What, exactly does/did this microcode actually “fix”? And, would it be a worthwhile effort to try it on my own system to see if it actually works. And, if so, we find a non-GPL method to implement?

Wait a second… there is a very easy way to figure out if Haiku is at fault or the hardware. Using a non-Haiku setup to build Haiku. If I can use all cores/threads to jam Haiku, in say, Ubuntu, then we know it’s something wrong with Haiku. If it still happens, then we knows it’s a flaw in the hardware… but has this ever been actually tested or mentioned as being a universal issue across build platforms?

Ok, reinstalled Windows 10 and ran CPU-Z. The CPU is, specifically an FX-8320 (Vishera). If the motherboard specs/chipset would help, or the location of the RAM DIMMs (they’re in slots 3 and 4, not 1 and 2, as I would assume is proper), I can give that as well.

The microcode is a patch that fixes problems inside the computer. Intel and AMD ship it as a file with instructions on how to apply it. The manufacturer of the motherboard can then include it in a BIOS update, or the OS can also bundle the file and apply it early at boot (since we can’t trust motherboard manufacturers to ship BIOS updates in time).

As for the problem, it is clearly a problem in Haiku itself, or in hardware. If it was in hardware, you would see instabilities in other OS as well. If it was a normal crash (just an application failing or the like), it could be the build system having a problem. But you get a reboot. There is no way an application should be able to trigger an accidental reboot like this, except in what we call the “triple fault”. What does this mean?

  • First, the application crashes (fault 1), for example it accesses memory is is not allowed to
  • Secondly, the code in the kernel that handles this also crashes (the “double fault”) because it accesses another chunk of memory it is not allowed to (or that doesn’t exist)
  • And finally, the code in the kernel that handles this (because yes, we do try to handle it - by dropping to the kernel debugger) also crashes (“triple fault”). In this case there is not much we can do, but reboot the computer.

While simple “faults” actually happen all the time (despite their name, they are used, for example, to implement virtual memory, and it’s a perfectly fine thing to trigger a “fault” in that case), double fault is something that should not happen. Triple fault, even less so. What happens is that the memory or some CPU register was somehow corrupt or set to an incorrect value, and the code is completely confused about it. Since it happens only on some machines, first we suspect a hardware problem, the memory could be losing bits, or the CPU overheating, leading to such problems. But, the fact that the machine runs stable in other OS, and the fact that multiple AMD Bulldozer machines all have the same problem, points rather to a compatibility issue between AMD CPUs or chipsets and Haiku.

It is hard to debug, because the machine just reboots. If at least we could get to a kernel debugger prompt, we could try to look around for a common pattern in all these crashes, see what’s wrong, and then try to understand why this hardware isn’t behaving as we expect and what we did miss in the code. But with the machine just rebooting, we’re left with very few clues. And that’s why the issue has been around and unsolved for so long.

1 Like

Thank you so much for that insightful explanation. At least now we know that it’s not for a lack of trying, but something that “mysterious” is very difficult to nail down.

Since, at least for me, it only happens during Jam and only when I have more than two cores active (I had all cores/threads deactivated (via Pulse) except 1 and 2), but when I activated core/thread #3, then it crashed as it was Jam’ing.

Since Haiku doesn’t blip like this normally, what does Jam do that might be in conflict with another Haiku process? In other words, is there anything Haiku does (background tasks) that might intersect the same memory address or something that Jam is doing, that is so unexpected, the crash totally blows up and leaves nothing in its wake (spontaneous reboot)?

Is the design of the Vishera CPU such that Jam is trying to tell it to do something that it balks at with more than two cores/threads, active? Is the Vishera CPU a true 8-core processor or a 4-core/8 thread processor? Seems I can’t get a straight story on that. Are only Bulldozer/Vishera processors the ones affected or are Athlon II/Phenom II processors also affected?

Does this issue rear it’s head if you build Haiku in Ubuntu or similar? If no one has tried, I will. But I won’t have easy access to deactivating cores/threads, like in Haiku, that I’m aware of.

Lastly, are Ryzen 7 systems unaffected, while Jam’ing Haiku x86_64 within Haiku x86_64? Seems maybe a “hardware compatibility list” might be wise, for developers, too! :smiley:

Appending a warning for Bulldozer/Vishera CPU-based systems, to use only 2 cores, when Jam’ing Haiku within Haiku? Is this a universal fault in 32-bit Haiku as well as 64-bit Haiku?

If you got the hw and have enough time you can get answer for some of your questions easily.