TESTERS WANTED: Kernel Guarded Heap

Back in the October activity report, I detailed the rewritten kernel guarded heap, an alternative malloc implementation used to find memory bugs. Well, as of hrev59534 (yesterday), I’ve refactored the kernel to support being built with multiple heap implementations and choosing one at boot time, and so the guarded heap is now included in the nightly builds by default (though it’s not active by default, of course.)

So, now that it’s available for all users without needing to make a custom build, it’s time for anyone who can to test and see if we can uncover any bugs!

The procedure for testing is pretty simple. You can enable this either via the bootloader (tricky, but safe) or via the kernel settings file (easier, but less safe: if things break ,you’ll have to undo this using the bootloader advanced options menu).

First, the bootloader mechanism:

  1. Get to the bootloader menu, as normal (spam the spacebar, or hold SHIFT if you’re on legacy BIOS)
  2. Choose “Select debug options”, and then “Add advanced debug option”
  3. Enter the following:
kernel_malloc guarded
  1. Boot as normal.

Alternatively, the kernel settings file mechanism:

  1. Open the kernel settings file (~/config/settings/kernel/drivers/kernel)
  2. Add the line from step 3 above.
  3. Reboot.
  • If your system won’t boot under this configuration, you’ll need to force the usage of the regular heap. Follow steps 1-2 from the bootloader mechanism, and then enter the option:
kernel_malloc slab

Once booted, you’ll need to edit the kernel settings file to remove the lines you’ve added.

You can confirm the kernel really is using the guarded heap by checking the syslog, it should contain the lines (early in the boot):

kernel malloc: using guarded_heap
guarded heap settings: R

After you’ve tested with just the basic guarded_heap, users on 64-bit can also test with the guarded heap in “memory reuse disabled” mode (which is even more effective at finding use-after-frees). To do that, you need two options, not just one:

kernel_malloc guarded
guarded_heap_options r

Any KDLs you encounter when running this way (well, except for “out of virtual memory” especially when running with memory reuse disabled for a long time, this is expected due to how much memory is wasted), please open ticket(s) with them as normal. (Or if you have existing KDL tickets that this changes the behavior of, upload the image to the existing ticket.) Any questions or comments, feel free to post or ask them here. Thanks!

22 Likes

Testing as we speak. So far so good, is it slightly slower or is that just my imagination?

Do you have to enter the kernel boot parameters everytime you reboot?

1 Like

It’s measurably much slower, and uses exponentially more memory (dozens to hundreds of times as much in certain areas). Hence why this is an option to test with, and not enabled by default!

If you use the “kernel settings file mechanism”, then no. But if you use that method, don’t forget about it in there; after your testing is done, you’ll want to delete it from there.

2 Likes

Just tested with:

kernel_malloc guarded
guarded_heap_options r

current system x86_64 hrev59535

Off the bat, I could tell the UI was struggling but not enough were I could run apps. I tried Iceweasel and first thing I noticed was it did not show the proper bookmark tabs as normal. Plus it was definitely struggling to open sites. Eventually it would but took a while.

Running iceweasl, weechat, and my music app for about 10 or 15 minutes and the UI really started struggling. Tracker was having a rough time rendering directories.

No KDL but when I did remove the kernel flags and set back to default, the shutdown took a long time.

I have a snippet of these last odd lines from the syslog

KERN: hda: stream buffer not completed (id: 1, status 0x8)
KERN: hda: stream buffer not completed (id: 1, status 0x28)
Last message repeated 3 times
KERN: bfs: KERN: Could not find value in index “size”!
KERN: bfs: Could not find value in index “last_modified”!
KERN: bfs: Could not find value in index “size”!
KERN: bfs: Could not find value in index “last_modified”!
KERN: bfs: Could not find value in index “size”!
KERN: bfs: Could not find value in index “last_modified”!
KERN: bfs: Could not find value in index “size”!
KERN: bfs: Could not find value in index “last_modified”!

Happy to help!

1 Like

The “hda” lines I think are somewhat ordinary. The BFS lines are possibly due to old filesystem corruption, or maybe if you deleted and re-created indexes they might not have certain files in them. I have seen those messages on various systems without the guarded heap, so I think that’s unrelated.

Thanks for testing!

2 Likes

Are there new debug options enabled by default since incorporating the guarded heap into the nightlies? Latest x64 nightly seems a little sluggish even when I didn’t enable the kernel tests at boot time.

redrawing of the screen leaves trails when I move something around like Terminal

Looks like your tracker crashed?

No I was moving the Terminal window around while pressing print screen button, it’s an action shot :wink:

There aren’t new debug options enabled by default, but I found a potential cause for this regression and fixed it in hrev59543, so see if that improves things.

2 Likes

FWIW, I have nothing to report. :slight_smile:
I’ve been using the kernel guarded heap for some hours, did some heavy compiling jobs on my 8-core CPU, but haven’t triggered a KDL. I’ve now reverted back to avoid the performance penalty in my daily use.

When do you recommend activating this setting? When some nightly has a regression that is a bit unstable and throws you in KDLand more often?

It’s mostly going to be useful in specific circumstances that are suspected to be memory-related. There are a number of criteria to diagnose that from a KDL alone, but unfortunately some are quite technical. The easiest one to look for is if you see 0xdeadbeef anywhere in the KDL output. However, sometimes that shows up on caches besides the kernel malloc, so activating the guarded heap alone won’t help in that case…

Overall, it’s probably best to try it in a specific case only after a developer suggests it specifically. It won’t hurt to try it at other times, but it’ll probably just be a waste of time.

2 Likes

Testing it on 32 bit. I added in both the boot debug options & the kernel file, but I don’t see the evidence that the new setting is being used in the Syslog file (although, I feel the overall performance a bit degraded):

You should see kernel malloc: slab at least if the guarded heap isn’t active. This comes after the “Welcome to kernel debugger output!” line, so it shouldn’t be getting lost… What hrev are you on?

Shame on me: I was on hrev59498… forgot my previous message.

1 Like

@waddlesplash I tested on recent hrev (32 bit), but I consistently get the following crash:

I tried restarting tree times using the new configuration, but none of them were successful. There are tree stack trace (the tree looks similar, but just in case I will copy all of them)

Which configuration was this: did you use guarded_heap_options r, or not?

No, I was using the standard configuration (just added “kernel_malloc slab” line to the kernel config file.

I guess you mean this configuration, right? Sorry, I guessed was only for 64 bit version (I am testing on 32 bit):

You are definitely not using kernel_malloc slab, the panic messages indicate the guarded heap is in use.

Yes, it’s only for 64-bit. But I guess the guarded heap may waste too much virtual memory for you on 32-bit also, so it can’t really be used there either. Oh well.

Do you mean that maybe I did not applied correctly the configuration entry in the kernel file?

Let me try again. I will try to collect some evidence to confirm if the new parameter is in use.