Unlocking BLocker with sem 131860 from wrong thread 2828, current holder 2916 (see issue #6400).
So I’m seeing this in one of my apps. But I’d like to break in the debugger when that event happens so I can see the stack. Cause at the moment I have no idea what code is triggering it.
It would be nice if there was a function in the BLocker class that was to specifically just print that message. It would make breaking on the msg very easy, even without debug symbols. Because you can see the function names at least in the debugger. In fact that would be a good pattern to apply across all similar error/warning messages.
Btw that semid doesn’t exist if I open up the process in something like SystemManager. So I don’t even know what it relates to.
I’ve successfully built an instrumented libbe.so but when I load my application in the debugger, it still uses the /system/boot/lib version rather then the libbe.so in the same folder as my executable. Is there some other step I missed?
Seems like they are the same type of binary:
~/code/lgi/trunk/lvc> file libbe.so
libbe.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
~/code/lgi/trunk/lvc> file lvc
lvc: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, with debug_info, not stripped
There isn’t an “ldd” or “otool” to check dynamic lib dependencies?
But that will not tell you where the lib will be searched, just hte library name.
For library search paths, check the LIBRARY_PATH environment variable., which you can adjust if needed to force loading a specific library (or just use one of the paths that are already there)
Ok, now I have it in the right folder. But the program exits almost immediately without printing much. Which is not the same behavior as the system libbe.so. I suspect I need to backdate the haiku checkout to be compatible with my system. e.g.
That’s seems super recent. So there should be no mismatch to the current HEAD of the git repo… weird.
I wonder if I can I get debug symbols for the build of haiku I’m running? Is a debug build of Haiku a thing?
Also: I read through the assembly of BLocker::Unlock (of the system libbe.so), comparing it to the C++, and found a good point to put my breakpoint despite not having the symbols. So I now have an accurate stack trace of where the rouge unlock is called from. GG
My working theory is that the “system” BFont I created in the startup thread which is getting shared across many threads is being used incorrectly. In single threaded GUI’s like Linux and Windows it’s fine. Haiku… not so much.
My code is somehow trigging this error message on the sLock in src\kits\app\AppServerLink.cpp.
In the context of calling BFont::GetStringWidths (src\kits\interface\Font.cpp).
BPrivate::AppServerLink is constructing (and there for locked) and destructed in the same function call, so also the same thread. How could it possibly have a mismatched lock/unlock thread error in ~AppServerLink?
R1/beta4 is on a branch (called r1beta4.) You will note the current HEAD of the git repo is hrev57259, quite far ahead of where you are. So, either switch to a nightly build, or check out the r1beta4 branch.
It’s something to do with the sub-processes. So this particular app spawns off some subprocesses to read the versions of the various vcs’s installed (hg, git, svn etc) as part of it’s startup. It’s a front end for version control. Anyway, if I disable the subprocesses it all starts working normally.
The code that runs subprocesses is just your standard fork and execve that works great under linux and macosx. Although it does make pipes to talk to the stdout and stdin of the subprocess. Something about calling that messes up the parent process. Does the child inherit something it shouldn’t?
Is there a good example of running a subprocess under Haiku without trashing the parent process?
It’s not always in the same place… but this is one example. It’s always in ~AppServerLink though. And it happens sometime after the subprocesses have all run and exited. Although I do see those subprocesses sitting in the list of processes in the debugger. So maybe they aren’t being cleaned up properly?
I don’t waitpid on the child process in this case… do I HAVE to do that to make it actually finish up what it was doing?
You shouldn’t need to, no. The list of processes in the Debugger sometimes doesn’t update correctly, I think. Check with ProcessController to see if they’re really still around.
-1 indicates there’s no thread actually holding the BLocker. So this usually means “unlock without having been locked before.” This is pretty strange, especially if it’s the AppServerLink lock; there shouldn’t be a way we wind up in that state.
Given that other processes are involved, I guess there’s some chance this is memory corruption? But if the memory was modified, this should have resulted in 0’ing the lock, not -1 in one specific field. Very strange.
If you have steps to reproduce and open a ticket, I may be able to look into it next week (or perhaps someone else will beat me to it.)
My working theory is that between the fork and the execve there are 2 copies of everything? And are they both running? And some counting of locking / unlocking gets messed up? It’s only a short period of time… but non zero. I admit I don’t know enough about the fork implementation to really have a stab at understanding the issue.