Clang with GCC2 ABI

Recently, I’ve been quite pissed off with the dumb errors related to the legacy GCC 2.95 codebase and the hacks required to make it building again.

And then I wonder, what if, we can plug in a GCC2 ABI backend to a modern codebase? And as a bonus, a modern frontend with all the latest C++11 language features and beyond?

After a weekend of pointing AI agents at llvm/llvm-project, I now have a copy of clang that can generate simple “Hello World” binaries that can link with GCC 2.95 objects. There is primitive support for mangling, RTTI, vtables, DWARF2 exception handling, and runtime functions. I’m still continuing to loop that agent and try to fuzz for ABI mismatches - every invocation it seems to find something, but the problems are getting less and less serious.

I’m currently testing with i386-linux-gnu, using the custom flag -fc++-abi=gcc2. The ABI is not configured to be limited to any target, so theoretically, if pointed to the correct includes, libraries, and start files, this may work for x86_gcc2 Haiku.

Below is the link to the branch if you want to try. Be creative and throw in some of your weirdest C++98 code imaginable! Please let me know if you find any ABI bugs, it will help my agent a lot!

CW: AI Slop.

The branch is currently 100% generated by AI. There may be egregious hacks that have not been cleaned up.

This branch is by no means ready for upstreaming, either to HaikuPorts or to llvm-project.

If you are anti-AI, you should also refrain from reading this code.

This discussion is not meant to be a debate on AI tools. Please keep the thread on-topic about the GCC2 ABI only.

CW: This code is not suitable for newbies or those who are easily disturbed.

If you cannot fix compilation errors without asking ChatGPT, this project is not for you.

If you are OK with all the above, click here for the link.
19 Likes

A RFC to the LLVM community: [RFC] Clang with GCC2 ABI - Clang Frontend - LLVM Discussion Forums

If we have their blessing, this thing may get serious.

2 Likes

If you look at the mailing list archive from about 20 years ago, you will find that idea mentioned already.

I’m not sure it’s worth the effort, but it might still be interesting to explore and play with it.

I would assume that the actual benefits to the codebase are rather small, though.

2 Likes

I got a somewhat working prototype, ready for testing. Any good C++ port suggestions (preferably apps) that have not been ported to x86_gcc2?

1 Like

You could just look into the Haiku build system: for instance, the raw translator uses libraw.so using the newer ABI, but uses a very old manual dcraw port when using gcc 2.

I’m not sure if libraw.so uses C++ itself, though, but our translator does.

In any case, it would be nice to get rid of these build differences.

1 Like

You could try Genio, but It has some dependencies which need to be compiled for x86_gcc2, too, like scintilla and lsp_framework

2 Likes

Thanks, this looks interesting.

If by libraw you mean this library:

GitHub - LibRaw/LibRaw: LibRaw is a library for reading RAW files from digital cameras · GitHub,

then it does indeed use C++. It has a C API, but Haiku still uses the C++ API instead.

Fortunately, libraw also has CMake support, making cross-compilation a bit easier.

It should already be part of our build system for non-gcc 2.95 builds.

I’m a bit more interested in the abi shim layer outlines on the llvm forum, if we can actually do this (with a C compiler) that could open up more avenues, we would then be capable of using full modern C++, and not just old c++ with some gcc2 warts removed

We don’t port anything to gcc2? The idea was to allow us to move away from the _gcc2 part, no?

It is kinda hard to do. You will need to write something that understands both the Itanium and the GCC2 ABIs. And something like that must live in the Haiku codebase (likely a post-processing step in the build system after the system libs are compiled), which forbids LLM tool usage and requires much higher initial effort.

I do not think that it is even possible to make some kind of bridge that will allow GCC 2 programs to directly call Itanium ABI Haiku libraries without them being recompiled to GCC 2 ABI. ABIs memory layout is not compatible.

Such thing is possible for example for Win16 ↔ Win32 thunking because Windows do not expose object internals, objects are referenced by handles and accessed by plain C functions. No direct access to object fields by memory offset or calling/overriding vtable.

1 Like

LLM usage didn’t make sense before for this either, I can agree that it feels a lot like biolerplate code, even complex ones. But that is especially a usecase where we would want deterministic code generation, and that just isn’t something a LLM is capablle of.

Yes this would probably be more effort than your current trajectory, but the effort may pay off for better gains in the future :wink:

1 Like

I agree with you on this. I do not want LLMs to generate the marshalling code in such a scenario - I’d use them to bootstrap the code generator itself.

What I have in mind for this is quite similar to the .NET bindings generator, except that unlike Itanium-to-C#, there is no Itanium-to-GCC2 generator out there. We would still have to write a GCC2 emitter.

Which brings back to the point, such a code generator would need to live in the Haiku source tree as a post-processing step.

1 Like

Nice idea, btw

Got the first ever Clang-built binary running on x86_gcc2 Haiku without setarch.

It was cross-compiled from Linux using this toolchain.

To get anywhere further, we will need to have a modern C++ library port with the GCC2 ABI. I will build libc++ using the custom clang soon.

10 Likes

Why’s that? Can’t we just build Clang for x86 architecture and have it use the modern GCC ABI, but have its default target triple be the GCC2 ABI?

The problem is not Clang itself. In fact, I am very close to bootstrapping Clang on i386-linux with -fc++-abi=gcc2.

The problem is Clang (or any modern C++98-compliant compiler) refuses to parse GCC2-era libstdc++ headers. Right now, pure Be API programs work just fine, but most C++ codebases in the wild do use the standard library.

Therefore, to get anywhere further than simple Be API demonstrations, there are two ways forward. Either,

  1. Update the C++ library headers for Haiku at headers/cpp to be C++98 compliant, or
  2. Port a modern C++ library to the GCC2 land.

I remember there being concerns about having multiple C++ libraries in the same process (and therefore the libc++ port was blocked from Haiku). However, as long as they link to the same ABI runtime functions (libgcc), these should be fine. The symbols are safeguarded from clashing using versioned namespaces (e.g. std::__cxx11 for modern libstdc++, std::__1 for libc++).

Unless it proves impossible, I think this is the route we should take. If the “GCC2 ABI” Clang can’t compile things that use the STL, it’s not very useful.

I don’t think it is impossible - we can basically just spam template and this-> to every compile error we see.

Whether Haiku is willing to review and accept such patches is another story.

There is no STL in the kernel or system libraries, is there?

For new ports, I see no reason why they should be trapped in the ancient libstdc++.r4.so.