[GSoC 2024] Fixing IPC in WebKit | Haiku Project

WebKit is split into several processes. One of these processes is the browser itself. In my case, this is MiniBrowser, but, hopefully, in the future, it is WebPositive. Since browsers can have any name, WebKit refers to this process as the UIProcess. And, indeed, that process is mainly responsible for the UI. Our port will also be using two other processes: NetworkProcess and WebProcess. Unsurprisingly, NetworkProcess does the networking. WebProcess does all of the work associated with a single web page. For example, it is responsible for running JavaScript and does most of the work for rendering the web page. There is one WebProcess for each web page.


This is a companion discussion topic for the original entry at https://www.haiku-os.org/blog/zardshard/2024-06-28_gsoc_2024_fixing_ipc_in_webkit
16 Likes

Great progress! Keep up the good work! :+1:

1 Like

And now I run into the problem PulkoMandy has been telling me I might encounter. Currently, WebKitā€™s connection creation code goes something like this:

  1. Give me a connection (in our case, a BMessenger)
  2. Thanks! Iā€™ll send that off to the other process.
  3. Ok, now create the the BLooper and BHandler to receive those messages.

No, we canā€™t make a BMessenger to a currently non-existent BLooper. WebKit has pretty much ordered the steps in the worst possible way for us! We have to create a BMessenger before knowing what the BLooper that is supposed to receive it is.

Now I see why Rajagopalan had such a complex solution in the GSoC 2019 project. It seems Iā€™ll have to do one of

  1. Have a similarly complex solution
  2. Modify WebKitā€™s code
  3. Create a new way of messaging in Haiku

Is modifying WebKit code problematic? How often relevant parts of the WebKit code change?

Often. If we edit shared code we need to maintain it very often, or we upstream it. If we do there should be a good reason since webkit doesnā€˜t maintain our port. : )

In this specific case, Iā€™m not sure it changes all that often. And that would be the cleanest soluton. Maybe we should ask WebKit developers about this as well, they may be willing to consider changes as long as it still allows the other platforms to work.

Well, the bad news is that the three step process is repeated in multiple places throughout the code and involves several classes, so it may not be the best solution.

How viable is option 3? Not the quickest solution to develop probably, but maybe the ā€˜bestā€™?

Actually, I was investigating whether we can use specifiers, that little used scripting feature. So we might be able to use a fancy existing feature of messaging :slight_smile:

But if that doesnā€™t work out, option 3 would look like

  1. Give me a connection ā€“ we create a new port
  2. Thanks! Iā€™ll send that off to the other process. ā€“ it sends the port number to the other process
  3. Ok, now create the BLooper and BHandler to receive those messages ā€“ we create a new BLooper that reads messages sent to that port.

This would require us to do two or so things:

  1. Make a private BLooper method public.
    We need BLooper(int32 priority, port_id port, const char* name);, which creates a BLooper that reads from an existing port.
  2. Make a way to create a BMessenger that points directly to a port.
1 Like

Maybe it makes sense to create a blooperweb , bmessengeweb etc etc classes in the api and just copy paste blooper and extend the code for the requirements of webkit there.

I may say silliness, but
what if you could creaĆ­e something like
a 2 way BMessanger and call it a different way , e.g. BChannelLine or BChannelQueue.
This way the connection could be established among processes in a general way.
The action that requested/initiated, so what will happen on this 2 way comm. would be possible to mark later what kind of stuff will be go there and back from receiver. This way the connection can be established without knowing before which thing should be processed.
Itā€™s just a living connection.

If itā€™s consuming too much computation cycle ā€¦ then something like a telephone exchange ?

Mr UI plays the telephone exchange itself actually

Mr WebX a client with a specific , known line number (port), but
they can be differen or numerous, so their number ( port ) will be alternate only if thereā€™s more present at the same time

[ you can specify a range for him, adding one port only for one page ]

Mr Net is a specific client , he has multiple numbers to be easier availability - as he gives service

[ you can reserve 1 or to line (ports] to communicate with Mr. UI ]
and
[ you can also specify a range for him, adding one port for one page , requested by Mr. WebX , and releasing a port when Mr. WebX interrupted/put down the line ]

This way always known what kind of communication will happen on the line (IPC). between 2 ports known and reserved for a specific communication.

I hope I was clear how I mean this.

Yes, I was thinking about that option, it may be worth exploring.

Why? The goal of developing WebKit as a native browser (not Qt based or the like) is not only to have a web browser for Haiku, but also to improve Haiku APIs as needed to support more use cases and complex applications.

Also, the goal of the Haiku project is to be in control of the whole codebase, and if something needs to be changed in the core of the OS to support an upper layer part of the code, we can do it. It avoids the ā€œlasagna codeā€ architecture where each layer is very isolated from the other ones, and people re-implement things from other layers just to make slight changes, leading to more code replication.

Letā€™s implement this in Haiku if we need it, and then other apps may also benefit from it :slight_smile:

3 Likes

Professionally, I had to embedd Chromium Embedded to a Vulkan non Haiku project. Its shared process model is one of the biggest wtf I have ever professionally encountered. It fails often and resets so many times its not amusing. Instead of fixing the underlying timing bugs, they just restart the process hoping that attempt #n gets to the end. Most users dont see this, but when cloning the framebuffer and tracing you can see what happens under the hood. It is really sad that they sweep timing related issues under the rug. With caching, it eventually loads. I can see that in a Haiku system with sensitive timings, this resetting design will cause major issues. This is CEF and Apple decided to go their own way since the Google way is really bizarre.

Note that Haiku ports have various fundamental design flaws compared to UNIX sockets:

  1. Risk of use-after-free and referencing wrong port ID. When process owning port suddenly dies, port ID will be freed and can be allocated to some new port, but other processes may still store old port ID. It can cause disastrous consequences of writing to wrong unrelated port. UNIX sockets use process-local FD numbers, so it have no such issue. Each process have itā€™s own FD number and reference.

  2. Access permissions issues. Port IDs are global, so everyone can write to any port, potentially causing misbehavior. Or even more, it is possible to read from other process port. For example you can make a simple program that read app_server port that accepts new application connections and all new applications will fail to start.

  3. Unreliable detection of client disconnect. UNIX sockets have 2 separate socket objects connected to each other, possibly owned by different processes. If process owning one socket side terminates for any reason (including crash), other side can gracefully and reliably detect that.

  4. UNIX sockets have a mechanism of passing FDs that is another important part to prevent use-after-free and access issues.

1 Like

Step 1 means something wants us to listen, canā€™t the looper be created there?

1 Like

The way it happens (very simplified) in UNIX sockets is something like this:

  1. Create a socketpair
  2. Start two processes (web process and network process)
  3. Give each side of the socketpair to each of the processes (either by command line arguments when running them, or by sending them messages after they are started)
  4. The processes can now talk to each other

You canā€™t replace the socketpair with BMessenger while keeping things in that same order. You have to swap steps 1 and 2: create the processes first, and then you can create a BMessenger targetting them (again, this is simplified, in reality, we have to first start the processes, then start a separate run loop/BLooper inside that process, and then get a BMessenger targetting that).

At that theoretical level, there is no problem with switching these two steps. But it is still a major change to the WebKit codebase which could be a bit of a problem for future maintenance.

What rajagopalan had done is that the UI process (the one that is started first) would keep a hash map mapping connection IDs to BMessengers, and all other processes would ask it to find the corresponding BMessenger to a given connection ID, and also reply to the UI process queries about ā€œyou are the target of this connection ID, what is the corresponding BMessenger?ā€).

In the current branch, I have removed that map, and I made the first part of the connection (UIProcess <> WebProcess and UIProcess <>NetworkProcess) work by serializing a BMessenger into a string and passing it as a command line parameter to the WebProcess and NetworkProcess. From there, they can deserialize it, send messages to the UIProcess, and the UIProcess can then use these messages to obtain a BMessenger to messages these processes back. But at the moment there is no way to make this work to establish the connection between the Web Process and the Network Process.

So, the solutions are (at an abstract level), if we donā€™t want to change WebKit code too much:

  • Either to make things happen as WebKit suggests (that is, creating a port in the UI Process, and then sending the port ID to the other processes and allowing them to build a BMessenger and a BLooper using that port).
  • Or, similar to what Rajagopalan had done in 2019, make it so a ā€œconnection identifierā€ is an abstract object, not immediately tied to a BHandler when it is constructed, and somehow fill in the details after the target process that will receive the message has been started.
3 Likes

I believe Rajagopalan had tried something similar before moving on. I believe this could work, but, fortunately, there are better, easier-to-implement ideas that donā€™t involve giving UIProcess so much work :slight_smile:

Waddlesplash also suggested this idea on IRC. Congratulations! This is the current front-runner.


So, here are the current three ideas that appear best so far, ordered from best to worst:

Create a BLooper when creating the BMessenger:

  1. Give me a connection ā€“ create a BLooper for the server and a BMessenger that points to that BLooper for the client.
  2. Thanks! Iā€™ll send the BMessenger off to the other process.
  3. Ok, weā€™ll reuse the existing BLooper you created to receive the messages.

This method was suggested by Waddlesplash and madmax. It requires modifying a bit of WebKitā€™s code, but it shouldnā€™t be that bad. Other than that, it should be relatively easy to implement.

Expand Haikuā€™s API

Explained in an earlier response. Everything but the first paragraph covers this idea.

This would take more time to implement but would yield the cleanest solution.

Use a connection identifier

This is somewhat similar to Rajagopalanā€™s approach.

  1. Give me a connection ā€“ Create a pair {this process's team_id, randomly generated connection_identifier}
  2. Thanks! Iā€™ll send the pair over to the other process!
  3. Now create the BLooper and BHandler to receive those messages

This newly-created BLooper will be registered with the be_app for the current team to respond to all messages addressed to connection_identifier.

This will probably require the most code of the three solutions and be the ugliest.

2 Likes

This is very unlikely because port IDs would have to wrap all the way around (32-bit integer) for this to happen. Sure, itā€™s possible, but it would only happen if someone was trying to make it happen. And if they can allocate/free ports at will, they already have remote code execution and can invoke syscalls as many times as they want, so why would they be doing this? There are more important problems to worry about first.

Internally the port system does record what process/thread sent a port message. This information isnā€™t propagated up to BLooper yet, but, it could be.

Does anything use this behavior? If itā€™s rarely used, we should just turn this off by default (perhaps enabling it for BeOS applications), same as with cloning memory.

Enabeling for BeOS applications might lead to the android problem of ā€œif you donā€™t want to ask for permissions just claim android 4 as lowest targetā€.

I would turn it off completely, if some BeOS applications break we can then see about adding a special privilege.

32 bit integer value can wrap around very quickly (in a few minutes) if someone is creating and deleting ports. It is a reliable attack vector that should be addressed. It is also quite easy to guess specific ports IDs and make them match because port ID allocation is predictable.

This behavior is useful for automatic client termination detection: you can make client a port owner and server a port reader so if client terminates for any reason, server will detect that by unblocking read operation and getting B_BAD_PORT_ID error.

One possible solutions of mentioned problems is adding port ID ā€œaccess referenceā€ that is stored in client team and require for using port by someone. So before sending port ID to another team, sender creates access reference record for target team and specify permissions that access reference record owner team can do with port (read, write, close). If some team attempts to access port ID without having access reference, it will return B_BAD_PORT_ID. Existence of access reference also keep port ID from allocation to new ports, so even if port itself is already deleted, as long access references to that ID still exists, ID will be not reallocated and accesses will guarantee to return B_BAD_PORT_ID error.

This is breaking change, but I think it can be transparently handled by Application kit and also some ports needed by legacy software can be declared as globally accessible by default.