About API improvements and Media2 Kit

Barrett · December 7, 2018, 11:24am

Wow, after years people begin to acknowledge that. However, the BMediaClient was exactly a temporary countermeasure for this problem. The media_kit has quite a lot of flaws. Seriously, the difficulty in developing it, isn’t just a question of pedantic interfaces, but it is a direct consequence of some design decisions. To name a few :

Distinguish producers and consumers
BTimeSource is a node (horrible)
Every node has start/stop/threading etc. implicitly defined, that’s a major flaw
Absolutely no concept of remote and local objects
BMediaRoster is a BLooper (WTF)
Mixes encoding stuff with raw data stuff
BFileInterface is a complete hole in the water.

I will name how those flaws are being solved in media2 :

BMediaInput and BMediaOutput
BTimeSource become just a provider of RealTime(), BMediaClock and BMediaSync are introduced.
BMediaSource
BMediaNode implements IMediaNode, with reference counting system.
The roster becomes just an accessor for a few static methods, e.g. getting the SystemGraph, all other features will be available using meta-interfaces a-la OpenBinder.
The Codec Kit is here for this reason.
BMediaExtractor and BMediaWriter will inherit some of the above mentioned classes to become interfaceable with nodes.

So, it is a few days that I’m swinging about creating an article or not. I think it is going to drain too much of my time for little interest, but there are still a few things that I’d like to discuss. Note also that this is the first time I expose what the new Media2 Kit will be, that means I have mostly completed my draft design.

However, continue on the Why’s. Let’s take a BMediaNode (as in the old kit). The main point of having such a class is being able to route audio/video/midi in/out between processes, right?

To do something like that you need :

A MediaNode object
A way to communicate between processes
A way for the media_server to observe such objects
A way for the media_server to reference count those resources and free them
A way for a remote process to control a node, instantiate it and so on.

How the Media Kit handle that:

BMediaNode, BBufferProduce, BBufferConsumer, BControllable, BFileInterface
Ports
Fragile code that uses port messages and expect the nodes to notify him
Again, it tries to keep track of resources using the ports communication
The BMediaRoster and those opaque structs, media_source/destination, media_input/output, buffer_clone_info and so on.

What’s the problem?

Suppose an application crash. Boom. You are into a burden on trying to recover from that. Suppose the media_server crash. Boom. Recovering the status is very hard and imply writing tons of code which is error prone.
Suppose you are controlling a remote node, and one of the above happens, guess what? Houston we have a problem.

Someone remember some strange media_kit bug like that?

Let’s continue with the BMediaRoster example. It’s API is really too much complex for the few things it is required to do. You need to start the whole chain of nodes, attach timesources, when you need to handle connection statuses you need to use all those funny and fancy methods with old-plain c structures. Even if I have a fairly decent experience in Media Kit, I find myself in doubt if I open the BeBook and analyze some methods. What’s that? inOutput? outInput? inSource? ourNode, theirNode? theirInput? And so on. I’m pretty sure anyone who developed something using the media_kit knows that. And if it’s not enough, look at the media_connection implementation in the BMediaClient, see how complex is to track a connection between two nodes.

Now we identified at least two problems:

Programming is way too hard and error prone, the learning curve is too high.
Keeping track of remote objects and remote resources status is simply impossible.

And we can go directly to the point. What’s the solution OpenBinder gives us, and why I’m falling in love with this idea.

We implement an interface IMediaNode, this is remotely-reference-countable, that means we can know who uses it and why. BMediaNode, implements IMediaNode, which is the local version of IMediaNode.

The media_server can subscribe to notifications from this node, how many buffer it owns, what buffer group uses. The status of the object is kept by the kernel rather than the server, so that anything happens we don’t loose information.
The local process, just implements BMediaNode, the remote processes uses IMediaNode.
A third process that want, for example, Stop() our node, just call it’s own IMediaNode::Stop() command without caring about using anything so horrible like the BMediaRoster.
A fourth process want to connect with us, nice, it uses IMediaNode::Connect.
We are done processing something, all the BBuffers are released, the media_server simply receive a notification that the buffers reference count gone to zero and release the memory, in just a few lines of code.
At some point the process crash, the Binder connection become unavailable and the node status get to an invalid state. All remote processes instantly know that, and know that their inputs/outpus are now invalid and free to be used by another remote node. The media_server know that too, he can release the resources by fairly decreasing the reference count.

The whole thing is simplified here. But trust me or not, the media2 kit needs a way to model remote and local objects and reference count them. That’s what the Binder interface does. That’s why it is so useful.

It is entirely possible to implement something like that using ports, shared memory and signals, plus some support in the kernel to maintain the status of shared objects. I’m still considering how to implement it, if it’s the case to have it enough generic to be used by other kits, if it’s the case to grab some code from android, or if it’s the case to have a local implementation for media2 kit use only.

I’m pretty sure, other parts like the app_server will greatly benefit from something like that.

Feel free to comment.

waddlesplash · December 7, 2018, 9:40pm

This sounds awesome.

I’m still not sure it should be its own .so+namespace and not in libmedia2.so, but the concept at least makes sense.

I see the need for the “remote reference counting” and the like as you describe it. However, I am still unconvinced that Binder’s solution is the best.

I have been thinking about this problem a lot and I really need a couple of hours to just sit down and write it out; but I don’t have that time this week. Hopefully next week.

Barrett · December 8, 2018, 7:42am

I begin to think you have a very short memory to not remember what I already said in your original ml reply, that I already considered in future to move it into the same library. There are a few reasons also for not doing so.

Should I say: Arguments? Or are you just doing the contrary-man? As I told in past I think you know very little of how the Binder works under the hood.

Beware that I’m thinking of this problem since years at this point. I know very well where the key point is technically. The solution has been designed by the Be engineers, then continued to be developed in Palm and in the end used to create an Operating System (Android), that runs on millions of devices. Without counting that the Windows API has indeed similar concepts. I’m pretty sure you either come with a less efficient solution, or something that does the same thing with another name. Do not count that I choose your genius system over a system that has been well tested and proven functional.

Barrett · December 8, 2018, 8:42am

However, I want to move the discussion on the technical level only. Take some time to explain me what you don’t like in the Binder model. There are at least two levels of discussion here, the first is the OO model, how the concepts are modeled on an extensive level. The second is how they are actually implemented in the low level. It may be in the end that we don’t like the same things, so please be more explicit and point me out what you think that’d be a problem.

Zenja · December 8, 2018, 9:37am

I’m creating a video editor for Haiku, and from all the available media kit classes, I only rely on BMediaFile and BMediaTrack (encoding/decoding). I have no need for the rest of the kits which seems to Be designed for live streaming (a very narrow use case). The complexity seems to stem from Rolands Edirol 7 product, and BeInc’s battle for media latency. I’m sure that the design would be different if developed today.

Since the number of existing legacy apps which depend on streaming is essentially zero, I strongly suspect that most developers would welcome a redesign if it simplifies usage. If I ever add streaming to video editor, I’d appreciate a simpler more modern design. I haven’t looked at Binder so cannot comment.

From my adventures with the video editor, my biggest obstacle was lack of access to output of system audio mixer. I’ve had to roll my own.

Barrett · December 8, 2018, 10:02am

Not really for streaming, mostly for real-time processing.

In audio applications, latency is a very important thing to consider. But even there, there are numerous flaws that prevent the kit from being effective outside trivial chains.

The Binder is just an IPC model in itself, and not really need as a whole for the purposes of the new kit. My new kit will not only be simpler, but also much more powerful, scalable and extensible. However I see normal that for video editing you need little of those capabilities as in general video programs have little needs of communicating with other programs.

Do you mean recording from the system mixer output, while it’s still connected to the soundcard?

Zenja · December 8, 2018, 10:19am

Ideally I would have piped output of audio mixer to user buffer - a pseudo audio recording feature. The original BeBook doesn’t allow this. Hence I’ve ended up implementing my custom mixer (with all the headaches of resampling mismatched bitrates, channel counts, buffers etc). I need this for channel mixing and track effects. Now that I’ve done that, I dont rely on system audio mixer at all (I just pipe data to single BSoundPlayer for playback preview).

Barrett · December 8, 2018, 10:31am

You could have instantiated a custom instance of the mixer to do that, without reimplementing a mixer yourself. I understand it’s not clear that it is doable, but in fact it is. That’s why the media2 will be much more simple.

rocket · December 8, 2018, 11:40am

video programs need very much to communicate with other programs. gstreamer and snowmix (built on top of gstreamer) recognize this and are built on a similar idea to the media kit. no application can know how many simultaneous feeds an editor needs – whether from streams or files; whether audio or video or control – nor what an editor intends to do with them – whether sequencing or coloring or compositing; whether streaming over IP or a video out; whether rendering constant or variable framerate. monolithic video apps are numerous, take you through countless interfaces and require making intermediary files because none of them speak the same language. a modular approach means audio fed from a DAW can be synced to a video timeline (MIDI timecode is based on SMPTE timecode for exactly this reason); it means an arbitrary number of video feeds can run on a timeline, agnostic about whether the source is a live camera, a file or an internet stream; it means an editor can run the feed through actual dedicated color correction without sacrificing space to intermediary files; it means once finished some subset of these same modules can render the finished sequence. workflows can be made on the fly, by end users, for audio sync, nonlinear editing, live editing for broadcast, VJing, color correction or anything else they might imagine, and all using the modules they like.

Barrett · December 8, 2018, 1:06pm

I think you over-read what I wrote. I’m not saying that video doesn’t need IPC, I’m saying that normally audio workflows needs this much more. Especially non-linear paths, where the signal is taken at a random place in the path and sent backward.

The media2 kit will be centered around one class: BMediaGraph, this class will allow to do really a lot of things which aren’t possible in other APIs, especially for video processing. The idea is that the whole system is seen as a global graph, a BMediaNode will have it’s own (internal) instance of BMediaGraph as well. This way you can see the system globally as graph intersections.