Professional Sound API

[quote=DavieB]
Are you saying the media kit is not up to scratch for pro audio applications? Have I misunderstood?[/quote]

Well it’s always possible to stretch a point. After all, a ZX Spectrum is not what we’d generally consider a pro audio system, but a crowd of paying guests dancing for an hour to sounds coming (in part at least) from a ZX Spectrum is hard to argue with. I’d say if you care about low latency, the Media Kit is probably not a good choice.

But to be fair the primary criterion for most users, whether they do pro audio or just blog about skateboarding is always application availability. If you like Reason then an OS without Reason is no good to you, even if it has really low latency.

You haven’t really answered the question and your comments seem vague and indirect. If you think the media kit is not up to scratch then give us some reasons.

Lets say for now the media kit is written for bog standard audio use and that it sucks for pro audio. The better API is Jack which is already in use by Ardour and Harrison’s MixBus program. It’s made it’s way onto MAC OSX and Windows. This seems to be the best API for “pro” audio.

I DO think that using the Media kit for pro audio (extreme low latency) is not appropriate because it will inevitably introduce jitter and also because it does not allow direct hardware acceleration which is trivial because it is hidden by the Media kit API (please correct me if I am wrong).

I have some interestes in audio myself and I tried to think about this previously, and after reading the Be Book I found these conclusions

  • Using Media Kit is only for non-time critical (no pro-audio) software, such like media players because of the overhead of the context switching, non optimsed sound routing calls (again correct me if I am wrong)

  • Pro audio is similar in some ways to 3D graphics, it needs the fastest access to the device and the best hardware acceleration, so it needs some direct access API for sound, something like DirectSound, ASIO, etc … So I found some answers not in the MediaKIT section but in the GameKit instead so let’s see what it says here :
    BPushGameSounds

    Are used to let you fill buffers flowing to the speakers (or headphones, or whatever audio output device is being used). Their missions are the same, but their methods are different; BPushGameSound also provides a way to play sound in a cyclic loop, keeping ahead of the playback point in the buffer. This is the extreme in high-performance, low-latency audio, but does require some extra work on the programmer’s part.

  • My personal approach for a Pro audio system would be an implementation of Jack API in top of the GameKit which provides higher level API instead of using directly the BPushGameSounds and the BStreamingSound and it would be for the “careful and know-what-to-do programmers” and for the not so critical missions, there’s the MediaKit.

My 2 cents

I’m in favour of Jack but I think it can be extended to not just be a way of routing but also be the hub of the mix. Consider the fact that most programs have their own mixer built in, Pro Tools, Cubase, Sonar, Ardour, MixBus Logic; they all have their own mixer. Why? If you want to Jack your audio between all your programs and hardware etc why not mix them as well? Just have one system level professionally implemented mixer. So you open a DAW and create a track. This track shows up in the mixer as a channel strip, with everything a professional mixer has, inserts, sends, master fader, metering etc. In the DAW you edit and move the audio around but you control the mix from the system mixer. Then you open a drum machine instrument and this shows up in the mixer also. The mixer also has automation built into it, which can be edited. The set-up can be saved as a project. It has a master transport control and clock so everything can run in sync. This mixer would come with a rich high level API not unlike Jack’s. In fact you could consider this as “Jack with a Pro Mixer”. Or think of it as Ardour with the mixer removed and stuck into Jack instead. The great thing about MixBus is that it was written by Harrison so the implementation of the mixer is exemplary but the problem is that most people just want the mixer part not the DAW. If you had Jack Mixer implemented by Harrison that would be the center piece of a pro audio system on a computer, everything goes through it’s API …

Another thought on the subject of the media kit etc. How do you explain iZ Corps RADAR system and Tascams SX-1 professional hardware DAW’s being based on BeOS? Will they have made use of the various media and midi kits to create their software?

http://www.tascam.com/products/sx-1.html

Hi all. First time poster here. I was a developer back in the BeOS days (Harrison used BeOS extensively in our products, along with the iZ, Tascam, and LCS guys). In all of these cases, BeOS was the control/automation computer and the heavy-lifting was done by DSP hardware. I don’t believe any professional product ever used the Media Kit.

The Media Kit is very well designed (overdesigned?) for media playback and simple music production. It is not ideally suited for professional use because of the latency issues discussed before, the relative complexity of some simple tasks (such as plugin hosting and parameters) and the lack of features such as time sync, video sync pulse, netjack, etc. Of course all of these issues could be incorporated into the Media Kit, and probably very neatly.

On Linux it has become fairly common for systems to use PulseAudio for desktop sounds and JACK for professional audio. I’m a big fan of the “cleanliness” of BeOS/Haiku so I’d prefer that Haiku didn’t have this split. However, practically speaking, Haiku has the opportunity to gain a lot of developers and a lot of existing code if they adopt JACK. If they don’t do this, then there is going to be a long and hard road to re-implement everything that has been happening in Linux Audio for the last 5 years.

DavieB, regarding your comment about a separate mixer: there have been many examples of this over the years on all platforms. BeOS had a simple software mixer that allowed you to adjust the level of apps that were playing back on the system. There have been mixer-only apps for CoreAudio, ReWire, and JACK. Ardour/Mixbus has often been used as a mixer-only system with no track playback. I’m even aware of people using Ardour and Mixbus as live mixing consoles(!) However in most cases, the inconvenience of saving the mixer state in addition to the various other programs makes it impractical. There is an effort to incorporate “session management” into JACK to provide just this sort of inter-operability. (yet another thing that the Media Kit will want to re-implement in the future…)

I could imagine a cool software mixer that automatically adds an input strip(s) every time you launch a new app. But you’d want to separate the “pro” apps (whose settings could be saved/recalled and automated) from the desktop apps such as the web browser.

Best Regards,
Ben Loftis

edit by admin: removed the URLs.

It would be good to have a professional implementation on Haiku, Jack seems to be the current contender.

On the separate mixer idea, I know what you mean, when you say you want to save the mixer state with the project audio and midi sequencing etc. But I would say that this is still broken if you are using instruments or FX feeding into the DAW from external Jack inputs. The state of the entire project means saving the DAW and mixer as one project, the instruments as another, jack routing as another etc.

The other reason why I think it is a good idea to have a separate mixer is that the user will never have to change the mixer settings if they decide to move from one DAW sequencer to another. It’s a real pain to work in Ableton Live with a mixer set-up and then move to Cubase for recording audio and having to move over your mixer settings as well. Interchange of audio and midi is fine but how can a user easily interchange their channel strips, plug-in inserts, and overall mixer set-up. I don’t think you can and copying one mixer to another is slow and painful. In the real world there is only ever one mixer at the heart of any recording studio.

I agree that a pro level software mixer would be for “pro audio” software only.

I was thinking about the issue a bit lately and I think I came to the conclusion (which anyway would be better discussed on the development mailing list with the developers) that an easy and clean way to get something working for professional audio on Haiku could be to have JACK as a node/roster/whatever in the Media server, so that simple applications may just use the Media Kit and professional apps could use JACK - it would be also preferable to be able to connect normal Media server nodes/rosters/whatever to the JACK’s inputs/outputs (and maybe have some setting, maybe even dynamic, to specify how many JACK I/O ports one wants to have).

This as far as audio data goes and in the hope that direct connections in the Media server to soundcard I/O do not introduce latency too (if that’s not the case, that could probably be changed without breaking APIs, I hope) and that Haiku is capable of makiing JACK work decently.

When it comes to MIDI, I don’t really know; I’ve been told that the Media Kit has problems for professional MIDI usage too, but I actually have no idea.

As a professional composer and musician using linux, i’d add a couple of things to this thread.

Ben Loftis makes some good points in his post, and hits the nub of any discussion about professional sound.

1.) A central mixer is essential. I’ve gone the route he described in the past, using an app’s built in mixer just to try and have a central mix core into which i can add any number of apps. Now i use a little app called non-mixer, as it’s the only mixer i’ve found which does the job as a standalone for what i do. (As a writer of orchestral and film music, my port requirements are in the hundreds, not the tens, and many professional audio and midi apps are written for “headbangers”, excluding my unique requirements, and lacking the extensibility to use many more tracks, ports, etc.)

2.) Jack, for all its strengths and weaknesses, is ideal (more or less) for professional audio use. It seeks to fulfill the requirements for a multiapp working environment, and does so at low latency. Those that know will appreciate the value of this, recording not only from live sources, but building scores, tunes, etc… with software tools inside the box, and hardware tool outside. (Synths, samplers, blah blah blah)

3.) Extending what is a domestic API may well seem like a good idea, but it’s basic flaw is in user requirements, those of domestic users, where generous buffers and high latency are essential for great playback of movies and tunes, versus the time essential low latency requirements of the professional user. Pulseaudio, for all it’s strengths and weaknesses, and i assume this is akin to the Haiku API, is built for domestic users, and there are many trainwrecks in fora as the discussions rage on about the “worth” of pulseaudio versus jack, or indeed directly using alsa. It’s FUD to assume that a domestic API will satisfy the needs of a professional user, and this FUD is often spread by well meaning devs who think they know what professional requirements might be, although they have no experience in that field.

4.) This also applies to Video playback, and as someone who writes to image, all too often the sync between audio and video devices is less than optimal, because the sound api doesn’t neccessarily sync to the framerate in vid. Jack has a common Transport API that can be implemented in both audio and video apps, and in the linux world, Xine is one of these Jack Transport capable apps, able to show vid synced via jack transport with any other jack transport capable app. A domestic api lacks this sync, or framework for syncing, and renders itself more or less useless for such sync required work.

5.) MIDI is the hairy red headed step sister of any professional working environment, and all too often, this protocol is poorly done, not only at app level, but at server, or framework level in APIs. Jack at least attempts to rectify this in some way, with JackMidi, a sample accurate midi framework that syncs by default with JackAudio using as it does timestamped events, and doing so within a specified period, as for audio. Any consideration of a professional grade sound/vid/midi API should, by default, include the required components to deliver sample accurate MIDI, and without jitter, or uncertain timing. There are many users, professional and domestic alike that use midi, yet all too often it’s “tacked on afterwards” as an afterthought, rather than considered as part of the whole professional API, and integrated at initial development, by developers whose vision is of a studio with a desk/console, and that’s…professional. It’s just one part of a much wider picture, and you’re more likely to see a film score, or recording studio setup, in the 21st century, on a box, with a widescreen. Large console studios are still viable, of course, but there’s a lot of users working with a software mixer, and plugins, at a paid level. A colleague of mine has a large console and Protools rig, yet his editing work for the last 2 years has been on a computer, using another DAW, because it does the job. He’s on Windows, poor soul, but he makes a valid point when he says it’s cheaper and quicker to edit this way, than crank up the desk and PT rig, to the same finished product standard. (Which makes yet another case for a central software mixer as part of any professional sound API)

6.) With the rise of smaller devices, like phones, handhelds of some description, netbooks, laptops, and so on, there is a trend to build to these small devices, as a default, and considerations in design, and implementation seem to be heading in this direction. That’s fine for the 6 to 24 track engineer (some may be a little more using a powerful laptop for example), but as designs change, it’s also a trend that limitations are imposed to cater for these devices, and by the nature of those limitations, the larger setup users are being slowly excluded. I urge anyone building apps or sound APIs not to forget many users are still using desktop computers or server farms to render their material, and this doesn’t look like changing soon. A Haiku professional sound API should be just as valid, and easy to use for a multi box user, as it may be for a laptop or netbook user, imho.

7.) Finally, no limitations. If you think a user will never use more than 100 audio ports, or 50 midi ports, you’re wrong, and profoundly so. My regular template, and this is also true for my colleagues doing the same work, is at least 400 audio tracks, and 128 midi ports. I have a friend in the UK who writes for bigger work than I, using 600-900 midi tracks, over 250 midi ports, and a farm of boxes running samplers to feed his main daw. If you think there’s “enough”, then double it, then triple it, then use no limitations at all. :slight_smile:

A few thoughts if you chaps are considering a professional API from a user who does this for a living.

Good luck.

Alex.

Buffers are not passed in and out nodes by MediaRoster (no xxx returns to mr / mr feeds yyy). Nodes being connected and started, they perfectly knows their output(s). They simply hands the buffer to their connected output(s), which trigger a BufferReceived() event on the corresponding node(s) in the graph, which run in a different thread.
The BMediaRoster allows you to query, instanciate and control nodes, but is not involved in buffer(s) processing.

There is no main/central/looping processing thread managing the pipeline(s). As with many BeOS/Haiku areas, it’s an asynchronous multithreaded model, where each node works in parallel (actually, on SMP hardware) and cooperatively via messaging to produce buffers in time to perform them when due.
It’s very visible in ProcessControler: the media_server doesn’t do that much work, instead the load is spread in several threads, in media_server_addon’s ones for system media nodes like mixer, physical nodes, in client apps (Media Player, CortexAddOnHost, Clockwerk…) for others.

The timing design is kind of reversed compared to JACK: it’s a performance time driven data flow, not a process-driven cycle. Also, JACK design focus mostly on live audio, while Media Kit design don’t/can’t.

I dunno if Media Kit design is up for Pro Audio, but when I consider that JACK is not an end-user audio-video framework (GStreamer is more in par with Media Kit here), that its design is optimized for live audio but had to evolve from single RT processing thread model to a dual thread model to actually benefit from SMP system, had memory locking and RT memory alloc support to avoid VM swapping, I find more similarities in both frameworks than one could though at first.

The fact remains that nobody will knows what worth Be’s Media Kit design and Haiku re-implementation until one actually try to stress it as a Pro Audio solution will.

Any volunteer?
:slight_smile:

Media Kit don’t do MIDI.
The Midi Kit does, but don’t ask me if it’s up for professional usage or not…

You seem to be confusing 2 different elements.Mind you real time audio is a myth. If you want real time audio, play accoustic instruments. Now back to being realistic.  

Latency correction or synchronization of latecy on all audio tracks in a DAW.IE sum rendering error differences are not tolerable against time code.

thats fine. but so say you have to have real time audio ? Thats bunk. You have DSP process cycle wiat times and execution regardless of how low the latency is and they just can’t be gotten around regardless of how good the system is Os aside, the hardware has to crunch the math and this is impossiable to avoid, even protools DSP cards are “low Latency” but they have tons of DSP chips to run each track into its own DSP channel but it still has to have a back end synchornization time stamp at assebmly time for rendering at the master bus.

the only flaw the media kit might have is a LACK " I haven’t confirmed this" of a way to keep things sycnd to the master time clock. A problem that is very easily resolved at the track level. But as you add DSP effects, reamping, external busing, bus round trips, outboard processing gear you are going to incur latency the more layers of processing you add the worse this becomes.

what we need is a way to calibrate the latency and have the DAW stick everything to one timestamp. Presonus studio one has this feature. the Kristal audio engine has this feature, Cubase has this feature and its not in the audio drive.

Jack solve none of these issues unless it is generating synchornization time code stamps for each audio stream. Which it might. Jack is a fancy software mixer. Nothing more. honestly I wouldn’t consider jack pro audio level. Midas sounds boards are Pro gear. Jack is at best high end hobby software for live performance/home recording “which is whatever that is”

There is no such thing as real time audio. Get used to saying it. Its just what it is.

If you want to have a rational discussion about time code based stream synchronization and how to best implement those capabilitys into the Media kit or make a media kit API. Great. Welcome the suggestions. but stop with the latecny argument. Its really pretty tired.

Ok, let’s see if I can make things more clear.

Ok.

[quote=DavieB]Now, we have three programs linked in series.

In your example is a “node” a program; A, B or C?[/quote]

Yes.

[quote=DavieB]The media server doesn’t card about processing chains? I think I know what you are going on about now. On linux you can string your programs together in series such as; Renoise → Ardour → soundcard. Jack treats this as one thing? How does the media server achieve this? Well each program A, B and C uses a the media roster object. This roster object owns individual “nodes” for fx, inserts, tracks, anything you like etc. The roster object is aware of a master clock so that they can run in sync. This is ok when things are to be sync in parallel but you want to know how the media roster objects run in series. This is an important point. I don’t actually know.

Simple high level view of the media kit

Media Kit is the studio.
A “node” is to an effect, input, output, track, channel etc
A media roster is an application such as a sequencer, DAW, beatbox program that can use nodes and can be sync’d to a studio clock etc.

So can media rosters be chained together in series? One feeding the next etc?[/quote]

Well, I said I have no experience with the media kit itself, apart from resampling; my source of information is: The Be Book - System Overview - The Media Kit

I don’t know if media rosters can be chained themselves (doesn’t Cortex do that?), but even inside one roster, connecting nodes in series creates latency.

Real world example: let’s say I have 3 ms buffering latency and I’m using a guitar rack program which uses nodes for each effect. I want to apply 5 effects in this order: wah, distortion, phaser, flanger, FIR convolver - if each node introduces latency I have a total of at least 12 ms of latency. If I used JACK to connect individual application each doing one of these effect I would have had only the 3 ms of latency.

Why? Because JACK is capable of executing the effects in the “correct” order, while the media kit does not take the order of effects into account. So the media kit has to use past buffers for each effect to avoid processing bad data, and each buffer is 3 ms long.

Result: 12 ms is hearable, 3 ms instead is not.

I hope this is clear enough.[/quote]

Dude I have done research with this latency doesn’t even get to perceptual till 150+ msec.

But belive what you want.

12msec is 

0.00012 seconds

[quote=thatguy]
12msec is

0.00012 seconds[/quote]

Isn’t that supposed to be 0.012 seconds?

Regards,

James

o whole .0 tenth 0 hundredth 0 thousand 0 ten thousand

Yeah I misplaced the decimal and its still faster then you blink.

Where did you read that the media kit doesn’t take nodes chain order into account!?
Of course it does. Otherwise, why all these efforts to manage nodes graph, downstream latencies, notify late events, etc?!
In fact, in live run mode, all nodes in the chain are driven by performance time: each node is called at performance_time - this_node_downstream_latency - this_node_processing_latency - scheduling_latency

The minimal performance time being:
now

  • initial latency (in your example, the physical sampling latency, 3ms)
  • sum of all nodes processing latencies
  • sum of all nodes scheduling latencies.

Aka exactly like under JACK, except for scheduling latencies (see below how it’s taken care of).

Maybe one confusion is that in Media Kit physical input are also nodes, and this initial latency is just the processing latency of this physical input node: for obvious reason, 3ms live sampling will take 3ms whatever the audio framework… But passing down this 3ms long buffer from nodes to nodes don’t take 3ms, thanks god! What it take is just passing the buffer pointer, like JACK cycle task pass samples pointer to his clients.

The Media Kit neither enforce neither add arbitrary buffering (and so latency) between nodes. When all the nodes works on the same data format and rate, the same input sample buffer (in shared memory) is reused by every node along the chain, and no extra buffering is either needed or enforced.
What works because JACK enforce 32 bits float sample format and a common rate works the same under Media Kit.

JACK works synchronously on a fixed sample format at a shared (hence the synchronous) rate.
While Media Kit works asynchronously in term of processing model, the above situation is still supported in term of data flow model. But it’s only one of the many data flow it can support. If you’ve follow the jackdmp design changes, you’ll see that sequential activation is not the best design to benefit the most from SMP machines, hence why jackdmp (now JACK2) has to introduce parallel cycle(s) execution, working on previous cycle(s) buffer(s).
Which is far more similar with the data flow graph asynchronous activation strategy used by the Media Kit than one would think at first…

One issue of the asynchronous processing model is the scheduling latency of each node in the chain, which could cumulate contrary to the JACK’s synchronous one. That’s why the Media Kit monitors scheduling latency and automatically compensate drift by calling nodes early. JACK don’t have to do that, because it’s a RT task, which by definition have no scheduling latency drift.
On the other side, the more SMP capable your machine is, the less scheduling important latency & drift you will see ;-).

The eternal synchronous real-time processing vs parallel soft-time processing debate still goes on…

[quote=phoudoin]
Aka exactly like under JACK, except for scheduling latencies (see below how it’s taken care of)[/quote]

Thanks for clearing this up.

I’m not sure if that’s a helpful way to think about it. AFAICT: The actual Haiku source and sink nodes are integer based. The actual system mixer does not adapt to use the same period size as the source. So “the same” does not seem to exist today on Haiku, unless I’ve missed something.

[quote]
If you’ve follow the jackdmp design changes, you’ll see that sequential activation is not the best design to benefit the most from SMP machines, hence why jackdmp (now JACK2) has to introduce parallel cycle(s) execution, working on previous cycle(s) buffer(s).
Which is far more similar with the data flow graph asynchronous activation strategy used by the Media Kit than one would think at first…[/quote]

I think you’re mistaken about the effect of the “pipelining” in jackdmp, or at least your description is misleading. The pipeline divides up periods at the source. So setting a period of 1024 frames has the same effect as before - and the same latency, but by processing a fraction (say 256 frames) of the period at a time jackdmp achieves greater parallelism at a cost of increased overheads. Is the Media Kit’s scheduling equipped for this? Both the first 256 frames and the last 256 frames of a period must be delivered for the same deadline.

[quote]
That’s why the Media Kit monitors scheduling latency and automatically compensate drift by calling nodes early.[/quote]

References to running things “early” predominate in discussion of the Media Kit, even by Be. But this only makes sense for the nice easy cases like “user is listening to some music files”. I recommend that anyone working on Pro Audio solve the hard problem first, and that’s real time audio. In real time audio there is no “early”. We can see this in JACK’s “race to idle” approach. A sound card generates an interrupt indicating that the last period is recorded into the buffer, this is the source. JACK runs the source buffer through the graph and writes it to the sink, which may also be a hardware buffer on the same sound card. It is not possible to compensate for anything by doing something “early” here, until the interrupt fires there is no audio buffer available with which to do anything.

Well, i’m not a developer but “just” an audio producer: it would be great to have a method that manages 3rd-party effects plugins as Haiku manages video codecs backend… that’s why i suggested the N.A.S.PRO. adoption !

Talking about DAW, it would be great to start upgrading HyperStudio (or, why not, Clockwerk too) by “grabbing” features from others open source DAWs (such as Ardour, Traverso, Qtractor, Rosegarden, DarkWave Studio, Koblo, etc.)

BTW i believe that the 1st correct step should be let Haiku manage professional multichannel (8+) soundcards/mixers, in order to record multiple audio streams (harddisk recording, at least).

[quote=phoudoin]
Not true. The audio physical input and output are handled by MultiAudio node, which support any audio format (8/16/32 bits integer or float) that the below kernel hardware drivers supports. If they don’t support float in hardware, someone must convert it somewhere in the pipeline. It’s true for JACK sources and sinks too. The difference is that the Media Kit don’t enforced a format and the system audio nodes supports all formats to be able to accommodate any situation.[/quote]

OK, this is the Right Thing™ in so much as it means you can do the whole pipeline in float. I have not seen this done, in BeOS or Haiku but I believe you (as the Media Kit expert) when you say that this works.

[quote]
First, it’s the default system mixer, but nothing forbid a pro audio solution to shortcut it totally.
It’s just a matter of routing choice to make. Desktop multimedia apps routes their audio toward the system mixer because it make sense, but nothing forbid to route it toward a dedicated sink node instead. If one would want to mimic the JACK pipeline with the Media Kit, he would have to do that in fact, as JACK don’t enforce either a system wide mixer client, AFAICT.[/quote]

In BeOS at least, this simply doesn’t work. I was not able to remove the system mixer from graphs without serious problems. This was clearly not something Be’s engineers spent time on. Remember the B_SYSTEM_MIXER is not just another node on the graph as a mixer would be in JACK.

Perhaps Haiku is different. Perhaps you’ve actually tried it?

[quote]
In the end, keep in mind that the Media Kit has to be far more flexible that JACK, who knows it’s only audio 32bits samples, while the Media Kit has to support variable audio and video formats. Being flexible doesn’t mean that it’s not up for some very specialized task, it means it need far more setup and tuning to be able to it. Aka being specialized.

I’m not saying Media Kit is able to do everything JACK succeed to for audio, being the well-optimized for audio real time processing JACK is.
What I’m saying is that Media Kit was not designed to support only the playback usual case, which you seems to assert. In fact, if it was the case, it would be an overly flexible layer to accomplish just that. This flexibility is there for a reason: to adapt to more than one well-known case.[/quote]

As an aside, people have transported other things over JACK. Notably MIDI and video. But you are correct that JACK’s core purpose was from the outset and remains PCM audio.

On the subject of what is supported, I think there’s a tendency to believe that Be did all this design work for you. You should not underestimate how much of Be’s work was unfinished. There’s a thread somewhere on this forum asking about Monitoring. That ought to have been covered by Be, right? But I can’t even find mention of it as an “upcoming” feature like the video codec APIs.

You seem to have misconstrued what “real time” means. A real time system has a fixed deadline. Often it is acceptable for something to occur “as soon as possible” or even just “eventually”. But in real time systems, including real time audio, there is a hard deadline. The nature of the deadline varies by application.

Sound travels about 34cm per millisecond through air. By programming our software to meet a deadline such that total system latency is below 10ms, we can ensure that the effect is no different than if the listener was sat 3.4m further from the sound (e.g. speakers).

Acoustic instruments incur latency too. Consider a piano (a real one, say a Bösendorfer concert grand if it helps you to imagine it). After the pianist presses a key, there is a very real delay before sound comes out of the piano. The hammer strikes the strings, which then resonate, and the resonating vibration travels to the sound board and this vibrates the air causing a sound. Pianists learn to compensate.

I’m not sure if that’s a helpful way to think about it. AFAICT: The actual Haiku source and sink nodes are integer based.[/quote]

Not true. The audio physical input and output are handled by MultiAudio node, which support any audio format (8/16/32 bits integer or float) that the below kernel hardware drivers supports. If they don’t support float in hardware, someone must convert it somewhere in the pipeline. It’s true for JACK sources and sinks too. The difference is that the Media Kit don’t enforced a format and the system audio nodes supports all formats to be able to accommodate any situation. The actual format is negociated between nodes at connection time. If a node only support integer, but some other downstream node support only float, I fail to see how JACK or Media Kit can avoid the format conversion. Claiming that JACK works in float even with hardware working only in integer is wrong: the driver does the conversion, that’s all.
Same with Media Kit, except that it can be anywhere, not only at input driver level.

With hardware working in float format directly, a Media kit pipeline can perfectly works in float format, including the MultiAudio and Mixer system nodes.

First, it’s the default system mixer, but nothing forbid a pro audio solution to shortcut it totally.
It’s just a matter of routing choice to make. Desktop multimedia apps routes their audio toward the system mixer because it make sense, but nothing forbid to route it toward a dedicated sink node instead. If one would want to mimic the JACK pipeline with the Media Kit, he would have to do that in fact, as JACK don’t enforce either a system wide mixer client, AFAICT.

Second, the default system mixer time source is set to it’s default output clock rate, indeed, simply because this rate can be totally disconnected from the input ones (multi inputs, not all physical). But it doesn’t means that it’s impossible to connect a chain of source → effects nodes → sink as in JACK pipeline and set the sink node time source being the source one.
It just doesn’t make sense for a desktop operating system’s default system mixer to do that, because the source(s) to mix can be anything, not only physical ones, and can have each one a distinct rate (and format, as seen above). On this regard, our Haiku system audio mixer must adapt the same constraints than PulseAudio one for instance.
But it’s not because the default Haiku Media nodes are setup like they are that it means the Media Kit can only support such “default” setup.

Wont be surprised, as english is not my native language :slight_smile:

Well, what I saw in JACK2 design documentation was about parallel clients activation (aka process parallelism) when the graph have client waiting on more than one client. I didn’t see the data subset parallelism (aka dataset pipelining).
Thanks to your comment, I’ve read a bit more and see your point here.

Nothing forbids a Media Kit node to divide his input buffers and produce sub-buffers at an higher rate to help similar parallelism of downstream nodes. The main differences here are:

  • the media kit don’t enforce such splitting mechanism, nor provides support to help a node coder to do it automatically. It’s up to each node to do it how it’s best for him (or his downstream nodes), and usually when it does it’s at format negociation during connection phase: the buffer size, frame rate and sample count are all parts of the negociated format.

  • every Media Kit node is already enforced to work in parallel: they’re spawned as distinct threads, they can’t avoid it. The JACK sequencial execution cycle model doesn’t exists at all in term of Media Kit land, instead you’ll have a pipeline of producer/consumer threads. If an upstream node decide to divide the input frames on output, all downstream nodes will automatically benefits from smaller latency by running in parallel on smaller frames.

True. In real time digital multimedia, strictly speaking, there is only “late”.
The issue is to keep it under control.

While algorithmic latencies can’t be compensate without degrading in some way the output indeed, scheduling latency could on SMP machines, by being parallelized.
When every node could run (theoretically) in parallel, instead of cumulating the schedule latency between each ones you could preroll the nodes. You have then just the initial nodes scheduling latency, the ones which make real time audio always late whatever you could do :-).

In the end, keep in mind that the Media Kit has to be far more flexible that JACK, who knows it’s only audio 32bits samples, while the Media Kit has to support variable audio and video formats. Being flexible doesn’t mean that it’s not up for some very specialized task, it means it need far more setup and tuning to be able to it. Aka being specialized.

I’m not saying Media Kit is able to do everything JACK succeed to for audio, being the well-optimized for audio real time processing JACK is.
What I’m saying is that Media Kit was not designed to support only the playback usual case, which you seems to assert. In fact, if it was the case, it would be an overly flexible layer to accomplish just that. This flexibility is there for a reason: to adapt to more than one well-known case.