In a real-time case, yes. As noted by korli, the measurable latency would be 6.5ms. There I was wrong. In other cases it could be varying between 2.5ms and 6.5ms (the case of e.g. playing audio from disk i.e. data is not coming in from a real-time source).
Although the initial reason for getting into this example was your assertion that the latency would always come out as integer multiples of the buffer duration, which, I suppose you agree, is not correct.
Yep, and that’s exactly what I said in different words: if a node has 5ms latency (and uses it!) in a chain using 4ms buffers that means it needs 5ms to process 4ms of audio, workload is too high. It can’t do “one period of audio” (one buffer duration) in “one period of time” (used node latency).
(There are cases where it can make sense to report to the MediaKit a node latency higher than buffer duration, which is fine as long as the node doesn’t actually use it entirely.)
It doesn’t make any sense to talk about “latency” for a system that doesn’t operate in real time. Note that in Haiku “playing audio from disk” is in ordinary practice a real time operation, because the software volume controls sit between the media player and the soundcard, and also the supplied media player has local controls too. When you increase the volume in the mixer GUI, that’s a real time change to the audio being performed.
Ah, the subtle stuff. I could really use a whiteboard here, but we’ll have to do without. We’ve seen why it can’t be less than one buffer length, but the reason it can’t be less than two is a bit trickier.
Haiku’s Media Kit, like ASIO or JACK, lets the workload vary over time. You can drop in another synth and it’ll just keep going. Don’t need to stop everything - black out the stage, spend a couple of minutes re-wiring everything - no you can just get on with it. But there’s a price: This means that even if in fact the framework chooses to schedule your node to run close to the end of a period you can’t rely on that. If you did the result would be either drop outs or nasty jitter when things change. So you always have to act as though you might run at the very start of the period, the latency will be two buffer lengths (or three, four, etc.)
Today most of Haiku isn’t built to this exacting standard, there are plenty of other things that need fixing more urgently, but it would be a mistake to take the lesson that Haiku’s doing better here rather than worse.
[quote]
Yep, and that’s exactly what I said in different words: if a node has 5ms latency (and uses it!) in a chain using 4ms buffers that means it needs 5ms to process 4ms of audio, workload is too high. It can’t do “one period of audio” (one buffer duration) in “one period of time” (used node latency).
(There are cases where it can make sense to report to the MediaKit a node latency higher than buffer duration, which is fine as long as the node doesn’t actually use it entirely.)[/quote]
The entire workload, not just one node, exactly as korli and I have explained. Forget about the nodes here.
The difference I was pointing out is the following:
a) Capturing data from a live source, such as a soundcard’s audio input. You cannot fill 4ms of buffer within 1ms of real-time because 4ms of real-world time have to pass to acquire the data in the first place. Except your soundcard has a builtin time-machine
b) Playing data from a file on disk, where you can just read ahead and know the data of “future” points in time.
In case a) for our example, latency would be fixed at 6.5ms. However in case b), it depends on when the change request from the user comes in. When the user initially hits play, sound comes out 2.5ms later, that’s 2.5ms latency. Now let’s say, while playing, the user seeks the playback to some other place in the file, he does that at t=0.1ms (just after the producer started working on the first buffer, so it’s too late for change). This change will be reflected with 6.4ms latency. If however he did that at t=3.9ms (just before the second buffer is prepared), the latency would be only 2.6ms. So the range is 2.5-6.5ms. Of course it’s possible to write the node in a way which always fixes it to 6.5ms, in case the application does not want any jitter.
Well, first of all, your claim was that the latency must always be an integer multiple of the buffer duration. We are both agreeing now that in the example we used, latency is 6.5ms. Not an integer mulitple of the 4ms buffer size, so the example falsifies the assertion.
Secondly, the scheduling for buffer processing is always “as late as possible”. Why would it cause problems on rewiring?
No. The chain length is irreleveant, only the nodes matter.
Again: what I said is that “any single” node’s latency must be below buffer duration. That means: every one of them in the chain, individually, must obey the constraint. The constraint can be rephrased as “a node must be able to process a buffer in less real-time than it will take to play back the buffer in the end”. As long as every node obeys this, no dropouts can happen.
Although, thinking about it, writing “any single” to mean “each single node individually” was misunderstandable, I should have phrased it as “every single” or “each single” in the earlier postings… hmh, English skill fail. Maybe that was the confusion?
However, if you still disagree, please provide an example, any example, of a node chain which a) has only nodes with latency smaller than the buffer duration used in the chain and b) gets its workload too high. Try it out, you will see it cannot happen.
Avoiding jitter is a basic requirement. I mentioned this quite a lot earlier, you can’t take seriously a system that will introduce jitter in this way. We are not talking about the more or less imaginary sub-frame jitter that audiophiles mumble about when reviewing cables, and which most likely vanishes during reconstruction - this is jitter of the order of milliseconds, which is all too audible.
That’s not how agreeing works. I’ve been trying to get you to understand all the way through how it has to work, we’re not trying to reach a compromise here, the question is only whether I get you to understand or not.
[quote]
Secondly, the scheduling for buffer processing is always “as late as possible”. Why would it cause problems on rewiring?[/quote]
Aside from more prosaic considerations the value of possible changes during rewiring. If your scheduling moves earlier you will be running too early to deliver your promised latency, and you have no control over that, so you have to always assume the worst case or you’ll introduce drop-outs or jitter. That worst case is two (or more if you introduce more) buffers.
[quote]
However, if you still disagree, please provide an example, any example, of a node chain which a) has only nodes with latency smaller than the buffer duration used in the chain and b) gets its workload too high. Try it out, you will see it cannot happen.[/quote]
We already saw exactly this, using your own example numbers, in Search results for 'cortex' - Haiku Community and you simply snipped it out completely from all further conversation as if it did not exist.
Your insistence on trying to think only of the workload of individual nodes leads you astray. All of the nodes must share the same resources. Multitasking isn’t fairy magic, the computer doesn’t become more powerful when you run more stuff, it just parcels the same resources out more thinly.
[quote=NoHaikuForMe]Avoiding jitter is a basic requirement. I mentioned this quite a lot earlier, you can’t take seriously a system that will introduce jitter in this way.
[/quote]
Avoiding the jitter is easy, but for a simple media player type application it doesn’t matter at all.
Now you’re evading an answer. We had an example. Is the latency in it 6.5ms? It’s a simple yes or no question. I don’t know how you get that about being a compromise or not.
Fine, let’s look at that again. The timeline is wrong. As I said very early on in this discussion, the node chain is a pipeline, every node runs in its own thread, they work in parallel. Concurreny is what multitasking is all about.
The example was:
Nodes A → B → C.
Latencies: A=1ms B=3ms C=1ms, buffer duration is 4ms
The correct timeline for that one is:
t=0: A begins filling buffer 1
t=1: Buffer 1 sent to B
t=4: Buffer 1 sent to C // at the same time: A begins filling buffer 2
t=5: Buffer 1 is played to the user // at the same time: buffer 2 sent to B
t=8: Buffer 2 sent to C
t=9: Buffer 2 played to the user.
No dropout happened!
It certainly isn’t. And yet, many nodes can work concurrently, even on a single-processor system.
No, to do this correctly the latency will be 8ms. If you assume you’ll be scheduled later you risk drop-outs or jitter so that’s the wrong decision.
If you want to do twice as many things at once you end up doing them twice as slowly, that’s how timeslicing works. I’ll show the number of simultaneous execution contexts as X to help with following along, and assume that the above rule of thumb is sufficient, in reality the penalty for having multiple execution contexts is a little higher because of context switches, caches etc.
t=0: A begins filling buffer 1 (X=1)
t=1: Buffer 1 sent to B, B starts filtering (X=1)
t=4: Buffer 1 sent to C && A begins filling buffer 2 (X=2) with two things at once to do, they take twice as long, two milliseconds to do both
t=6: Buffer 1 start being played to the user && buffer 2 sent to B for filtering (X=1) back to just B running for a while at full speed
t=9: Buffer 2 sent to C && A begins filling buffer 3 (X=2) slow mode again, two milliseconds until we’re done
t=10: a PCM buffer underrun occurs
Huh. That didn’t work after all. Well let’s try to shuffle things around.
t=0: A begins filling buffer 1 (X=1)
t=1: Buffer 1 sent to B, B starts filtering (X=1)
t=4: Buffer 1 sent to C && A begins filling buffer 2 (X=2) with two things at once to do, they take twice as long, two milliseconds to do both
t=6: Buffer 1 start being played to the user && buffer 2 sent to B for filtering (X=1)
t=9: Buffer 2 sent to C (X=1) Yay, we cut some corners but now we’ll make it in time!
t=10: Buffer 2 starts being played to the user && A begins filling buffer 3 (X=1)
t=11: Buffer 3 sent to B (X=1)
t=14: a PCM buffer underrun occurs
Still didn’t work.
You can keep re-arranging things all day, but you’re trying to fit 5ms of work into 4ms of time so it won’t fit, no matter what you do.
No, only the illusion of concurrency is created, and only at a human timescale, the timeslices are not actually happening simultaneously. And on the scale we’ve been discussing these slices are quite thick, milliseconds like the buffer period. If one node’s thread is pre-empted by another it most likely won’t get another chance until the next period, that doesn’t matter for your example, which is doomed anyway, but it’s important to have it in mind for e.g. graphs where parallel execution is possible within a single buffer period.
It should not take twice as long if the node was written correctly.
A node which told the Media Kit to have 1ms latency, and now suddendly uses 2ms, has run late, it did not keep its “promise” to finish in the designated time. That should be avoided, by choosing latency values correctly.
When a node tells the Media Kit “my latency is 3ms” that should not mean “I need 3ms of exclusive CPU time, all to myself, to process a buffer”. That would indeed be a problem, two such nodes could not work at the same time without running late. But in a multitasking system, you may not assume that you have the CPU all to yourself.
So, the way latencies are estimated by nodes in the Media Kit, is to incorporate the fact that other things are running as well. One way to do it is to e.g. make a preflight to see how fast you can run on the machine and then add headroom on top of that to account for varying machine load.
Thus, two (and more) nodes can work at the same time and both can still meet their timing constraints. Because the latency time is higher than what the node really needs in exclusive CPU time. It must be this way. After all, the CPU is not only shared between all the media nodes, there are other things unrelated to media to do in real-time as well – if you set latencies super-tight then music would stutter just because you multitask!
Now, what happens if the node estimated a too small latency anyway? It runs late, the node after it in the chain will notice that and send a “late notice” to its predecessor, which will (usually) increase the latency of the late node. The Media Kit handles all this to readjust the downstream latencies for all nodes which should know about it, so buffers flow in time again with the new overall chain latency. In general, the successor node will be able to cope with a little lateness, so no dropout happens even when the workload goes up and lateness occurs, although in really bad cases, a dropout might be unavoidable. But that is no different from other systems I know.
Why not? Other than because it would be inconvenient for you?
So, explain how any node can ever reliably keep the “promise” to do finite work without any guarantee in return that it will get some amount of resources to do the work in. If that’s not possible, why bother asking them to make such a “promise” at all?
OK, so again, how should they decide what to report? Multiply everything by two? By fifty? By a thousand?
This is hand-waving. A preflight for one node discovers nothing about its behaviour when competing with other nodes for resources. How shall a node correctly discover the latency you’ve been talking about which magically allows for an unlimited number of simultaneous execution contexts?
So, the way you make 1+1+3 less than 4 is by being less than truthful with all the numbers, the 1s are actually worth 0.5, the 3 is maybe worth 2, and so the total is only 3 after all. Pure theatre. A system that dispenses with this foolishness gets the same results, 3ms of real work can be executed synchronously in under 4ms too.
[quote]
Now, what happens if the node estimated a too small latency anyway? It runs late, the node after it in the chain will notice that and send a “late notice” to its predecessor, which will (usually) increase the latency of the late node. The Media Kit handles all this to readjust the downstream latencies for all nodes which should know about it, so buffers flow in time again with the new overall chain latency. In general, the successor node will be able to cope with a little lateness, so no dropout happens even when the workload goes up and lateness occurs, although in really bad cases, a dropout might be unavoidable. But that is no different from other systems I know.[/quote]
There can be a drop out or jitter every single time this adjustment is made, intolerable. Worse, these silly estimates and the pretence that it’s OK for the entire workload to take more than one period leads to the system spiralling out of control with “late” messages when workload exceeds capacity rather than booting things from the graph.
It looks like you have zero understanding of what real time means, this is a non-sense question.
There’s not a magic “how to decide” it depends on what the node do. A lot of nodes do that by materially measuring latency when they are loaded. So usually, the latency declared isn’t an hardcoded value from the dreams of programmers, but a pratical probe of the system.
The late node notification is an extreme case, and it’s supposed to happen only when something steal a lot of cpu cycles and nodes doesn’t receive their expected amount of cycles. It’s like you open too many programs in your, pc, if it’s slow, is it a fault of the programmer or you are going over the possibilities of your CPU?
But, what’s your solution? Do you want the system to don’t care about latency? I can’t understand the point. Please, give us the philosopher’s stone.
Also an example of a system which is resolving the problem better is appreciated.
This probe can only measure the state when it is performed, not the state it’s being asked to predict. You haven’t offered (and indeed the available Haiku software makes no attempt to offer) a solution for that. The entire premise (if we are to believe jua) of the Media Kit is undermined by these estimates.
Check for example the equalizer node provided with Haiku, its latency estimate is calculated by simply running its calculations over a test buffer at the moment of connection. This is exactly the sort of probe you’re talking about, and jua would have us believe that it shouldn’t do this because the results running alone now will of course not reflect what happens if later it is competing with another node for timeslices. But, what should it do instead?.
[quote]But, what’s your solution? Do you want the system to don’t care about latency? I can’t understand the point. Please, give us the philosopher’s stone.
Also an example of a system which is resolving the problem better is appreciated.[/quote]
For audio on a PC type system? Do all the work and race to idle. This was the approach taken by JACK for good reason. Note that if you lose this race then by definition you could not have avoided a buffer underrun by any means, whereas you can’t be sure with the Media Kit approach described above – so JACK is justified in deciding to kick a node from the graph for running too slowly if it misses the deadline. In the Media Kit design you can at best rely on heuristics for managing lateness.
I had lost interest in replying due to the increasingly aggressive discussion style. So, if you want to know more about the Media Kit, I’d just refer you to the plethora of available documentation.
In the end, you keep arguing that the Media Kit is fundamentally flawed, as if it was just a theoretical concept which has never been tried out in practice. Fact is, two implementations exist, the one in BeOS and the one in Haiku. Believe it or not, they do work.
On the one hand, I really am mostly interested in explaining the theory. It’s an opportunity for people to learn stuff that’s useful outside of the present context, stuff that I wish I’d known twenty years ago before I wrote my first audio software.
But on the other hand, in practice yes the Media Kit design really is pretty terrible. Does it work? After a fashion sure, and so did the BeOS R5 netserver.
[quote=jua]I had lost interest in replying due to the increasingly aggressive discussion style. So, if you want to know more about the Media Kit, I’d just refer you to the plethora of available documentation.
In the end, you keep arguing that the Media Kit is fundamentally flawed, as if it was just a theoretical concept which has never been tried out in practice. Fact is, two implementations exist, the one in BeOS and the one in Haiku. Believe it or not, they do work.[/quote]
What exact thoery are you discussing ? Processing Latency, Transport Latency, AD conversion Latency ? Which Latencys, where do they exist in the audio chain and what are you rambling on about ? There is nothing fundementally flawed with the media kit design, it works rather well in practice. Its based on the BeOS design, which BTW inspired the jack2 implementation, the new windows implementation and most of the linux implementations.
thats reallity. does it need some updates and features and enhancements, sure, but its the basis for every other media backend design I can see and it was the first design to actually concouqer latency in a practical and sane way. You can spend 20 years doing something, and still suck at it. and you can spend 20 years doing something and master it.
In Haiku require an intuitive interface, control audio streams, including the software should be very easy to learn. Cortex’a will have little to master musicians :), need at least a semblance of Logic Studio, Adobe Audition3 (CoolEditPro), FL Studio - simple, comfortable music editors(trnslt)
The solution is simply to do this check some times programmatically and update the latencies accordingly. But for most nodes it’s not needed.
I’ll try to use different words for explaining what you aren’t understanding (or refusing to). Before all it’s false that latencies aren’t not taken into consideration in jack. If you go through the documentation at some point this question is explained very well. Basically the latency callback is not declared by clients with only input or output ports, or when the input is completely unrelated to the output. In BeOS the situation is not different, for (pure) consumers the latency isn’t something that have meaning at all.
[quote]
For audio on a PC type system? Do all the work and race to idle. This was the approach taken by JACK for good reason. Note that if you lose this race then by definition you could not have avoided a buffer underrun by any means, whereas you can’t be sure with the Media Kit approach described above – so JACK is justified in deciding to kick a node from the graph for running too slowly if it misses the deadline. In the Media Kit design you can at best rely on heuristics for managing lateness.[/quote]
Extending what i said before, let’s dispel some myths out there. The approach taken by Haiku is performance driven, and it’s true. But by itself this means only that you have more control on what will happen in the node. This is probably not needed in a audio system but the media_kit is not a such type of beast in the strict meaning. So, i’ll show you what is the current approach which usually a BMediaEventLooper take :
1 - the new buffer event is scheduled at time x
2 - receive a new buffer event at time x
3 - create the buffer and send it
Theoretically there’s nothing preventing you to do this :
1 - create the buffer
2 - the new buffer event is scheduled at time x
3 - receive a new buffer event at time x, retrieve the buffer and send it
From the latency perspective, i think jack use some method to estabilish how many time it will take to process the client (when the latency_callback is not declared), and a similar technique may be used on top of the media_kit. I don’t think really those are the problems out there.
[quote=NoHaikuForMe]
On the one hand, I really am mostly interested in explaining the theory. It’s an opportunity for people to learn stuff that’s useful outside of the present context, stuff that I wish I’d known twenty years ago before I wrote my first audio software.
But on the other hand, in practice yes the Media Kit design really is pretty terrible. Does it work? After a fashion sure, and so did the BeOS R5 netserver.[/quote]
Did you had to work with the mediakit or is it just from looking into the code? And wich framework are you thinking is “well” designed? What is your opinion. Because there are always to sides some Frameworks are “well” designed in the backend but the api is horrible and the other maybe the API is well designed but the Backend is way to slow.
Please give a variety of examples of nodes for which no solution is needed. You should explain your working. If you don’t have any examples to give, you should reconsider your argument that “most nodes” don’t need to solve this problem. Note that for the overrun examples “updating the latencies accordingly” just results in latencies climbing forever, you still can only do finite work in finite time as before. This is why jua’s clever examples don’t work, they assume you can do infinite work in finite time just by slicing it up more, maybe a nice argument for a theoretical mathematician, but quite obviously wrong for a computer programmer.
OK?
Please try to rewrite this so that it doesn’t contain a double negative with unclear valency. I recommend trying to write things in positive terms when you can, for example perhaps you meant here “It’s true that JACK takes latency into consideration” ?
As the documentation explains the latency callback is only to be used by clients which introduce an algorithmic delay. This occurs when the algorithm unavoidably needs to use “future” input sample frames to calculate the value of “past” output sample frames. A “lookahead limiter” is a popular and useful example.
You should not confuse it with delays related to the audio hardware (which JACK tracks separately), or with the “latency” values used by the Media Kit for scheduling decisions.
The situation is quite different in BeOS because the latency reported is used for scheduling decisions. That would be nonsensical in JACK.
[quote]Extending what i said before, let’s dispel some myths out there. The approach taken by Haiku is performance driven, and it’s true. But by itself this means only that you have more control on what will happen in the node. This is probably not needed in a audio system but the media_kit is not a such type of beast in the strict meaning. So, i’ll show you what is the current approach which usually a BMediaEventLooper take :
1 - the new buffer event is scheduled at time x
2 - receive a new buffer event at time x
3 - create the buffer and send it
Theoretically there’s nothing preventing you to do this :
1 - create the buffer
2 - the new buffer event is scheduled at time x
3 - receive a new buffer event at time x, retrieve the buffer and send it[/quote]
For a BMediaEventLooper implementing a trivial effect (say, doubling the linear amplitude of an audio stream) which of these steps is deriving the new values from the old? I suppose it is “create the buffer” ? In this case you are wrong, you are prevented from moving this step before “receive a new buffer” because you cannot proceed until you have the input data to be processed.
Of course you can sidestep this by incurring an entire period of additional latency for each such node. That’s actually more or less what Be’s engineers were recommending when the BMediaEventLooper was introduced. Their plan was to come up with a fix for the next release of BeOS. Alas, events overtook them and BeOS R5 missed quite a lot of things more obvious than this. The Media Kit you have today leaves the problem unsolved.
No. JACK doesn’t need to “establish how many time it will take to process the client”. It just runs the client, synchronously as described previously. If the client takes too long to run (causing xruns) it is dropped from the graph.
Because JACK does not have this imaginary “method” you cannot incorporate it “on top of the media_kit”.
[quote=Paradoxon]
Did you had to work with the mediakit or is it just from looking into the code? And wich framework are you thinking is “well” designed? What is your opinion. Because there are always to sides some Frameworks are “well” designed in the backend but the api is horrible and the other maybe the API is well designed but the Backend is way to slow.[/quote]
I have spent some small amount of time playing with the Media Kit (writing actual code, though nothing of value), and somewhat more time reading the documentation and source code. I had actually assumed, years ago when it was current, that it was somewhat more capable than in fact it is.
I have also written non-trivial programs for JACK and fooled around with GStreamer. I have read about, but never had cause to use, JMF and CoreAudio.
Problems like being “way to slow” are called Quality of Implementation issues. Haiku scores very badly here. For example over the years there have been several bug reports (e.g. #1351, #9438) about Haiku’s interpolating resamplers. From time to time it is pointed out that the entire approach of the resampler is wrong (e.g. it needs to know the actual sample rates in order to function correctly) and must be replaced, but as with most things in Haiku nobody finds the time to do it, although they’ve got time for bike-shedding about it.
The API is nothing special, it’s certainly not the worst part of the Media Kit. The programmer is (as several others have observed) left doing a lot of unnecessary housekeeping that could be taken care of by the OS / the kit. There are various things missing, and no evidence anyone is looking to add them (such as monitors and session management). It looks more or less how you’d expect given that it was all abandoned, unfinished, almost 15 years ago.
Any BBufferConsumer is an example of a node which doesn’t have the need to have a complex latency handling, because it’s usually fixed. The consumer latency (more like jack do with the driver latency), is used to estabilish how many time it will pass before the buffer is heard at the speakers.
Talking more generally, if the node processing cost is not variable (in the sense that it’s every time a Ω(n) or Ω(n log n) and so on), it’s latency may be perfectly calculated using a phantom cycle, just as i’ve seen Clockwerk do.
So at this point you may say (as before) “What happens when the load become too high?”. Well the media_kit, notice the producer that it’s producing buffers too late. At this point the producer should update it’s latency accordingly. I would like also to show out how this way to do things is more reasonable for video rather than audio.
You’ve understand perfectly. No need to comment a typo of a non mother tongue speaker. This just show us your joking way of doing discussions.
NO. It’s used to get the latency of a port which is not trivial. So for example if the internal signal path is not unique and jack can’t trivially estabilish how to operate. The callback is needed for example in a client processing audio from two separate inputs to two separate outputs.
Absolutely not, scheduling is another story. BeOS use the latency to determine when the buffer should be played out.
[quote]
The only difference with jack in this case, is that jack process everything in a unique cycle. Where the BeOS instead, use different threads to do that in a cooperative fashion. But if a jack client block out the results will be the same (an xrun).
That’s something like a xrun.
[quote]
No. JACK doesn’t need to “establish how many time it will take to process the client”. It just runs the client, synchronously as described previously. If the client takes too long to run (causing xruns) it is dropped from the graph.
Because JACK does not have this imaginary “method” you cannot incorporate it “on top of the media_kit”.[/quote]
Anyway, in sync mode jack always wait for the clients to end. In async he doesn’t wait and go to the next cycle. If jack were not taking into account latencies, how it recognize xruns?
Well, i remember it doing all this way in the driver :
1 - the buffer is long 10ms
2 - process graph
3 - if the process has been more long than 10 ms we have an xrun
But this way to work is replicable also in Haiku. Jack is like a media_node itself and can be compared to it. It’s clients aren’t comparable to media_nodes but a little subset of it.