Liste des Groupes | Revenir à c arch |
On 9/21/2024 3:42 PM, Terje Mathisen wrote:Where, IME, stereo seems to be preserved with a 1/4 downsample (say 16 kHz with a 4 kHz side channel, or 32 kHz with an 8 kHz side channel).Michael S wrote:Pretty much the whole occipital lobe is dedicated to visual processing (along with various parts of the parietal lobe), only a few small areas of the temporal lobe are for audio.On Sat, 21 Sep 2024 15:39:41 +0200>
Terje Mathisen <terje.mathisen@tmsw.no> wrote:
>MitchAlsup1 wrote:>On Fri, 20 Sep 2024 20:06:00 +0000, John Dallman wrote:>In article <vcgpqt$gndp$1@dont-email.me>, david.brown@hesbynett.no>
(David
Brown) wrote:Even a complete amateur can notice time mismatches of 10 ms in a>
musical context, so for a professional this does not surprise me.
I don't know of any human endeavour that requires lower latency or
more precise timing than music.
A friend used to work on set-top boxes, with fairly slow hardware.
They had demonstrations of two different ways of handling
inability to keep up
with the data stream:
>
- Keeping the picture on schedule, and dropping a few milliseconds
of sound.
- Dropping a frame of the picture, and keeping the sound on-track.
>
Potential customers always thought they wanted the first approach,
until they watched the demos. Human vision fakes a lot of what we
"see" at the best of times, bit hearing is more sensitive to
glitches.
Having the ears being able to hear millisecond differences in sound
arrival times is key to our ability to hunt and evade predator's.
Not only that, but the slight non-sylindrical shape of the ear
opening 6 canal cause _really_ minute phase shifts, but they are what
makes it possible for us to differentiate between a sound coming from
directly behind vs directly ahead.>>
While our eyes have a time constant closer to 0.1 seconds.
>
That is, I blame natural selection on the above.
Supposedly, we devote more of our bran to hearing than to vision?
>
Terje
>
>
I think, it's not even close in favor of vision.
That would have been my guess as well, as but as I wrote above, a few years ago I was told it was otherwise. Now I have actually read a few papers about how you can actually measure this, and it did make sense, i.e at least an order of magnitude more vision than hearing.
>
Granted, still, small audio glitches are more noticeable than temporal visual glitches.
Personally, I see most everything much over 16 Hz as full motion (so, much past this, I need a frame-counter to know how fast it is going).
But, I will notice user-input lag much over ~ 75 .. 100 ms or so.
Have noted that my ability to perceive motion starts to break down much above 200-250 ms, at which point it perceptually transitions into being a series of still images.
From what I gather, for many people this transition is more around 120-140 ms.
To me, 30 vs 60 fps video doesn't make much difference, as, AFAIK, I can't really tell the difference.
Having done both audio and video codec optimization I know that even the very highest levels of audio quality is near-DC compared to video signals. :-)?...
>
One annoyance when working with audio is avoiding the creation of "pops", as any rapid transition in DC level is likely to be fairly obvious (even if fairly small).
Usual strategy to stitch audio together is to generate a certain amount of "overmix" and then blend the transition.
Though, I suspect, strategies like MDCT are still a bit overkill (ex: generating twice the output, with half being overmix).
One can use computationally cheaper transforms and then extrapolate past the edges a little (with a weighted blend on the edges). Similar can also be used for deblocking in JPEG-like codecs, but usually isn't worthwhile due to the added computational cost.
Say, with around a 12% overmix,..
I have noted that I don't really like MP3 all that much as, apart from higher quality levels (>= 128 kbps), it adds significant / obvious distortions at higher frequencies (particularly obvious below around 80-96 kbps).
At lower bitrates, I almost find ADPCM preferable, as while it doesn't sound good either at a low sample rate, at least it isn't a bunch of squealing and whistling mixed with what sounds like shaking a steel can full of broken glass.
In my past fiddling, the relative quality/bitrate of ADPCM could be improved slightly by using predictive filters and a 3-bit sample size. Though, much smaller than 3-bits / sample does not work effectively.
A usual strategy for stereo was to do a center/side transform and then encoding the side channel at around 1/4 the sample rate of the center channel.
In the past, had also noted that the Walsh-Hadamard transform also works OK for audio, but seemingly no one uses WHT for audio...An example of this would be like a "poor man's MP3":
I hadn't done much with it, as my priority had often been on "cheap to decode" designs with fixed block-sizes. Which generally favored either an ADPCM, or encoding points above/below a segmented line.One variation of the segmented line approach is, for a block of samples:
...
Terje
>
Les messages affichés proviennent d'usenet.