Vorbisfile documentation

vorbisfile version 1.2.0 - 20070723

What is Crosslapping?

Crosslapping blends two samples together using a window function, such that any sudden discontinuities between the samples that may cause clicks or thumps are eliminated or blended away. The technique is nearly identical to how Vorbis internally splices together frames of audio data during normal decode. API functions are provided to crosslap transitions between seperate streams, or to crosslap when seeking within a single stream.

Why Crosslap?

The source of boundary clicks

Vorbis is a lossy compression format such that any compressed signal is at best a close approximation of the original. The approximation may be very good (ie, indistingushable to the human ear), but it is an approximation nonetheless. Even if a sample or set of samples is contructed carefully such that transitions from one to another match perfectly in the original, the compression process introduces minute amplitude and phase errors. It's an unavoidable result of such high compression rates.

If an application transitions instantly from one sample to another, any tiny discrepancy introduced in the lossy compression process becomes audible as a stairstep discontinuity. Even if the discrepancy in a normal lapped frame is only .1dB (usually far below the threshhold of perception), that's a sudden cliff of 380 steps in a 16 bit sample (when there's a boundary with no lapping).

I thought Vorbis was gapless

It is. Vorbis introduces no extra samples at the beginning or end of a stream, nor does it remove any samples. Gapless encoding eliminates 99% of the click, pop or outright blown speaker that would occur if boundaries had gaps or made no effort to align transitions. However, gapless encoding is not enough to entirely eliminate stairstep discontinuities all the time for exactly the reasons described above.

Frame lapping, like Vorbis performs internally during continuous playback, is necessary to eliminate that last epsilon of trouble.

Easiest Crosslap

The easiest way to perform crosslapping in Vorbis is to use the lapping functions with no other extra effort. These functions behave identically to when lapping isn't used except to provide at-least-very-good lapping results. Crosslapping will not introduce any samples into or remove any samples from the decoded audio; the only difference is that the transition is lapped. Lapping occurs from the current PCM position (either in the old stream, or at the position prior to calling a lapping seek) forward into the next half-short-block of audio data to be read from the new stream or position.

Ideally, vorbisfile internally reads an extra frame of audio from the old stream/position to perform lapping into the new stream/position. However, automagic crosslapping works properly even if the old stream/position is at EOF. In this case, the synthetic post-extrapolation generated by the encoder to pad out the last block with appropriate data (and avoid encoding a stairstep, which is inefficient) is used for crosslapping purposes. Although this is synthetic data, the result is still usually completely unnoticable even in careful listening (and always preferable to a click or pop).

Vorbisfile will lap between streams of differing numbers of channels. Any extra channels from the old stream are ignored; playback of these channels simply ends. Extra channels in the new stream are lapped from silence. Vorbisfile will also lap between streams links of differing sample rates. In this case, the sample rates are ignored (no implicit resampling is done to match playback). It is up to the application developer to decide if this behavior makes any sense in a given context; in practical use, these default behaviors perform sensibly.

Best Crosslap

To acheive the best possible crosslapping results, avoid the case where synthetic extrapolation data is used for crosslapping. That is, design loops and samples such that a little bit of data is left over in sample A when seeking to sample B. Normally, the end of sample A and the beginning of B would overlap exactly; this allows crosslapping to perform exactly as it would within vorbis when stitching audio frames together into continuous decoded audio.

The optimal amount of overlap is half a short-block, and this varies by compression mode. Each encoder will vary in exact block size selection; for vorbis 1.0, for -q0 through -q10 and 44kHz or greater, a half-short block is 64 samples.


copyright © 2007 Xiph.org

Ogg Vorbis

Vorbisfile documentation

vorbisfile version 1.2.0 - 20070723