Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: downloads/libvorbis-1.2.0/doc/vorbisenc/overview.html @ 16

Last change on this file since 16 was 16, checked in by landauf, 17 years ago

added libvorbis

File size: 15.5 KB
Line 
1<html>
2
3<head>
4<title>libvorbisenc - API Overview</title>
5<link rel=stylesheet href="style.css" type="text/css">
6</head>
7
8<body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff">
9<table border=0 width=100%>
10<tr>
11<td><p class=tiny>libvorbisenc documentation</p></td>
12<td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
13</tr>
14</table>
15
16<h1>Libvorbisenc API Overview</h1>
17
18<p>Libvorbisenc is an encoding convenience library intended to
19encapsulate the elaborate setup that libvorbis requires for encoding.
20Libvorbisenc gives easy access to all high-level adjustments an
21application may require when encoding and also exposes some low-level
22tuning parameters to allow applications to make detailed adjustments
23to the encoding process. <p>
24
25All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h".
26
27<em>Note: libvorbis and libvorbisenc always
28encode in a single pass. Thus, all possible encoding setups will work
29properly with live input and produce streams that decode properly when
30streamed.  See the subsection titled <a href="#BBR">"managed bitrate
31modes"</a> for details on setting limits on bitrate usage when Vorbis
32streams are used in a limited-bandwidth environment.</em>
33
34<h2>workflow</h2>
35
36<p>Libvorbisenc is used only during encoder setup; its function
37is to automate initialization of a multitude of settings in a
38<tt>vorbis_info</tt> structure which libvorbis then uses as a reference
39during the encoding process.  Libvorbisenc plays no part in the
40encoding process after setup.
41
42<p>Encode setup using libvorbisenc consists of three steps:
43
44<ol>
45<li>high-level initialization of a <tt>vorbis_info</tt> structure by
46calling one of <a
47href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
48href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
49with the basic input audio parameters (rate and channels) and the
50basic desired encoded audio output parameters (VBR quality or ABR/CBR
51bitrate)<p>
52
53<li>optional adjustment of the basic setup defaults using <a
54href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p>
55
56<li>calling <a
57href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to
58finalize the high-level setup into the detailed low-level reference
59values needed by libvorbis to encode audio. The <tt>vorbis_info</tt>
60structure is then ready to use for encoding by libvorbis.<p>
61
62</ol>
63
64These three steps can be collapsed into a single call by using <a
65href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a
66quality-based VBR stream or <a
67href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed
68bitrate (ABR or CBR) stream.<p>
69
70<h2>adjustable encoding parameters</h2>
71
72<h3>input audio parameters</h3>
73
74<p>
75<table border=1 color=black width=50% cellspacing=0 cellpadding=7>
76<tr bgcolor=#cccccc>
77        <td><b>parameter</b></td>
78        <td><b>description</b></td>
79</tr>
80<tr valign=top>
81<td>sampling rate</td>
82<td>
83The sampling rate (in samples per second) of the input audio.  Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT.  Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample.
84
85</td>
86</tr>
87<tr valign=top>
88<td>channels</td>
89<td>
90
91The number of channels encoded in each input sample.  By default,
92stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such
93that the stereo relationship between the samples is taken into account
94when encoding.  Stereo coupling my be disabled by using <a
95href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
96href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>.
97
98</td>
99</tr>
100</table>
101
102<h3>quality and VBR modes</h3>
103
104Vorbis is natively a VBR codec; a user requests a given constant
105<em>quality</em> and the encoder keeps the encoding quality constant
106while allowing the bitrate to vary.  'Quality' modes (Variable BitRate)
107will always produce the most consistent encoding results as well as
108the highest quality for the amount of bits used.
109
110<p>
111<table border=1 color=black width=50% cellspacing=0 cellpadding=7>
112<tr bgcolor=#cccccc>
113        <td><b>parameter</b></td>
114        <td><b>description</b></td>
115</tr>
116<tr valign=top>
117<td>quality</td>
118<td>
119A decimal float value requesting a desired quality.  Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency.  Quality settings 0.0 and above are intended to produce consistent results at all times. 
120
121</td>
122</tr>
123</table>
124
125<a name="BBR">
126<h3>managed bitrate modes</h3>
127
128Although the Vorbis codec is natively VBR, libvorbis includes
129infrastructure for 'managing' the bitrate of streams by setting
130minimum and maximum usage constraints, as well as functionality for
131nudging a stream toward a desired average value.  These features
132should <em>only</em> be used when there is a requirement to limit
133bitrate in some way.  Although the difference is usually slight,
134managed bitrate modes will always produce output inferior to VBR
135(given equal bitrate usage). Setting overly or impossibly tight
136bitrate management requirements can affect output quality dramatically
137for the worse.<p>
138
139Beginning in libvorbis 1.1, bitrate management is implemented using a
140<em>bit-reservoir</em> algorithm. The encoder has a fixed-size
141reservoir used as a 'savings account' in encoding.  When a frame is
142smaller than the target rate, the unused bits go into the reservoir so
143that they may be used by future frames.  When a frame is larger than
144target bitrate, it draws 'banked' bits out of the reservoir.  Encoding
145is managed so that the reservoir never goes negative (when a maximum
146bitrate is specified) or fills beyond a fixed limit (when a minimum
147bitrate is specified).  An 'average bitrate' request is used as the
148set-point in a long-range bitrate tracker which adjusts the encoder's
149aggressiveness up or down depending on whether or not frames are coming
150in larger or smaller than the requested average point.
151
152<p>
153<table border=1 color=black width=50% cellspacing=0 cellpadding=7>
154<tr bgcolor=#cccccc>
155        <td><b>parameter</b></td>
156        <td><b>description</b></td>
157</tr>
158<tr valign=top>
159<td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits
160per second.  If the bitrate would otherwise rise such that oversized
161frames would underflow the bit-reservoir by consuming banked bits,
162bitrate management will force the encoder to use fewer bits per frame
163by encoding with a more aggressive psychoacoustic model.<p> This
164setting is a hard limit; the bitstream will never be allowed, under
165any circumstances, to increase above the specified bitrate over the
166average period set by the reservoir; it may momentarily rise over if
167inspected on a granularity much finer than the average period across
168the reservoir.  Normally, the encoder will conserve bits gracefully by
169using more aggressive psychoacoustics to shrink a frame when forced
170to.  However, if the encoder runs out of means of gracefully shrinking
171a frame, it will simply take the smallest frame it can otherwise
172generate and truncate it to the maximum allowed length.  Note that
173this is not an error and although it will obviously adversely affect
174audio quality, a Vorbis decoder will be able to decode a truncated
175frame into audio.
176
177</td>
178</tr>
179
180<tr valign=top>
181<td>average bitrate</td> 
182
183<td>
184
185The average desired bitrate of a stream, set
186in bits per second.  Average bitrate is tracked via a reservoir like
187minimum and maximum bitrate, however the averaging reservior does not
188impose a hard limit; it is used to nudge the bitrate toward the
189desired average by slowly adjusting the psychoacoustic aggressiveness.
190As such, the reservoir size does not affect the average bitrate
191behavior.  Because this setting alone is not used to impose hard
192bitrate limits, the bitrate of a stream produced using only the
193<tt>average bitrate</tt> constraint will track the average over time
194but not necessarily adhere strictly to that average for any given
195period.  Should a strict localized average be required, <tt>average
196bitrate</tt> should be used along with <tt>minimum bitrate</tt> and
197<tt>maximum bitrate</tt>.
198</td>
199
200</tr>
201
202<tr valign=top>
203<td>minimum bitrate</td>
204<td> 
205 The minimum allowed bitrate, set in bits per second.  If
206the bitrate would otherwise fall such that undersized frames would
207overflow the bit-reservoir with unused bits, bitrate management will
208force the encoder to use more bits per frame by encoding with a less
209aggressive psychoacoustic model.<p> This setting is a hard limit; the
210bitstream will never be allowed, under any circumstances, to drop
211below the specified bitrate over the average period set by the
212reservoir; it may momentarily fall under if inspected on a granularity
213much finer than the average period across the reservoir.  Normally,
214the encoder will fill out undersided frames with additional useful
215coding information by increasing the perceived quality of the stream.
216If the encoder runs out of useful ways to consume more bits, it will
217pad frames out with zeroes.
218</td>
219</tr>
220
221<tr valign=top>
222<td>reservoir size</td> <td> The size of the minimum/maximum bitrate
223tracking reservoir, set in bits.  The reservoir is used as a 'bit
224bank' to average out localized surges and dips in bitrate while
225providing predictable, guaranteed buffering behavior for streams to be
226used in situations with constrained transport bandwidth.  The default
227setting is two seconds of average bitrate.<p>
228
229When a single frame is larger than the maximum allowed overall
230bitrate, the bits are 'borrowed' from the bitrate reservoir; if the
231reservoir contains insufficient bits to cover the defecit, the encoder
232must find some way to reduce the frame size. <p>
233
234When a frame is under the minimum limit, the surplus bits are placed
235into the reservoir, banking them for future use.  If the reservoir is
236already full of banked bits, the encoder is forced to find some way to
237make the frame larger.<p>
238
239If the frame size is between the minimum and maximum rates (thus
240implying the minimum and maximum allowed rates are different), the
241reservoir gravitates toward a fill point configured by the
242<tt>reservoir bias</tt> setting described next.  If the reservoir is
243fuller than the fill point (a 'surplus of surplus'), the encoder will
244consume a number bits from the reservoir equal to the number of the
245bits by which the frame exceeds minimum size.  If the reservoir is
246emptier than the fillpoint (a 'surplus of defecit'), bits are returned
247to the reservoir equaling the current frame's number of bits under the
248maximum frame size.  The idea of the fill point is to buffer against
249both underruns and overruns, by trying to hold the reservoir to a
250middle course.
251</td>
252</tr>
253
254<tr valign=top>
255<td>reservoir bias</td>
256
257<td>
258
259Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate
260management toward smoothing bitrate spikes (0.0) or bitrate peaks
261(1.0); the default setting is 0.1.<p>
262
263Using settings toward 0.0 causes the bitrate manager to hoard bits in
264the bit reservoir such that there is a large pool of banked surplus to
265draw upon during short spikes in bitrate.  As a result, the encoder
266will react less aggressively and less drastically to curtail framesize
267during brief surges in bitrate.<p>
268
269Using settings toward 1.0 causes the bitrate manager to empty the bit
270reservoir such that there is a large buffer available to store surplus
271bits during sudden drops in bitrate.  As a result, the encoder will
272react less aggressively and less drastically to support minimum frame
273sizes during drops in bitrate and will tend not to store any extra
274bits in the reservoir for future bitrate spikes.<p>
275
276</td>
277</tr>
278
279<tr valign=top>
280<td>average track damping</td>
281<td> 
282
283A decimal value, in seconds, that controls how quickly the average
284bitrate tracker is allowed to slew from enforcing minimum frame sizes
285to maximum framesizes and vice versa.  Default value is 1.5
286seconds.<p>
287
288When the 'average bitrate' setting is in use, the average bitrate
289tracker uses an unbounded reservoir to track overall bitrate-to-date
290in the stream.  When bitrates are too low, the tracker will try to
291nudge bitrates up and when the bitrate is too high, nudge it down.
292The damping value regulates the maximum strength of the nudge; it
293describes, in seconds, how quickly the tracker may transition from an
294extreme nudge in one direction to an extreme nudge in the other.<p>
295
296</td>
297</tr>
298
299</table>
300
301<h3>encoding model adjustments</h3>
302
303The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides
304a generalized interface for making encoding setup adjustments to the
305basic high-level setup provided by <a
306href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
307href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>.
308In reality, these two calls use <a
309href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a
310href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust
311most of the parameters set by other calls.<p>
312
313In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can
314adjust the following additional parameters not described elsewhere:
315
316<p>
317<table border=1 color=black width=50% cellspacing=0 cellpadding=7>
318<tr bgcolor=#cccccc>
319        <td><b>parameter</b></td>
320        <td><b>description</b></td>
321</tr>
322<tr valign=top>
323<td>management mode</td> <td> Configures whether or not bitrate
324management is in use or not.  Normally, this value is set implicitly
325during encoding setup; however, the supported means of selecting a
326quality mode by bitrate (that is, requesting a true VBR stream, but
327doing so by asking for an approximate bitrate) is to use <a
328href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
329and then to explicitly turn off bitrate management by calling <a
330href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
331href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a>
332</td>
333</tr>
334
335<tr valign=top>
336<td>coupling</td> <td> Stereo encoding (and in the future, surround
337encodings) are normally encoded assuming the channels form a stereo
338image and that lossy-stereo modelling is appropriate; this is called
339'coupling'.  Stereo coupling may be explicitly enabled or disabled.
340</td>
341</tr>
342<tr valign=top>
343<td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode;
344this may be used to conserve a few bits in high-rate audio that has
345limited bandwidth, or in testing of the encoder's acoustic model.  The
346encoder is generally already configured with ideal lowpasses (if any
347at all) for given modes; use of this parameter is strongly discouraged
348if the point is to try to 'improve' a given encoding mode for general
349encoding.
350</td>
351</tr>
352
353<tr valign=top>
354<td>impulse coding aggressiveness</td> <td>By default, libvorbis
355attempts to compromise between preventing wide bitrate swings and
356high-resolution impulse coding (which is required for the crispest
357possible attacks, but also requires a relatively large momentary
358bitrate increase).  This parameter allows an application to tune the
359compromise or eliminate it; A value of 0.0 indicates normal behavior
360while a value of -15.0 requests maximum possible impulse
361resolution.</td>
362</tr>
363
364</table>
365
366
367<br><br>
368<hr noshade>
369<table border=0 width=100%>
370<tr valign=top>
371<td><p class=tiny>copyright &copy; 2004 Vorbis team</p></td>
372<td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a><br><a href="mailto:team@vorbis.org">team@vorbis.org</a></p></td>
373</tr><tr>
374<td><p class=tiny>libvorbisenc documentation</p></td>
375<td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
376</tr>
377</table>
378
379</body>
380
381</html>
382
Note: See TracBrowser for help on using the repository browser.