1 | <html> |
---|
2 | |
---|
3 | <head> |
---|
4 | <title>libvorbisenc - API Overview</title> |
---|
5 | <link rel=stylesheet href="style.css" type="text/css"> |
---|
6 | </head> |
---|
7 | |
---|
8 | <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff"> |
---|
9 | <table border=0 width=100%> |
---|
10 | <tr> |
---|
11 | <td><p class=tiny>libvorbisenc documentation</p></td> |
---|
12 | <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td> |
---|
13 | </tr> |
---|
14 | </table> |
---|
15 | |
---|
16 | <h1>Libvorbisenc API Overview</h1> |
---|
17 | |
---|
18 | <p>Libvorbisenc is an encoding convenience library intended to |
---|
19 | encapsulate the elaborate setup that libvorbis requires for encoding. |
---|
20 | Libvorbisenc gives easy access to all high-level adjustments an |
---|
21 | application may require when encoding and also exposes some low-level |
---|
22 | tuning parameters to allow applications to make detailed adjustments |
---|
23 | to the encoding process. <p> |
---|
24 | |
---|
25 | All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h". |
---|
26 | |
---|
27 | <em>Note: libvorbis and libvorbisenc always |
---|
28 | encode in a single pass. Thus, all possible encoding setups will work |
---|
29 | properly with live input and produce streams that decode properly when |
---|
30 | streamed. See the subsection titled <a href="#BBR">"managed bitrate |
---|
31 | modes"</a> for details on setting limits on bitrate usage when Vorbis |
---|
32 | streams are used in a limited-bandwidth environment.</em> |
---|
33 | |
---|
34 | <h2>workflow</h2> |
---|
35 | |
---|
36 | <p>Libvorbisenc is used only during encoder setup; its function |
---|
37 | is to automate initialization of a multitude of settings in a |
---|
38 | <tt>vorbis_info</tt> structure which libvorbis then uses as a reference |
---|
39 | during the encoding process. Libvorbisenc plays no part in the |
---|
40 | encoding process after setup. |
---|
41 | |
---|
42 | <p>Encode setup using libvorbisenc consists of three steps: |
---|
43 | |
---|
44 | <ol> |
---|
45 | <li>high-level initialization of a <tt>vorbis_info</tt> structure by |
---|
46 | calling one of <a |
---|
47 | href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a |
---|
48 | href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> |
---|
49 | with the basic input audio parameters (rate and channels) and the |
---|
50 | basic desired encoded audio output parameters (VBR quality or ABR/CBR |
---|
51 | bitrate)<p> |
---|
52 | |
---|
53 | <li>optional adjustment of the basic setup defaults using <a |
---|
54 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p> |
---|
55 | |
---|
56 | <li>calling <a |
---|
57 | href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to |
---|
58 | finalize the high-level setup into the detailed low-level reference |
---|
59 | values needed by libvorbis to encode audio. The <tt>vorbis_info</tt> |
---|
60 | structure is then ready to use for encoding by libvorbis.<p> |
---|
61 | |
---|
62 | </ol> |
---|
63 | |
---|
64 | These three steps can be collapsed into a single call by using <a |
---|
65 | href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a |
---|
66 | quality-based VBR stream or <a |
---|
67 | href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed |
---|
68 | bitrate (ABR or CBR) stream.<p> |
---|
69 | |
---|
70 | <h2>adjustable encoding parameters</h2> |
---|
71 | |
---|
72 | <h3>input audio parameters</h3> |
---|
73 | |
---|
74 | <p> |
---|
75 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
---|
76 | <tr bgcolor=#cccccc> |
---|
77 | <td><b>parameter</b></td> |
---|
78 | <td><b>description</b></td> |
---|
79 | </tr> |
---|
80 | <tr valign=top> |
---|
81 | <td>sampling rate</td> |
---|
82 | <td> |
---|
83 | The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample. |
---|
84 | |
---|
85 | </td> |
---|
86 | </tr> |
---|
87 | <tr valign=top> |
---|
88 | <td>channels</td> |
---|
89 | <td> |
---|
90 | |
---|
91 | The number of channels encoded in each input sample. By default, |
---|
92 | stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such |
---|
93 | that the stereo relationship between the samples is taken into account |
---|
94 | when encoding. Stereo coupling my be disabled by using <a |
---|
95 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a |
---|
96 | href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>. |
---|
97 | |
---|
98 | </td> |
---|
99 | </tr> |
---|
100 | </table> |
---|
101 | |
---|
102 | <h3>quality and VBR modes</h3> |
---|
103 | |
---|
104 | Vorbis is natively a VBR codec; a user requests a given constant |
---|
105 | <em>quality</em> and the encoder keeps the encoding quality constant |
---|
106 | while allowing the bitrate to vary. 'Quality' modes (Variable BitRate) |
---|
107 | will always produce the most consistent encoding results as well as |
---|
108 | the highest quality for the amount of bits used. |
---|
109 | |
---|
110 | <p> |
---|
111 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
---|
112 | <tr bgcolor=#cccccc> |
---|
113 | <td><b>parameter</b></td> |
---|
114 | <td><b>description</b></td> |
---|
115 | </tr> |
---|
116 | <tr valign=top> |
---|
117 | <td>quality</td> |
---|
118 | <td> |
---|
119 | A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times. |
---|
120 | |
---|
121 | </td> |
---|
122 | </tr> |
---|
123 | </table> |
---|
124 | |
---|
125 | <a name="BBR"> |
---|
126 | <h3>managed bitrate modes</h3> |
---|
127 | |
---|
128 | Although the Vorbis codec is natively VBR, libvorbis includes |
---|
129 | infrastructure for 'managing' the bitrate of streams by setting |
---|
130 | minimum and maximum usage constraints, as well as functionality for |
---|
131 | nudging a stream toward a desired average value. These features |
---|
132 | should <em>only</em> be used when there is a requirement to limit |
---|
133 | bitrate in some way. Although the difference is usually slight, |
---|
134 | managed bitrate modes will always produce output inferior to VBR |
---|
135 | (given equal bitrate usage). Setting overly or impossibly tight |
---|
136 | bitrate management requirements can affect output quality dramatically |
---|
137 | for the worse.<p> |
---|
138 | |
---|
139 | Beginning in libvorbis 1.1, bitrate management is implemented using a |
---|
140 | <em>bit-reservoir</em> algorithm. The encoder has a fixed-size |
---|
141 | reservoir used as a 'savings account' in encoding. When a frame is |
---|
142 | smaller than the target rate, the unused bits go into the reservoir so |
---|
143 | that they may be used by future frames. When a frame is larger than |
---|
144 | target bitrate, it draws 'banked' bits out of the reservoir. Encoding |
---|
145 | is managed so that the reservoir never goes negative (when a maximum |
---|
146 | bitrate is specified) or fills beyond a fixed limit (when a minimum |
---|
147 | bitrate is specified). An 'average bitrate' request is used as the |
---|
148 | set-point in a long-range bitrate tracker which adjusts the encoder's |
---|
149 | aggressiveness up or down depending on whether or not frames are coming |
---|
150 | in larger or smaller than the requested average point. |
---|
151 | |
---|
152 | <p> |
---|
153 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
---|
154 | <tr bgcolor=#cccccc> |
---|
155 | <td><b>parameter</b></td> |
---|
156 | <td><b>description</b></td> |
---|
157 | </tr> |
---|
158 | <tr valign=top> |
---|
159 | <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits |
---|
160 | per second. If the bitrate would otherwise rise such that oversized |
---|
161 | frames would underflow the bit-reservoir by consuming banked bits, |
---|
162 | bitrate management will force the encoder to use fewer bits per frame |
---|
163 | by encoding with a more aggressive psychoacoustic model.<p> This |
---|
164 | setting is a hard limit; the bitstream will never be allowed, under |
---|
165 | any circumstances, to increase above the specified bitrate over the |
---|
166 | average period set by the reservoir; it may momentarily rise over if |
---|
167 | inspected on a granularity much finer than the average period across |
---|
168 | the reservoir. Normally, the encoder will conserve bits gracefully by |
---|
169 | using more aggressive psychoacoustics to shrink a frame when forced |
---|
170 | to. However, if the encoder runs out of means of gracefully shrinking |
---|
171 | a frame, it will simply take the smallest frame it can otherwise |
---|
172 | generate and truncate it to the maximum allowed length. Note that |
---|
173 | this is not an error and although it will obviously adversely affect |
---|
174 | audio quality, a Vorbis decoder will be able to decode a truncated |
---|
175 | frame into audio. |
---|
176 | |
---|
177 | </td> |
---|
178 | </tr> |
---|
179 | |
---|
180 | <tr valign=top> |
---|
181 | <td>average bitrate</td> |
---|
182 | |
---|
183 | <td> |
---|
184 | |
---|
185 | The average desired bitrate of a stream, set |
---|
186 | in bits per second. Average bitrate is tracked via a reservoir like |
---|
187 | minimum and maximum bitrate, however the averaging reservior does not |
---|
188 | impose a hard limit; it is used to nudge the bitrate toward the |
---|
189 | desired average by slowly adjusting the psychoacoustic aggressiveness. |
---|
190 | As such, the reservoir size does not affect the average bitrate |
---|
191 | behavior. Because this setting alone is not used to impose hard |
---|
192 | bitrate limits, the bitrate of a stream produced using only the |
---|
193 | <tt>average bitrate</tt> constraint will track the average over time |
---|
194 | but not necessarily adhere strictly to that average for any given |
---|
195 | period. Should a strict localized average be required, <tt>average |
---|
196 | bitrate</tt> should be used along with <tt>minimum bitrate</tt> and |
---|
197 | <tt>maximum bitrate</tt>. |
---|
198 | </td> |
---|
199 | |
---|
200 | </tr> |
---|
201 | |
---|
202 | <tr valign=top> |
---|
203 | <td>minimum bitrate</td> |
---|
204 | <td> |
---|
205 | The minimum allowed bitrate, set in bits per second. If |
---|
206 | the bitrate would otherwise fall such that undersized frames would |
---|
207 | overflow the bit-reservoir with unused bits, bitrate management will |
---|
208 | force the encoder to use more bits per frame by encoding with a less |
---|
209 | aggressive psychoacoustic model.<p> This setting is a hard limit; the |
---|
210 | bitstream will never be allowed, under any circumstances, to drop |
---|
211 | below the specified bitrate over the average period set by the |
---|
212 | reservoir; it may momentarily fall under if inspected on a granularity |
---|
213 | much finer than the average period across the reservoir. Normally, |
---|
214 | the encoder will fill out undersided frames with additional useful |
---|
215 | coding information by increasing the perceived quality of the stream. |
---|
216 | If the encoder runs out of useful ways to consume more bits, it will |
---|
217 | pad frames out with zeroes. |
---|
218 | </td> |
---|
219 | </tr> |
---|
220 | |
---|
221 | <tr valign=top> |
---|
222 | <td>reservoir size</td> <td> The size of the minimum/maximum bitrate |
---|
223 | tracking reservoir, set in bits. The reservoir is used as a 'bit |
---|
224 | bank' to average out localized surges and dips in bitrate while |
---|
225 | providing predictable, guaranteed buffering behavior for streams to be |
---|
226 | used in situations with constrained transport bandwidth. The default |
---|
227 | setting is two seconds of average bitrate.<p> |
---|
228 | |
---|
229 | When a single frame is larger than the maximum allowed overall |
---|
230 | bitrate, the bits are 'borrowed' from the bitrate reservoir; if the |
---|
231 | reservoir contains insufficient bits to cover the defecit, the encoder |
---|
232 | must find some way to reduce the frame size. <p> |
---|
233 | |
---|
234 | When a frame is under the minimum limit, the surplus bits are placed |
---|
235 | into the reservoir, banking them for future use. If the reservoir is |
---|
236 | already full of banked bits, the encoder is forced to find some way to |
---|
237 | make the frame larger.<p> |
---|
238 | |
---|
239 | If the frame size is between the minimum and maximum rates (thus |
---|
240 | implying the minimum and maximum allowed rates are different), the |
---|
241 | reservoir gravitates toward a fill point configured by the |
---|
242 | <tt>reservoir bias</tt> setting described next. If the reservoir is |
---|
243 | fuller than the fill point (a 'surplus of surplus'), the encoder will |
---|
244 | consume a number bits from the reservoir equal to the number of the |
---|
245 | bits by which the frame exceeds minimum size. If the reservoir is |
---|
246 | emptier than the fillpoint (a 'surplus of defecit'), bits are returned |
---|
247 | to the reservoir equaling the current frame's number of bits under the |
---|
248 | maximum frame size. The idea of the fill point is to buffer against |
---|
249 | both underruns and overruns, by trying to hold the reservoir to a |
---|
250 | middle course. |
---|
251 | </td> |
---|
252 | </tr> |
---|
253 | |
---|
254 | <tr valign=top> |
---|
255 | <td>reservoir bias</td> |
---|
256 | |
---|
257 | <td> |
---|
258 | |
---|
259 | Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate |
---|
260 | management toward smoothing bitrate spikes (0.0) or bitrate peaks |
---|
261 | (1.0); the default setting is 0.1.<p> |
---|
262 | |
---|
263 | Using settings toward 0.0 causes the bitrate manager to hoard bits in |
---|
264 | the bit reservoir such that there is a large pool of banked surplus to |
---|
265 | draw upon during short spikes in bitrate. As a result, the encoder |
---|
266 | will react less aggressively and less drastically to curtail framesize |
---|
267 | during brief surges in bitrate.<p> |
---|
268 | |
---|
269 | Using settings toward 1.0 causes the bitrate manager to empty the bit |
---|
270 | reservoir such that there is a large buffer available to store surplus |
---|
271 | bits during sudden drops in bitrate. As a result, the encoder will |
---|
272 | react less aggressively and less drastically to support minimum frame |
---|
273 | sizes during drops in bitrate and will tend not to store any extra |
---|
274 | bits in the reservoir for future bitrate spikes.<p> |
---|
275 | |
---|
276 | </td> |
---|
277 | </tr> |
---|
278 | |
---|
279 | <tr valign=top> |
---|
280 | <td>average track damping</td> |
---|
281 | <td> |
---|
282 | |
---|
283 | A decimal value, in seconds, that controls how quickly the average |
---|
284 | bitrate tracker is allowed to slew from enforcing minimum frame sizes |
---|
285 | to maximum framesizes and vice versa. Default value is 1.5 |
---|
286 | seconds.<p> |
---|
287 | |
---|
288 | When the 'average bitrate' setting is in use, the average bitrate |
---|
289 | tracker uses an unbounded reservoir to track overall bitrate-to-date |
---|
290 | in the stream. When bitrates are too low, the tracker will try to |
---|
291 | nudge bitrates up and when the bitrate is too high, nudge it down. |
---|
292 | The damping value regulates the maximum strength of the nudge; it |
---|
293 | describes, in seconds, how quickly the tracker may transition from an |
---|
294 | extreme nudge in one direction to an extreme nudge in the other.<p> |
---|
295 | |
---|
296 | </td> |
---|
297 | </tr> |
---|
298 | |
---|
299 | </table> |
---|
300 | |
---|
301 | <h3>encoding model adjustments</h3> |
---|
302 | |
---|
303 | The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides |
---|
304 | a generalized interface for making encoding setup adjustments to the |
---|
305 | basic high-level setup provided by <a |
---|
306 | href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a |
---|
307 | href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>. |
---|
308 | In reality, these two calls use <a |
---|
309 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a |
---|
310 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust |
---|
311 | most of the parameters set by other calls.<p> |
---|
312 | |
---|
313 | In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can |
---|
314 | adjust the following additional parameters not described elsewhere: |
---|
315 | |
---|
316 | <p> |
---|
317 | <table border=1 color=black width=50% cellspacing=0 cellpadding=7> |
---|
318 | <tr bgcolor=#cccccc> |
---|
319 | <td><b>parameter</b></td> |
---|
320 | <td><b>description</b></td> |
---|
321 | </tr> |
---|
322 | <tr valign=top> |
---|
323 | <td>management mode</td> <td> Configures whether or not bitrate |
---|
324 | management is in use or not. Normally, this value is set implicitly |
---|
325 | during encoding setup; however, the supported means of selecting a |
---|
326 | quality mode by bitrate (that is, requesting a true VBR stream, but |
---|
327 | doing so by asking for an approximate bitrate) is to use <a |
---|
328 | href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> |
---|
329 | and then to explicitly turn off bitrate management by calling <a |
---|
330 | href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a |
---|
331 | href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a> |
---|
332 | </td> |
---|
333 | </tr> |
---|
334 | |
---|
335 | <tr valign=top> |
---|
336 | <td>coupling</td> <td> Stereo encoding (and in the future, surround |
---|
337 | encodings) are normally encoded assuming the channels form a stereo |
---|
338 | image and that lossy-stereo modelling is appropriate; this is called |
---|
339 | 'coupling'. Stereo coupling may be explicitly enabled or disabled. |
---|
340 | </td> |
---|
341 | </tr> |
---|
342 | <tr valign=top> |
---|
343 | <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode; |
---|
344 | this may be used to conserve a few bits in high-rate audio that has |
---|
345 | limited bandwidth, or in testing of the encoder's acoustic model. The |
---|
346 | encoder is generally already configured with ideal lowpasses (if any |
---|
347 | at all) for given modes; use of this parameter is strongly discouraged |
---|
348 | if the point is to try to 'improve' a given encoding mode for general |
---|
349 | encoding. |
---|
350 | </td> |
---|
351 | </tr> |
---|
352 | |
---|
353 | <tr valign=top> |
---|
354 | <td>impulse coding aggressiveness</td> <td>By default, libvorbis |
---|
355 | attempts to compromise between preventing wide bitrate swings and |
---|
356 | high-resolution impulse coding (which is required for the crispest |
---|
357 | possible attacks, but also requires a relatively large momentary |
---|
358 | bitrate increase). This parameter allows an application to tune the |
---|
359 | compromise or eliminate it; A value of 0.0 indicates normal behavior |
---|
360 | while a value of -15.0 requests maximum possible impulse |
---|
361 | resolution.</td> |
---|
362 | </tr> |
---|
363 | |
---|
364 | </table> |
---|
365 | |
---|
366 | |
---|
367 | <br><br> |
---|
368 | <hr noshade> |
---|
369 | <table border=0 width=100%> |
---|
370 | <tr valign=top> |
---|
371 | <td><p class=tiny>copyright © 2004 Vorbis team</p></td> |
---|
372 | <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a><br><a href="mailto:team@vorbis.org">team@vorbis.org</a></p></td> |
---|
373 | </tr><tr> |
---|
374 | <td><p class=tiny>libvorbisenc documentation</p></td> |
---|
375 | <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td> |
---|
376 | </tr> |
---|
377 | </table> |
---|
378 | |
---|
379 | </body> |
---|
380 | |
---|
381 | </html> |
---|
382 | |
---|