1 | |
---|
2 | |
---|
3 | |
---|
4 | |
---|
5 | |
---|
6 | |
---|
7 | Network Working Group S. Pfeiffer |
---|
8 | Request for Comments: 3533 CSIRO |
---|
9 | Category: Informational May 2003 |
---|
10 | |
---|
11 | |
---|
12 | The Ogg Encapsulation Format Version 0 |
---|
13 | |
---|
14 | Status of this Memo |
---|
15 | |
---|
16 | This memo provides information for the Internet community. It does |
---|
17 | not specify an Internet standard of any kind. Distribution of this |
---|
18 | memo is unlimited. |
---|
19 | |
---|
20 | Copyright Notice |
---|
21 | |
---|
22 | Copyright (C) The Internet Society (2003). All Rights Reserved. |
---|
23 | |
---|
24 | Abstract |
---|
25 | |
---|
26 | This document describes the Ogg bitstream format version 0, which is |
---|
27 | a general, freely-available encapsulation format for media streams. |
---|
28 | It is able to encapsulate any kind and number of video and audio |
---|
29 | encoding formats as well as other data streams in a single bitstream. |
---|
30 | |
---|
31 | Terminology |
---|
32 | |
---|
33 | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", |
---|
34 | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this |
---|
35 | document are to be interpreted as described in BCP 14, RFC 2119 [2]. |
---|
36 | |
---|
37 | Table of Contents |
---|
38 | |
---|
39 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 |
---|
40 | 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 |
---|
41 | 3. Requirements for a generic encapsulation format . . . . . . . 3 |
---|
42 | 4. The Ogg bitstream format . . . . . . . . . . . . . . . . . . . 3 |
---|
43 | 5. The encapsulation process . . . . . . . . . . . . . . . . . . 6 |
---|
44 | 6. The Ogg page format . . . . . . . . . . . . . . . . . . . . . 9 |
---|
45 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 |
---|
46 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 |
---|
47 | A. Glossary of terms and abbreviations . . . . . . . . . . . . . 13 |
---|
48 | B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 |
---|
49 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . 14 |
---|
50 | Full Copyright Statement . . . . . . . . . . . . . . . . . . . 15 |
---|
51 | |
---|
52 | |
---|
53 | |
---|
54 | |
---|
55 | |
---|
56 | |
---|
57 | |
---|
58 | Pfeiffer Informational [Page 1] |
---|
59 | |
---|
60 | RFC 3533 OGG May 2003 |
---|
61 | |
---|
62 | |
---|
63 | 1. Introduction |
---|
64 | |
---|
65 | The Ogg bitstream format has been developed as a part of a larger |
---|
66 | project aimed at creating a set of components for the coding and |
---|
67 | decoding of multimedia content (codecs) which are to be freely |
---|
68 | available and freely re-implementable, both in software and in |
---|
69 | hardware for the computing community at large, including the Internet |
---|
70 | community. It is the intention of the Ogg developers represented by |
---|
71 | Xiph.Org that it be usable without intellectual property concerns. |
---|
72 | |
---|
73 | This document describes the Ogg bitstream format and how to use it to |
---|
74 | encapsulate one or several media bitstreams created by one or several |
---|
75 | encoders. The Ogg transport bitstream is designed to provide |
---|
76 | framing, error protection and seeking structure for higher-level |
---|
77 | codec streams that consist of raw, unencapsulated data packets, such |
---|
78 | as the Vorbis audio codec or the upcoming Tarkin and Theora video |
---|
79 | codecs. It is capable of interleaving different binary media and |
---|
80 | other time-continuous data streams that are prepared by an encoder as |
---|
81 | a sequence of data packets. Ogg provides enough information to |
---|
82 | properly separate data back into such encoder created data packets at |
---|
83 | the original packet boundaries without relying on decoding to find |
---|
84 | packet boundaries. |
---|
85 | |
---|
86 | Please note that the MIME type application/ogg has been registered |
---|
87 | with the IANA [1]. |
---|
88 | |
---|
89 | 2. Definitions |
---|
90 | |
---|
91 | For describing the Ogg encapsulation process, a set of terms will be |
---|
92 | used whose meaning needs to be well understood. Therefore, some of |
---|
93 | the most fundamental terms are defined now before we start with the |
---|
94 | description of the requirements for a generic media stream |
---|
95 | encapsulation format, the process of encapsulation, and the concrete |
---|
96 | format of the Ogg bitstream. See the Appendix for a more complete |
---|
97 | glossary. |
---|
98 | |
---|
99 | The result of an Ogg encapsulation is called the "Physical (Ogg) |
---|
100 | Bitstream". It encapsulates one or several encoder-created |
---|
101 | bitstreams, which are called "Logical Bitstreams". A logical |
---|
102 | bitstream, provided to the Ogg encapsulation process, has a |
---|
103 | structure, i.e., it is split up into a sequence of so-called |
---|
104 | "Packets". The packets are created by the encoder of that logical |
---|
105 | bitstream and represent meaningful entities for that encoder only |
---|
106 | (e.g., an uncompressed stream may use video frames as packets). They |
---|
107 | do not contain boundary information - strung together they appear to |
---|
108 | be streams of random bytes with no landmarks. |
---|
109 | |
---|
110 | |
---|
111 | |
---|
112 | |
---|
113 | |
---|
114 | Pfeiffer Informational [Page 2] |
---|
115 | |
---|
116 | RFC 3533 OGG May 2003 |
---|
117 | |
---|
118 | |
---|
119 | Please note that the term "packet" is not used in this document to |
---|
120 | signify entities for transport over a network. |
---|
121 | |
---|
122 | 3. Requirements for a generic encapsulation format |
---|
123 | |
---|
124 | The design idea behind Ogg was to provide a generic, linear media |
---|
125 | transport format to enable both file-based storage and stream-based |
---|
126 | transmission of one or several interleaved media streams independent |
---|
127 | of the encoding format of the media data. Such an encapsulation |
---|
128 | format needs to provide: |
---|
129 | |
---|
130 | o framing for logical bitstreams. |
---|
131 | |
---|
132 | o interleaving of different logical bitstreams. |
---|
133 | |
---|
134 | o detection of corruption. |
---|
135 | |
---|
136 | o recapture after a parsing error. |
---|
137 | |
---|
138 | o position landmarks for direct random access of arbitrary positions |
---|
139 | in the bitstream. |
---|
140 | |
---|
141 | o streaming capability (i.e., no seeking is needed to build a 100% |
---|
142 | complete bitstream). |
---|
143 | |
---|
144 | o small overhead (i.e., use no more than approximately 1-2% of |
---|
145 | bitstream bandwidth for packet boundary marking, high-level |
---|
146 | framing, sync and seeking). |
---|
147 | |
---|
148 | o simplicity to enable fast parsing. |
---|
149 | |
---|
150 | o simple concatenation mechanism of several physical bitstreams. |
---|
151 | |
---|
152 | All of these design considerations have been taken into consideration |
---|
153 | for Ogg. Ogg supports framing and interleaving of logical |
---|
154 | bitstreams, seeking landmarks, detection of corruption, and stream |
---|
155 | resynchronisation after a parsing error with no more than |
---|
156 | approximately 1-2% overhead. It is a generic framework to perform |
---|
157 | encapsulation of time-continuous bitstreams. It does not know any |
---|
158 | specifics about the codec data that it encapsulates and is thus |
---|
159 | independent of any media codec. |
---|
160 | |
---|
161 | 4. The Ogg bitstream format |
---|
162 | |
---|
163 | A physical Ogg bitstream consists of multiple logical bitstreams |
---|
164 | interleaved in so-called "Pages". Whole pages are taken in order |
---|
165 | from multiple logical bitstreams multiplexed at the page level. The |
---|
166 | logical bitstreams are identified by a unique serial number in the |
---|
167 | |
---|
168 | |
---|
169 | |
---|
170 | Pfeiffer Informational [Page 3] |
---|
171 | |
---|
172 | RFC 3533 OGG May 2003 |
---|
173 | |
---|
174 | |
---|
175 | header of each page of the physical bitstream. This unique serial |
---|
176 | number is created randomly and does not have any connection to the |
---|
177 | content or encoder of the logical bitstream it represents. Pages of |
---|
178 | all logical bitstreams are concurrently interleaved, but they need |
---|
179 | not be in a regular order - they are only required to be consecutive |
---|
180 | within the logical bitstream. Ogg demultiplexing reconstructs the |
---|
181 | original logical bitstreams from the physical bitstream by taking the |
---|
182 | pages in order from the physical bitstream and redirecting them into |
---|
183 | the appropriate logical decoding entity. |
---|
184 | |
---|
185 | Each Ogg page contains only one type of data as it belongs to one |
---|
186 | logical bitstream only. Pages are of variable size and have a page |
---|
187 | header containing encapsulation and error recovery information. Each |
---|
188 | logical bitstream in a physical Ogg bitstream starts with a special |
---|
189 | start page (bos=beginning of stream) and ends with a special page |
---|
190 | (eos=end of stream). |
---|
191 | |
---|
192 | The bos page contains information to uniquely identify the codec type |
---|
193 | and MAY contain information to set up the decoding process. The bos |
---|
194 | page SHOULD also contain information about the encoded media - for |
---|
195 | example, for audio, it should contain the sample rate and number of |
---|
196 | channels. By convention, the first bytes of the bos page contain |
---|
197 | magic data that uniquely identifies the required codec. It is the |
---|
198 | responsibility of anyone fielding a new codec to make sure it is |
---|
199 | possible to reliably distinguish his/her codec from all other codecs |
---|
200 | in use. There is no fixed way to detect the end of the codec- |
---|
201 | identifying marker. The format of the bos page is dependent on the |
---|
202 | codec and therefore MUST be given in the encapsulation specification |
---|
203 | of that logical bitstream type. Ogg also allows but does not require |
---|
204 | secondary header packets after the bos page for logical bitstreams |
---|
205 | and these must also precede any data packets in any logical |
---|
206 | bitstream. These subsequent header packets are framed into an |
---|
207 | integral number of pages, which will not contain any data packets. |
---|
208 | So, a physical bitstream begins with the bos pages of all logical |
---|
209 | bitstreams containing one initial header packet per page, followed by |
---|
210 | the subsidiary header packets of all streams, followed by pages |
---|
211 | containing data packets. |
---|
212 | |
---|
213 | The encapsulation specification for one or more logical bitstreams is |
---|
214 | called a "media mapping". An example for a media mapping is "Ogg |
---|
215 | Vorbis", which uses the Ogg framework to encapsulate Vorbis-encoded |
---|
216 | audio data for stream-based storage (such as files) and transport |
---|
217 | (such as TCP streams or pipes). Ogg Vorbis provides the name and |
---|
218 | revision of the Vorbis codec, the audio rate and the audio quality on |
---|
219 | the Ogg Vorbis bos page. It also uses two additional header pages |
---|
220 | per logical bitstream. The Ogg Vorbis bos page starts with the byte |
---|
221 | 0x01, followed by "vorbis" (a total of 7 bytes of identifier). |
---|
222 | |
---|
223 | |
---|
224 | |
---|
225 | |
---|
226 | Pfeiffer Informational [Page 4] |
---|
227 | |
---|
228 | RFC 3533 OGG May 2003 |
---|
229 | |
---|
230 | |
---|
231 | Ogg knows two types of multiplexing: concurrent multiplexing (so- |
---|
232 | called "Grouping") and sequential multiplexing (so-called |
---|
233 | "Chaining"). Grouping defines how to interleave several logical |
---|
234 | bitstreams page-wise in the same physical bitstream. Grouping is for |
---|
235 | example needed for interleaving a video stream with several |
---|
236 | synchronised audio tracks using different codecs in different logical |
---|
237 | bitstreams. Chaining on the other hand, is defined to provide a |
---|
238 | simple mechanism to concatenate physical Ogg bitstreams, as is often |
---|
239 | needed for streaming applications. |
---|
240 | |
---|
241 | In grouping, all bos pages of all logical bitstreams MUST appear |
---|
242 | together at the beginning of the Ogg bitstream. The media mapping |
---|
243 | specifies the order of the initial pages. For example, the grouping |
---|
244 | of a specific Ogg video and Ogg audio bitstream may specify that the |
---|
245 | physical bitstream MUST begin with the bos page of the logical video |
---|
246 | bitstream, followed by the bos page of the audio bitstream. Unlike |
---|
247 | bos pages, eos pages for the logical bitstreams need not all occur |
---|
248 | contiguously. Eos pages may be 'nil' pages, that is, pages |
---|
249 | containing no content but simply a page header with position |
---|
250 | information and the eos flag set in the page header. Each grouped |
---|
251 | logical bitstream MUST have a unique serial number within the scope |
---|
252 | of the physical bitstream. |
---|
253 | |
---|
254 | In chaining, complete logical bitstreams are concatenated. The |
---|
255 | bitstreams do not overlap, i.e., the eos page of a given logical |
---|
256 | bitstream is immediately followed by the bos page of the next. Each |
---|
257 | chained logical bitstream MUST have a unique serial number within the |
---|
258 | scope of the physical bitstream. |
---|
259 | |
---|
260 | It is possible to consecutively chain groups of concurrently |
---|
261 | multiplexed bitstreams. The groups, when unchained, MUST stand on |
---|
262 | their own as a valid concurrently multiplexed bitstream. The |
---|
263 | following diagram shows a schematic example of such a physical |
---|
264 | bitstream that obeys all the rules of both grouped and chained |
---|
265 | multiplexed bitstreams. |
---|
266 | |
---|
267 | physical bitstream with pages of |
---|
268 | different logical bitstreams grouped and chained |
---|
269 | ------------------------------------------------------------- |
---|
270 | |*A*|*B*|*C*|A|A|C|B|A|B|#A#|C|...|B|C|#B#|#C#|*D*|D|...|#D#| |
---|
271 | ------------------------------------------------------------- |
---|
272 | bos bos bos eos eos eos bos eos |
---|
273 | |
---|
274 | In this example, there are two chained physical bitstreams, the first |
---|
275 | of which is a grouped stream of three logical bitstreams A, B, and C. |
---|
276 | The second physical bitstream is chained after the end of the grouped |
---|
277 | bitstream, which ends after the last eos page of all its grouped |
---|
278 | logical bitstreams. As can be seen, grouped bitstreams begin |
---|
279 | |
---|
280 | |
---|
281 | |
---|
282 | Pfeiffer Informational [Page 5] |
---|
283 | |
---|
284 | RFC 3533 OGG May 2003 |
---|
285 | |
---|
286 | |
---|
287 | together - all of the bos pages MUST appear before any data pages. |
---|
288 | It can also be seen that pages of concurrently multiplexed bitstreams |
---|
289 | need not conform to a regular order. And it can be seen that a |
---|
290 | grouped bitstream can end long before the other bitstreams in the |
---|
291 | group end. |
---|
292 | |
---|
293 | Ogg does not know any specifics about the codec data except that each |
---|
294 | logical bitstream belongs to a different codec, the data from the |
---|
295 | codec comes in order and has position markers (so-called "Granule |
---|
296 | positions"). Ogg does not have a concept of 'time': it only knows |
---|
297 | about sequentially increasing, unitless position markers. An |
---|
298 | application can only get temporal information through higher layers |
---|
299 | which have access to the codec APIs to assign and convert granule |
---|
300 | positions or time. |
---|
301 | |
---|
302 | A specific definition of a media mapping using Ogg may put further |
---|
303 | constraints on its specific use of the Ogg bitstream format. For |
---|
304 | example, a specific media mapping may require that all the eos pages |
---|
305 | for all grouped bitstreams need to appear in direct sequence. An |
---|
306 | example for a media mapping is the specification of "Ogg Vorbis". |
---|
307 | Another example is the upcoming "Ogg Theora" specification which |
---|
308 | encapsulates Theora-encoded video data and usually comes multiplexed |
---|
309 | with a Vorbis stream for an Ogg containing synchronised audio and |
---|
310 | video. As Ogg does not specify temporal relationships between the |
---|
311 | encapsulated concurrently multiplexed bitstreams, the temporal |
---|
312 | synchronisation between the audio and video stream will be specified |
---|
313 | in this media mapping. To enable streaming, pages from various |
---|
314 | logical bitstreams will typically be interleaved in chronological |
---|
315 | order. |
---|
316 | |
---|
317 | 5. The encapsulation process |
---|
318 | |
---|
319 | The process of multiplexing different logical bitstreams happens at |
---|
320 | the level of pages as described above. The bitstreams provided by |
---|
321 | encoders are however handed over to Ogg as so-called "Packets" with |
---|
322 | packet boundaries dependent on the encoding format. The process of |
---|
323 | encapsulating packets into pages will be described now. |
---|
324 | |
---|
325 | From Ogg's perspective, packets can be of any arbitrary size. A |
---|
326 | specific media mapping will define how to group or break up packets |
---|
327 | from a specific media encoder. As Ogg pages have a maximum size of |
---|
328 | about 64 kBytes, sometimes a packet has to be distributed over |
---|
329 | several pages. To simplify that process, Ogg divides each packet |
---|
330 | into 255 byte long chunks plus a final shorter chunk. These chunks |
---|
331 | are called "Ogg Segments". They are only a logical construct and do |
---|
332 | not have a header for themselves. |
---|
333 | |
---|
334 | |
---|
335 | |
---|
336 | |
---|
337 | |
---|
338 | Pfeiffer Informational [Page 6] |
---|
339 | |
---|
340 | RFC 3533 OGG May 2003 |
---|
341 | |
---|
342 | |
---|
343 | A group of contiguous segments is wrapped into a variable length page |
---|
344 | preceded by a header. A segment table in the page header tells about |
---|
345 | the "Lacing values" (sizes) of the segments included in the page. A |
---|
346 | flag in the page header tells whether a page contains a packet |
---|
347 | continued from a previous page. Note that a lacing value of 255 |
---|
348 | implies that a second lacing value follows in the packet, and a value |
---|
349 | of less than 255 marks the end of the packet after that many |
---|
350 | additional bytes. A packet of 255 bytes (or a multiple of 255 bytes) |
---|
351 | is terminated by a lacing value of 0. Note also that a 'nil' (zero |
---|
352 | length) packet is not an error; it consists of nothing more than a |
---|
353 | lacing value of zero in the header. |
---|
354 | |
---|
355 | The encoding is optimized for speed and the expected case of the |
---|
356 | majority of packets being between 50 and 200 bytes large. This is a |
---|
357 | design justification rather than a recommendation. This encoding |
---|
358 | both avoids imposing a maximum packet size as well as imposing |
---|
359 | minimum overhead on small packets. In contrast, e.g., simply using |
---|
360 | two bytes at the head of every packet and having a max packet size of |
---|
361 | 32 kBytes would always penalize small packets (< 255 bytes, the |
---|
362 | typical case) with twice the segmentation overhead. Using the lacing |
---|
363 | values as suggested, small packets see the minimum possible byte- |
---|
364 | aligned overhead (1 byte) and large packets (>512 bytes) see a fairly |
---|
365 | constant ~0.5% overhead on encoding space. |
---|
366 | |
---|
367 | |
---|
368 | |
---|
369 | |
---|
370 | |
---|
371 | |
---|
372 | |
---|
373 | |
---|
374 | |
---|
375 | |
---|
376 | |
---|
377 | |
---|
378 | |
---|
379 | |
---|
380 | |
---|
381 | |
---|
382 | |
---|
383 | |
---|
384 | |
---|
385 | |
---|
386 | |
---|
387 | |
---|
388 | |
---|
389 | |
---|
390 | |
---|
391 | |
---|
392 | |
---|
393 | |
---|
394 | Pfeiffer Informational [Page 7] |
---|
395 | |
---|
396 | RFC 3533 OGG May 2003 |
---|
397 | |
---|
398 | |
---|
399 | The following diagram shows a schematic example of a media mapping |
---|
400 | using Ogg and grouped logical bitstreams: |
---|
401 | |
---|
402 | logical bitstream with packet boundaries |
---|
403 | ----------------------------------------------------------------- |
---|
404 | > | packet_1 | packet_2 | packet_3 | < |
---|
405 | ----------------------------------------------------------------- |
---|
406 | |
---|
407 | |segmentation (logically only) |
---|
408 | v |
---|
409 | |
---|
410 | packet_1 (5 segments) packet_2 (4 segs) p_3 (2 segs) |
---|
411 | ------------------------------ -------------------- ------------ |
---|
412 | .. |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3|| |seg_1|s_2 | .. |
---|
413 | ------------------------------ -------------------- ------------ |
---|
414 | |
---|
415 | | page encapsulation |
---|
416 | v |
---|
417 | |
---|
418 | page_1 (packet_1 data) page_2 (pket_1 data) page_3 (packet_2 data) |
---|
419 | ------------------------ ---------------- ------------------------ |
---|
420 | |H|------------------- | |H|----------- | |H|------------------- | |
---|
421 | |D||seg_1|seg_2|seg_3| | |D|seg_4|s_5 | | |D||seg_1|seg_2|seg_3| | ... |
---|
422 | |R|------------------- | |R|----------- | |R|------------------- | |
---|
423 | ------------------------ ---------------- ------------------------ |
---|
424 | |
---|
425 | | |
---|
426 | pages of | |
---|
427 | other --------| | |
---|
428 | logical ------- |
---|
429 | bitstreams | MUX | |
---|
430 | ------- |
---|
431 | | |
---|
432 | v |
---|
433 | |
---|
434 | page_1 page_2 page_3 |
---|
435 | ------ ------ ------- ----- ------- |
---|
436 | ... || | || | || | || | || | ... |
---|
437 | ------ ------ ------- ----- ------- |
---|
438 | physical Ogg bitstream |
---|
439 | |
---|
440 | In this example we take a snapshot of the encapsulation process of |
---|
441 | one logical bitstream. We can see part of that bitstream's |
---|
442 | subdivision into packets as provided by the codec. The Ogg |
---|
443 | encapsulation process chops up the packets into segments. The |
---|
444 | packets in this example are rather large such that packet_1 is split |
---|
445 | into 5 segments - 4 segments with 255 bytes and a final smaller one. |
---|
446 | Packet_2 is split into 4 segments - 3 segments with 255 bytes and a |
---|
447 | |
---|
448 | |
---|
449 | |
---|
450 | Pfeiffer Informational [Page 8] |
---|
451 | |
---|
452 | RFC 3533 OGG May 2003 |
---|
453 | |
---|
454 | |
---|
455 | final very small one - and packet_3 is split into two segments. The |
---|
456 | encapsulation process then creates pages, which are quite small in |
---|
457 | this example. Page_1 consists of the first three segments of |
---|
458 | packet_1, page_2 contains the remaining 2 segments from packet_1, and |
---|
459 | page_3 contains the first three pages of packet_2. Finally, this |
---|
460 | logical bitstream is multiplexed into a physical Ogg bitstream with |
---|
461 | pages of other logical bitstreams. |
---|
462 | |
---|
463 | 6. The Ogg page format |
---|
464 | |
---|
465 | A physical Ogg bitstream consists of a sequence of concatenated |
---|
466 | pages. Pages are of variable size, usually 4-8 kB, maximum 65307 |
---|
467 | bytes. A page header contains all the information needed to |
---|
468 | demultiplex the logical bitstreams out of the physical bitstream and |
---|
469 | to perform basic error recovery and landmarks for seeking. Each page |
---|
470 | is a self-contained entity such that the page decode mechanism can |
---|
471 | recognize, verify, and handle single pages at a time without |
---|
472 | requiring the overall bitstream. |
---|
473 | |
---|
474 | The Ogg page header has the following format: |
---|
475 | |
---|
476 | 0 1 2 3 |
---|
477 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte |
---|
478 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
479 | | capture_pattern: Magic number for page start "OggS" | 0-3 |
---|
480 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
481 | | version | header_type | granule_position | 4-7 |
---|
482 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
483 | | | 8-11 |
---|
484 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
485 | | | bitstream_serial_number | 12-15 |
---|
486 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
487 | | | page_sequence_number | 16-19 |
---|
488 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
489 | | | CRC_checksum | 20-23 |
---|
490 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
491 | | |page_segments | segment_table | 24-27 |
---|
492 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
493 | | ... | 28- |
---|
494 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
---|
495 | |
---|
496 | The LSb (least significant bit) comes first in the Bytes. Fields |
---|
497 | with more than one byte length are encoded LSB (least significant |
---|
498 | byte) first. |
---|
499 | |
---|
500 | |
---|
501 | |
---|
502 | |
---|
503 | |
---|
504 | |
---|
505 | |
---|
506 | Pfeiffer Informational [Page 9] |
---|
507 | |
---|
508 | RFC 3533 OGG May 2003 |
---|
509 | |
---|
510 | |
---|
511 | The fields in the page header have the following meaning: |
---|
512 | |
---|
513 | 1. capture_pattern: a 4 Byte field that signifies the beginning of a |
---|
514 | page. It contains the magic numbers: |
---|
515 | |
---|
516 | 0x4f 'O' |
---|
517 | |
---|
518 | 0x67 'g' |
---|
519 | |
---|
520 | 0x67 'g' |
---|
521 | |
---|
522 | 0x53 'S' |
---|
523 | |
---|
524 | It helps a decoder to find the page boundaries and regain |
---|
525 | synchronisation after parsing a corrupted stream. Once the |
---|
526 | capture pattern is found, the decoder verifies page sync and |
---|
527 | integrity by computing and comparing the checksum. |
---|
528 | |
---|
529 | 2. stream_structure_version: 1 Byte signifying the version number of |
---|
530 | the Ogg file format used in this stream (this document specifies |
---|
531 | version 0). |
---|
532 | |
---|
533 | 3. header_type_flag: the bits in this 1 Byte field identify the |
---|
534 | specific type of this page. |
---|
535 | |
---|
536 | * bit 0x01 |
---|
537 | |
---|
538 | set: page contains data of a packet continued from the previous |
---|
539 | page |
---|
540 | |
---|
541 | unset: page contains a fresh packet |
---|
542 | |
---|
543 | * bit 0x02 |
---|
544 | |
---|
545 | set: this is the first page of a logical bitstream (bos) |
---|
546 | |
---|
547 | unset: this page is not a first page |
---|
548 | |
---|
549 | * bit 0x04 |
---|
550 | |
---|
551 | set: this is the last page of a logical bitstream (eos) |
---|
552 | |
---|
553 | unset: this page is not a last page |
---|
554 | |
---|
555 | 4. granule_position: an 8 Byte field containing position information. |
---|
556 | For example, for an audio stream, it MAY contain the total number |
---|
557 | of PCM samples encoded after including all frames finished on this |
---|
558 | page. For a video stream it MAY contain the total number of video |
---|
559 | |
---|
560 | |
---|
561 | |
---|
562 | Pfeiffer Informational [Page 10] |
---|
563 | |
---|
564 | RFC 3533 OGG May 2003 |
---|
565 | |
---|
566 | |
---|
567 | frames encoded after this page. This is a hint for the decoder |
---|
568 | and gives it some timing and position information. Its meaning is |
---|
569 | dependent on the codec for that logical bitstream and specified in |
---|
570 | a specific media mapping. A special value of -1 (in two's |
---|
571 | complement) indicates that no packets finish on this page. |
---|
572 | |
---|
573 | 5. bitstream_serial_number: a 4 Byte field containing the unique |
---|
574 | serial number by which the logical bitstream is identified. |
---|
575 | |
---|
576 | 6. page_sequence_number: a 4 Byte field containing the sequence |
---|
577 | number of the page so the decoder can identify page loss. This |
---|
578 | sequence number is increasing on each logical bitstream |
---|
579 | separately. |
---|
580 | |
---|
581 | 7. CRC_checksum: a 4 Byte field containing a 32 bit CRC checksum of |
---|
582 | the page (including header with zero CRC field and page content). |
---|
583 | The generator polynomial is 0x04c11db7. |
---|
584 | |
---|
585 | 8. number_page_segments: 1 Byte giving the number of segment entries |
---|
586 | encoded in the segment table. |
---|
587 | |
---|
588 | 9. segment_table: number_page_segments Bytes containing the lacing |
---|
589 | values of all segments in this page. Each Byte contains one |
---|
590 | lacing value. |
---|
591 | |
---|
592 | The total header size in bytes is given by: |
---|
593 | header_size = number_page_segments + 27 [Byte] |
---|
594 | |
---|
595 | The total page size in Bytes is given by: |
---|
596 | page_size = header_size + sum(lacing_values: 1..number_page_segments) |
---|
597 | [Byte] |
---|
598 | |
---|
599 | 7. Security Considerations |
---|
600 | |
---|
601 | The Ogg encapsulation format is a container format and only |
---|
602 | encapsulates content (such as Vorbis-encoded audio). It does not |
---|
603 | provide for any generic encryption or signing of itself or its |
---|
604 | contained content bitstreams. However, it encapsulates any kind of |
---|
605 | content bitstream as long as there is a codec for it, and is thus |
---|
606 | able to contain encrypted and signed content data. It is also |
---|
607 | possible to add an external security mechanism that encrypts or signs |
---|
608 | an Ogg physical bitstream and thus provides content confidentiality |
---|
609 | and authenticity. |
---|
610 | |
---|
611 | As Ogg encapsulates binary data, it is possible to include executable |
---|
612 | content in an Ogg bitstream. This can be an issue with applications |
---|
613 | that are implemented using the Ogg format, especially when Ogg is |
---|
614 | used for streaming or file transfer in a networking scenario. As |
---|
615 | |
---|
616 | |
---|
617 | |
---|
618 | Pfeiffer Informational [Page 11] |
---|
619 | |
---|
620 | RFC 3533 OGG May 2003 |
---|
621 | |
---|
622 | |
---|
623 | such, Ogg does not pose a threat there. However, an application |
---|
624 | decoding Ogg and its encapsulated content bitstreams has to ensure |
---|
625 | correct handling of manipulated bitstreams, of buffer overflows and |
---|
626 | the like. |
---|
627 | |
---|
628 | 8. References |
---|
629 | |
---|
630 | [1] Walleij, L., "The application/ogg Media Type", RFC 3534, May |
---|
631 | 2003. |
---|
632 | |
---|
633 | [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement |
---|
634 | Levels", BCP 14, RFC 2119, March 1997. |
---|
635 | |
---|
636 | |
---|
637 | |
---|
638 | |
---|
639 | |
---|
640 | |
---|
641 | |
---|
642 | |
---|
643 | |
---|
644 | |
---|
645 | |
---|
646 | |
---|
647 | |
---|
648 | |
---|
649 | |
---|
650 | |
---|
651 | |
---|
652 | |
---|
653 | |
---|
654 | |
---|
655 | |
---|
656 | |
---|
657 | |
---|
658 | |
---|
659 | |
---|
660 | |
---|
661 | |
---|
662 | |
---|
663 | |
---|
664 | |
---|
665 | |
---|
666 | |
---|
667 | |
---|
668 | |
---|
669 | |
---|
670 | |
---|
671 | |
---|
672 | |
---|
673 | |
---|
674 | Pfeiffer Informational [Page 12] |
---|
675 | |
---|
676 | RFC 3533 OGG May 2003 |
---|
677 | |
---|
678 | |
---|
679 | Appendix A. Glossary of terms and abbreviations |
---|
680 | |
---|
681 | bos page: The initial page (beginning of stream) of a logical |
---|
682 | bitstream which contains information to identify the codec type |
---|
683 | and other decoding-relevant information. |
---|
684 | |
---|
685 | chaining (or sequential multiplexing): Concatenation of two or more |
---|
686 | complete physical Ogg bitstreams. |
---|
687 | |
---|
688 | eos page: The final page (end of stream) of a logical bitstream. |
---|
689 | |
---|
690 | granule position: An increasing position number for a specific |
---|
691 | logical bitstream stored in the page header. Its meaning is |
---|
692 | dependent on the codec for that logical bitstream and specified in |
---|
693 | a specific media mapping. |
---|
694 | |
---|
695 | grouping (or concurrent multiplexing): Interleaving of pages of |
---|
696 | several logical bitstreams into one complete physical Ogg |
---|
697 | bitstream under the restriction that all bos pages of all grouped |
---|
698 | logical bitstreams MUST appear before any data pages. |
---|
699 | |
---|
700 | lacing value: An entry in the segment table of a page header |
---|
701 | representing the size of the related segment. |
---|
702 | |
---|
703 | logical bitstream: A sequence of bits being the result of an encoded |
---|
704 | media stream. |
---|
705 | |
---|
706 | media mapping: A specific use of the Ogg encapsulation format |
---|
707 | together with a specific (set of) codec(s). |
---|
708 | |
---|
709 | (Ogg) packet: A subpart of a logical bitstream that is created by the |
---|
710 | encoder for that bitstream and represents a meaningful entity for |
---|
711 | the encoder, but only a sequence of bits to the Ogg encapsulation. |
---|
712 | |
---|
713 | (Ogg) page: A physical bitstream consists of a sequence of Ogg pages |
---|
714 | containing data of one logical bitstream only. It usually |
---|
715 | contains a group of contiguous segments of one packet only, but |
---|
716 | sometimes packets are too large and need to be split over several |
---|
717 | pages. |
---|
718 | |
---|
719 | physical (Ogg) bitstream: The sequence of bits resulting from an Ogg |
---|
720 | encapsulation of one or several logical bitstreams. It consists |
---|
721 | of a sequence of pages from the logical bitstreams with the |
---|
722 | restriction that the pages of one logical bitstream MUST come in |
---|
723 | their correct temporal order. |
---|
724 | |
---|
725 | |
---|
726 | |
---|
727 | |
---|
728 | |
---|
729 | |
---|
730 | Pfeiffer Informational [Page 13] |
---|
731 | |
---|
732 | RFC 3533 OGG May 2003 |
---|
733 | |
---|
734 | |
---|
735 | (Ogg) segment: The Ogg encapsulation process splits each packet into |
---|
736 | chunks of 255 bytes plus a last fractional chunk of less than 255 |
---|
737 | bytes. These chunks are called segments. |
---|
738 | |
---|
739 | Appendix B. Acknowledgements |
---|
740 | |
---|
741 | The author gratefully acknowledges the work that Christopher |
---|
742 | Montgomery and the Xiph.Org foundation have done in defining the Ogg |
---|
743 | multimedia project and as part of it the open file format described |
---|
744 | in this document. The author hopes that providing this document to |
---|
745 | the Internet community will help in promoting the Ogg multimedia |
---|
746 | project at http://www.xiph.org/. Many thanks also for the many |
---|
747 | technical and typo corrections that C. Montgomery and the Ogg |
---|
748 | community provided as feedback to this RFC. |
---|
749 | |
---|
750 | Author's Address |
---|
751 | |
---|
752 | Silvia Pfeiffer |
---|
753 | CSIRO, Australia |
---|
754 | Locked Bag 17 |
---|
755 | North Ryde, NSW 2113 |
---|
756 | Australia |
---|
757 | |
---|
758 | Phone: +61 2 9325 3141 |
---|
759 | EMail: Silvia.Pfeiffer@csiro.au |
---|
760 | URI: http://www.cmis.csiro.au/Silvia.Pfeiffer/ |
---|
761 | |
---|
762 | |
---|
763 | |
---|
764 | |
---|
765 | |
---|
766 | |
---|
767 | |
---|
768 | |
---|
769 | |
---|
770 | |
---|
771 | |
---|
772 | |
---|
773 | |
---|
774 | |
---|
775 | |
---|
776 | |
---|
777 | |
---|
778 | |
---|
779 | |
---|
780 | |
---|
781 | |
---|
782 | |
---|
783 | |
---|
784 | |
---|
785 | |
---|
786 | Pfeiffer Informational [Page 14] |
---|
787 | |
---|
788 | RFC 3533 OGG May 2003 |
---|
789 | |
---|
790 | |
---|
791 | Full Copyright Statement |
---|
792 | |
---|
793 | Copyright (C) The Internet Society (2003). All Rights Reserved. |
---|
794 | |
---|
795 | This document and translations of it may be copied and furnished to |
---|
796 | others, and derivative works that comment on or otherwise explain it |
---|
797 | or assist in its implementation may be prepared, copied, published |
---|
798 | and distributed, in whole or in part, without restriction of any |
---|
799 | kind, provided that the above copyright notice and this paragraph are |
---|
800 | included on all such copies and derivative works. However, this |
---|
801 | document itself may not be modified in any way, such as by removing |
---|
802 | the copyright notice or references to the Internet Society or other |
---|
803 | Internet organizations, except as needed for the purpose of |
---|
804 | developing Internet standards in which case the procedures for |
---|
805 | copyrights defined in the Internet Standards process must be |
---|
806 | followed, or as required to translate it into languages other than |
---|
807 | English. |
---|
808 | |
---|
809 | The limited permissions granted above are perpetual and will not be |
---|
810 | revoked by the Internet Society or its successors or assigns. |
---|
811 | |
---|
812 | This document and the information contained herein is provided on an |
---|
813 | "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING |
---|
814 | TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING |
---|
815 | BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION |
---|
816 | HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF |
---|
817 | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. |
---|
818 | |
---|
819 | Acknowledgement |
---|
820 | |
---|
821 | Funding for the RFC Editor function is currently provided by the |
---|
822 | Internet Society. |
---|
823 | |
---|
824 | |
---|
825 | |
---|
826 | |
---|
827 | |
---|
828 | |
---|
829 | |
---|
830 | |
---|
831 | |
---|
832 | |
---|
833 | |
---|
834 | |
---|
835 | |
---|
836 | |
---|
837 | |
---|
838 | |
---|
839 | |
---|
840 | |
---|
841 | |
---|
842 | Pfeiffer Informational [Page 15] |
---|
843 | |
---|