1 | <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
---|
2 | <html> |
---|
3 | <!-- |
---|
4 | (C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com . |
---|
5 | Use, modification and distribution is subject to the Boost Software |
---|
6 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
---|
7 | http://www.boost.org/LICENSE_1_0.txt) |
---|
8 | --> |
---|
9 | <head> |
---|
10 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
---|
11 | <link rel="stylesheet" type="text/css" href="../../../boost.css"> |
---|
12 | <link rel="stylesheet" type="text/css" href="style.css"> |
---|
13 | <title>Serialization - Special Considerations</title> |
---|
14 | </head> |
---|
15 | <body link="#0000ff" vlink="#800080"> |
---|
16 | <table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header"> |
---|
17 | <tr> |
---|
18 | <td valign="top" width="300"> |
---|
19 | <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3> |
---|
20 | </td> |
---|
21 | <td valign="top"> |
---|
22 | <h1 align="center">Serialization</h1> |
---|
23 | <h2 align="center">Special Considerations</h2> |
---|
24 | </td> |
---|
25 | </tr> |
---|
26 | </table> |
---|
27 | <hr> |
---|
28 | <dl class="page-index"> |
---|
29 | <dt><a href="#objecttracking">Object Tracking</a> |
---|
30 | <dt><a href="#export">Exporting Class Serialization</a> |
---|
31 | <dt><a href="#classinfo">Class Information</a> |
---|
32 | <dt><a href="#portability">Archive Portability</a> |
---|
33 | <dl class="page-index"> |
---|
34 | <dt><a href="#numerics">Numerics</a> |
---|
35 | <dt><a href="#traits">Traits</a> |
---|
36 | </dl> |
---|
37 | <dt><a href="#binary_archives">Binary Archives</a> |
---|
38 | <dt><a href="#xml_archives">XML Archives</a> |
---|
39 | <dt><a href="exceptions.html">Archive Exceptions</a> |
---|
40 | <dt><a href="exception_safety.html">Exception Safety</a> |
---|
41 | </dl> |
---|
42 | |
---|
43 | <h3><a name="objecttracking">Object Tracking</a></h3> |
---|
44 | Depending on how the class is used and other factors, serialized objects |
---|
45 | may be tracked by memory address. This prevents the same object from being |
---|
46 | written to or read from an archive multiple times. These stored addresses |
---|
47 | can also be used to delete objects created during a loading process |
---|
48 | that has been interrupted by throwing of an exception. |
---|
49 | <p> |
---|
50 | This could cause problems in |
---|
51 | progams where the copies of different objects are saved from the same address. |
---|
52 | <pre><code> |
---|
53 | template<class Archive> |
---|
54 | void save(boost::basic_oarchive & ar, const unsigned int version) const |
---|
55 | { |
---|
56 | for(int i = 0; i < 10; ++i){ |
---|
57 | A x = a[i]; |
---|
58 | ar << x; |
---|
59 | } |
---|
60 | } |
---|
61 | </code></pre> |
---|
62 | In this case, the data to be saved exists on the stack. Each iteration |
---|
63 | of the loop updates the value on the stack. So although the data changes |
---|
64 | each iteration, the address of the data doesn't. If a[i] is an array of |
---|
65 | objects being tracked by memory address, the library will skip storing |
---|
66 | objects after the first as it will be assumed that objects at the same address |
---|
67 | are really the same object. |
---|
68 | <p> |
---|
69 | To help detect such cases, output archive operators expect to be passed |
---|
70 | <code style="white-space: normal">const</code> reference arguments. |
---|
71 | <p> |
---|
72 | Given this, the above code will invoke a compile time assertion. |
---|
73 | The obvious fix in this example is to use |
---|
74 | <pre><code> |
---|
75 | template<class Archive> |
---|
76 | void save(boost::basic_oarchive & ar, const unsigned int version) const |
---|
77 | { |
---|
78 | for(int i = 0; i < 10; ++i){ |
---|
79 | ar << a[i]; |
---|
80 | } |
---|
81 | } |
---|
82 | </code></pre> |
---|
83 | which will compile and run without problem. |
---|
84 | The usage of <code style="white-space: normal">const</code> by the output archive operators |
---|
85 | will ensure that the process of serialization doesn't |
---|
86 | change the state of the objects being serialized. An attempt to do this |
---|
87 | would constitute augmentation of the concept of saving of state with |
---|
88 | some sort of non-obvious side effect. This would almost surely be a mistake |
---|
89 | and a likely source of very subtle bugs. |
---|
90 | <p> |
---|
91 | Unfortunately, implementation issues currently prevent the detection of this kind of |
---|
92 | error when the data item is wrapped as a name-value pair. |
---|
93 | <p> |
---|
94 | A similar problem can occur when different objects are loaded to and address |
---|
95 | which is different from the final location: |
---|
96 | <pre><code> |
---|
97 | template<class Archive> |
---|
98 | void load(boost::basic_oarchive & ar, const unsigned int version) const |
---|
99 | { |
---|
100 | for(int i = 0; i < 10; ++i){ |
---|
101 | A x; |
---|
102 | ar >> x; |
---|
103 | std::m_set.insert(x); |
---|
104 | } |
---|
105 | } |
---|
106 | </code></pre> |
---|
107 | In this case, the address of <code>x</code> is the one that is tracked rather than |
---|
108 | the address of the new item added to the set. Left unaddressed |
---|
109 | this will break the features that depend on tracking such as loading object through a pointer. |
---|
110 | Subtle bugs will be introduced into the program. This can be |
---|
111 | addressed by altering the above code thusly: |
---|
112 | |
---|
113 | <pre><code> |
---|
114 | template<class Archive> |
---|
115 | void load(boost::basic_iarchive & ar, const unsigned int version) const |
---|
116 | { |
---|
117 | for(int i = 0; i < 10; ++i){ |
---|
118 | A x; |
---|
119 | ar >> x; |
---|
120 | std::pair<std::set::const_iterator, bool> result; |
---|
121 | result = std::m_set.insert(x); |
---|
122 | ar.reset_object_address(& (*result.first), &x); |
---|
123 | } |
---|
124 | } |
---|
125 | </code></pre> |
---|
126 | This will adjust the tracking information to reflect the final resting place of |
---|
127 | the moved variable and thereby rectify the above problem. |
---|
128 | <p> |
---|
129 | If it is known a priori that no pointer |
---|
130 | values are duplicated, overhead associated with object tracking can |
---|
131 | be eliminated by setting the object tracking class serialization trait |
---|
132 | appropriately. |
---|
133 | <p> |
---|
134 | By default, data types designated primitive by |
---|
135 | <a target="detail" href="traits.html#level">Implementation Level</a> |
---|
136 | class serialization trait are never tracked. If it is desired to |
---|
137 | track a shared primitive object through a pointer (e.g. a |
---|
138 | <code style="white-space: normal">long</code> used as a reference count), It should be wrapped |
---|
139 | in a class/struct so that it is an identifiable type. |
---|
140 | The alternative of changing the implementation level of a <code style="white-space: normal">long</code> |
---|
141 | would affect all <code style="white-space: normal">long</code>s serialized in the whole |
---|
142 | program - probably not what one would intend. |
---|
143 | <p> |
---|
144 | It is possible that we may want to track addresses even though |
---|
145 | the object is never serialized through a pointer. For example, |
---|
146 | a virtual base class need be saved/loaded only once. By setting |
---|
147 | this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress |
---|
148 | redundant save/load operations. |
---|
149 | <pre><code> |
---|
150 | BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always) |
---|
151 | </code></pre> |
---|
152 | |
---|
153 | <h3><a name="export">Exporting Class Serialization</a></h3> |
---|
154 | <a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described |
---|
155 | <code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware |
---|
156 | that code should be instantiated for serialization of a given class even though the |
---|
157 | class hasn't been otherwise referred to by the program. |
---|
158 | <p> |
---|
159 | There are several ways <code style="white-space: normal">BOOST_CLASS_EXPORT</code> could have been |
---|
160 | implemented. |
---|
161 | <p> |
---|
162 | One approach would be to instantiate serialization code for all archive classes included in the library. |
---|
163 | This would add to each executable a large amount of code that is most likely never called. |
---|
164 | Also it would needlessly slow down compilations of any program that uses the library. Finally, |
---|
165 | the list of archives would be "built-in" to the library which would compilicate the addition of |
---|
166 | new or custom archive classes. |
---|
167 | <p> |
---|
168 | Another approach would be for the library user to somehow explicitly instantiate which archive classes |
---|
169 | code should be instantiated for each class to be serialized. Users would have to include |
---|
170 | header files corresponding the archive classes to be instantiated. |
---|
171 | The list of instantiated archive classes would have to be manually kept in sync with the |
---|
172 | archive class headers actually included. This was considered burdensome and error prone. |
---|
173 | <p> |
---|
174 | This implementation of <code style="white-space: normal">BOOST_CLASS_EXPORT</code> works in the |
---|
175 | following way: |
---|
176 | <ul> |
---|
177 | <li>All header modules of the form <boost/archive/*archive.hpp> are required to precede |
---|
178 | the header module <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>. |
---|
179 | <li>The header <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a> |
---|
180 | builds a list of archive classes whose header modules have been previously included. |
---|
181 | It does this by checking to see which inclusion guard constants have been defined. |
---|
182 | The header <a href="../../../boost/archive/detail/known_archive_types.hpp" target="known_archive_types_hpp">known_archive_types.hpp</a> |
---|
183 | lists the archive header files which whose include guards will be checked. If you create your own |
---|
184 | archive class, you probably want to edit this file. |
---|
185 | <li><code style="white-space: normal">BOOST_CLASS_EXPORT(my_class)</code> explicitly instantiates |
---|
186 | serialization code for <code style="white-space: normal">my_class</code> for each archive in the list. |
---|
187 | </ul> |
---|
188 | Serialization code will be instantiated for a given archive class |
---|
189 | if and only if the module that defines that archive class has been included in the program. |
---|
190 | Given this, our program will contain all necessary code instantiations and no other. |
---|
191 | <p> |
---|
192 | For many styles of code organization this header sequencing requirement presents little problem. |
---|
193 | |
---|
194 | Serialization code organized by class headers that are designed to be independent of archive |
---|
195 | implementations will look something like the following: |
---|
196 | <code><pre> |
---|
197 | // A.hpp |
---|
198 | // Note:to preserve independence from any particular archive implementation, |
---|
199 | // no headers from <boost/archive/...> are included. |
---|
200 | // Headers can be included in any order. |
---|
201 | #include <boost/serialization/...> |
---|
202 | #include <boost/serialization/export.hpp> |
---|
203 | ... // include other headers that A depends upon |
---|
204 | |
---|
205 | class A { |
---|
206 | ... |
---|
207 | }; |
---|
208 | |
---|
209 | BOOST_CLASS_EXPORT(A) // note: the export name of this class |
---|
210 | </pre></code> |
---|
211 | This style: |
---|
212 | <ul> |
---|
213 | <li>permits the header to include all aspects of the serialization implementation. |
---|
214 | <li>permits the header to be included anywhere else as part of some other class declaration. |
---|
215 | <li>reflects the concept of headers as a "library of types" which |
---|
216 | can be used independently in other programs or other parts of the same program. |
---|
217 | <li>reflects a fundamental principle of the serialization library design in that the |
---|
218 | specification of serialization of any class is independent of any archive implementation. |
---|
219 | </ul> |
---|
220 | However, it might not always be possible or convenient to conform to the above style. Something |
---|
221 | like the following might be required or preferred: |
---|
222 | <code><pre> |
---|
223 | // A.hpp |
---|
224 | // headers can be included in any order |
---|
225 | #include <boost/archive/text_oarchive.hpp> |
---|
226 | #include <boost/archive/text_iarchive.hpp> |
---|
227 | ... |
---|
228 | #include <boost/serialization/...> |
---|
229 | ... |
---|
230 | // can't do the following because then A.hpp couldn't be included somewhere else |
---|
231 | // #include <boost/serialization/export.hpp> |
---|
232 | |
---|
233 | class A { |
---|
234 | ... |
---|
235 | }; |
---|
236 | // can't do the following because export.hpp is not included !! |
---|
237 | //BOOST_CLASS_EXPORT(A) // note: the export name of this class |
---|
238 | </pre></code> |
---|
239 | As noted in the comments, this would work. But |
---|
240 | <code style="white-space: normal">#include <.../export.hpp></code> can't be used |
---|
241 | without conflicting with other modules which use |
---|
242 | <code style="white-space: normal">#include <.../*archive.hpp></code>. In this |
---|
243 | case we can move the export to an implementation file: |
---|
244 | <code><pre> |
---|
245 | // A.cpp |
---|
246 | #include "A.hpp" |
---|
247 | ... |
---|
248 | // export.hpp header should be last; |
---|
249 | #include <boost/serialization/export.hpp> |
---|
250 | ... |
---|
251 | BOOST_CLASS_EXPORT(A) |
---|
252 | ... |
---|
253 | </pre></code> |
---|
254 | |
---|
255 | <h3><a name="classinfo">Class Information</a></h3> |
---|
256 | By default, for each class serialized, class information is written to the archive. |
---|
257 | This information includes version number, implementation level and tracking |
---|
258 | behavior. This is necessary so that the archive can be correctly |
---|
259 | deserialized even if a subsequent version of the program changes |
---|
260 | some of the current trait values for a class. The space overhead for |
---|
261 | this data is minimal. There is a little bit of runtime overhead |
---|
262 | since each class has to be checked to see if it has already had its |
---|
263 | class information included in the archive. In some cases, even this |
---|
264 | might be considered too much. This extra overhead can be eliminated |
---|
265 | by setting the |
---|
266 | <a target="detail" href="traits.html#level">implementation level</a> |
---|
267 | class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>. |
---|
268 | <p> |
---|
269 | <i>Turning off tracking and class information serialization will result |
---|
270 | in pure template inline code that in principle could be optimised down |
---|
271 | to a simple stream write/read.</i> Elimination of all serialization overhead |
---|
272 | in this manner comes at a cost. Once archives are released to users, the |
---|
273 | class serialization traits cannot be changed without invalidating the old |
---|
274 | archives. Including the class information in the archive assures us |
---|
275 | that they will be readable in the future even if the class definition |
---|
276 | is revised. A light weight structure such as display pixel might be |
---|
277 | declared in a header like this: |
---|
278 | |
---|
279 | <pre><code> |
---|
280 | #include <boost/serialization/serialization.hpp> |
---|
281 | #include <boost/serialization/level.hpp> |
---|
282 | #include <boost/serialization/tracking.hpp> |
---|
283 | |
---|
284 | // a pixel is a light weight struct which is used in great numbers. |
---|
285 | struct pixel |
---|
286 | { |
---|
287 | unsigned char red, green, blue; |
---|
288 | template<class Archive> |
---|
289 | void serialize(Archive & ar, const unsigned int /* version */){ |
---|
290 | ar << red << green << blue; |
---|
291 | } |
---|
292 | }; |
---|
293 | |
---|
294 | // elminate serialization overhead at the cost of |
---|
295 | // never being able to increase the version. |
---|
296 | BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable); |
---|
297 | |
---|
298 | // eliminate object tracking (even if serialized through a pointer) |
---|
299 | // at the risk of a programming error creating duplicate objects. |
---|
300 | BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never) |
---|
301 | </code></pre> |
---|
302 | |
---|
303 | <h3><a name="portability">Archive Portability</a></h3> |
---|
304 | Several archive classes create their data in the form of text or portable a binary format. |
---|
305 | It should be possible to save such an of such a class on one platform and load it on another. |
---|
306 | This is subject to a couple of conditions. |
---|
307 | <h4><a name="numerics">Numerics</a></h4> |
---|
308 | The architecture of the machine reading the archive must be able hold the data |
---|
309 | saved. For example, the gcc compiler reserves 4 bytes to store a variable of type |
---|
310 | <code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes. |
---|
311 | So its possible that a value could be written that couldn't be represented by the loading program. This is a |
---|
312 | fairly obvious situation and easily handled by using the numeric types in |
---|
313 | <a target="cstding" href="../../../boost/cstdint.hpp"><boost/cstdint.hpp></a> |
---|
314 | |
---|
315 | <h4><a name="traits">Traits</a></h4> |
---|
316 | Another potential problem is illustrated by the following example: |
---|
317 | <pre><code> |
---|
318 | template<class T> |
---|
319 | struct my_wrapper { |
---|
320 | template<class Archive> |
---|
321 | Archive & serialize ... |
---|
322 | }; |
---|
323 | |
---|
324 | ... |
---|
325 | |
---|
326 | class my_class { |
---|
327 | wchar_t a; |
---|
328 | short unsigned b; |
---|
329 | template<<class Archive> |
---|
330 | Archive & serialize(Archive & ar, unsigned int version){ |
---|
331 | ar & my_wrapper(a); |
---|
332 | ar & my_wrapper(b); |
---|
333 | } |
---|
334 | }; |
---|
335 | </code></pre> |
---|
336 | If <code style="white-space: normal">my_wrapper</code> uses default serialization |
---|
337 | traits there could be a problem. With the default traits, each time a new type is |
---|
338 | added to the archive, bookkeeping information is added. So in this example, the |
---|
339 | archive would include such bookkeeping information for |
---|
340 | <code style="white-space: normal">my_wrapper<wchar_t></code> and for |
---|
341 | <code style="white-space: normal">my_wrapper<short_unsigned></code>. |
---|
342 | Or would it? What about compilers that treat |
---|
343 | <code style="white-space: normal">wchar_t</code> as a |
---|
344 | synonym for <code style="white-space: normal">unsigned short</code>? |
---|
345 | In this case there is only one distinct type - not two. If archives are passed between |
---|
346 | programs with compilers that differ in their treatment |
---|
347 | of <code style="white-space: normal">wchar_t</code> the load operation will fail |
---|
348 | in a catastrophic way. |
---|
349 | <p> |
---|
350 | One remedy for this is to assign serialization traits to the template |
---|
351 | <code style="white-space: normal">my_template</code> such that class |
---|
352 | information for instantiations of this template is never serialized. This |
---|
353 | process is described <a target="detail" href="traits.html#templates">above</a> and |
---|
354 | has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>. |
---|
355 | Wrappers would typically be assigned such traits. |
---|
356 | <p> |
---|
357 | Another way to avoid this problem is to assign serialization traits |
---|
358 | to all specializations of the template <code style="white-space: normal">my_wrapper</code> |
---|
359 | for all primitive types so that class information is never saved. This is what has |
---|
360 | been done for our implementation of serializations for STL collections. |
---|
361 | |
---|
362 | <h3><a name="binary_archives">Binary Archives</a></h3> |
---|
363 | Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed |
---|
364 | on output. This creates a problem for binary archives. The easiest way to handle this is to |
---|
365 | open streams for binary archives in "binary mode" by using the flag |
---|
366 | <code style="white-space: normal">ios::binary</code>. If this is not done, the archive generated |
---|
367 | will be unreadable. |
---|
368 | <p> |
---|
369 | Unfortunately, no way has been found to detect this error before loading the archive. Debug builds |
---|
370 | will assert when this is detected so that may be helpful in catching this error. |
---|
371 | |
---|
372 | <h3><a name="xml_archives">XML Archives</a></h3> |
---|
373 | XML archives present a somewhat special case. |
---|
374 | XML format has a nested structure that maps well to the "recursive class member visitor" pattern |
---|
375 | used by the serialization system. However, XML differs from other formats in that it |
---|
376 | requires a name for each data member. Our goal is to add this information to the |
---|
377 | class serialization specification while still permiting the the serialization code to be |
---|
378 | used with any archive. This is achived by requiring that all data serialized to an XML archive |
---|
379 | be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>. |
---|
380 | The first member is the name to be used as the XML tag for the |
---|
381 | data item while the second is a reference to the data item itself. Any attempt to serialize data |
---|
382 | not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will |
---|
383 | be trapped at compile time. The system is implemented in such a way that for other archive classes, |
---|
384 | just the value portion of the data is serialized. The name portion is discarded during compilation. |
---|
385 | So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will |
---|
386 | be guarenteed that all data can be serialized to all archive classes with maximum efficiency. |
---|
387 | |
---|
388 | <h3><a href="exceptions.html">Archive Exceptions</a></h3> |
---|
389 | <h3><a href="exception_safety.html">Exception Safety</a></h3> |
---|
390 | |
---|
391 | <hr> |
---|
392 | <p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004. |
---|
393 | Distributed under the Boost Software License, Version 1.0. (See |
---|
394 | accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) |
---|
395 | </i></p> |
---|
396 | </body> |
---|
397 | </html> |
---|