Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: downloads/boost_1_34_1/libs/serialization/doc/special.html @ 29

Last change on this file since 29 was 29, checked in by landauf, 16 years ago

updated boost from 1_33_1 to 1_34_1

File size: 18.5 KB
Line 
1<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html>
3<!--
4(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
5Use, modification and distribution is subject to the Boost Software
6License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
7http://www.boost.org/LICENSE_1_0.txt)
8-->
9<head>
10<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
11<link rel="stylesheet" type="text/css" href="../../../boost.css">
12<link rel="stylesheet" type="text/css" href="style.css">
13<title>Serialization - Special Considerations</title>
14</head>
15<body link="#0000ff" vlink="#800080">
16<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
17  <tr> 
18    <td valign="top" width="300"> 
19      <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
20    </td>
21    <td valign="top"> 
22      <h1 align="center">Serialization</h1>
23      <h2 align="center">Special Considerations</h2>
24    </td>
25  </tr>
26</table>
27<hr>
28<dl class="page-index">
29  <dt><a href="#objecttracking">Object Tracking</a>
30  <dt><a href="#export">Exporting Class Serialization</a>
31  <dt><a href="#classinfo">Class Information</a>
32  <dt><a href="#portability">Archive Portability</a>
33  <dl class="page-index">
34    <dt><a href="#numerics">Numerics</a>
35    <dt><a href="#traits">Traits</a>
36  </dl>
37  <dt><a href="#binary_archives">Binary Archives</a>
38  <dt><a href="#xml_archives">XML Archives</a>
39  <dt><a href="exceptions.html">Archive Exceptions</a>
40  <dt><a href="exception_safety.html">Exception Safety</a>
41</dl>
42
43<h3><a name="objecttracking">Object Tracking</a></h3>
44Depending on how the class is used and other factors, serialized objects
45may be tracked by memory address.  This prevents the same object from being
46written to or read from an archive multiple times. These stored addresses
47can also be used to delete objects created during a loading process
48that has been interrupted by throwing of an exception. 
49<p>
50This could cause problems in
51progams where the copies of different objects are saved from the same address.
52<pre><code>
53template&lt;class Archive&gt;
54void save(boost::basic_oarchive  &amp; ar, const unsigned int version) const
55{
56    for(int i = 0; i &lt; 10; ++i){
57        A x = a[i];
58        ar &lt;&lt; x;
59    }
60}
61</code></pre>
62In this case, the data to be saved exists on the stack.  Each iteration
63of the loop updates the value on the stack.  So although the data changes
64each iteration, the address of the data doesn't.  If a[i] is an array of
65objects being tracked by memory address, the library will skip storing
66objects after the first as it will be assumed that objects at the same address
67are really the same object.
68<p>
69To help detect such cases, output archive operators expect to be passed
70<code style="white-space: normal">const</code> reference arguments.
71<p>
72Given this, the above code will invoke a compile time assertion.
73The obvious fix in this example is to use
74<pre><code>
75template&lt;class Archive&gt;
76void save(boost::basic_oarchive &amp; ar, const unsigned int version) const
77{
78    for(int i = 0; i &lt; 10; ++i){
79        ar &lt;&lt; a[i];
80    }
81}
82</code></pre>
83which will compile and run without problem. 
84The usage of <code style="white-space: normal">const</code> by the output archive operators
85will ensure that the process of serialization doesn't
86change the state of the objects being serialized.  An attempt to do this
87would constitute augmentation of the concept of saving of state with
88some sort of non-obvious side effect. This would almost surely be a mistake
89and a likely source of very subtle bugs.
90<p>
91Unfortunately, implementation issues currently prevent the detection of this kind of
92error when the data item is wrapped as a name-value pair.
93<p>
94A similar problem can occur when different objects are loaded to and address
95which is different from the final location:
96<pre><code>
97template&lt;class Archive&gt;
98void load(boost::basic_oarchive  &amp; ar, const unsigned int version) const
99{
100    for(int i = 0; i &lt; 10; ++i){
101        A x;
102        ar &gt;&gt; x;
103        std::m_set.insert(x);
104    }
105}
106</code></pre>
107In this case, the address of <code>x</code> is the one that is tracked rather than
108the address of the new item added to the set.  Left unaddressed
109this will break the features that depend on tracking such as loading object through a pointer.
110Subtle bugs will be introduced into the program.  This can be
111addressed by altering the above code thusly:
112
113<pre><code>
114template&lt;class Archive&gt;
115void load(boost::basic_iarchive  &amp; ar, const unsigned int version) const
116{
117    for(int i = 0; i &lt; 10; ++i){
118        A x;
119        ar &gt;&gt; x;
120        std::pair&lt;std::set::const_iterator, bool&gt; result;
121        result = std::m_set.insert(x);
122        ar.reset_object_address(& (*result.first), &x);
123    }
124}
125</code></pre>
126This will adjust the tracking information to reflect the final resting place of
127the moved variable and thereby rectify the above problem.
128<p>
129If it is known a priori that no pointer
130values are duplicated, overhead associated with object tracking can
131be eliminated by setting the object tracking class serialization trait
132appropriately.
133<p>
134By default, data types designated primitive by
135<a target="detail" href="traits.html#level">Implementation Level</a>
136class serialization trait are never tracked. If it is desired to
137track a shared primitive object through a pointer (e.g. a
138<code style="white-space: normal">long</code> used as a reference count), It should be wrapped
139in a class/struct so that it is an identifiable type.
140The alternative of changing the implementation level of a <code style="white-space: normal">long</code>
141would affect all <code style="white-space: normal">long</code>s serialized in the whole
142program - probably not what one would intend.
143<p>
144It is possible that we may want to track addresses even though
145the object is never serialized through a pointer.  For example,
146a virtual base class need be saved/loaded only once.  By setting
147this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress
148redundant save/load operations.
149<pre><code>
150BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always)
151</code></pre>
152
153<h3><a name="export">Exporting Class Serialization</a></h3>
154<a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described
155<code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware
156that code should be instantiated for serialization of a given class even though the
157class hasn't been otherwise referred to by the program. 
158<p>
159There are several ways <code style="white-space: normal">BOOST_CLASS_EXPORT</code> could have been
160implemented.
161<p>
162One approach would be to instantiate serialization code  for all archive classes included in the library.
163This would add to each executable a large amount of code that is most likely never called.
164Also it would needlessly slow down compilations of any program that uses the library.  Finally,
165the list of archives would be "built-in" to the library which would compilicate the addition of
166new or custom archive classes.
167<p>
168Another approach would be for the library user to somehow explicitly instantiate which archive classes
169code should be instantiated for each class to be serialized. Users would have to include
170header files corresponding the archive classes to be instantiated.
171The list of instantiated archive classes would have to be manually kept in sync with the
172archive class headers actually included.  This was considered burdensome and error prone.
173<p>
174This implementation of <code style="white-space: normal">BOOST_CLASS_EXPORT</code> works in the
175following way:
176<ul>
177  <li>All header modules of the form &lt;boost/archive/*archive.hpp&gt; are required to precede
178  the header module <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>.
179  <li>The header <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>
180  builds a list of archive classes whose header modules have been previously included. 
181  It does this by checking to see which inclusion guard constants have been defined.
182  The header <a href="../../../boost/archive/detail/known_archive_types.hpp" target="known_archive_types_hpp">known_archive_types.hpp</a>
183  lists the archive header files which whose include guards will be checked.  If you create your own
184  archive class, you probably want to edit this file.
185  <li><code style="white-space: normal">BOOST_CLASS_EXPORT(my_class)</code> explicitly instantiates
186  serialization code for <code style="white-space: normal">my_class</code> for each archive in the list.
187</ul>
188Serialization code will be instantiated for a given archive class
189if and only if the module that defines that archive class has been included in the program.
190Given this, our program will contain all necessary code instantiations and no other.
191<p>
192For many styles of code organization this header sequencing requirement presents little problem.
193 
194Serialization code organized by class headers that are designed to be independent of archive
195implementations will look something like the following:
196<code><pre>
197// A.hpp
198// Note:to preserve independence from any particular archive implementation,
199// no headers from &lt;boost/archive/...&gt; are included.
200// Headers can be included in any order.
201#include &lt;boost/serialization/...&gt;
202#include &lt;boost/serialization/export.hpp&gt;
203... // include other headers that A depends upon
204
205class A {
206        ...
207};
208
209BOOST_CLASS_EXPORT(A) // note: the export name of this class
210</pre></code>
211This style:
212<ul>
213  <li>permits the header to include all aspects of the serialization implementation.
214  <li>permits the header to be included anywhere else as part of some other class declaration.
215  <li>reflects the concept of headers as a "library of types" which
216  can be used independently in other programs or other parts of the same program.
217  <li>reflects a fundamental principle of the serialization library design in that the
218  specification of serialization of any class is independent of any archive implementation.
219</ul>
220However, it might not always be possible or convenient to conform to the above style. Something
221like the following might be required or preferred:
222<code><pre>
223// A.hpp
224// headers can be included in any order
225#include &lt;boost/archive/text_oarchive.hpp&gt;
226#include &lt;boost/archive/text_iarchive.hpp&gt;
227...
228#include &lt;boost/serialization/...&gt;
229...
230// can't do the following because then A.hpp couldn't be included somewhere else
231// #include &lt;boost/serialization/export.hpp&gt;
232
233class A {
234        ...
235};
236// can't do the following because export.hpp is not included !!
237//BOOST_CLASS_EXPORT(A) // note: the export name of this class
238</pre></code>
239As noted in the comments, this would work.  But
240<code style="white-space: normal">#include &lt;.../export.hpp&gt;</code> can't be used
241without conflicting with other modules which use
242<code style="white-space: normal">#include &lt;.../*archive.hpp&gt;</code>.  In this
243case we can move the export to an implementation file:
244<code><pre>
245// A.cpp
246#include "A.hpp"
247...
248// export.hpp header should be last;
249#include &lt;boost/serialization/export.hpp&gt;
250...
251BOOST_CLASS_EXPORT(A)
252...
253</pre></code>
254
255<h3><a name="classinfo">Class Information</a></h3>
256By default, for each class serialized, class information is written to the archive.
257This information includes version number, implementation level and tracking
258behavior.  This is necessary so that the archive can be correctly
259deserialized even if a subsequent version of the program changes
260some of the current trait values for a class.  The space overhead for
261this data is minimal.  There is a little bit of runtime overhead
262since each class has to be checked to see if it has already had its
263class information included in the archive.  In some cases, even this
264might be considered too much.  This extra overhead can be eliminated
265by setting the
266<a target="detail" href="traits.html#level">implementation level</a>
267class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>.
268<p>
269<i>Turning off tracking and class information serialization will result
270in pure template inline code that in principle could be optimised down
271to a simple stream write/read.</i>  Elimination of all serialization overhead
272in this manner comes at a cost.  Once archives are released to users, the
273class serialization traits cannot be changed without invalidating the old
274archives.  Including the class information in the archive assures us
275that they will be readable in the future even if the class definition
276is revised.  A light weight structure such as display pixel might be
277declared in a header like this:
278
279<pre><code>
280#include &lt;boost/serialization/serialization.hpp&gt;
281#include &lt;boost/serialization/level.hpp&gt;
282#include &lt;boost/serialization/tracking.hpp&gt;
283
284// a pixel is a light weight struct which is used in great numbers.
285struct pixel
286{
287    unsigned char red, green, blue;
288    template&lt;class Archive&gt;
289    void serialize(Archive &amp; ar, const unsigned int /* version */){
290        ar &lt;&lt; red &lt;&lt; green &lt;&lt; blue;
291    }
292};
293
294// elminate serialization overhead at the cost of
295// never being able to increase the version.
296BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable);
297
298// eliminate object tracking (even if serialized through a pointer)
299// at the risk of a programming error creating duplicate objects.
300BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never)
301</code></pre>
302
303<h3><a name="portability">Archive Portability</a></h3>
304Several archive classes create their data in the form of text or portable a binary format. 
305It should be possible to save such an of such a class on one platform and load it on another. 
306This is subject to a couple of conditions.
307<h4><a name="numerics">Numerics</a></h4>
308The architecture of the machine reading the archive must be able hold the data
309saved.  For example, the gcc compiler reserves 4 bytes to store a variable of type
310<code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes. 
311So its possible that   a value could be written that couldn't be represented by the loading program.  This is a
312fairly obvious situation and easily handled by using the numeric types in
313<a target="cstding" href="../../../boost/cstdint.hpp">&lt;boost/cstdint.hpp&gt;</a>
314
315<h4><a name="traits">Traits</a></h4>
316Another potential problem is illustrated by the following example:
317<pre><code>
318template&lt;class T&gt;
319struct my_wrapper {
320    template&lt;class Archive&gt;
321    Archive & serialize ...
322};
323
324...
325
326class my_class {
327    wchar_t a;
328    short unsigned b;
329    template<&lt;class Archive&gt;
330    Archive & serialize(Archive & ar, unsigned int version){
331        ar & my_wrapper(a);
332        ar & my_wrapper(b);
333    }
334};
335</code></pre>
336If <code style="white-space: normal">my_wrapper</code> uses default serialization
337traits there could be a problem.  With the default traits, each time a new type is
338added to the archive, bookkeeping information is added. So in this example, the
339archive would include such bookkeeping information for
340<code style="white-space: normal">my_wrapper&lt;wchar_t&gt;</code> and for
341<code style="white-space: normal">my_wrapper&lt;short_unsigned&gt;</code>.
342Or would it?  What about compilers that treat
343<code style="white-space: normal">wchar_t</code> as a
344synonym for <code style="white-space: normal">unsigned short</code>?
345In this case there is only one distinct type - not two.  If archives are passed between
346programs with compilers that differ in their treatment
347of <code style="white-space: normal">wchar_t</code> the load operation will fail
348in a catastrophic way.
349<p>
350One remedy for this is to assign serialization traits to the template
351<code style="white-space: normal">my_template</code> such that class
352information for instantiations of this template is never serialized.  This
353process is described <a target="detail" href="traits.html#templates">above</a> and
354has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>.
355Wrappers would typically be assigned such traits.
356<p>
357Another way to avoid this problem is to assign serialization traits
358to all specializations of the template <code style="white-space: normal">my_wrapper</code>
359for all primitive types so that class information is never saved.  This is what has
360been done for our implementation of serializations for STL collections.
361
362<h3><a name="binary_archives">Binary Archives</a></h3>
363Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed
364on output. This creates a problem for binary archives.  The easiest way to handle this is to
365open streams for binary archives in "binary mode" by using the flag
366<code style="white-space: normal">ios::binary</code>.  If this is not done, the archive generated
367will be unreadable.
368<p>
369Unfortunately, no way has been found to detect this error before loading the archive.  Debug builds
370will assert when this is detected so that may be helpful in catching this error.
371
372<h3><a name="xml_archives">XML Archives</a></h3>
373XML archives present a somewhat special case.
374XML format has a nested structure that maps well to the "recursive class member visitor" pattern
375used by the serialization system. However, XML differs from other formats in that it
376requires a name for each data member. Our goal is to add this information to the
377class serialization specification while still permiting the the serialization code to be
378used with any archive. This is achived by requiring that all data serialized to an XML archive
379be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>.
380The first member is the name to be used as the XML tag for the
381data item while the second is a reference to the data item itself. Any attempt to serialize data
382not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will
383be trapped at compile time. The system is implemented in such a way that for other archive classes,
384just the value portion of the data is serialized. The name portion is discarded during compilation.
385So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will
386be guarenteed that all data can be serialized to all archive classes with maximum efficiency.
387
388<h3><a href="exceptions.html">Archive Exceptions</a></h3>
389<h3><a href="exception_safety.html">Exception Safety</a></h3>
390
391<hr>
392<p><i>&copy; Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
393Distributed under the Boost Software License, Version 1.0. (See
394accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
395</i></p>
396</body>
397</html>
Note: See TracBrowser for help on using the repository browser.