Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: downloads/boost_1_33_1/libs/serialization/doc/special.html @ 12

Last change on this file since 12 was 12, checked in by landauf, 17 years ago

added boost

File size: 18.3 KB
Line 
1<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html>
3<!--
4(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
5Use, modification and distribution is subject to the Boost Software
6License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
7http://www.boost.org/LICENSE_1_0.txt)
8-->
9<head>
10<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
11<link rel="stylesheet" type="text/css" href="../../../boost.css">
12<link rel="stylesheet" type="text/css" href="style.css">
13<title>Serialization - Special Considerations</title>
14</head>
15<body link="#0000ff" vlink="#800080">
16<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
17  <tr> 
18    <td valign="top" width="300"> 
19      <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
20    </td>
21    <td valign="top"> 
22      <h1 align="center">Serialization</h1>
23      <h2 align="center">Special Considerations</h2>
24    </td>
25  </tr>
26</table>
27<hr>
28<dl class="page-index">
29  <dt><a href="#objecttracking">Object Tracking</a>
30  <dt><a href="#export">Exporting Class Serialization</a>
31  <dt><a href="#classinfo">Class Information</a>
32  <dt><a href="#portability">Archive Portability</a>
33  <dl class="page-index">
34    <dt><a href="#numerics">Numerics</a>
35    <dt><a href="#traits">Traits</a>
36  </dl>
37  <dt><a href="#binary_archives">Binary Archives</a>
38  <dt><a href="#xml_archives">XML Archives</a>
39  <dt><a href="exceptions.html">Archive Exceptions</a>
40  <dt><a href="exception_safety.html">Exception Safety</a>
41</dl>
42
43<h3><a name="objecttracking">Object Tracking</a></h3>
44Depending on how the class is used and other factors, serialized objects
45may be tracked by memory address.  This prevents the same object from being
46written to or read from an archive multiple times. These stored addresses
47can also be used to delete objects created during a loading process
48that has been interrupted by throwing of an exception. 
49<p>
50This could cause problems in
51progams where the copies of different objects are saved from the same address.
52<pre><code>
53template&lt;class Archive&gt;
54void save(boost::basic_oarchive  &amp; ar, const unsigned int version) const
55{
56    for(int i = 0; i &lt; 10; ++i){
57        A x = a[i];
58        ar &lt;&lt; x;
59    }
60}
61</code></pre>
62In this case, the data to be saved exists on the stack.  Each iteration
63of the loop updates the value on the stack.  So although the data changes
64each iteration, the address of the data doesn't.  If a[i] is an array of
65objects being tracked by memory address, the library will skip storing
66objects after the first as it will be assumed that objects at the same address
67are really the same object.
68<p>
69To help detect such cases, output archive operators expect to be passed
70<code style="white-space: normal">const</code> reference arguments.
71<p>
72Given this, the above code will invoke a compile time assertion.
73The obvious fix in this example is to use
74<pre><code>
75template&lt;class Archive&gt;
76void save(boost::basic_oarchive &amp; ar, const unsigned int version) const
77{
78    for(int i = 0; i &lt; 10; ++i){
79        ar &lt;&lt; a[i];
80    }
81}
82</code></pre>
83which will compile and run without problem. 
84The usage of <code style="white-space: normal">const</code> by the output archive operators
85will ensure that the process of serialization doesn't
86change the state of the objects being serialized.  An attempt to do this
87would constitute augmentation of the concept of saving of state with
88some sort of non-obvious side effect. This would almost surely be a mistake
89and a likely source of very subtle bugs.
90<p>
91Unfortunately, implementation issues currently prevent the detection of this kind of
92error when the data item is wrapped as a name-value pair.
93<p>
94A similar problem can occur when different objects are loaded to and address
95which is different from the final location:
96<pre><code>
97template&lt;class Archive&gt;
98void load(boost::basic_oarchive  &amp; ar, const unsigned int version) const
99{
100    for(int i = 0; i &lt; 10; ++i){
101        A x;
102        ar &gt;&gt; x;
103        std::m_set.insert(x);
104    }
105}
106</code></pre>
107In this case, the address of <code>x</code> is the one that is tracked rather than
108the address of the new item added to the set.  Left unaddressed
109this will break the features that depend on tracking such as loading object through a pointer.
110Subtle bugs will be introduced into the program.  This can be
111addressed by altering the above code thusly:
112
113<pre><code>
114template&lt;class Archive&gt;
115void load(boost::basic_iarchive  &amp; ar, const unsigned int version) const
116{
117    for(int i = 0; i &lt; 10; ++i){
118        A x;
119        ar &gt;&gt; x;
120        std::pair&lt;std::set::const_iterator, bool&gt; result;
121        result = std::m_set.insert(x);
122        ar.reset_object_address(& (*result.first), &x);
123    }
124}
125</code></pre>
126This will adjust the tracking information to reflect the final resting place of
127the moved variable and thereby rectify the above problem.
128<p>
129If it is known a priori that no pointer
130values are duplicated, overhead associated with object tracking can
131be eliminated by setting the object tracking class serialization trait
132appropriately.
133<p>
134By default, data types designated primitive by
135<a target="detail" href="traits.html#level">Implementation Level</a>
136class serialization trait are never tracked. If it is desired to
137track a shared primitive object through a pointer (e.g. a
138<code style="white-space: normal">long</code> used as a reference count), It should be wrapped
139in a class/struct so that it is an identifiable type.
140The alternative of changing the implementation level of a <code style="white-space: normal">long</code>
141would affect all <code style="white-space: normal">long</code>s serialized in the whole
142program - probably not what one would intend.
143<p>
144It is possible that we may want to track addresses even though
145the object is never serialized through a pointer.  For example,
146a virtual base class need be saved/loaded only once.  By setting
147this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress
148redundant save/load operations.
149<pre><code>
150BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always)
151</code></pre>
152
153<h3><a name="export">Exporting Class Serialization</a></h3>
154<a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described
155<code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware
156that code should be instantiated for serialization of a given class even though the
157class hasn't been otherwise referred to by the program. 
158<p>
159There are several ways <code style="white-space: normal">BOOST_CLASS_EXPORT</code> could have been
160implemented.
161<p>
162One approach would be to instantiate serialization code  for all archive classes included in the library.
163This would add to each executable a large amount of code that is most likely never called.
164Also it would needlessly slow down compilations of any program that uses the library.  Finally,
165the list of archives would be "built-in" to the library which would compilicate the addition of
166new or custom archive classes.
167<p>
168Another approach would be for the library user to somehow explicitly instantiate which archive classes
169code should be instantiated for each class to be serialized. Users would have to include
170header files corresponding the archive classes to be instantiated.
171The list of instantiated archive classes would have to be manually kept in sync with the
172archive class headers actually included.  This was considered burdensome and error prone.
173<p>
174This implementation of <code style="white-space: normal">BOOST_CLASS_EXPORT</code> works in the
175following way:
176<ul>
177  <li>All header modules of the form &lt;boost/archive/*archive.hpp&gt; are required to precede
178  the header module <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>.
179  <li>The header <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>
180  builds a list of archive classes whose header modules have been previously included. 
181  It does this by checking to see which inclusion guard constants have been defined.
182  <li><code style="white-space: normal">BOOST_CLASS_EXPORT(my_class)</code> explicitly instantiates
183  serialization code for <code style="white-space: normal">my_class</code> for each archive in the list.
184</ul>
185Serialization code will be instantiated for a given archive class
186if and only if the module that defines that archive class has been included in the program.
187Given this, our program will contain all necessary code instantiations and no other.
188<p>
189For many styles of code organization this header sequencing requirement presents little problem.
190 
191Serialization code organized by class headers that are designed to be independent of archive
192implementations will look something like the following:
193<code><pre>
194// A.hpp
195// Note:to preserve independence from any particular archive implementation,
196// no headers from &lt;boost/archive/...&gt; are included.
197// Headers can be included in any order.
198#include &lt;boost/serialization/...&gt;
199#include &lt;boost/serialization/export.hpp&gt;
200... // include other headers that A depends upon
201
202class A {
203        ...
204};
205
206BOOST_CLASS_EXPORT(A) // note: the export name of this class
207</pre></code>
208This style:
209<ul>
210  <li>permits the header to include all aspects of the serialization implementation.
211  <li>permits the header to be included anywhere else as part of some other class declaration.
212  <li>reflects the concept of headers as a "library of types" which
213  can be used independently in other programs or other parts of the same program.
214  <li>reflects a fundamental principle of the serialization library design in that the
215  specification of serialization of any class is independent of any archive implementation.
216</ul>
217However, it might not always be possible or convenient to conform to the above style. Something
218like the following might be required or preferred:
219<code><pre>
220// A.hpp
221// headers can be included in any order
222#include &lt;boost/archive/text_oarchive.hpp&gt;
223#include &lt;boost/archive/text_iarchive.hpp&gt;
224...
225#include &lt;boost/serialization/...&gt;
226...
227// can't do the following because then A.hpp couldn't be included somewhere else
228// #include &lt;boost/serialization/export.hpp&gt;
229
230class A {
231        ...
232};
233// can't do the following because export.hpp is not included !!
234//BOOST_CLASS_EXPORT(A) // note: the export name of this class
235</pre></code>
236As noted in the comments, this would work.  But
237<code style="white-space: normal">#include &lt;.../export.hpp&gt;</code> can't be used
238without conflicting with other modules which use
239<code style="white-space: normal">#include &lt;.../*archive.hpp&gt;</code>.  In this
240case we can move the export to an implementation file:
241<code><pre>
242// A.cpp
243#include "A.hpp"
244...
245// export.hpp header should be last;
246#include &lt;boost/serialization/export.hpp&gt;
247...
248BOOST_CLASS_EXPORT(A)
249...
250</pre></code>
251
252<h3><a name="classinfo">Class Information</a></h3>
253By default, for each class serialized, class information is written to the archive.
254This information includes version number, implementation level and tracking
255behavior.  This is necessary so that the archive can be correctly
256deserialized even if a subsequent version of the program changes
257some of the current trait values for a class.  The space overhead for
258this data is minimal.  There is a little bit of runtime overhead
259since each class has to be checked to see if it has already had its
260class information included in the archive.  In some cases, even this
261might be considered too much.  This extra overhead can be eliminated
262by setting the
263<a target="detail" href="traits.html#level">implementation level</a>
264class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>.
265<p>
266<i>Turning off tracking and class information serialization will result
267in pure template inline code that in principle could be optimised down
268to a simple stream write/read.</i>  Elimination of all serialization overhead
269in this manner comes at a cost.  Once archives are released to users, the
270class serialization traits cannot be changed without invalidating the old
271archives.  Including the class information in the archive assures us
272that they will be readable in the future even if the class definition
273is revised.  A light weight structure such as display pixel might be
274declared in a header like this:
275
276<pre><code>
277#include &lt;boost/serialization/serialization.hpp&gt;
278#include &lt;boost/serialization/level.hpp&gt;
279#include &lt;boost/serialization/tracking.hpp&gt;
280
281// a pixel is a light weight struct which is used in great numbers.
282struct pixel
283{
284    unsigned char red, green, blue;
285    template&lt;class Archive&gt;
286    void serialize(Archive &amp; ar, const unsigned int /* version */){
287        ar &lt;&lt; red &lt;&lt; green &lt;&lt; blue;
288    }
289};
290
291// elminate serialization overhead at the cost of
292// never being able to increase the version.
293BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable);
294
295// eliminate object tracking (even if serialized through a pointer)
296// at the risk of a programming error creating duplicate objects.
297BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never)
298</code></pre>
299
300<h3><a name="portability">Archive Portability</a></h3>
301Several archive classes create their data in the form of text or portable a binary format. 
302It should be possible to save such an of such a class on one platform and load it on another. 
303This is subject to a couple of conditions.
304<h4><a name="numerics">Numerics</a></h4>
305The architecture of the machine reading the archive must be able hold the data
306saved.  For example, the gcc compiler reserves 4 bytes to store a variable of type
307<code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes. 
308So its possible that   a value could be written that couldn't be represented by the loading program.  This is a
309fairly obvious situation and easily handled by using the numeric types in
310<a target="cstding" href="../../../boost/cstdint.hpp">&lt;boost/cstdint.hpp&gt;</a>
311
312<h4><a name="traits">Traits</a></h4>
313Another potential problem is illustrated by the following example:
314<pre><code>
315template&lt;class T&gt;
316struct my_wrapper {
317    template&lt;class Archive&gt;
318    Archive & serialize ...
319};
320
321...
322
323class my_class {
324    wchar_t a;
325    short unsigned b;
326    template<&lt;class Archive&gt;
327    Archive & serialize(Archive & ar, unsigned int version){
328        ar & my_wrapper(a);
329        ar & my_wrapper(b);
330    }
331};
332</code></pre>
333If <code style="white-space: normal">my_wrapper</code> uses default serialization
334traits there could be a problem.  With the default traits, each time a new type is
335added to the archive, bookkeeping information is added. So in this example, the
336archive would include such bookkeeping information for
337<code style="white-space: normal">my_wrapper&lt;wchar_t&gt;</code> and for
338<code style="white-space: normal">my_wrapper&lt;short_unsigned&gt;</code>.
339Or would it?  What about compilers that treat
340<code style="white-space: normal">wchar_t</code> as a
341synonym for <code style="white-space: normal">unsigned short</code>?
342In this case there is only one distinct type - not two.  If archives are passed between
343programs with compilers that differ in their treatment
344of <code style="white-space: normal">wchar_t</code> the load operation will fail
345in a catastrophic way.
346<p>
347One remedy for this is to assign serialization traits to the template
348<code style="white-space: normal">my_template</code> such that class
349information for instantiations of this template is never serialized.  This
350process is described <a target="detail" href="traits.html#templates">above</a> and
351has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>.
352Wrappers would typically be assigned such traits.
353<p>
354Another way to avoid this problem is to assign serialization traits
355to all specializations of the template <code style="white-space: normal">my_wrapper</code>
356for all primitive types so that class information is never saved.  This is what has
357been done for our implementation of serializations for STL collections.
358
359<h3><a name="binary_archives">Binary Archives</a></h3>
360Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed
361on output. This creates a problem for binary archives.  The easiest way to handle this is to
362open streams for binary archives in "binary mode" by using the flag
363<code style="white-space: normal">ios::binary</code>.  If this is not done, the archive generated
364will be unreadable.
365<p>
366Unfortunately, no way has been found to detect this error before loading the archive.  Debug builds
367will assert when this is detected so that may be helpful in catching this error.
368
369<h3><a name="xml_archives">XML Archives</a></h3>
370XML archives present a somewhat special case.
371XML format has a nested structure that maps well to the "recursive class member visitor" pattern
372used by the serialization system. However, XML differs from other formats in that it
373requires a name for each data member. Our goal is to add this information to the
374class serialization specification while still permiting the the serialization code to be
375used with any archive. This is achived by requiring that all data serialized to an XML archive
376be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>.
377The first member is the name to be used as the XML tag for the
378data item while the second is a reference to the data item itself. Any attempt to serialize data
379not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will
380be trapped at compile time. The system is implemented in such a way that for other archive classes,
381just the value portion of the data is serialized. The name portion is discarded during compilation.
382So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will
383be guarenteed that all data can be serialized to all archive classes with maximum efficiency.
384
385<h3><a href="exceptions.html">Archive Exceptions</a></h3>
386<h3><a href="exception_safety.html">Exception Safety</a></h3>
387
388<hr>
389<p><i>&copy; Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
390Distributed under the Boost Software License, Version 1.0. (See
391accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
392</i></p>
393</body>
394</html>
Note: See TracBrowser for help on using the repository browser.