Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: downloads/boost_1_33_1/libs/serialization/doc/special.html @ 12

Last change on this file since 12 was 12, checked in by landauf, 17 years ago
added boost
File size: 18.3 KB

Line
1	<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2	<html>
3	<!--
4	(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
5	Use, modification and distribution is subject to the Boost Software
6	License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
7	http://www.boost.org/LICENSE_1_0.txt)
8	-->
9	<head>
10	<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
11	<link rel="stylesheet" type="text/css" href="../../../boost.css">
12	<link rel="stylesheet" type="text/css" href="style.css">
13	<title>Serialization - Special Considerations</title>
14	</head>
15	<body link="#0000ff" vlink="#800080">
16	<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
17	<tr>
18	<td valign="top" width="300">
19	<h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
20	</td>
21	<td valign="top">
22	<h1 align="center">Serialization</h1>
23	<h2 align="center">Special Considerations</h2>
24	</td>
25	</tr>
26	</table>
27	<hr>
28	<dl class="page-index">
29	<dt><a href="#objecttracking">Object Tracking</a>
30	<dt><a href="#export">Exporting Class Serialization</a>
31	<dt><a href="#classinfo">Class Information</a>
32	<dt><a href="#portability">Archive Portability</a>
33	<dl class="page-index">
34	<dt><a href="#numerics">Numerics</a>
35	<dt><a href="#traits">Traits</a>
36	</dl>
37	<dt><a href="#binary_archives">Binary Archives</a>
38	<dt><a href="#xml_archives">XML Archives</a>
39	<dt><a href="exceptions.html">Archive Exceptions</a>
40	<dt><a href="exception_safety.html">Exception Safety</a>
41	</dl>
42
43	<h3><a name="objecttracking">Object Tracking</a></h3>
44	Depending on how the class is used and other factors, serialized objects
45	may be tracked by memory address. This prevents the same object from being
46	written to or read from an archive multiple times. These stored addresses
47	can also be used to delete objects created during a loading process
48	that has been interrupted by throwing of an exception.
49	<p>
50	This could cause problems in
51	progams where the copies of different objects are saved from the same address.
52	<pre><code>
53	template<class Archive>
54	void save(boost::basic_oarchive & ar, const unsigned int version) const
55	{
56	for(int i = 0; i < 10; ++i){
57	A x = a[i];
58	ar << x;
59	}
60	}
61	</code></pre>
62	In this case, the data to be saved exists on the stack. Each iteration
63	of the loop updates the value on the stack. So although the data changes
64	each iteration, the address of the data doesn't. If a[i] is an array of
65	objects being tracked by memory address, the library will skip storing
66	objects after the first as it will be assumed that objects at the same address
67	are really the same object.
68	<p>
69	To help detect such cases, output archive operators expect to be passed
70	<code style="white-space: normal">const</code> reference arguments.
71	<p>
72	Given this, the above code will invoke a compile time assertion.
73	The obvious fix in this example is to use
74	<pre><code>
75	template<class Archive>
76	void save(boost::basic_oarchive & ar, const unsigned int version) const
77	{
78	for(int i = 0; i < 10; ++i){
79	ar << a[i];
80	}
81	}
82	</code></pre>
83	which will compile and run without problem.
84	The usage of <code style="white-space: normal">const</code> by the output archive operators
85	will ensure that the process of serialization doesn't
86	change the state of the objects being serialized. An attempt to do this
87	would constitute augmentation of the concept of saving of state with
88	some sort of non-obvious side effect. This would almost surely be a mistake
89	and a likely source of very subtle bugs.
90	<p>
91	Unfortunately, implementation issues currently prevent the detection of this kind of
92	error when the data item is wrapped as a name-value pair.
93	<p>
94	A similar problem can occur when different objects are loaded to and address
95	which is different from the final location:
96	<pre><code>
97	template<class Archive>
98	void load(boost::basic_oarchive & ar, const unsigned int version) const
99	{
100	for(int i = 0; i < 10; ++i){
101	A x;
102	ar >> x;
103	std::m_set.insert(x);
104	}
105	}
106	</code></pre>
107	In this case, the address of <code>x</code> is the one that is tracked rather than
108	the address of the new item added to the set. Left unaddressed
109	this will break the features that depend on tracking such as loading object through a pointer.
110	Subtle bugs will be introduced into the program. This can be
111	addressed by altering the above code thusly:
112
113	<pre><code>
114	template<class Archive>
115	void load(boost::basic_iarchive & ar, const unsigned int version) const
116	{
117	for(int i = 0; i < 10; ++i){
118	A x;
119	ar >> x;
120	std::pair<std::set::const_iterator, bool> result;
121	result = std::m_set.insert(x);
122	ar.reset_object_address(& (*result.first), &x);
123	}
124	}
125	</code></pre>
126	This will adjust the tracking information to reflect the final resting place of
127	the moved variable and thereby rectify the above problem.
128	<p>
129	If it is known a priori that no pointer
130	values are duplicated, overhead associated with object tracking can
131	be eliminated by setting the object tracking class serialization trait
132	appropriately.
133	<p>
134	By default, data types designated primitive by
135	<a target="detail" href="traits.html#level">Implementation Level</a>
136	class serialization trait are never tracked. If it is desired to
137	track a shared primitive object through a pointer (e.g. a
138	<code style="white-space: normal">long</code> used as a reference count), It should be wrapped
139	in a class/struct so that it is an identifiable type.
140	The alternative of changing the implementation level of a <code style="white-space: normal">long</code>
141	would affect all <code style="white-space: normal">long</code>s serialized in the whole
142	program - probably not what one would intend.
143	<p>
144	It is possible that we may want to track addresses even though
145	the object is never serialized through a pointer. For example,
146	a virtual base class need be saved/loaded only once. By setting
147	this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress
148	redundant save/load operations.
149	<pre><code>
150	BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always)
151	</code></pre>
152
153	<h3><a name="export">Exporting Class Serialization</a></h3>
154	<a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described
155	<code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware
156	that code should be instantiated for serialization of a given class even though the
157	class hasn't been otherwise referred to by the program.
158	<p>
159	There are several ways <code style="white-space: normal">BOOST_CLASS_EXPORT</code> could have been
160	implemented.
161	<p>
162	One approach would be to instantiate serialization code for all archive classes included in the library.
163	This would add to each executable a large amount of code that is most likely never called.
164	Also it would needlessly slow down compilations of any program that uses the library. Finally,
165	the list of archives would be "built-in" to the library which would compilicate the addition of
166	new or custom archive classes.
167	<p>
168	Another approach would be for the library user to somehow explicitly instantiate which archive classes
169	code should be instantiated for each class to be serialized. Users would have to include
170	header files corresponding the archive classes to be instantiated.
171	The list of instantiated archive classes would have to be manually kept in sync with the
172	archive class headers actually included. This was considered burdensome and error prone.
173	<p>
174	This implementation of <code style="white-space: normal">BOOST_CLASS_EXPORT</code> works in the
175	following way:
176	<ul>
177	<li>All header modules of the form <boost/archive/*archive.hpp> are required to precede
178	the header module <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>.
179	<li>The header <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>
180	builds a list of archive classes whose header modules have been previously included.
181	It does this by checking to see which inclusion guard constants have been defined.
182	<li><code style="white-space: normal">BOOST_CLASS_EXPORT(my_class)</code> explicitly instantiates
183	serialization code for <code style="white-space: normal">my_class</code> for each archive in the list.
184	</ul>
185	Serialization code will be instantiated for a given archive class
186	if and only if the module that defines that archive class has been included in the program.
187	Given this, our program will contain all necessary code instantiations and no other.
188	<p>
189	For many styles of code organization this header sequencing requirement presents little problem.
190
191	Serialization code organized by class headers that are designed to be independent of archive
192	implementations will look something like the following:
193	<code><pre>
194	// A.hpp
195	// Note:to preserve independence from any particular archive implementation,
196	// no headers from <boost/archive/...> are included.
197	// Headers can be included in any order.
198	#include <boost/serialization/...>
199	#include <boost/serialization/export.hpp>
200	... // include other headers that A depends upon
201
202	class A {
203	...
204	};
205
206	BOOST_CLASS_EXPORT(A) // note: the export name of this class
207	</pre></code>
208	This style:
209	<ul>
210	<li>permits the header to include all aspects of the serialization implementation.
211	<li>permits the header to be included anywhere else as part of some other class declaration.
212	<li>reflects the concept of headers as a "library of types" which
213	can be used independently in other programs or other parts of the same program.
214	<li>reflects a fundamental principle of the serialization library design in that the
215	specification of serialization of any class is independent of any archive implementation.
216	</ul>
217	However, it might not always be possible or convenient to conform to the above style. Something
218	like the following might be required or preferred:
219	<code><pre>
220	// A.hpp
221	// headers can be included in any order
222	#include <boost/archive/text_oarchive.hpp>
223	#include <boost/archive/text_iarchive.hpp>
224	...
225	#include <boost/serialization/...>
226	...
227	// can't do the following because then A.hpp couldn't be included somewhere else
228	// #include <boost/serialization/export.hpp>
229
230	class A {
231	...
232	};
233	// can't do the following because export.hpp is not included !!
234	//BOOST_CLASS_EXPORT(A) // note: the export name of this class
235	</pre></code>
236	As noted in the comments, this would work. But
237	<code style="white-space: normal">#include <.../export.hpp></code> can't be used
238	without conflicting with other modules which use
239	<code style="white-space: normal">#include <.../*archive.hpp></code>. In this
240	case we can move the export to an implementation file:
241	<code><pre>
242	// A.cpp
243	#include "A.hpp"
244	...
245	// export.hpp header should be last;
246	#include <boost/serialization/export.hpp>
247	...
248	BOOST_CLASS_EXPORT(A)
249	...
250	</pre></code>
251
252	<h3><a name="classinfo">Class Information</a></h3>
253	By default, for each class serialized, class information is written to the archive.
254	This information includes version number, implementation level and tracking
255	behavior. This is necessary so that the archive can be correctly
256	deserialized even if a subsequent version of the program changes
257	some of the current trait values for a class. The space overhead for
258	this data is minimal. There is a little bit of runtime overhead
259	since each class has to be checked to see if it has already had its
260	class information included in the archive. In some cases, even this
261	might be considered too much. This extra overhead can be eliminated
262	by setting the
263	<a target="detail" href="traits.html#level">implementation level</a>
264	class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>.
265	<p>
266	<i>Turning off tracking and class information serialization will result
267	in pure template inline code that in principle could be optimised down
268	to a simple stream write/read.</i> Elimination of all serialization overhead
269	in this manner comes at a cost. Once archives are released to users, the
270	class serialization traits cannot be changed without invalidating the old
271	archives. Including the class information in the archive assures us
272	that they will be readable in the future even if the class definition
273	is revised. A light weight structure such as display pixel might be
274	declared in a header like this:
275
276	<pre><code>
277	#include <boost/serialization/serialization.hpp>
278	#include <boost/serialization/level.hpp>
279	#include <boost/serialization/tracking.hpp>
280
281	// a pixel is a light weight struct which is used in great numbers.
282	struct pixel
283	{
284	unsigned char red, green, blue;
285	template<class Archive>
286	void serialize(Archive & ar, const unsigned int /* version */){
287	ar << red << green << blue;
288	}
289	};
290
291	// elminate serialization overhead at the cost of
292	// never being able to increase the version.
293	BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable);
294
295	// eliminate object tracking (even if serialized through a pointer)
296	// at the risk of a programming error creating duplicate objects.
297	BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never)
298	</code></pre>
299
300	<h3><a name="portability">Archive Portability</a></h3>
301	Several archive classes create their data in the form of text or portable a binary format.
302	It should be possible to save such an of such a class on one platform and load it on another.
303	This is subject to a couple of conditions.
304	<h4><a name="numerics">Numerics</a></h4>
305	The architecture of the machine reading the archive must be able hold the data
306	saved. For example, the gcc compiler reserves 4 bytes to store a variable of type
307	<code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes.
308	So its possible that a value could be written that couldn't be represented by the loading program. This is a
309	fairly obvious situation and easily handled by using the numeric types in
310	<a target="cstding" href="../../../boost/cstdint.hpp"><boost/cstdint.hpp></a>
311
312	<h4><a name="traits">Traits</a></h4>
313	Another potential problem is illustrated by the following example:
314	<pre><code>
315	template<class T>
316	struct my_wrapper {
317	template<class Archive>
318	Archive & serialize ...
319	};
320
321	...
322
323	class my_class {
324	wchar_t a;
325	short unsigned b;
326	template<<class Archive>
327	Archive & serialize(Archive & ar, unsigned int version){
328	ar & my_wrapper(a);
329	ar & my_wrapper(b);
330	}
331	};
332	</code></pre>
333	If <code style="white-space: normal">my_wrapper</code> uses default serialization
334	traits there could be a problem. With the default traits, each time a new type is
335	added to the archive, bookkeeping information is added. So in this example, the
336	archive would include such bookkeeping information for
337	<code style="white-space: normal">my_wrapper<wchar_t></code> and for
338	<code style="white-space: normal">my_wrapper<short_unsigned></code>.
339	Or would it? What about compilers that treat
340	<code style="white-space: normal">wchar_t</code> as a
341	synonym for <code style="white-space: normal">unsigned short</code>?
342	In this case there is only one distinct type - not two. If archives are passed between
343	programs with compilers that differ in their treatment
344	of <code style="white-space: normal">wchar_t</code> the load operation will fail
345	in a catastrophic way.
346	<p>
347	One remedy for this is to assign serialization traits to the template
348	<code style="white-space: normal">my_template</code> such that class
349	information for instantiations of this template is never serialized. This
350	process is described <a target="detail" href="traits.html#templates">above</a> and
351	has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>.
352	Wrappers would typically be assigned such traits.
353	<p>
354	Another way to avoid this problem is to assign serialization traits
355	to all specializations of the template <code style="white-space: normal">my_wrapper</code>
356	for all primitive types so that class information is never saved. This is what has
357	been done for our implementation of serializations for STL collections.
358
359	<h3><a name="binary_archives">Binary Archives</a></h3>
360	Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed
361	on output. This creates a problem for binary archives. The easiest way to handle this is to
362	open streams for binary archives in "binary mode" by using the flag
363	<code style="white-space: normal">ios::binary</code>. If this is not done, the archive generated
364	will be unreadable.
365	<p>
366	Unfortunately, no way has been found to detect this error before loading the archive. Debug builds
367	will assert when this is detected so that may be helpful in catching this error.
368
369	<h3><a name="xml_archives">XML Archives</a></h3>
370	XML archives present a somewhat special case.
371	XML format has a nested structure that maps well to the "recursive class member visitor" pattern
372	used by the serialization system. However, XML differs from other formats in that it
373	requires a name for each data member. Our goal is to add this information to the
374	class serialization specification while still permiting the the serialization code to be
375	used with any archive. This is achived by requiring that all data serialized to an XML archive
376	be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>.
377	The first member is the name to be used as the XML tag for the
378	data item while the second is a reference to the data item itself. Any attempt to serialize data
379	not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will
380	be trapped at compile time. The system is implemented in such a way that for other archive classes,
381	just the value portion of the data is serialized. The name portion is discarded during compilation.
382	So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will
383	be guarenteed that all data can be serialized to all archive classes with maximum efficiency.
384
385	<h3><a href="exceptions.html">Archive Exceptions</a></h3>
386	<h3><a href="exception_safety.html">Exception Safety</a></h3>
387
388	<hr>
389	<p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
390	Distributed under the Boost Software License, Version 1.0. (See
391	accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
392	</i></p>
393	</body>
394	</html>

Note: See TracBrowser for help on using the repository browser.

Download in other formats: