Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: downloads/boost_1_34_1/libs/serialization/doc/special.html @ 29

Last change on this file since 29 was 29, checked in by landauf, 16 years ago
updated boost from 1_33_1 to 1_34_1
File size: 18.5 KB

Line
1	<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2	<html>
3	<!--
4	(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
5	Use, modification and distribution is subject to the Boost Software
6	License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
7	http://www.boost.org/LICENSE_1_0.txt)
8	-->
9	<head>
10	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
11	<link rel="stylesheet" type="text/css" href="../../../boost.css">
12	<link rel="stylesheet" type="text/css" href="style.css">
13	<title>Serialization - Special Considerations</title>
14	</head>
15	<body link="#0000ff" vlink="#800080">
16	<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
17	<tr>
18	<td valign="top" width="300">
19	<h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
20	</td>
21	<td valign="top">
22	<h1 align="center">Serialization</h1>
23	<h2 align="center">Special Considerations</h2>
24	</td>
25	</tr>
26	</table>
27	<hr>
28	<dl class="page-index">
29	<dt><a href="#objecttracking">Object Tracking</a>
30	<dt><a href="#export">Exporting Class Serialization</a>
31	<dt><a href="#classinfo">Class Information</a>
32	<dt><a href="#portability">Archive Portability</a>
33	<dl class="page-index">
34	<dt><a href="#numerics">Numerics</a>
35	<dt><a href="#traits">Traits</a>
36	</dl>
37	<dt><a href="#binary_archives">Binary Archives</a>
38	<dt><a href="#xml_archives">XML Archives</a>
39	<dt><a href="exceptions.html">Archive Exceptions</a>
40	<dt><a href="exception_safety.html">Exception Safety</a>
41	</dl>
42
43	<h3><a name="objecttracking">Object Tracking</a></h3>
44	Depending on how the class is used and other factors, serialized objects
45	may be tracked by memory address. This prevents the same object from being
46	written to or read from an archive multiple times. These stored addresses
47	can also be used to delete objects created during a loading process
48	that has been interrupted by throwing of an exception.
49	<p>
50	This could cause problems in
51	progams where the copies of different objects are saved from the same address.
52	<pre><code>
53	template<class Archive>
54	void save(boost::basic_oarchive & ar, const unsigned int version) const
55	{
56	for(int i = 0; i < 10; ++i){
57	A x = a[i];
58	ar << x;
59	}
60	}
61	</code></pre>
62	In this case, the data to be saved exists on the stack. Each iteration
63	of the loop updates the value on the stack. So although the data changes
64	each iteration, the address of the data doesn't. If a[i] is an array of
65	objects being tracked by memory address, the library will skip storing
66	objects after the first as it will be assumed that objects at the same address
67	are really the same object.
68	<p>
69	To help detect such cases, output archive operators expect to be passed
70	<code style="white-space: normal">const</code> reference arguments.
71	<p>
72	Given this, the above code will invoke a compile time assertion.
73	The obvious fix in this example is to use
74	<pre><code>
75	template<class Archive>
76	void save(boost::basic_oarchive & ar, const unsigned int version) const
77	{
78	for(int i = 0; i < 10; ++i){
79	ar << a[i];
80	}
81	}
82	</code></pre>
83	which will compile and run without problem.
84	The usage of <code style="white-space: normal">const</code> by the output archive operators
85	will ensure that the process of serialization doesn't
86	change the state of the objects being serialized. An attempt to do this
87	would constitute augmentation of the concept of saving of state with
88	some sort of non-obvious side effect. This would almost surely be a mistake
89	and a likely source of very subtle bugs.
90	<p>
91	Unfortunately, implementation issues currently prevent the detection of this kind of
92	error when the data item is wrapped as a name-value pair.
93	<p>
94	A similar problem can occur when different objects are loaded to and address
95	which is different from the final location:
96	<pre><code>
97	template<class Archive>
98	void load(boost::basic_oarchive & ar, const unsigned int version) const
99	{
100	for(int i = 0; i < 10; ++i){
101	A x;
102	ar >> x;
103	std::m_set.insert(x);
104	}
105	}
106	</code></pre>
107	In this case, the address of <code>x</code> is the one that is tracked rather than
108	the address of the new item added to the set. Left unaddressed
109	this will break the features that depend on tracking such as loading object through a pointer.
110	Subtle bugs will be introduced into the program. This can be
111	addressed by altering the above code thusly:
112
113	<pre><code>
114	template<class Archive>
115	void load(boost::basic_iarchive & ar, const unsigned int version) const
116	{
117	for(int i = 0; i < 10; ++i){
118	A x;
119	ar >> x;
120	std::pair<std::set::const_iterator, bool> result;
121	result = std::m_set.insert(x);
122	ar.reset_object_address(& (*result.first), &x);
123	}
124	}
125	</code></pre>
126	This will adjust the tracking information to reflect the final resting place of
127	the moved variable and thereby rectify the above problem.
128	<p>
129	If it is known a priori that no pointer
130	values are duplicated, overhead associated with object tracking can
131	be eliminated by setting the object tracking class serialization trait
132	appropriately.
133	<p>
134	By default, data types designated primitive by
135	<a target="detail" href="traits.html#level">Implementation Level</a>
136	class serialization trait are never tracked. If it is desired to
137	track a shared primitive object through a pointer (e.g. a
138	<code style="white-space: normal">long</code> used as a reference count), It should be wrapped
139	in a class/struct so that it is an identifiable type.
140	The alternative of changing the implementation level of a <code style="white-space: normal">long</code>
141	would affect all <code style="white-space: normal">long</code>s serialized in the whole
142	program - probably not what one would intend.
143	<p>
144	It is possible that we may want to track addresses even though
145	the object is never serialized through a pointer. For example,
146	a virtual base class need be saved/loaded only once. By setting
147	this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress
148	redundant save/load operations.
149	<pre><code>
150	BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always)
151	</code></pre>
152
153	<h3><a name="export">Exporting Class Serialization</a></h3>
154	<a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described
155	<code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware
156	that code should be instantiated for serialization of a given class even though the
157	class hasn't been otherwise referred to by the program.
158	<p>
159	There are several ways <code style="white-space: normal">BOOST_CLASS_EXPORT</code> could have been
160	implemented.
161	<p>
162	One approach would be to instantiate serialization code for all archive classes included in the library.
163	This would add to each executable a large amount of code that is most likely never called.
164	Also it would needlessly slow down compilations of any program that uses the library. Finally,
165	the list of archives would be "built-in" to the library which would compilicate the addition of
166	new or custom archive classes.
167	<p>
168	Another approach would be for the library user to somehow explicitly instantiate which archive classes
169	code should be instantiated for each class to be serialized. Users would have to include
170	header files corresponding the archive classes to be instantiated.
171	The list of instantiated archive classes would have to be manually kept in sync with the
172	archive class headers actually included. This was considered burdensome and error prone.
173	<p>
174	This implementation of <code style="white-space: normal">BOOST_CLASS_EXPORT</code> works in the
175	following way:
176	<ul>
177	<li>All header modules of the form <boost/archive/*archive.hpp> are required to precede
178	the header module <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>.
179	<li>The header <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>
180	builds a list of archive classes whose header modules have been previously included.
181	It does this by checking to see which inclusion guard constants have been defined.
182	The header <a href="../../../boost/archive/detail/known_archive_types.hpp" target="known_archive_types_hpp">known_archive_types.hpp</a>
183	lists the archive header files which whose include guards will be checked. If you create your own
184	archive class, you probably want to edit this file.
185	<li><code style="white-space: normal">BOOST_CLASS_EXPORT(my_class)</code> explicitly instantiates
186	serialization code for <code style="white-space: normal">my_class</code> for each archive in the list.
187	</ul>
188	Serialization code will be instantiated for a given archive class
189	if and only if the module that defines that archive class has been included in the program.
190	Given this, our program will contain all necessary code instantiations and no other.
191	<p>
192	For many styles of code organization this header sequencing requirement presents little problem.
193
194	Serialization code organized by class headers that are designed to be independent of archive
195	implementations will look something like the following:
196	<code><pre>
197	// A.hpp
198	// Note:to preserve independence from any particular archive implementation,
199	// no headers from <boost/archive/...> are included.
200	// Headers can be included in any order.
201	#include <boost/serialization/...>
202	#include <boost/serialization/export.hpp>
203	... // include other headers that A depends upon
204
205	class A {
206	...
207	};
208
209	BOOST_CLASS_EXPORT(A) // note: the export name of this class
210	</pre></code>
211	This style:
212	<ul>
213	<li>permits the header to include all aspects of the serialization implementation.
214	<li>permits the header to be included anywhere else as part of some other class declaration.
215	<li>reflects the concept of headers as a "library of types" which
216	can be used independently in other programs or other parts of the same program.
217	<li>reflects a fundamental principle of the serialization library design in that the
218	specification of serialization of any class is independent of any archive implementation.
219	</ul>
220	However, it might not always be possible or convenient to conform to the above style. Something
221	like the following might be required or preferred:
222	<code><pre>
223	// A.hpp
224	// headers can be included in any order
225	#include <boost/archive/text_oarchive.hpp>
226	#include <boost/archive/text_iarchive.hpp>
227	...
228	#include <boost/serialization/...>
229	...
230	// can't do the following because then A.hpp couldn't be included somewhere else
231	// #include <boost/serialization/export.hpp>
232
233	class A {
234	...
235	};
236	// can't do the following because export.hpp is not included !!
237	//BOOST_CLASS_EXPORT(A) // note: the export name of this class
238	</pre></code>
239	As noted in the comments, this would work. But
240	<code style="white-space: normal">#include <.../export.hpp></code> can't be used
241	without conflicting with other modules which use
242	<code style="white-space: normal">#include <.../*archive.hpp></code>. In this
243	case we can move the export to an implementation file:
244	<code><pre>
245	// A.cpp
246	#include "A.hpp"
247	...
248	// export.hpp header should be last;
249	#include <boost/serialization/export.hpp>
250	...
251	BOOST_CLASS_EXPORT(A)
252	...
253	</pre></code>
254
255	<h3><a name="classinfo">Class Information</a></h3>
256	By default, for each class serialized, class information is written to the archive.
257	This information includes version number, implementation level and tracking
258	behavior. This is necessary so that the archive can be correctly
259	deserialized even if a subsequent version of the program changes
260	some of the current trait values for a class. The space overhead for
261	this data is minimal. There is a little bit of runtime overhead
262	since each class has to be checked to see if it has already had its
263	class information included in the archive. In some cases, even this
264	might be considered too much. This extra overhead can be eliminated
265	by setting the
266	<a target="detail" href="traits.html#level">implementation level</a>
267	class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>.
268	<p>
269	<i>Turning off tracking and class information serialization will result
270	in pure template inline code that in principle could be optimised down
271	to a simple stream write/read.</i> Elimination of all serialization overhead
272	in this manner comes at a cost. Once archives are released to users, the
273	class serialization traits cannot be changed without invalidating the old
274	archives. Including the class information in the archive assures us
275	that they will be readable in the future even if the class definition
276	is revised. A light weight structure such as display pixel might be
277	declared in a header like this:
278
279	<pre><code>
280	#include <boost/serialization/serialization.hpp>
281	#include <boost/serialization/level.hpp>
282	#include <boost/serialization/tracking.hpp>
283
284	// a pixel is a light weight struct which is used in great numbers.
285	struct pixel
286	{
287	unsigned char red, green, blue;
288	template<class Archive>
289	void serialize(Archive & ar, const unsigned int /* version */){
290	ar << red << green << blue;
291	}
292	};
293
294	// elminate serialization overhead at the cost of
295	// never being able to increase the version.
296	BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable);
297
298	// eliminate object tracking (even if serialized through a pointer)
299	// at the risk of a programming error creating duplicate objects.
300	BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never)
301	</code></pre>
302
303	<h3><a name="portability">Archive Portability</a></h3>
304	Several archive classes create their data in the form of text or portable a binary format.
305	It should be possible to save such an of such a class on one platform and load it on another.
306	This is subject to a couple of conditions.
307	<h4><a name="numerics">Numerics</a></h4>
308	The architecture of the machine reading the archive must be able hold the data
309	saved. For example, the gcc compiler reserves 4 bytes to store a variable of type
310	<code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes.
311	So its possible that a value could be written that couldn't be represented by the loading program. This is a
312	fairly obvious situation and easily handled by using the numeric types in
313	<a target="cstding" href="../../../boost/cstdint.hpp"><boost/cstdint.hpp></a>
314
315	<h4><a name="traits">Traits</a></h4>
316	Another potential problem is illustrated by the following example:
317	<pre><code>
318	template<class T>
319	struct my_wrapper {
320	template<class Archive>
321	Archive & serialize ...
322	};
323
324	...
325
326	class my_class {
327	wchar_t a;
328	short unsigned b;
329	template<<class Archive>
330	Archive & serialize(Archive & ar, unsigned int version){
331	ar & my_wrapper(a);
332	ar & my_wrapper(b);
333	}
334	};
335	</code></pre>
336	If <code style="white-space: normal">my_wrapper</code> uses default serialization
337	traits there could be a problem. With the default traits, each time a new type is
338	added to the archive, bookkeeping information is added. So in this example, the
339	archive would include such bookkeeping information for
340	<code style="white-space: normal">my_wrapper<wchar_t></code> and for
341	<code style="white-space: normal">my_wrapper<short_unsigned></code>.
342	Or would it? What about compilers that treat
343	<code style="white-space: normal">wchar_t</code> as a
344	synonym for <code style="white-space: normal">unsigned short</code>?
345	In this case there is only one distinct type - not two. If archives are passed between
346	programs with compilers that differ in their treatment
347	of <code style="white-space: normal">wchar_t</code> the load operation will fail
348	in a catastrophic way.
349	<p>
350	One remedy for this is to assign serialization traits to the template
351	<code style="white-space: normal">my_template</code> such that class
352	information for instantiations of this template is never serialized. This
353	process is described <a target="detail" href="traits.html#templates">above</a> and
354	has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>.
355	Wrappers would typically be assigned such traits.
356	<p>
357	Another way to avoid this problem is to assign serialization traits
358	to all specializations of the template <code style="white-space: normal">my_wrapper</code>
359	for all primitive types so that class information is never saved. This is what has
360	been done for our implementation of serializations for STL collections.
361
362	<h3><a name="binary_archives">Binary Archives</a></h3>
363	Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed
364	on output. This creates a problem for binary archives. The easiest way to handle this is to
365	open streams for binary archives in "binary mode" by using the flag
366	<code style="white-space: normal">ios::binary</code>. If this is not done, the archive generated
367	will be unreadable.
368	<p>
369	Unfortunately, no way has been found to detect this error before loading the archive. Debug builds
370	will assert when this is detected so that may be helpful in catching this error.
371
372	<h3><a name="xml_archives">XML Archives</a></h3>
373	XML archives present a somewhat special case.
374	XML format has a nested structure that maps well to the "recursive class member visitor" pattern
375	used by the serialization system. However, XML differs from other formats in that it
376	requires a name for each data member. Our goal is to add this information to the
377	class serialization specification while still permiting the the serialization code to be
378	used with any archive. This is achived by requiring that all data serialized to an XML archive
379	be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>.
380	The first member is the name to be used as the XML tag for the
381	data item while the second is a reference to the data item itself. Any attempt to serialize data
382	not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will
383	be trapped at compile time. The system is implemented in such a way that for other archive classes,
384	just the value portion of the data is serialized. The name portion is discarded during compilation.
385	So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will
386	be guarenteed that all data can be serialized to all archive classes with maximum efficiency.
387
388	<h3><a href="exceptions.html">Archive Exceptions</a></h3>
389	<h3><a href="exception_safety.html">Exception Safety</a></h3>
390
391	<hr>
392	<p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
393	Distributed under the Boost Software License, Version 1.0. (See
394	accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
395	</i></p>
396	</body>
397	</html>

Note: See TracBrowser for help on using the repository browser.

Download in other formats: