[12] | 1 | <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
---|
| 2 | <html> |
---|
| 3 | <!-- |
---|
| 4 | (C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com . |
---|
| 5 | Use, modification and distribution is subject to the Boost Software |
---|
| 6 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
---|
| 7 | http://www.boost.org/LICENSE_1_0.txt) |
---|
| 8 | --> |
---|
| 9 | <head> |
---|
| 10 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
---|
| 11 | <link rel="stylesheet" type="text/css" href="../../../boost.css"> |
---|
| 12 | <link rel="stylesheet" type="text/css" href="style.css"> |
---|
| 13 | <title>Serialization - Special Considerations</title> |
---|
| 14 | </head> |
---|
| 15 | <body link="#0000ff" vlink="#800080"> |
---|
| 16 | <table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header"> |
---|
| 17 | <tr> |
---|
| 18 | <td valign="top" width="300"> |
---|
| 19 | <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3> |
---|
| 20 | </td> |
---|
| 21 | <td valign="top"> |
---|
| 22 | <h1 align="center">Serialization</h1> |
---|
| 23 | <h2 align="center">Special Considerations</h2> |
---|
| 24 | </td> |
---|
| 25 | </tr> |
---|
| 26 | </table> |
---|
| 27 | <hr> |
---|
| 28 | <dl class="page-index"> |
---|
| 29 | <dt><a href="#objecttracking">Object Tracking</a> |
---|
| 30 | <dt><a href="#export">Exporting Class Serialization</a> |
---|
| 31 | <dt><a href="#classinfo">Class Information</a> |
---|
| 32 | <dt><a href="#portability">Archive Portability</a> |
---|
| 33 | <dl class="page-index"> |
---|
| 34 | <dt><a href="#numerics">Numerics</a> |
---|
| 35 | <dt><a href="#traits">Traits</a> |
---|
| 36 | </dl> |
---|
| 37 | <dt><a href="#binary_archives">Binary Archives</a> |
---|
| 38 | <dt><a href="#xml_archives">XML Archives</a> |
---|
| 39 | <dt><a href="exceptions.html">Archive Exceptions</a> |
---|
| 40 | <dt><a href="exception_safety.html">Exception Safety</a> |
---|
| 41 | </dl> |
---|
| 42 | |
---|
| 43 | <h3><a name="objecttracking">Object Tracking</a></h3> |
---|
| 44 | Depending on how the class is used and other factors, serialized objects |
---|
| 45 | may be tracked by memory address. This prevents the same object from being |
---|
| 46 | written to or read from an archive multiple times. These stored addresses |
---|
| 47 | can also be used to delete objects created during a loading process |
---|
| 48 | that has been interrupted by throwing of an exception. |
---|
| 49 | <p> |
---|
| 50 | This could cause problems in |
---|
| 51 | progams where the copies of different objects are saved from the same address. |
---|
| 52 | <pre><code> |
---|
| 53 | template<class Archive> |
---|
| 54 | void save(boost::basic_oarchive & ar, const unsigned int version) const |
---|
| 55 | { |
---|
| 56 | for(int i = 0; i < 10; ++i){ |
---|
| 57 | A x = a[i]; |
---|
| 58 | ar << x; |
---|
| 59 | } |
---|
| 60 | } |
---|
| 61 | </code></pre> |
---|
| 62 | In this case, the data to be saved exists on the stack. Each iteration |
---|
| 63 | of the loop updates the value on the stack. So although the data changes |
---|
| 64 | each iteration, the address of the data doesn't. If a[i] is an array of |
---|
| 65 | objects being tracked by memory address, the library will skip storing |
---|
| 66 | objects after the first as it will be assumed that objects at the same address |
---|
| 67 | are really the same object. |
---|
| 68 | <p> |
---|
| 69 | To help detect such cases, output archive operators expect to be passed |
---|
| 70 | <code style="white-space: normal">const</code> reference arguments. |
---|
| 71 | <p> |
---|
| 72 | Given this, the above code will invoke a compile time assertion. |
---|
| 73 | The obvious fix in this example is to use |
---|
| 74 | <pre><code> |
---|
| 75 | template<class Archive> |
---|
| 76 | void save(boost::basic_oarchive & ar, const unsigned int version) const |
---|
| 77 | { |
---|
| 78 | for(int i = 0; i < 10; ++i){ |
---|
| 79 | ar << a[i]; |
---|
| 80 | } |
---|
| 81 | } |
---|
| 82 | </code></pre> |
---|
| 83 | which will compile and run without problem. |
---|
| 84 | The usage of <code style="white-space: normal">const</code> by the output archive operators |
---|
| 85 | will ensure that the process of serialization doesn't |
---|
| 86 | change the state of the objects being serialized. An attempt to do this |
---|
| 87 | would constitute augmentation of the concept of saving of state with |
---|
| 88 | some sort of non-obvious side effect. This would almost surely be a mistake |
---|
| 89 | and a likely source of very subtle bugs. |
---|
| 90 | <p> |
---|
| 91 | Unfortunately, implementation issues currently prevent the detection of this kind of |
---|
| 92 | error when the data item is wrapped as a name-value pair. |
---|
| 93 | <p> |
---|
| 94 | A similar problem can occur when different objects are loaded to and address |
---|
| 95 | which is different from the final location: |
---|
| 96 | <pre><code> |
---|
| 97 | template<class Archive> |
---|
| 98 | void load(boost::basic_oarchive & ar, const unsigned int version) const |
---|
| 99 | { |
---|
| 100 | for(int i = 0; i < 10; ++i){ |
---|
| 101 | A x; |
---|
| 102 | ar >> x; |
---|
| 103 | std::m_set.insert(x); |
---|
| 104 | } |
---|
| 105 | } |
---|
| 106 | </code></pre> |
---|
| 107 | In this case, the address of <code>x</code> is the one that is tracked rather than |
---|
| 108 | the address of the new item added to the set. Left unaddressed |
---|
| 109 | this will break the features that depend on tracking such as loading object through a pointer. |
---|
| 110 | Subtle bugs will be introduced into the program. This can be |
---|
| 111 | addressed by altering the above code thusly: |
---|
| 112 | |
---|
| 113 | <pre><code> |
---|
| 114 | template<class Archive> |
---|
| 115 | void load(boost::basic_iarchive & ar, const unsigned int version) const |
---|
| 116 | { |
---|
| 117 | for(int i = 0; i < 10; ++i){ |
---|
| 118 | A x; |
---|
| 119 | ar >> x; |
---|
| 120 | std::pair<std::set::const_iterator, bool> result; |
---|
| 121 | result = std::m_set.insert(x); |
---|
| 122 | ar.reset_object_address(& (*result.first), &x); |
---|
| 123 | } |
---|
| 124 | } |
---|
| 125 | </code></pre> |
---|
| 126 | This will adjust the tracking information to reflect the final resting place of |
---|
| 127 | the moved variable and thereby rectify the above problem. |
---|
| 128 | <p> |
---|
| 129 | If it is known a priori that no pointer |
---|
| 130 | values are duplicated, overhead associated with object tracking can |
---|
| 131 | be eliminated by setting the object tracking class serialization trait |
---|
| 132 | appropriately. |
---|
| 133 | <p> |
---|
| 134 | By default, data types designated primitive by |
---|
| 135 | <a target="detail" href="traits.html#level">Implementation Level</a> |
---|
| 136 | class serialization trait are never tracked. If it is desired to |
---|
| 137 | track a shared primitive object through a pointer (e.g. a |
---|
| 138 | <code style="white-space: normal">long</code> used as a reference count), It should be wrapped |
---|
| 139 | in a class/struct so that it is an identifiable type. |
---|
| 140 | The alternative of changing the implementation level of a <code style="white-space: normal">long</code> |
---|
| 141 | would affect all <code style="white-space: normal">long</code>s serialized in the whole |
---|
| 142 | program - probably not what one would intend. |
---|
| 143 | <p> |
---|
| 144 | It is possible that we may want to track addresses even though |
---|
| 145 | the object is never serialized through a pointer. For example, |
---|
| 146 | a virtual base class need be saved/loaded only once. By setting |
---|
| 147 | this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress |
---|
| 148 | redundant save/load operations. |
---|
| 149 | <pre><code> |
---|
| 150 | BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always) |
---|
| 151 | </code></pre> |
---|
| 152 | |
---|
| 153 | <h3><a name="export">Exporting Class Serialization</a></h3> |
---|
| 154 | <a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described |
---|
| 155 | <code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware |
---|
| 156 | that code should be instantiated for serialization of a given class even though the |
---|
| 157 | class hasn't been otherwise referred to by the program. |
---|
| 158 | <p> |
---|
| 159 | There are several ways <code style="white-space: normal">BOOST_CLASS_EXPORT</code> could have been |
---|
| 160 | implemented. |
---|
| 161 | <p> |
---|
| 162 | One approach would be to instantiate serialization code for all archive classes included in the library. |
---|
| 163 | This would add to each executable a large amount of code that is most likely never called. |
---|
| 164 | Also it would needlessly slow down compilations of any program that uses the library. Finally, |
---|
| 165 | the list of archives would be "built-in" to the library which would compilicate the addition of |
---|
| 166 | new or custom archive classes. |
---|
| 167 | <p> |
---|
| 168 | Another approach would be for the library user to somehow explicitly instantiate which archive classes |
---|
| 169 | code should be instantiated for each class to be serialized. Users would have to include |
---|
| 170 | header files corresponding the archive classes to be instantiated. |
---|
| 171 | The list of instantiated archive classes would have to be manually kept in sync with the |
---|
| 172 | archive class headers actually included. This was considered burdensome and error prone. |
---|
| 173 | <p> |
---|
| 174 | This implementation of <code style="white-space: normal">BOOST_CLASS_EXPORT</code> works in the |
---|
| 175 | following way: |
---|
| 176 | <ul> |
---|
| 177 | <li>All header modules of the form <boost/archive/*archive.hpp> are required to precede |
---|
| 178 | the header module <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>. |
---|
| 179 | <li>The header <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a> |
---|
| 180 | builds a list of archive classes whose header modules have been previously included. |
---|
| 181 | It does this by checking to see which inclusion guard constants have been defined. |
---|
| 182 | <li><code style="white-space: normal">BOOST_CLASS_EXPORT(my_class)</code> explicitly instantiates |
---|
| 183 | serialization code for <code style="white-space: normal">my_class</code> for each archive in the list. |
---|
| 184 | </ul> |
---|
| 185 | Serialization code will be instantiated for a given archive class |
---|
| 186 | if and only if the module that defines that archive class has been included in the program. |
---|
| 187 | Given this, our program will contain all necessary code instantiations and no other. |
---|
| 188 | <p> |
---|
| 189 | For many styles of code organization this header sequencing requirement presents little problem. |
---|
| 190 | |
---|
| 191 | Serialization code organized by class headers that are designed to be independent of archive |
---|
| 192 | implementations will look something like the following: |
---|
| 193 | <code><pre> |
---|
| 194 | // A.hpp |
---|
| 195 | // Note:to preserve independence from any particular archive implementation, |
---|
| 196 | // no headers from <boost/archive/...> are included. |
---|
| 197 | // Headers can be included in any order. |
---|
| 198 | #include <boost/serialization/...> |
---|
| 199 | #include <boost/serialization/export.hpp> |
---|
| 200 | ... // include other headers that A depends upon |
---|
| 201 | |
---|
| 202 | class A { |
---|
| 203 | ... |
---|
| 204 | }; |
---|
| 205 | |
---|
| 206 | BOOST_CLASS_EXPORT(A) // note: the export name of this class |
---|
| 207 | </pre></code> |
---|
| 208 | This style: |
---|
| 209 | <ul> |
---|
| 210 | <li>permits the header to include all aspects of the serialization implementation. |
---|
| 211 | <li>permits the header to be included anywhere else as part of some other class declaration. |
---|
| 212 | <li>reflects the concept of headers as a "library of types" which |
---|
| 213 | can be used independently in other programs or other parts of the same program. |
---|
| 214 | <li>reflects a fundamental principle of the serialization library design in that the |
---|
| 215 | specification of serialization of any class is independent of any archive implementation. |
---|
| 216 | </ul> |
---|
| 217 | However, it might not always be possible or convenient to conform to the above style. Something |
---|
| 218 | like the following might be required or preferred: |
---|
| 219 | <code><pre> |
---|
| 220 | // A.hpp |
---|
| 221 | // headers can be included in any order |
---|
| 222 | #include <boost/archive/text_oarchive.hpp> |
---|
| 223 | #include <boost/archive/text_iarchive.hpp> |
---|
| 224 | ... |
---|
| 225 | #include <boost/serialization/...> |
---|
| 226 | ... |
---|
| 227 | // can't do the following because then A.hpp couldn't be included somewhere else |
---|
| 228 | // #include <boost/serialization/export.hpp> |
---|
| 229 | |
---|
| 230 | class A { |
---|
| 231 | ... |
---|
| 232 | }; |
---|
| 233 | // can't do the following because export.hpp is not included !! |
---|
| 234 | //BOOST_CLASS_EXPORT(A) // note: the export name of this class |
---|
| 235 | </pre></code> |
---|
| 236 | As noted in the comments, this would work. But |
---|
| 237 | <code style="white-space: normal">#include <.../export.hpp></code> can't be used |
---|
| 238 | without conflicting with other modules which use |
---|
| 239 | <code style="white-space: normal">#include <.../*archive.hpp></code>. In this |
---|
| 240 | case we can move the export to an implementation file: |
---|
| 241 | <code><pre> |
---|
| 242 | // A.cpp |
---|
| 243 | #include "A.hpp" |
---|
| 244 | ... |
---|
| 245 | // export.hpp header should be last; |
---|
| 246 | #include <boost/serialization/export.hpp> |
---|
| 247 | ... |
---|
| 248 | BOOST_CLASS_EXPORT(A) |
---|
| 249 | ... |
---|
| 250 | </pre></code> |
---|
| 251 | |
---|
| 252 | <h3><a name="classinfo">Class Information</a></h3> |
---|
| 253 | By default, for each class serialized, class information is written to the archive. |
---|
| 254 | This information includes version number, implementation level and tracking |
---|
| 255 | behavior. This is necessary so that the archive can be correctly |
---|
| 256 | deserialized even if a subsequent version of the program changes |
---|
| 257 | some of the current trait values for a class. The space overhead for |
---|
| 258 | this data is minimal. There is a little bit of runtime overhead |
---|
| 259 | since each class has to be checked to see if it has already had its |
---|
| 260 | class information included in the archive. In some cases, even this |
---|
| 261 | might be considered too much. This extra overhead can be eliminated |
---|
| 262 | by setting the |
---|
| 263 | <a target="detail" href="traits.html#level">implementation level</a> |
---|
| 264 | class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>. |
---|
| 265 | <p> |
---|
| 266 | <i>Turning off tracking and class information serialization will result |
---|
| 267 | in pure template inline code that in principle could be optimised down |
---|
| 268 | to a simple stream write/read.</i> Elimination of all serialization overhead |
---|
| 269 | in this manner comes at a cost. Once archives are released to users, the |
---|
| 270 | class serialization traits cannot be changed without invalidating the old |
---|
| 271 | archives. Including the class information in the archive assures us |
---|
| 272 | that they will be readable in the future even if the class definition |
---|
| 273 | is revised. A light weight structure such as display pixel might be |
---|
| 274 | declared in a header like this: |
---|
| 275 | |
---|
| 276 | <pre><code> |
---|
| 277 | #include <boost/serialization/serialization.hpp> |
---|
| 278 | #include <boost/serialization/level.hpp> |
---|
| 279 | #include <boost/serialization/tracking.hpp> |
---|
| 280 | |
---|
| 281 | // a pixel is a light weight struct which is used in great numbers. |
---|
| 282 | struct pixel |
---|
| 283 | { |
---|
| 284 | unsigned char red, green, blue; |
---|
| 285 | template<class Archive> |
---|
| 286 | void serialize(Archive & ar, const unsigned int /* version */){ |
---|
| 287 | ar << red << green << blue; |
---|
| 288 | } |
---|
| 289 | }; |
---|
| 290 | |
---|
| 291 | // elminate serialization overhead at the cost of |
---|
| 292 | // never being able to increase the version. |
---|
| 293 | BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable); |
---|
| 294 | |
---|
| 295 | // eliminate object tracking (even if serialized through a pointer) |
---|
| 296 | // at the risk of a programming error creating duplicate objects. |
---|
| 297 | BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never) |
---|
| 298 | </code></pre> |
---|
| 299 | |
---|
| 300 | <h3><a name="portability">Archive Portability</a></h3> |
---|
| 301 | Several archive classes create their data in the form of text or portable a binary format. |
---|
| 302 | It should be possible to save such an of such a class on one platform and load it on another. |
---|
| 303 | This is subject to a couple of conditions. |
---|
| 304 | <h4><a name="numerics">Numerics</a></h4> |
---|
| 305 | The architecture of the machine reading the archive must be able hold the data |
---|
| 306 | saved. For example, the gcc compiler reserves 4 bytes to store a variable of type |
---|
| 307 | <code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes. |
---|
| 308 | So its possible that a value could be written that couldn't be represented by the loading program. This is a |
---|
| 309 | fairly obvious situation and easily handled by using the numeric types in |
---|
| 310 | <a target="cstding" href="../../../boost/cstdint.hpp"><boost/cstdint.hpp></a> |
---|
| 311 | |
---|
| 312 | <h4><a name="traits">Traits</a></h4> |
---|
| 313 | Another potential problem is illustrated by the following example: |
---|
| 314 | <pre><code> |
---|
| 315 | template<class T> |
---|
| 316 | struct my_wrapper { |
---|
| 317 | template<class Archive> |
---|
| 318 | Archive & serialize ... |
---|
| 319 | }; |
---|
| 320 | |
---|
| 321 | ... |
---|
| 322 | |
---|
| 323 | class my_class { |
---|
| 324 | wchar_t a; |
---|
| 325 | short unsigned b; |
---|
| 326 | template<<class Archive> |
---|
| 327 | Archive & serialize(Archive & ar, unsigned int version){ |
---|
| 328 | ar & my_wrapper(a); |
---|
| 329 | ar & my_wrapper(b); |
---|
| 330 | } |
---|
| 331 | }; |
---|
| 332 | </code></pre> |
---|
| 333 | If <code style="white-space: normal">my_wrapper</code> uses default serialization |
---|
| 334 | traits there could be a problem. With the default traits, each time a new type is |
---|
| 335 | added to the archive, bookkeeping information is added. So in this example, the |
---|
| 336 | archive would include such bookkeeping information for |
---|
| 337 | <code style="white-space: normal">my_wrapper<wchar_t></code> and for |
---|
| 338 | <code style="white-space: normal">my_wrapper<short_unsigned></code>. |
---|
| 339 | Or would it? What about compilers that treat |
---|
| 340 | <code style="white-space: normal">wchar_t</code> as a |
---|
| 341 | synonym for <code style="white-space: normal">unsigned short</code>? |
---|
| 342 | In this case there is only one distinct type - not two. If archives are passed between |
---|
| 343 | programs with compilers that differ in their treatment |
---|
| 344 | of <code style="white-space: normal">wchar_t</code> the load operation will fail |
---|
| 345 | in a catastrophic way. |
---|
| 346 | <p> |
---|
| 347 | One remedy for this is to assign serialization traits to the template |
---|
| 348 | <code style="white-space: normal">my_template</code> such that class |
---|
| 349 | information for instantiations of this template is never serialized. This |
---|
| 350 | process is described <a target="detail" href="traits.html#templates">above</a> and |
---|
| 351 | has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>. |
---|
| 352 | Wrappers would typically be assigned such traits. |
---|
| 353 | <p> |
---|
| 354 | Another way to avoid this problem is to assign serialization traits |
---|
| 355 | to all specializations of the template <code style="white-space: normal">my_wrapper</code> |
---|
| 356 | for all primitive types so that class information is never saved. This is what has |
---|
| 357 | been done for our implementation of serializations for STL collections. |
---|
| 358 | |
---|
| 359 | <h3><a name="binary_archives">Binary Archives</a></h3> |
---|
| 360 | Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed |
---|
| 361 | on output. This creates a problem for binary archives. The easiest way to handle this is to |
---|
| 362 | open streams for binary archives in "binary mode" by using the flag |
---|
| 363 | <code style="white-space: normal">ios::binary</code>. If this is not done, the archive generated |
---|
| 364 | will be unreadable. |
---|
| 365 | <p> |
---|
| 366 | Unfortunately, no way has been found to detect this error before loading the archive. Debug builds |
---|
| 367 | will assert when this is detected so that may be helpful in catching this error. |
---|
| 368 | |
---|
| 369 | <h3><a name="xml_archives">XML Archives</a></h3> |
---|
| 370 | XML archives present a somewhat special case. |
---|
| 371 | XML format has a nested structure that maps well to the "recursive class member visitor" pattern |
---|
| 372 | used by the serialization system. However, XML differs from other formats in that it |
---|
| 373 | requires a name for each data member. Our goal is to add this information to the |
---|
| 374 | class serialization specification while still permiting the the serialization code to be |
---|
| 375 | used with any archive. This is achived by requiring that all data serialized to an XML archive |
---|
| 376 | be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>. |
---|
| 377 | The first member is the name to be used as the XML tag for the |
---|
| 378 | data item while the second is a reference to the data item itself. Any attempt to serialize data |
---|
| 379 | not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will |
---|
| 380 | be trapped at compile time. The system is implemented in such a way that for other archive classes, |
---|
| 381 | just the value portion of the data is serialized. The name portion is discarded during compilation. |
---|
| 382 | So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will |
---|
| 383 | be guarenteed that all data can be serialized to all archive classes with maximum efficiency. |
---|
| 384 | |
---|
| 385 | <h3><a href="exceptions.html">Archive Exceptions</a></h3> |
---|
| 386 | <h3><a href="exception_safety.html">Exception Safety</a></h3> |
---|
| 387 | |
---|
| 388 | <hr> |
---|
| 389 | <p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004. |
---|
| 390 | Distributed under the Boost Software License, Version 1.0. (See |
---|
| 391 | accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) |
---|
| 392 | </i></p> |
---|
| 393 | </body> |
---|
| 394 | </html> |
---|