1 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" |
---|
2 | "http://www.w3.org/TR/REC-html40/strict.dtd"> |
---|
3 | |
---|
4 | <title>Boost.Python Pickle Support</title> |
---|
5 | |
---|
6 | <div> |
---|
7 | |
---|
8 | <img src="../../../../boost.png" |
---|
9 | alt="boost.png (6897 bytes)" |
---|
10 | align="center" |
---|
11 | width="277" height="86"> |
---|
12 | |
---|
13 | <hr> |
---|
14 | <h1>Boost.Python Pickle Support</h1> |
---|
15 | |
---|
16 | Pickle is a Python module for object serialization, also known |
---|
17 | as persistence, marshalling, or flattening. |
---|
18 | |
---|
19 | <p> |
---|
20 | It is often necessary to save and restore the contents of an object to |
---|
21 | a file. One approach to this problem is to write a pair of functions |
---|
22 | that read and write data from a file in a special format. A powerful |
---|
23 | alternative approach is to use Python's pickle module. Exploiting |
---|
24 | Python's ability for introspection, the pickle module recursively |
---|
25 | converts nearly arbitrary Python objects into a stream of bytes that |
---|
26 | can be written to a file. |
---|
27 | |
---|
28 | <p> |
---|
29 | The Boost Python Library supports the pickle module |
---|
30 | through the interface as described in detail in the |
---|
31 | <a href="http://www.python.org/doc/current/lib/module-pickle.html" |
---|
32 | >Python Library Reference for pickle.</a> This interface |
---|
33 | involves the special methods <tt>__getinitargs__</tt>, |
---|
34 | <tt>__getstate__</tt> and <tt>__setstate__</tt> as described |
---|
35 | in the following. Note that Boost.Python is also fully compatible |
---|
36 | with Python's cPickle module. |
---|
37 | |
---|
38 | <hr> |
---|
39 | <h2>The Boost.Python Pickle Interface</h2> |
---|
40 | |
---|
41 | At the user level, the Boost.Python pickle interface involves three special |
---|
42 | methods: |
---|
43 | |
---|
44 | <dl> |
---|
45 | <dt> |
---|
46 | <strong><tt>__getinitargs__</tt></strong> |
---|
47 | <dd> |
---|
48 | When an instance of a Boost.Python extension class is pickled, the |
---|
49 | pickler tests if the instance has a <tt>__getinitargs__</tt> method. |
---|
50 | This method must return a Python tuple (it is most convenient to use |
---|
51 | a boost::python::tuple). When the instance is restored by the |
---|
52 | unpickler, the contents of this tuple are used as the arguments for |
---|
53 | the class constructor. |
---|
54 | |
---|
55 | <p> |
---|
56 | If <tt>__getinitargs__</tt> is not defined, <tt>pickle.load</tt> |
---|
57 | will call the constructor (<tt>__init__</tt>) without arguments; |
---|
58 | i.e., the object must be default-constructible. |
---|
59 | |
---|
60 | <p> |
---|
61 | <dt> |
---|
62 | <strong><tt>__getstate__</tt></strong> |
---|
63 | |
---|
64 | <dd> |
---|
65 | When an instance of a Boost.Python extension class is pickled, the |
---|
66 | pickler tests if the instance has a <tt>__getstate__</tt> method. |
---|
67 | This method should return a Python object representing the state of |
---|
68 | the instance. |
---|
69 | |
---|
70 | <p> |
---|
71 | <dt> |
---|
72 | <strong><tt>__setstate__</tt></strong> |
---|
73 | |
---|
74 | <dd> |
---|
75 | When an instance of a Boost.Python extension class is restored by the |
---|
76 | unpickler (<tt>pickle.load</tt>), it is first constructed using the |
---|
77 | result of <tt>__getinitargs__</tt> as arguments (see above). Subsequently |
---|
78 | the unpickler tests if the new instance has a <tt>__setstate__</tt> |
---|
79 | method. If so, this method is called with the result of |
---|
80 | <tt>__getstate__</tt> (a Python object) as the argument. |
---|
81 | |
---|
82 | </dl> |
---|
83 | |
---|
84 | The three special methods described above may be <tt>.def()</tt>'ed |
---|
85 | individually by the user. However, Boost.Python provides an easy to use |
---|
86 | high-level interface via the |
---|
87 | <strong><tt>boost::python::pickle_suite</tt></strong> class that also |
---|
88 | enforces consistency: <tt>__getstate__</tt> and <tt>__setstate__</tt> |
---|
89 | must be defined as pairs. Use of this interface is demonstrated by the |
---|
90 | following examples. |
---|
91 | |
---|
92 | <hr> |
---|
93 | <h2>Examples</h2> |
---|
94 | |
---|
95 | There are three files in |
---|
96 | <tt>boost/libs/python/test</tt> that show how to |
---|
97 | provide pickle support. |
---|
98 | |
---|
99 | <hr> |
---|
100 | <h3><a href="../../test/pickle1.cpp"><tt>pickle1.cpp</tt></a></h3> |
---|
101 | |
---|
102 | The C++ class in this example can be fully restored by passing the |
---|
103 | appropriate argument to the constructor. Therefore it is sufficient |
---|
104 | to define the pickle interface method <tt>__getinitargs__</tt>. |
---|
105 | This is done in the following way: |
---|
106 | |
---|
107 | <ul> |
---|
108 | <li>1. Definition of the C++ pickle function: |
---|
109 | <pre> |
---|
110 | struct world_pickle_suite : boost::python::pickle_suite |
---|
111 | { |
---|
112 | static |
---|
113 | boost::python::tuple |
---|
114 | getinitargs(world const& w) |
---|
115 | { |
---|
116 | return boost::python::make_tuple(w.get_country()); |
---|
117 | } |
---|
118 | }; |
---|
119 | </pre> |
---|
120 | <li>2. Establishing the Python binding: |
---|
121 | <pre> |
---|
122 | class_<world>("world", args<const std::string&>()) |
---|
123 | // ... |
---|
124 | .def_pickle(world_pickle_suite()) |
---|
125 | // ... |
---|
126 | </pre> |
---|
127 | </ul> |
---|
128 | |
---|
129 | <hr> |
---|
130 | <h3><a href="../../test/pickle2.cpp"><tt>pickle2.cpp</tt></a></h3> |
---|
131 | |
---|
132 | The C++ class in this example contains member data that cannot be |
---|
133 | restored by any of the constructors. Therefore it is necessary to |
---|
134 | provide the <tt>__getstate__</tt>/<tt>__setstate__</tt> pair of |
---|
135 | pickle interface methods: |
---|
136 | |
---|
137 | <ul> |
---|
138 | <li>1. Definition of the C++ pickle functions: |
---|
139 | <pre> |
---|
140 | struct world_pickle_suite : boost::python::pickle_suite |
---|
141 | { |
---|
142 | static |
---|
143 | boost::python::tuple |
---|
144 | getinitargs(const world& w) |
---|
145 | { |
---|
146 | // ... |
---|
147 | } |
---|
148 | |
---|
149 | static |
---|
150 | boost::python::tuple |
---|
151 | getstate(const world& w) |
---|
152 | { |
---|
153 | // ... |
---|
154 | } |
---|
155 | |
---|
156 | static |
---|
157 | void |
---|
158 | setstate(world& w, boost::python::tuple state) |
---|
159 | { |
---|
160 | // ... |
---|
161 | } |
---|
162 | }; |
---|
163 | </pre> |
---|
164 | <li>2. Establishing the Python bindings for the entire suite: |
---|
165 | <pre> |
---|
166 | class_<world>("world", args<const std::string&>()) |
---|
167 | // ... |
---|
168 | .def_pickle(world_pickle_suite()) |
---|
169 | // ... |
---|
170 | </pre> |
---|
171 | </ul> |
---|
172 | |
---|
173 | <p> |
---|
174 | For simplicity, the <tt>__dict__</tt> is not included in the result |
---|
175 | of <tt>__getstate__</tt>. This is not generally recommended, but a |
---|
176 | valid approach if it is anticipated that the object's |
---|
177 | <tt>__dict__</tt> will always be empty. Note that the safety guard |
---|
178 | described below will catch the cases where this assumption is violated. |
---|
179 | |
---|
180 | <hr> |
---|
181 | <h3><a href="../../test/pickle3.cpp"><tt>pickle3.cpp</tt></a></h3> |
---|
182 | |
---|
183 | This example is similar to <a |
---|
184 | href="../../test/pickle2.cpp"><tt>pickle2.cpp</tt></a>. However, the |
---|
185 | object's <tt>__dict__</tt> is included in the result of |
---|
186 | <tt>__getstate__</tt>. This requires a little more code but is |
---|
187 | unavoidable if the object's <tt>__dict__</tt> is not always empty. |
---|
188 | |
---|
189 | <hr> |
---|
190 | <h2>Pitfall and Safety Guard</h2> |
---|
191 | |
---|
192 | The pickle protocol described above has an important pitfall that the |
---|
193 | end user of a Boost.Python extension module might not be aware of: |
---|
194 | <p> |
---|
195 | <strong> |
---|
196 | <tt>__getstate__</tt> is defined and the instance's <tt>__dict__</tt> |
---|
197 | is not empty. |
---|
198 | </strong> |
---|
199 | <p> |
---|
200 | |
---|
201 | The author of a Boost.Python extension class might provide a |
---|
202 | <tt>__getstate__</tt> method without considering the possibilities |
---|
203 | that: |
---|
204 | |
---|
205 | <p> |
---|
206 | <ul> |
---|
207 | <li> |
---|
208 | his class is used in Python as a base class. Most likely the |
---|
209 | <tt>__dict__</tt> of instances of the derived class needs to be |
---|
210 | pickled in order to restore the instances correctly. |
---|
211 | |
---|
212 | <p> |
---|
213 | <li> |
---|
214 | the user adds items to the instance's <tt>__dict__</tt> directly. |
---|
215 | Again, the <tt>__dict__</tt> of the instance then needs to be |
---|
216 | pickled. |
---|
217 | |
---|
218 | </ul> |
---|
219 | <p> |
---|
220 | |
---|
221 | To alert the user to this highly unobvious problem, a safety guard is |
---|
222 | provided. If <tt>__getstate__</tt> is defined and the instance's |
---|
223 | <tt>__dict__</tt> is not empty, Boost.Python tests if the class has |
---|
224 | an attribute <tt>__getstate_manages_dict__</tt>. An exception is |
---|
225 | raised if this attribute is not defined: |
---|
226 | |
---|
227 | <pre> |
---|
228 | RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set) |
---|
229 | </pre> |
---|
230 | |
---|
231 | To resolve this problem, it should first be established that the |
---|
232 | <tt>__getstate__</tt> and <tt>__setstate__</tt> methods manage the |
---|
233 | instances's <tt>__dict__</tt> correctly. Note that this can be done |
---|
234 | either at the C++ or the Python level. Finally, the safety guard |
---|
235 | should intentionally be overridden. E.g. in C++ (from |
---|
236 | <a href="../../test/pickle3.cpp"><tt>pickle3.cpp</tt></a>): |
---|
237 | |
---|
238 | <pre> |
---|
239 | struct world_pickle_suite : boost::python::pickle_suite |
---|
240 | { |
---|
241 | // ... |
---|
242 | |
---|
243 | static bool getstate_manages_dict() { return true; } |
---|
244 | }; |
---|
245 | </pre> |
---|
246 | |
---|
247 | Alternatively in Python: |
---|
248 | |
---|
249 | <pre> |
---|
250 | import your_bpl_module |
---|
251 | class your_class(your_bpl_module.your_class): |
---|
252 | __getstate_manages_dict__ = 1 |
---|
253 | def __getstate__(self): |
---|
254 | # your code here |
---|
255 | def __setstate__(self, state): |
---|
256 | # your code here |
---|
257 | </pre> |
---|
258 | |
---|
259 | <hr> |
---|
260 | <h2>Practical Advice</h2> |
---|
261 | |
---|
262 | <ul> |
---|
263 | <li> |
---|
264 | In Boost.Python extension modules with many extension classes, |
---|
265 | providing complete pickle support for all classes would be a |
---|
266 | significant overhead. In general complete pickle support should |
---|
267 | only be implemented for extension classes that will eventually |
---|
268 | be pickled. |
---|
269 | |
---|
270 | <p> |
---|
271 | <li> |
---|
272 | Avoid using <tt>__getstate__</tt> if the instance can also be |
---|
273 | reconstructed by way of <tt>__getinitargs__</tt>. This automatically |
---|
274 | avoids the pitfall described above. |
---|
275 | |
---|
276 | <p> |
---|
277 | <li> |
---|
278 | If <tt>__getstate__</tt> is required, include the instance's |
---|
279 | <tt>__dict__</tt> in the Python object that is returned. |
---|
280 | |
---|
281 | </ul> |
---|
282 | |
---|
283 | <hr> |
---|
284 | <h2>Light-weight alternative: pickle support implemented in Python</h2> |
---|
285 | |
---|
286 | <h3><a href="../../test/pickle4.cpp"><tt>pickle4.cpp</tt></a></h3> |
---|
287 | |
---|
288 | The <tt>pickle4.cpp</tt> example demonstrates an alternative technique |
---|
289 | for implementing pickle support. First we direct Boost.Python via |
---|
290 | the <tt>class_::enable_pickling()</tt> member function to define only |
---|
291 | the basic attributes required for pickling: |
---|
292 | |
---|
293 | <pre> |
---|
294 | class_<world>("world", args<const std::string&>()) |
---|
295 | // ... |
---|
296 | .enable_pickling() |
---|
297 | // ... |
---|
298 | </pre> |
---|
299 | |
---|
300 | This enables the standard Python pickle interface as described |
---|
301 | in the Python documentation. By "injecting" a |
---|
302 | <tt>__getinitargs__</tt> method into the definition of the wrapped |
---|
303 | class we make all instances pickleable: |
---|
304 | |
---|
305 | <pre> |
---|
306 | # import the wrapped world class |
---|
307 | from pickle4_ext import world |
---|
308 | |
---|
309 | # definition of __getinitargs__ |
---|
310 | def world_getinitargs(self): |
---|
311 | return (self.get_country(),) |
---|
312 | |
---|
313 | # now inject __getinitargs__ (Python is a dynamic language!) |
---|
314 | world.__getinitargs__ = world_getinitargs |
---|
315 | </pre> |
---|
316 | |
---|
317 | See also the |
---|
318 | <a href="../tutorial/doc/html/python/techniques.html#python.extending_wrapped_objects_in_python" |
---|
319 | >tutorial section</a> on injecting additional methods from Python. |
---|
320 | |
---|
321 | <hr> |
---|
322 | |
---|
323 | © Copyright Ralf W. Grosse-Kunstleve 2001-2004. Distributed under |
---|
324 | the Boost Software License, Version 1.0. (See accompanying file |
---|
325 | LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) |
---|
326 | |
---|
327 | <p> |
---|
328 | Updated: Feb 2004. |
---|
329 | </div> |
---|