Planet

navi

home

PPS

about

screenshots

download

development

forum

Context Navigation

source: downloads/boost_1_33_1/libs/program_options/doc/overview.xml @ 12

Last change on this file since 12 was 12, checked in by landauf, 18 years ago
added boost
File size: 23.8 KB

Line
1	<?xml version="1.0" standalone="yes"?>
2	<!DOCTYPE library PUBLIC "-//Boost//DTD BoostBook XML V1.0//EN"
3	"http://www.boost.org/tools/boostbook/dtd/boostbook.dtd"
4	[
5	<!ENTITY % entities SYSTEM "program_options.ent" >
6	%entities;
7	]>
8	<section id="program_options.overview">
9	<title>Library Overview</title>
10
11	<para>In the tutorial section, we saw several examples of library usage.
12	Here we will describe the overall library design including the primary
13	components and their function.
14	</para>
15
16	<para>The library has three main components:
17	<itemizedlist>
18	<listitem>
19	<para>The options description component, which describes the allowed options
20	and what to do with the values of the options.
21	</para>
22	</listitem>
23	<listitem>
24	<para>The parsers component, which uses this information to find option names
25	and values in the input sources and return them.
26	</para>
27	</listitem>
28	<listitem>
29	<para>The storage component, which provides the
30	interface to access the value of an option. It also converts the string
31	representation of values that parsers return into desired C++ types.
32	</para>
33	</listitem>
34	</itemizedlist>
35	</para>
36
37	<para>To be a little more concrete, the <code>options_description</code>
38	class is from the options description component, the
39	<code>parse_command_line</code> function is from the parsers component, and the
40	<code>variables_map</code> class is from the storage component. </para>
41
42	<para>In the tutorial we've learned how those components can be used by the
43	<code>main</code> function to parse the command line and config
44	file. Before going into the details of each component, a few notes about
45	the world outside of <code>main</code>.
46	</para>
47
48	<para>
49	For that outside world, the storage component is the most important. It
50	provides a class which stores all option values and that class can be
51	freely passed around your program to modules which need access to the
52	options. All the other components can be used only in the place where
53	the actual parsing is the done. However, it might also make sense for the
54	individual program modules to describe their options and pass them to the
55	main module, which will merge all options. Of course, this is only
56	important when the number of options is large and declaring them in one
57	place becomes troublesome.
58	</para>
59
60	<!--
61	<para>The design looks very simple and straight-forward, but it is worth
62	noting some important points:
63	<itemizedlist>
64	<listitem>
65	<para>The options description is not tied to specific source. Once
66	options are described, all parsers can use that description.</para>
67	</listitem>
68	<listitem>
69	<para>The parsers are intended to be fairly dumb. They just
70	split the input into (name, value) pairs, using strings to represent
71	names and values. No meaningful processing of values is done.
72	</para>
73	</listitem>
74	<listitem>
75	<para>The storage component is focused on storing options values. It
76	</para>
77	</listitem>
78
79
80	</itemizedlist>
81
82	</para>
83	-->
84
85	<section>
86	<title>Options Description Component</title>
87
88	<para>The options description component has three main classes:
89	&option_description;, &value_semantic; and &options_description;. The
90	first two together describe a single option. The &option_description;
91	class contains the option's name, description and a pointer to &value_semantic;,
92	which, in turn, knows the type of the option's value and can parse the value,
93	apply the default value, and so on. The &options_description; class is a
94	container for instances of &option_description;.
95	</para>
96
97	<para>For almost every library, those classes could be created in a
98	conventional way: that is, you'd create new options using constructors and
99	then call the <code>add</code> method of &options_description;. However,
100	that's overly verbose for declaring 20 or 30 options. This concern led
101	to creation of the syntax that you've already seen:
102	<programlisting>
103	options_description desc;
104	desc.add_options()
105	("help", "produce help")
106	("optimization", value<int>()->default_value(10), "optimization level")
107	;
108	</programlisting>
109	</para>
110
111	<para>The call to the <code>value</code> function creates an instance of
112	a class derived from the <code>value_semantic</code> class: <code>typed_value</code>.
113	That class contains the code to parse
114	values of a specific type, and contains a number of methods which can be
115	called by the user to specify additional information. (This
116	essentially emulates named parameters of the constructor.) Calls to
117	<code>operator()</code> on the object returned by <code>add_options</code>
118	forward arguments to the constructor of the <code>option_description</code>
119	class and add the new instance.
120	</para>
121
122	<para>
123	Note that in addition to the
124	<code>value</code>, library provides the <code>bool_switch</code>
125	function, and user can write his own function which will return
126	other subclasses of <code>value_semantic</code> with
127	different behaviour. For the remainder of this section, we'll talk only
128	about the <code>value</code> function.
129	</para>
130
131	<para>The information about an option is divided into syntactic and
132	semantic. Syntactic information includes the name of the option and the
133	number of tokens which can be used to specify the value. This
134	information is used by parsers to group tokens into (name, value) pairs,
135	where value is just a vector of strings
136	(<code>std::vector<std::string></code>). The semantic layer
137	is responsible for converting the value of the option into more usable C++
138	types.
139	</para>
140
141	<para>This separation is an important part of library design. The parsers
142	use only the syntactic layer, which takes away some of the freedom to
143	use overly complex structures. For example, it's not easy to parse
144	syntax like: <screen>calc --expression=1 + 2/3</screen> because it's not
145	possible to parse <screen>1 + 2/3</screen> without knowing that it's a C
146	expression. With a little help from the user the task becomes trivial,
147	and the syntax clear: <screen>calc --expression="1 + 2/3"</screen>
148	</para>
149
150	<section>
151	<title>Syntactic Information</title>
152	<para>The syntactic information is provided by the
153	<classname>boost::program_options::options_description</classname> class
154	and some methods of the
155	<classname>boost::program_options::value_semantic</classname> class
156	and includes:
157	<itemizedlist>
158	<listitem>
159	<para>
160	name of the option, used to identify the option inside the
161	program,
162	</para>
163	</listitem>
164	<listitem>
165	<para>
166	description of the option, which can be presented to the user,
167	</para>
168	</listitem>
169	<listitem>
170	<para>
171	the allowed number of source tokens that comprise options's
172	value, which is used during parsing.
173	</para>
174	</listitem>
175	</itemizedlist>
176	</para>
177
178	<para>Consider the following example:
179	<programlisting>
180	options_description desc;
181	desc.add_options()
182	("help", "produce help message")
183	("compression", value<string>(), "compression level")
184	("verbose", value<string>()->implicit(), "verbosity level")
185	("email", value<string>()->multitoken(), "email to send to")
186	;
187	</programlisting>
188	For the first parameter, we specify only the name and the
189	description. No value can be specified in the parsed source.
190	For the first option, the user must specify a value, using a single
191	token. For the third option, the user may either provide a single token
192	for the value, or no token at all. For the last option, the value can
193	span several tokens. For example, the following command line is OK:
194	<screen>
195	test --help --compression 10 --verbose --email beadle@mars beadle2@mars
196	</screen>
197	</para>
198
199	<section>
200	<title>Description formatting</title>
201
202	<para>
203	Sometimes the description can get rather long, for example, when
204	several option's values need separate documentation. Below we
205	describe some simple formatting mechanisms you can use.
206	</para>
207
208	<para>The description string has one or more paragraphs, separated by
209	the newline character ('\n'). When an option is output, the library
210	will compute the indentation for options's description. Each of the
211	paragraph is output as a separate line with that intentation. If
212	a paragraph does not fit on one line it is spanned over multiple
213	lines (which will have the same indentation).
214	</para>
215
216	<para>You may specify additional indent for the first specified by
217	inserting spaces at the beginning of a paragraph. For example:
218	<programlisting>
219	options.add_options()
220	("help", " A long help msg a long help msg a long help msg a long help
221	msg a long help msg a long help msg a long help msg a long help msg ")
222	;
223	</programlisting>
224	will specify a four-space indent for the first line. The output will
225	look like:
226	<screen>
227	--help A long help msg a long
228	help msg a long help msg
229	a long help msg a long
230	help msg a long help msg
231	a long help msg a long
232	help msg
233
234	</screen>
235	</para>
236
237	<para>For the case where line is wrapped, you can want an additional
238	indent for wrapped text. This can be done by
239	inserting a tabulator character ('\t') at the desired position. For
240	example:
241	<programlisting>
242	options.add_options()
243	("well_formated", "As you can see this is a very well formatted
244	option description.\n"
245	"You can do this for example:\n\n"
246	"Values:\n"
247	" Value1: \tdoes this and that, bla bla bla bla
248	bla bla bla bla bla bla bla bla bla bla bla\n"
249	" Value2: \tdoes something else, bla bla bla bla
250	bla bla bla bla bla bla bla bla bla bla bla\n\n"
251	" This paragraph has a first line indent only,
252	bla bla bla bla bla bla bla bla bla bla bla bla bla bla bla");
253	</programlisting>
254	will produce:
255	<screen>
256	--well_formated As you can see this is a
257	very well formatted
258	option description.
259	You can do this for
260	example:
261
262	Values:
263	Value1: does this and
264	that, bla bla
265	bla bla bla bla
266	bla bla bla bla
267	bla bla bla bla
268	bla
269	Value2: does something
270	else, bla bla
271	bla bla bla bla
272	bla bla bla bla
273	bla bla bla bla
274	bla
275
276	This paragraph has a
277	first line indent only,
278	bla bla bla bla bla bla
279	bla bla bla bla bla bla
280	bla bla bla
281	</screen>
282	The tab character is removed before output. Only one tabulator per
283	paragraph is allowed, otherwisee an exception of type
284	program_options::error is thrown. Finally, the tabulator is ignored if
285	it's is not on the first line of the paragraph or is on the last
286	possible position of the first line.
287	</para>
288
289	</section>
290
291	</section>
292
293	<section>
294	<title>Semantic Information</title>
295
296	<para>The semantic information is completely provided by the
297	<classname>boost::program_options::value_semantic</classname> class. For
298	example:
299	<programlisting>
300	options_description desc;
301	desc.add_options()
302	("compression", value<int>()->default_value(10), "compression level")
303	("email", value< vector<string> >()
304	->composing()->notifier(&your_function), "email")
305	;
306	</programlisting>
307	These declarations specify that default value of the first option is 10,
308	that the second option can appear several times and all instances should
309	be merged, and that after parsing is done, the library will call
310	function <code>&your_function</code>, passing the value of the
311	"email" option as argument.
312	</para>
313	</section>
314
315	<section>
316	<title>Positional Options</title>
317
318	<para>Our definition of option as (name, value) pairs is simple and
319	useful, but in one special case of the command line, there's a
320	problem. A command line can include a <firstterm>positional option</firstterm>,
321	which does not specify any name at all, for example:
322	<screen>
323	archiver --compression=9 /etc/passwd
324	</screen>
325	Here, the "/etc/passwd" element does not have any option name.
326	</para>
327
328	<para>One solution is to ask the user to extract positional options
329	himself and process them as he likes. However, there's a nicer approach
330	-- provide a method to automatically assign the names for positional
331	options, so that the above command line can be interpreted the same way
332	as:
333	<screen>
334	archiver --compression=9 --input-file=/etc/passwd
335	</screen>
336	</para>
337
338	<para>The &positional_options_desc; class allows the command line
339	parser to assign the names. The class specifies how many positional options
340	are allowed, and for each allowed option, specifies the name. For example:
341	<programlisting>
342	positional_options_description pd; pd.add("input-file", 1);
343	</programlisting> specifies that for exactly one, first, positional
344	option the name will be "input-file".
345	</para>
346
347	<para>It's possible to specify that a number, or even all positional options, be
348	given the same name.
349	<programlisting>
350	positional_options_description pd;
351	pd.add("output-file", 2).add_optional("input-file", -1);
352	</programlisting>
353	In the above example, the first two positional options will be associated
354	with name "output-file", and any others with the name "input-file".
355	</para>
356
357	</section>
358
359	<!-- Note that the classes are not modified during parsing -->
360
361	</section>
362
363	<section>
364	<title>Parsers Component</title>
365
366	<para>The parsers component splits input sources into (name, value) pairs.
367	Each parser looks for possible options and consults the options
368	description component to determine if the option is known and how its value
369	is specified. In the simplest case, the name is explicitly specified,
370	which allows the library to decide if such option is known. If it is known, the
371	&value_semantic; instance determines how the value is specified. (If
372	it is not known, an exception is thrown.) Common
373	cases are when the value is explicitly specified by the user, and when
374	the value cannot be specified by the user, but the presence of the
375	option implies some value (for example, <code>true</code>). So, the
376	parser checks that the value is specified when needed and not specified
377	when not needed, and returns new (name, value) pair.
378	</para>
379
380	<para>
381	To invoke a parser you typically call a function, passing the options
382	description and command line or config file or something else.
383	The results of parsing are returned as an instance of the &parsed_options;
384	class. Typically, that object is passed directly to the storage
385	component. However, it also can be used directly, or undergo some additional
386	processing.
387	</para>
388
389	<para>
390	There are three exceptions to the above model -- all related to
391	traditional usage of the command line. While they require some support
392	from the options description component, the additional complexity is
393	tolerable.
394	<itemizedlist>
395	<listitem>
396	<para>The name specified on the command line may be
397	different from the option name -- it's common to provide a "short option
398	name" alias to a longer name. It's also common to allow an abbreviated name
399	to be specified on the command line.
400	</para>
401	</listitem>
402	<listitem>
403	<para>Sometimes it's desirable to specify value as several
404	tokens. For example, an option "--email-recipient" may be followed
405	by several emails, each as a separate command line token. This
406	behaviour is supported, though it can lead to parsing ambiguities
407	and is not enabled by default.
408	</para>
409	</listitem>
410	<listitem>
411	<para>The command line may contain positional options -- elements
412	which don't have any name. The command line parser provides a
413	mechanism to guess names for such options, as we've seen in the
414	tutorial.
415	</para>
416	</listitem>
417	</itemizedlist>
418	</para>
419
420	</section>
421
422
423	<section>
424	<title>Storage Component</title>
425
426	<para>The storage component is responsible for:
427	<itemizedlist>
428	<listitem>
429	<para>Storing the final values of an option into a special class and in
430	regular variables</para>
431	</listitem>
432	<listitem>
433	<para>Handling priorities among different sources.</para>
434	</listitem>
435
436	<listitem>
437	<para>Calling user-specified <code>notify</code> functions with the final
438	values of options.</para>
439	</listitem>
440	</itemizedlist>
441	</para>
442
443	<para>Let's consider an example:
444	<programlisting>
445	variables_map vm;
446	store(parse_command_line(argc, argv, desc), vm);
447	store(parse_config_file("example.cfg", desc), vm);
448	notify(vm);
449	</programlisting>
450	The <code>variables_map</code> class is used to store the option
451	values. The two calls to the <code>store</code> function add values
452	found on the command line and in the config file. Finally the call to
453	the <code>notify</code> function runs the user-specified notify
454	functions and stores the values into regular variables, if needed.
455	</para>
456
457	<para>The priority is handled in a simple way: the <code>store</code>
458	function will not change the value of an option if it's already
459	assigned. In this case, if the command line specifies the value for an
460	option, any value in the config file is ignored.
461	</para>
462
463	<warning>
464	<para>Don't forget to call the <code>notify</code> function after you've
465	stored all parsed values.</para>
466	</warning>
467
468	</section>
469
470	<section>
471	<title>Specific parsers</title>
472
473	<section>
474	<title>Environment variables</title>
475
476	<para><firstterm>Environment variables</firstterm> are string variables
477	which are available to all programs via the <code>getenv</code> function
478	of C runtime library. The operating system allows to set initial values
479	for a given user, and the values can be further changed on the command
480	line. For example, on Windows one can use the
481	<filename>autoexec.bat</filename> file or (on recent versions) the
482	<filename>Control Panel/System/Advanced/Environment Variables</filename>
483	dialog, and on Unix —, the <filename>/etc/profile</filename>,
484	<filename>~/profile</filename> and <filename>~/bash_profile</filename>
485	files. Because environment variables can be set for the entire system,
486	they are particularly suitable for options which apply to all programs.
487	</para>
488
489	<para>The environment variables can be parsed with the
490	&parse_environment; function. The function have several overloaded
491	versions. The first parameter is always an &options_description;
492	instance, and the second specifies what variables must be processed, and
493	what option names must correspond to it. To describe the second
494	parameter we need to consider naming conventions for environment
495	variables.</para>
496
497	<para>If you have an option that should be specified via environment
498	variable, you need make up the variable's name. To avoid name clashes,
499	we suggest that you use a sufficiently unique prefix for environment
500	variables. Also, while option names are most likely in lower case,
501	environment variables conventionally use upper case. So, for an option
502	name <literal>proxy</literal> the environment variable might be called
503	<envar>BOOST_PROXY</envar>. During parsing, we need to perform reverse
504	conversion of the names. This is accomplished by passing the choosen
505	prefix as the second parameter of the &parse_environment; function.
506	Say, if you pass <literal>BOOST_</literal> as the prefix, and there are
507	two variables, <envar>CVSROOT</envar> and <envar>BOOST_PROXY</envar>, the
508	first variable will be ignored, and the second one will be converted to
509	option <literal>proxy</literal>.
510	</para>
511
512	<para>The above logic is sufficient in many cases, but it is also
513	possible to pass, as the second parameter of the &parse_environment;
514	function, any function taking a <code>std::string</code> and returning
515	<code>std::string</code>. That function will be called for each
516	environment variable and should return either the name of the option, or
517	empty string if the variable should be ignored.
518	</para>
519
520	</section>
521	</section>
522
523	<section>
524	<title>Annotated List of Symbols</title>
525
526	<para>The following table describes all the important symbols in the
527	library, for quick access.</para>
528
529	<informaltable pgwide="1">
530
531	<tgroup cols="2">
532	<colspec colname='c1'/>
533	<colspec colname='c2'/>
534	<thead>
535
536	<row>
537	<entry>Symbol</entry>
538	<entry>Description</entry>
539	</row>
540	</thead>
541
542	<tbody>
543
544	<row>
545	<entry namest='c1' nameend='c2'>Options description component</entry>
546	</row>
547
548	<row>
549	<entry>&options_description;</entry>
550	<entry>describes a number of options</entry>
551	</row>
552	<row>
553	<entry>&value;</entry>
554	<entry>defines the option's value</entry>
555	</row>
556
557	<row>
558	<entry namest='c1' nameend='c2'>Parsers component</entry>
559	</row>
560
561	<row>
562	<entry>&parse_command_line;</entry>
563	<entry>parses command line</entry>
564	</row>
565	<row>
566	<entry>&parse_config_file;</entry>
567	<entry>parses config file</entry>
568	</row>
569
570	<row>
571	<entry>&parse_environment;</entry>
572	<entry>parses environment</entry>
573	</row>
574
575	<row>
576	<entry namest='c1' nameend='c2'>Storage component</entry>
577	</row>
578
579	<row>
580	<entry>&variables_map;</entry>
581	<entry>storage for option values</entry>
582	</row>
583
584	</tbody>
585	</tgroup>
586
587	</informaltable>
588
589	</section>
590
591	</section>
592
593	<!--
594	Local Variables:
595	mode: nxml
596	sgml-indent-data: t
597	sgml-parent-document: ("program_options.xml" "section")
598	sgml-set-face: t
599	End:
600	-->

Note: See TracBrowser for help on using the repository browser.

Download in other formats: