Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: downloads/tcl8.5.2/doc/encoding.n @ 42

Last change on this file since 42 was 25, checked in by landauf, 17 years ago

added tcl to libs

File size: 3.8 KB
Line 
1'\"
2'\" Copyright (c) 1998 by Scriptics Corporation.
3'\"
4'\" See the file "license.terms" for information on usage and redistribution
5'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
6'\"
7'\" RCS: @(#) $Id: encoding.n,v 1.15 2007/12/13 15:22:32 dgp Exp $
8'\"
9.so man.macros
10.TH encoding n "8.1" Tcl "Tcl Built-In Commands"
11.BS
12.SH NAME
13encoding \- Manipulate encodings
14.SH SYNOPSIS
15\fBencoding \fIoption\fR ?\fIarg arg ...\fR?
16.BE
17
18.SH INTRODUCTION
19.PP
20Strings in Tcl are encoded using 16-bit Unicode characters.  Different
21operating system interfaces or applications may generate strings in
22other encodings such as Shift-JIS.  The \fBencoding\fR command helps
23to bridge the gap between Unicode and these other formats.
24.SH DESCRIPTION
25.PP
26Performs one of several encoding related operations, depending on
27\fIoption\fR.  The legal \fIoption\fRs are:
28.TP
29\fBencoding convertfrom\fR ?\fIencoding\fR? \fIdata\fR
30Convert \fIdata\fR to Unicode from the specified \fIencoding\fR.  The
31characters in \fIdata\fR are treated as binary data where the lower
328-bits of each character is taken as a single byte.  The resulting
33sequence of bytes is treated as a string in the specified
34\fIencoding\fR.  If \fIencoding\fR is not specified, the current
35system encoding is used.
36.TP
37\fBencoding convertto\fR ?\fIencoding\fR? \fIstring\fR
38Convert \fIstring\fR from Unicode to the specified \fIencoding\fR.
39The result is a sequence of bytes that represents the converted
40string.  Each byte is stored in the lower 8-bits of a Unicode
41character.  If \fIencoding\fR is not specified, the current
42system encoding is used.
43.TP
44\fBencoding dirs\fR ?\fIdirectoryList\fR?
45.VS 8.5
46Tcl can load encoding data files from the file system that describe
47additional encodings for it to work with. This command sets the search
48path for \fB*.enc\fR encoding data files to the list of directories
49\fIdirectoryList\fR. If \fIdirectoryList\fR is omitted then the
50command returns the current list of directories that make up the
51search path. It is an error for \fIdirectoryList\fR to not be a valid
52list. If, when a search for an encoding data file is happening, an
53element in \fIdirectoryList\fR does not refer to a readable,
54searchable directory, that element is ignored.
55.VE 8.5
56.TP
57\fBencoding names\fR
58Returns a list containing the names of all of the encodings that are
59currently available.
60.TP
61\fBencoding system\fR ?\fIencoding\fR?
62Set the system encoding to \fIencoding\fR. If \fIencoding\fR is
63omitted then the command returns the current system encoding.  The
64system encoding is used whenever Tcl passes strings to system calls.
65.SH EXAMPLE
66.PP
67It is common practice to write script files using a text editor that
68produces output in the euc-jp encoding, which represents the ASCII
69characters as singe bytes and Japanese characters as two bytes.  This
70makes it easy to embed literal strings that correspond to non-ASCII
71characters by simply typing the strings in place in the script.
72However, because the \fBsource\fR command always reads files using the
73current system encoding, Tcl will only source such files correctly
74when the encoding used to write the file is the same.  This tends not
75to be true in an internationalized setting.  For example, if such a
76file was sourced in North America (where the ISO8859-1 is normally
77used), each byte in the file would be treated as a separate character
78that maps to the 00 page in Unicode.  The resulting Tcl strings will
79not contain the expected Japanese characters.  Instead, they will
80contain a sequence of Latin-1 characters that correspond to the bytes
81of the original string.  The \fBencoding\fR command can be used to
82convert this string to the expected Japanese Unicode characters.  For
83example,
84.CS
85set s [\fBencoding convertfrom\fR euc-jp "\exA4\exCF"]
86.CE
87would return the Unicode string
88.QW "\eu306F" ,
89which is the Hiragana letter HA.
90
91.SH "SEE ALSO"
92Tcl_GetEncoding(3)
93
94.SH KEYWORDS
95encoding
Note: See TracBrowser for help on using the repository browser.