summaryrefslogtreecommitdiff
path: root/usr.bin/tr/tr.1
diff options
context:
space:
mode:
Diffstat (limited to 'usr.bin/tr/tr.1')
-rw-r--r--usr.bin/tr/tr.1352
1 files changed, 352 insertions, 0 deletions
diff --git a/usr.bin/tr/tr.1 b/usr.bin/tr/tr.1
new file mode 100644
index 0000000..fa9d121
--- /dev/null
+++ b/usr.bin/tr/tr.1
@@ -0,0 +1,352 @@
+.\" $NetBSD: tr.1,v 1.22 2017/07/03 21:34:22 wiz Exp $
+.\"
+.\" Copyright (c) 1991, 1993
+.\" The Regents of the University of California. All rights reserved.
+.\"
+.\" This code is derived from software contributed to Berkeley by
+.\" the Institute of Electrical and Electronics Engineers, Inc.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\" notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\" notice, this list of conditions and the following disclaimer in the
+.\" documentation and/or other materials provided with the distribution.
+.\" 3. Neither the name of the University nor the names of its contributors
+.\" may be used to endorse or promote products derived from this software
+.\" without specific prior written permission.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.\" @(#)tr.1 8.1 (Berkeley) 6/6/93
+.\"
+.Dd May 29, 2013
+.Dt TR 1
+.Os
+.Sh NAME
+.Nm tr
+.Nd translate characters
+.Sh SYNOPSIS
+.Nm
+.Op Fl cs
+.Ar string1 string2
+.Nm
+.Op Fl c
+.Fl d
+.Ar string1
+.Nm
+.Op Fl c
+.Fl s
+.Ar string1
+.Nm
+.Op Fl c
+.Fl ds
+.Ar string1 string2
+.Sh DESCRIPTION
+The
+.Nm
+utility copies the standard input to the standard output with substitution
+or deletion of selected characters.
+.Pp
+The following options are available:
+.Bl -tag -width Ds
+.It Fl c
+Complements the set of characters in
+.Ar string1 ;
+that is,
+.Fl c Ar \&ab
+includes every character except for
+.Sq a
+and
+.Sq b .
+.It Fl d
+The
+.Fl d
+option causes characters to be deleted from the input.
+.It Fl s
+The
+.Fl s
+option squeezes multiple occurrences of the characters listed in the last
+operand (either
+.Ar string1
+or
+.Ar string2 )
+in the input into a single instance of the character.
+This occurs after all deletion and translation is completed.
+.El
+.Pp
+In the first synopsis form, the characters in
+.Ar string1
+are translated into the characters in
+.Ar string2
+where the first character in
+.Ar string1
+is translated into the first character in
+.Ar string2
+and so on.
+If
+.Ar string1
+is longer than
+.Ar string2 ,
+the last character found in
+.Ar string2
+is duplicated until
+.Ar string1
+is exhausted.
+.Pp
+In the second synopsis form, the characters in
+.Ar string1
+are deleted from the input.
+.Pp
+In the third synopsis form, the characters in
+.Ar string1
+are compressed as described for the
+.Fl s
+option.
+.Pp
+In the fourth synopsis form, the characters in
+.Ar string1
+are deleted from the input, and the characters in
+.Ar string2
+are compressed as described for the
+.Fl s
+option.
+.Pp
+The following conventions can be used in
+.Ar string1
+and
+.Ar string2
+to specify sets of characters:
+.Bl -tag -width [:equiv:]
+.It character
+Any character not described by one of the following conventions
+represents itself.
+.It \eoctal
+A backslash followed by 1, 2 or 3 octal digits represents a character
+with that encoded value.
+To follow an octal sequence with a digit as a character, left zero-pad
+the octal sequence to the full 3 octal digits.
+.It \echaracter
+A backslash followed by certain special characters maps to special
+values.
+.sp
+.Bl -column cc
+.It \ea <alert character>
+.It \eb <backspace>
+.It \ef <form-feed>
+.It \en <newline>
+.It \er <carriage return>
+.It \et <tab>
+.It \ev <vertical tab>
+.El
+.sp
+A backslash followed by any other character maps to that character.
+.It c-c
+Represents the range of characters between the range endpoints, inclusively.
+.It [:class:]
+Represents all characters belonging to the defined character class.
+Class names are:
+.sp
+.Bl -column xdigit
+.It alnum <alphanumeric characters>
+.It alpha <alphabetic characters>
+.It blank <blank characters>
+.It cntrl <control characters>
+.It digit <numeric characters>
+.It graph <graphic characters>
+.It lower <lower-case alphabetic characters>
+.It print <printable characters>
+.It punct <punctuation characters>
+.It space <space characters>
+.It upper <upper-case characters>
+.It xdigit <hexadecimal characters>
+.El
+.Pp
+.\" All classes may be used in
+.\" .Ar string1 ,
+.\" and in
+.\" .Ar string2
+.\" when both the
+.\" .Fl d
+.\" and
+.\" .Fl s
+.\" options are specified.
+.\" Otherwise, only the classes ``upper'' and ``lower'' may be used in
+.\" .Ar string2
+.\" and then only when the corresponding class (``upper'' for ``lower''
+.\" and vice-versa) is specified in the same relative position in
+.\" .Ar string1 .
+.\" .Pp
+With the exception of the
+.Dq upper
+and
+.Dq lower
+classes, characters in the classes are in unspecified order.
+In the
+.Dq upper
+and
+.Dq lower
+classes, characters are entered in ascending order.
+.Pp
+For specific information as to which ASCII characters are included
+in these classes, see
+.Xr ctype 3
+and related manual pages.
+.It [=equiv=]
+Represents all characters or collating (sorting) elements belonging to
+the same equivalence class as
+.Ar equiv .
+If there is a secondary ordering within the equivalence class, the
+characters are ordered in ascending sequence.
+Otherwise, they are ordered after their encoded values.
+An example of an equivalence class might be
+.Dq \&c
+and
+.Dq \&ch
+in Spanish;
+English has no equivalence classes.
+.It [#*n]
+Represents
+.Ar n
+repeated occurrences of the character represented by
+.Ar # .
+This
+expression is only valid when it occurs in
+.Ar string2 .
+If
+.Ar n
+is omitted or is zero, it is interpreted as large enough to extend the
+.Ar string2
+sequence to the length of
+.Ar string1 .
+If
+.Ar n
+has a leading zero, it is interpreted as an octal value;
+otherwise, it is interpreted as a decimal value.
+.El
+.Sh EXIT STATUS
+.Ex -std
+.Sh EXAMPLES
+The following examples are shown as given to the shell:
+.Pp
+Create a list of the words in
+.Ar file1 ,
+one per line, where a word is taken to be a maximal string of letters:
+.sp
+.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
+.sp
+Translate the contents of
+.Ar file1
+to upper-case:
+.sp
+.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
+.sp
+Strip out non-printable characters from
+.Ar file1 :
+.sp
+.D1 Li "tr -cd \*q[:print:]\*q < file1"
+.Sh COMPATIBILITY
+.At V
+has historically implemented character ranges using the syntax
+.Dq [c-c]
+instead of the
+.Dq c-c
+used by historic
+.Bx
+implementations and standardized by POSIX.
+.At V
+shell scripts should work under this implementation as long as
+the range is intended to map in another range, i.e. the command
+.Pp
+.Ic "tr [a-z] [A-Z]"
+.Pp
+will work as it will map the
+.Sq \&[
+character in
+.Ar string1
+to the
+.Sq \&[
+character in
+.Ar string2 .
+However, if the shell script is deleting or squeezing characters as in
+the command
+.Pp
+.Ic "tr -d [a-z]"
+.Pp
+the characters
+.Sq \&[
+and
+.Sq \&]
+will be included in the deletion or compression list which would
+not have happened under an historic
+.At V
+implementation.
+Additionally, any scripts that depended on the sequence
+.Dq a-z
+to represent the three characters
+.Sq \&a ,
+.Sq \&- ,
+and
+.Sq \&z
+will have to be rewritten as
+.Dq a\e-z .
+.Pp
+The
+.Nm
+utility has historically not permitted the manipulation of NUL bytes in
+its input and, additionally, stripped NUL's from its input stream.
+This implementation has removed this behavior as a bug.
+.Pp
+The
+.Nm
+utility has historically been extremely forgiving of syntax errors,
+for example, the
+.Fl c
+and
+.Fl s
+options were ignored unless two strings were specified.
+This implementation will not permit illegal syntax.
+.Sh SEE ALSO
+.Xr dd 1 ,
+.Xr sed 1
+.Sh STANDARDS
+The
+.Nm
+utility is expected to be
+.St -p1003.2
+compatible.
+It should be noted that the feature wherein the last character of
+.Ar string2
+is duplicated if
+.Ar string2
+has less characters than
+.Ar string1
+is permitted by POSIX but is not required.
+Shell scripts attempting to be portable to other POSIX systems should use
+the
+.Dq [#*n]
+convention instead of relying on this behavior.
+.Sh BUGS
+.Nm
+was originally designed to work with
+.Tn US-ASCII .
+Its use with character sets that do not share all the properties of
+.Tn US-ASCII ,
+e.g., a symmetric set of upper and lower case characters
+that can be algorithmically converted one to the other,
+may yield unpredictable results.
+.Pp
+.Nm
+should be internationalized.