Operations on strings and characters are an important part of many programs. The Oberon-2 language provides various built-in operations on characters and strings, but the OOC library goes on to extend the native facilities of Oberon-2 with a useful set of modules for character and string manipulation.
The Oberon-2 language report defines characters using ASCII
(American Standard Code for Information Exchange) representation. Because
of this, and for convenience, OOC provides module `Ascii', which
defines useful constants corresponding to certain ASCII
characters.
Note that OOC does support the full ISO-Latin-1 character set, which is a
strict superset of ASCII
, as well as Unicode (via
LONGCHAR
---see section Additional Data Types)
ASCII
characters can be printable characters, such as letters and
digits, and also non-printing characters such as tab and linefeed.
ASCII
only truly defines 128 characters; this means that the
interpretation of the range from `80X' to `0FFX' may vary.
Constants for all of the standard ASCII
names for non-printing
characters are provided in module `Ascii':
CONST nul = 00X; soh = 01X; stx = 02X; etx = 03X; eot = 04X; enq = 05X; ack = 06X; bel = 07X; bs = 08X; ht = 09X; lf = 0AX; vt = 0BX; ff = 0CX; cr = 0DX; so = 0EX; si = 0FX; dle = 01X; dc1 = 11X; dc2 = 12X; dc3 = 13X; dc4 = 14X; nak = 15X; syn = 16X; etb = 17X; can = 18X; em = 19X; sub = 1AX; esc = 1BX; fs = 1CX; gs = 1DX; rs = 1EX; us = 1FX; del = 7FX;
The most commonly used ASCII names have the following meanings:
bel -- bell bs -- backspace ht -- horizontal tabulator vt -- vertical tabulator lf -- line feed ff -- form feed cr -- carriage return esc -- escape del -- delete
Also, some often used synonyms are declared in module Ascii:
CONST sp = " "; xon = dc1; xoff = dc3;
Programs that deal with characters and strings often need to perform tests that "classify a character." Is the character a letter? A digit? A whitespace character? and so forth.
Module CharClass provides a set of boolean function procedures that are
used for such classification of values of the type CHAR
. All
procedures accept a single argument of type CHAR
and return a
BOOLEAN
result.
Recall that Oberon-2 is defined so that characters are ordered in the same manner as defined by ASCII. Specifically, all the digits precede all the upper-case letters, and all the upper-case letters precede all the lower-case letters. This assumption is carried over into module CharClass. Also, note that CharClass uses constants defined in module Ascii within many of its procedures (see section Module Ascii)
For example, the function IsLetter
is used to test whether a
particular character is one of `A' through `Z' or `a' through
`z':
Out.String("The character '"); IF CharClass.IsLetter(c) THEN Out.Char(c); Out.String("' is a letter."); ELSE Out.Char(c); Out.String("' isn't a letter."); END; Out.Ln
Please note: None of these predicates are affected by the current
localization setting. For example, IsUpper
will always test for
"A"<=ch & ch<="Z"
regardless of whether the locale specifies that
additional characters belong to this set or not. The same holds for the
compare and capitalization procedures in module Strings.
systemEol
may be more than one
character in length, and is not necessarily equal to eol
. Note that
systemEol
is a string; it is always terminated by 0X
(i.e.,
systemEol
cannot contain the character `0X').
(ch: CHAR): BOOLEAN
TRUE
if, and only if, ch is classified as a numeric
character (i.e., a decimal digit---`0' through `9').
(ch: CHAR): BOOLEAN
TRUE
if, and only if, ch is classified as a letter.
(ch: CHAR): BOOLEAN
TRUE
if, and only if, ch is classified as an upper-case
letter.
(ch: CHAR): BOOLEAN
TRUE
if, and only if, ch is classified as a lower-case
letter.
(ch: CHAR): BOOLEAN
TRUE
if, and only if, ch represents a control function
(that is, an ASCII character that is not a printing character).
(ch: CHAR): BOOLEAN
TRUE
if, and only if, ch represents a space character
or other "format effector". IsWhiteSpace
returns TRUE
for
only these characters:
` ' -- space (i.e., `Ascii.sp') `Ascii.ff' -- formfeed `Ascii.cr' -- carriage return `Ascii.ht' -- horizontal tab `Ascii.vt' -- vertical tab
(ch: CHAR): BOOLEAN
TRUE
if, and only if, ch is the implementation-defined
character used to represent end of line internally.
As string manipulation is so common to programming problems, the OOC library
provides additional string operations to those built into Oberon-2. The
Oberon-2 language defines a string as a character array containing
0X
as an embedded terminator. This means that an ARRAY OF
CHAR
isn't necessarily a string. The module `Strings' provides string
manipulation operations for use on terminated character arrays, whereas
module `LongStrings' has operations for terminated arrays of long
characters (LONGCHAR
---see section Additional Data Types)
Recall that string literals are sequences of characters enclosed in
single ('
) or double ("
) quote marks. The opening quote must
be the same as the closing quote and must not occur within the string.
Passing a string literal of length n as an argument to a procedure
expecting an ARRAY OF CHAR
delivers n+1 characters to the
parameter.
The number of characters in a string (up to the terminating 0X
) is
called its length. A string literal of length 1 can be used wherever
a character constant is allowed and vice versa.
Please note: All procedures reading and producing strings expect termination with
0X
. The behaviour of a procedure is undefined if one of its input parameters is an unterminated character array. Behavior is also undefined if a negative value is used as an input parameter that represents an array position or a string length.
This section describes procedures that construct a string value, and
attempt to assign it to a variable parameter. All of these procedures have
the property that if the length of the constructed string value exceeds the
capacity of the variable parameter, a truncated value is assigned. The
constructed string always ends with a string terminator 0X
.
Also described are procedures that provide for pre-testing of the operation-completion conditions for the copying and concatenation procedures.
(source: ARRAY OF CHAR; VAR destination: ARRAY OF CHAR)
(source: ARRAY OF LONGCHAR; VAR destination: ARRAY OF LONGCHAR)
COPY
. Unlike
COPY
, this procedure can be assigned to a procedure variable.
(sourceLength: INTEGER; VAR destination: ARRAY OF CHAR): BOOLEAN
(sourceLength: INTEGER; VAR destination: ARRAY OF LONGCHAR): BOOLEAN
Pre-condition: sourceLength is not negative.
Example:
VAR source: ARRAY 6 OF CHAR; destination: ARRAY 4 OF CHAR; source := ""; Strings.CanAssignAll (Strings.Length (source), destination); => TRUE Strings.Assign (source, destination); => destination = "" source := "abc"; Strings.CanAssignAll (Strings.Length (source), destination); => TRUE Strings.Assign (source, destination); => destination = "abc" source := "abcd"; Strings.CanAssignAll (Strings.Length (source), destination); => FALSE Strings.Assign (source, destination); => destination = "abc"
(source: ARRAY OF CHAR; startPos, numberToExtract: INTEGER; VAR destination: ARRAY OF CHAR)
(source: ARRAY OF LONGCHAR; startPos, numberToExtract: INTEGER; VAR destination: ARRAY OF LONGCHAR)
Length(source)
.
Pre-condition: startPos and numberToExtract are not negative.
(sourceLength, startPos, numberToExtract: INTEGER; VAR destination: ARRAY OF CHAR): BOOLEAN
(sourceLength, startPos, numberToExtract: INTEGER; VAR destination: ARRAY OF LONGCHAR): BOOLEAN
TRUE
if there are numberToExtract characters starting
at startPos and within the sourceLength of some string, and if
the capacity of destination is sufficient to hold
numberToExtract characters; otherwise returns FALSE
.
Pre-condition: sourceLength, startPos, and numberToExtract are not negative.
Example:
VAR source: ARRAY 6 OF CHAR; destination: ARRAY 4 OF CHAR; source := "abcde"; Strings.CanExtractAll (Strings.Length (source), 0, 3, destination); => TRUE Strings.Extract (source, 0, 3, destination); => destination = "abc" Strings.CanExtractAll (Strings.Length (source), 3, 2, destination); => TRUE Strings.Extract (source, 3, 2, destination); => destination = "de" Strings.CanExtractAll (Strings.Length (source), 0, 4, destination); => FALSE Strings.Extract (source, 0, 4, destination); => destination = "abc" Strings.CanExtractAll (Strings.Length (source), 2, 4, destination); => FALSE Strings.Extract (source, 2, 4, destination); => destination = "cde" Strings.CanExtractAll (Strings.Length (source), 5, 1, destination); => FALSE Strings.Extract (source, 5, 1, destination); => destination = "" Strings.CanExtractAll (Strings.Length (source), 4, 0, destination); => TRUE Strings.Extract (source, 4, 0, destination); => destination = ""
(VAR stringVar: ARRAY OF CHAR; startPos, numberToDelete: INTEGER)
(VAR stringVar: ARRAY OF LONGCHAR; startPos, numberToDelete: INTEGER)
Length(stringVar)
.
Pre-condition: startPos and numberToDelete are not negative.
(stringLength, startPos, numberToDelete: INTEGER): BOOLEAN
(stringLength, startPos, numberToDelete: INTEGER): BOOLEAN
TRUE
if there are numberToDelete characters starting at
startPos and within the stringLength of some string; otherwise
returns FALSE
.
Pre-condition: stringLength, startPos and numberToDelete are not negative.
Example:
VAR stringVar: ARRAY 6 OF CHAR; startPos: INTEGER; stringVar := "abcd"; Strings.CanDeleteAll (Strings.Length (stringVar), 0, 4); => TRUE Strings.Delete (stringVar, 0, 4); => stringVar = "" stringVar := "abcd"; Strings.CanDeleteAll (Strings.Length (stringVar), 1, 2); => TRUE Strings.Delete (stringVar, 1, 2); => stringVar = "ad" stringVar := "abcd"; Strings.CanDeleteAll (Strings.Length (stringVar), 0, 5); => FALSE Strings.Delete (stringVar, 0, 5); => stringVar = ""
(source: ARRAY OF CHAR; startPos: INTEGER; VAR destination: ARRAY OF CHAR)
(source: ARRAY OF LONGCHAR; startPos: INTEGER; VAR destination: ARRAY OF LONGCHAR)
Length(source)
. If startPos=Length(source)
,
then source is appended to destination.
Pre-condition: startPos is not negative.
(sourceLength, startPos: INTEGER; VAR destination: ARRAY OF CHAR): BOOLEAN
(sourceLength, startPos: INTEGER; VAR destination: ARRAY OF LONGCHAR): BOOLEAN
TRUE
if there is room for the insertion of sourceLength
characters from some string into destination starting at
startPos; otherwise returns FALSE
.
Pre-condition: sourceLength and startPos are not negative.
Example:
VAR source: ARRAY 6 OF CHAR; destination: ARRAY 8 OF CHAR; source := "abc"; destination := "012"; Strings.CanInsertAll (Strings.Length (source), 1, destination); => TRUE Strings.Insert (source, 1, destination); => destination = "0abc12" Strings.CanInsertAll (Strings.Length (source), 3, destination); => TRUE Strings.Insert (source, 3, destination); => destination = "012abc" Strings.CanInsertAll (Strings.Length (source, 4, destination); => FALSE Strings.Insert (source, 4, destination); => destination = "012" source := "abcde"; destination := "012356"; Strings.CanInsertAll (Strings.Length (source), 0, destination); => FALSE Strings.Insert (source, 0, destination); => destination = "abcde01" Strings.CanInsertAll (Strings.Length (source), 4, destination); => FALSE Strings.Insert (source, 4, destination); => destination = "0123abc"
(source: ARRAY OF CHAR; startPos: INTEGER; VAR destination: ARRAY OF CHAR)
(source: ARRAY OF LONGCHAR; startPos: INTEGER; VAR destination: ARRAY OF LONGCHAR)
Length(source)
.
Notice that Replace
does not continue past the string terminator
0X
in destination. That is, Length(destination)
will never be changed by Replace
.
Pre-condition: startPos is not negative.
(sourceLength, startPos: INTEGER; VAR destination: ARRAY OF CHAR): BOOLEAN
(sourceLength, startPos: INTEGER; VAR destination: ARRAY OF LONGCHAR): BOOLEAN
TRUE
if there is room for the replacement of
sourceLength characters in destination starting at
startPos; otherwise returns FALSE
.
Pre-condition: sourceLength and startPos are not negative.
Example:
VAR source, destination: ARRAY 6 OF CHAR; source := "ab"; destination := "1234"; Strings.CanReplaceAll (Strings.Length (source), 0, destination); => TRUE Strings.Replace (source, 0, destination); => destination = "ab34" source := "abc"; destination := "1234"; Strings.CanReplaceAll (Strings.Length (source), 2, destination); => FALSE Strings.Replace (source, 2, destination); => destination = "12ab" source := ""; destination := "1234"; Strings.CanReplaceAll (Strings.Length (source), 4, destination); => TRUE Strings.Replace (source, 4, destination); => destination = "1234" source := ""; destination := "1234"; Strings.CanReplaceAll (Strings.Length (source), 5, destination); => FALSE Strings.Replace (source, 5, destination); => destination = "1234"
(source: ARRAY OF CHAR; VAR destination: ARRAY OF CHAR)
(source: ARRAY OF LONGCHAR; VAR destination: ARRAY OF LONGCHAR)
(sourceLength: INTEGER; VAR destination: ARRAY OF CHAR): BOOLEAN
(sourceLength: INTEGER; VAR destination: ARRAY OF LONGCHAR): BOOLEAN
TRUE
if there is sufficient room in destination to
append a string of length sourceLength to the string in
destination; otherwise returns FALSE
.
Pre-condition: sourceLength is not negative.
Example:
VAR source, destination: ARRAY 6 OF CHAR; source := "12"; destination := "abc"; Strings.CanAppendAll (Strings.Length (source), destination); => TRUE Strings.Append (source, destination); => destination = "abc12" source := "123"; destination := "abc"; Strings.CanAppendAll (Strings.Length (source), destination); => FALSE Strings.Append (source, destination); => destination = "abc12" source := "123"; destination := "abcde"; Strings.CanAppendAll (Strings.Length (source), destination); => FALSE Strings.Append (source, destination); => destination = "abcde"
(source1, source2: ARRAY OF CHAR; VAR destination: ARRAY OF CHAR)
(source1, source2: ARRAY OF LONGCHAR; VAR destination: ARRAY OF LONGCHAR)
Concat
.
(source1Length, source2Length: INTEGER; VAR destination: ARRAY OF CHAR): BOOLEAN
(source1Length, source2Length: INTEGER; VAR destination: ARRAY OF LONGCHAR): BOOLEAN
TRUE
if there is sufficient room in destination for a
two strings of lengths source1Length and source2Length;
otherwise returns FALSE
.
Pre-condition: source1Length and source2Length are not negative.
Example:
VAR source1, source2: ARRAY 5 OF CHAR; destination: ARRAY 6 OF CHAR; source1 := "12"; source2 := "abc"; Strings.CanConcatAll (Strings.Length (source1), Strings.Length (source2), destination); => TRUE Strings.Concat (source1, source2, destination); => destination = "12abc" source1 := "123"; source2 := "abc"; Strings.CanConcatAll (Strings.Length (source1), Strings.Length (source2), destination); => FALSE Strings.Concat (source1, source2, destination); => destination = "123ab" source1 := ""; source2 := "abc"; Strings.CanConcatAll (Strings.Length (source1), Strings.Length (source2), destination); => TRUE Strings.Concat (source1, source2, destination); => destination = "abc"
These procedures provide for the comparison of string values, and for the location of substrings within strings.
(stringVal1, stringVal2: ARRAY OF CHAR): CompareResults
(stringVal1, stringVal2: ARRAY OF LONGCHAR): CompareResults
less
, equal
, or greater
, according as
stringVal1 is lexically less than, equal to, or greater than
stringVal2.
Please note: Oberon-2 already contains predefined comparison operators on strings.
CompareResults
and its related constants are used with procedure
Compare
. The following constants are defined for its value:
Example:
VAR stringVal1, stringVal2: ARRAY 4 OF CHAR; stringVal1 := "abc"; stringVal2 := "abc"; Strings.Compare (stringVal1, stringVal2); => equal stringVal1 := "abc"; stringVal2 := "abd"; Strings.Compare (stringVal1, stringVal2); => less stringVal1 := "ab"; stringVal2 := "abc"; Strings.Compare (stringVal1, stringVal2); => less stringVal1 := "abd"; stringVal2 := "abc"; Strings.Compare (stringVal1, stringVal2); => greater
(stringVal1, stringVal2: ARRAY OF CHAR): BOOLEAN
(stringVal1, stringVal2: ARRAY OF LONGCHAR): BOOLEAN
stringVal1=stringVal2
. That is, Equal
returns TRUE
if the string value of stringVal1 is the same as
the string value of stringVal2; otherwise, it returns FALSE
.
Unlike the predefined operator =
, this procedure can be assigned to a
procedure variable.
Example:
VAR stringVal1, stringVal2: ARRAY 4 OF CHAR; stringVal1 := "abc"; stringVal2 := "abc"; Strings.Equal (stringVal1, stringVal2); => TRUE stringVal1 := "abc"; stringVal2 := "abd"; Strings.Equal (stringVal1, stringVal2); => FALSE stringVal1 := "ab"; stringVal2 := "abc"; Strings.Equal (stringVal1, stringVal2); => FALSE
(pattern, stringToSearch: ARRAY OF CHAR; startPos: INTEGER; VAR patternFound: BOOLEAN; VAR posOfPattern: INTEGER)
(pattern, stringToSearch: ARRAY OF LONGCHAR; startPos: INTEGER; VAR patternFound: BOOLEAN; VAR posOfPattern: INTEGER)
If startPos<Length(stringToSearch)
and pattern is
found, patternFound is returned as TRUE
and posOfPattern
contains the start position in stringToSearch of pattern (i.e.,
posOfPattern is in the range
[startPos..Length(stringToSearch)-1]
)
Otherwise, patternFound is returned as FALSE
and
posOfPattern is unchanged.
If startPos>Length(stringToSearch)-Length(pattern)
,
then patternFound is returned as FALSE
.
Pre-condition: startPos is not negative.
Example:
VAR pattern: ARRAY 4 OF CHAR; stringToSearch: ARRAY 9 OF CHAR; found: BOOLEAN; posOfPattern: INTEGER; pattern := "ab"; stringToSearch := "ababcaba"; Strings.FindNext (pattern, stringToSearch, 0, found, posOfPattern); => TRUE, posOfPattern = 0 Strings.FindNext (pattern, stringToSearch, 1, found, posOfPattern); => TRUE, posOfPattern = 2 Strings.FindNext (pattern, stringToSearch, 2, found, posOfPattern); => TRUE, posOfPattern = 2 Strings.FindNext (pattern, stringToSearch, 3, found, posOfPattern); => TRUE, posOfPattern = 5 Strings.FindNext (pattern, stringToSearch, 4, found, posOfPattern); => TRUE, posOfPattern = 5 Strings.FindNext (pattern, stringToSearch, 5, found, posOfPattern); => TRUE, posOfPattern = 5 Strings.FindNext (pattern, stringToSearch, 6, found, posOfPattern); => FALSE, posOfPattern unchanged pattern := ""; stringToSearch := "abc"; Strings.FindNext (pattern, stringToSearch, 2, found, posOfPattern); => TRUE, posOfPattern = 2 Strings.FindNext (pattern, stringToSearch, 3, found, posOfPattern); => FALSE, posOfPattern unchanged
(pattern, stringToSearch: ARRAY OF CHAR; startPos: INTEGER; VAR patternFound: BOOLEAN; VAR posOfPattern: INTEGER)
(pattern, stringToSearch: ARRAY OF LONGCHAR; startPos: INTEGER; VAR patternFound: BOOLEAN; VAR posOfPattern: INTEGER)
If pattern is found, patternFound is returned as TRUE
and posOfPattern contains the start position in stringToSearch
of pattern (i.e., posOfPattern is in the range
[0..startPos]
).
Otherwise, patternFound is returned as FALSE
and
posOfPattern is unchanged (in this case, the pattern might be found at
startPos).
The search will fail if startPos is negative.
If startPos>Length(stringToSearch)-Length(pattern)
the whole string value is searched.
Example:
VAR pattern: ARRAY 4 OF CHAR; stringToSearch: ARRAY 9 OF CHAR; found: BOOLEAN; posOfPattern: INTEGER; pattern := "abc"; stringToSearch := "ababcaba"; Strings.FindPrev(pattern, stringToSearch, 1, found, posOfPattern); => FALSE, posOfPattern unchanged Strings.FindPrev(pattern, stringToSearch, 2, found, posOfPattern); => TRUE, posOfPattern = 2 Strings.FindPrev(pattern, stringToSearch, 3, found, posOfPattern); => TRUE, posOfPattern = 2 pattern := "ab"; stringToSearch := "ababcaba"; Strings.FindPrev(pattern, stringToSearch, 0, found, posOfPattern); => TRUE, posOfPattern = 0 Strings.FindPrev(pattern, stringToSearch, 1, found, posOfPattern); => TRUE, posOfPattern = 0 Strings.FindPrev(pattern, stringToSearch, 2, found, posOfPattern); => TRUE, posOfPattern = 2 Strings.FindPrev(pattern, stringToSearch, 3, found, posOfPattern); => TRUE, posOfPattern = 2 Strings.FindPrev(pattern, stringToSearch, 4, found, posOfPattern); => TRUE, posOfPattern = 2 Strings.FindPrev(pattern, stringToSearch, 5, found, posOfPattern); => TRUE, posOfPattern = 5 pattern := ""; stringToSearch := "abc"; Strings.FindPrev(pattern, stringToSearch, -1, found, posOfPattern); => FALSE, posOfPattern unchanged Strings.FindPrev(pattern, stringToSearch, 0, found, posOfPattern); => TRUE, posOfPattern = 0 Strings.FindPrev(pattern, stringToSearch, 4, found, posOfPattern); => TRUE, posOfPattern = 3
(stringVal1, stringVal2: ARRAY OF CHAR; VAR differenceFound: BOOLEAN; VAR posOfDifference: INTEGER)
(stringVal1, stringVal2: ARRAY OF LONGCHAR; VAR differenceFound: BOOLEAN; VAR posOfDifference: INTEGER)
FALSE
; and TRUE
otherwise.
If differenceFound is TRUE
, posOfDifference is set to the
position of the first difference; otherwise posOfDifference is
unchanged.
Example:
VAR stringVal1, stringVal2: ARRAY 4 OF CHAR; diffFound: BOOLEAN; posOfDiff: INTEGER; stringVal1 := "abc"; stringVal2 := "abc"; Strings.FindDiff(stringVal1, stringVal2, diffFound, posOfDiff); => FALSE, posOfDifference unchanged stringVal1 := "ab"; stringVal2 := "ac"; Strings.FindDiff(stringVal1, stringVal2, diffFound, posOfDiff); => TRUE, posOfDifference = 1 stringVal1 := "ab"; stringVal2 := "a"; Strings.FindDiff(stringVal1, stringVal2, diffFound, posOfDiff); => TRUE, posOfDifference = 1
(stringVal: ARRAY OF CHAR): INTEGER
(stringVal: ARRAY OF LONGCHAR): INTEGER
0X
.
Example:
Strings.Length("Hello, world"); => 12 VAR stringVal: ARRAY 6 OF CHAR; stringVal := ""; Strings.Length(stringVal); => 0 stringVal := "12"; Strings.Length(stringVal); => 2
Recall that if you instead need the total size of the character
array, you should use the standard Oberon-2 function procedure LEN
:
VAR aString: ARRAY 32 OF CHAR; aString := "Hello, world"; LEN(aString) => 32
(VAR stringVar: ARRAY OF CHAR)
(VAR stringVar: ARRAY OF LONGCHAR)
CAP
to each character of the string value in
stringVar.
Example:
VAR stringVar: ARRAY 6 OF CHAR; stringVar := "abc"; Strings.Capitalize (stringVar); => stringVar = "ABC" stringVar := "0aB"; Strings.Capitalize (stringVar); => stringVar = "0AB"
Go to the first, previous, next, last section, table of contents.