[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Revised LONGCHAR proposal
> From: "Eric W. Nikitin" <enikitin@apk.net>
> Date: Fri, 5 Mar 1999 13:55:32 -0500 (EST)
>
> Here is the revised proposal for LONGCHAR.
>
> If this is acceptable, I believe Michael van Acken will implement the
> compiler changes and probably module LongStrings as well,
I have. See companion mail.
> and I'll do the documentation. Can I get some volunteers to work on
> the rest of the library changes?
>
>
> Thanks,
> Eric
> ---
>
> In order to support the Unicode character set, OOC adds the type
> LONGCHAR and introduces the concept of long strings. The `character
> types' are now CHAR and LONGCHAR, and the `string types' are String
> and LongString.
This definition of "String"/"LongString" is a but problematic, because
further down it is said, e.g., that "LONG(String)" is a legal
expression. It _is_ legal, as long as the argument is a string
constant, but is is not legal if it is a variable of type array of
CHAR.
Does anyone know if the O2 language report uses the term "string" with
a different meaning than "string constant"? I would prefer to reserve
"string" for constants, not for ARRAY values (even if they happen to
be terminated by a 0X). But we must find a way to distinguish the
different kinds of string constants, maybe string(CHAR) and
string(LONGCHAR), with simple "string" referring to either of them?
> I. Language
>
> The basic character types are as follows:
>
> * CHAR the characters of the ISO-Latin-1 (i.e., ISO-8859-1)
> character set (0X..0FFX)
>
> * LONGCHAR the characters of the Unicode character set
> (0X..0FFFFX)
>
> The character type LONGCHAR includes the values of type CHAR
> according to the following hierarchy:
>
> LONGCHAR >= CHAR
And
LongString => String
That is, a string constant composed of CHAR can also be used in place
of a string constant composed of LONGCHAR. The usual implicit type
conversion rules (as known from integer types) apply to character
values and string constants, too.
> Character constants are denoted by the ordinal number of the
> character in hexadecimal notation followed by the letter X. The type
> of a character constant is the minimal type to which the constant
> value belongs. (i.e., If the constant value is in the range
> `0X..0FFX', its type is CHAR; otherwise, it is LONGCHAR).
>
>
> Constant strings which consist solely of characters in the range
> `0X..0FFX' and strings stored in an ARRAY OF CHAR are of type String,
> all others are of type LongString.
>
> Constants strings can be represented using the string concatenation
> operator `+' and a combination of characters or string constants.
> For example, the following is of type LongString:
>
> CONST
> aLongString = 0C0ACX + 0C6A9X + " " + 0C2E4X + 0D328X;
Which is a string of Hangul syllables as far as I am able to determine
:-]
> The following predeclared function procedures support these
> additional operations:
>
> Name Argument type Result type Function
> LONG(x) CHAR LONGCHAR identity
> String LongString identity
>
> LONGCHR(x) integer type LONGCHAR long character with
> ordinal value x
>
> ORD(x) LONGCHAR LONGINT ordinal value of x
>
> SHORT(x) LONGCHAR CHAR projection
> LongString String projection
>
> Please Note:
>
> SHORT(x), where x is of type LONGCHAR, can result in overflow, which
> triggers a compilation or run-time error. The result of an operation
> that causes an overflow, but is not detected as such, is undefined.
Also:
CAP(x) CHAR CHAR
Maps lower case letters from ISO-Latin-1 to the capital counterparts,
identity for all other characters. Exception: U+00DF (LATIN SMALL
LETTER SHARP S) whose uppercase version is the two letter sequence
"SS", and U+00FF (LATIN SMALL LETTER Y WITH DIARESIS) whose capital
version is outside the ISO-Latin-1 range (it has the code U+0178).
These two characers are also mapped onto themselves.
[Gosh, I am being pedantic today! This should appear in the "Language
Specifications" section of the manual.]
CAP(x) LONGCHAR LONGCHAR
Restricted to the range of CHAR, this function is equivalent to
CAP(CHAR). Outside this range it is equivalent to identity. [I am
adding this for symmetry reasons.]
MIN(LONGCHAR) and MAX(LONGCHAR) are also defined as expected.
> The predeclared procedure COPY(x, v) also supports LongStrings.
>
> Name Argument type Function
> COPY(x, v) x: character array, string v := x
> v: character array
>
> Note that, COPY(x, v) is invalid if x is of type ARRAY OF CHAR, and v
> is of type LongString or ARRAY OF LONGCHAR.
>
>
> String types are assignment compatible as follows:
>
> An expression e of type Te is assignment compatible with a variable
> v of type Tv if one of the following conditions hold:
>
> 1. Tv is an array of LONGCHAR, Te is LongString or String, and
> LEN(e) < LEN(v);
> 2. Tv is an array of CHAR, Te is String, and LEN(e) < LEN(v);
>
>
> String types are array compatible as follows:
>
> An actual parameter a of type Ta is array compatible with a formal
> parameter f of type Tf if
>
> 1. Tf is an open array of LONGCHAR and Ta is LongString, or
> 2. Tf is an open array of CHAR and Ta is String.
Based on a (not up to date) copy of the language report, the rule (3)
from "Array compatible" is probably better rephrased as
An actual parameter a of type Ta is array compatible with a formal
parameter f of type Tf if
3a. f is a value parameter of type ARRAY OF CHAR and a is a String,
or
3b. f is a value parameter of type ARRAY OF LONGCHAR and a is a
String or LongString.
> Character and string types are expression compatible as follows:
>
> Operator First operand Second operand Result type
> = # < <= > >= character type character type BOOLEAN
> string type string type BOOLEAN
Just to make this clear: implicit type conversion rules apply to both
character values and string constants. They do _not_ apply to
character arrays.
> II. Library
>
> [...]
> CSTRING would also need to be changed to deal with
> LongStrings. (Or is it that we need to add CWIDESTRING?)
This is just a type flag govering assigment rules of array variables.
So sticking to CSTRING is sufficient IMO.
-- mva