[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LONGCHAR proposal
> > > > SHORT(x) LONGCHAR CHAR projection
> > >
> > > This will lose the most significant part of the 2-byte Unicode
> > > character.
> >
> > True. But are you saying that this simply needs to be noted in the
> > proposal? Or that SHORT shouldn't be defined to operate on LONGCHAR
> > and LongStrings?
>
> There are two ways to deal with this loss of information.
>
> 1) Truncation. Disadvantage: The character mapping to latin-1 is
> quite arbitrary.
>
> 2) Mapping of 0100X..FFFFX onto a single character, e.g. "?".
> Advantage: More deterministic, and the shorted string can be
> readable if the original Unicode text uses mostly latin-1
> characters, e.g. if it is an English text with a few special
> characters like quotes, hyphens, and the like. The effect would be
> pretty much like viewing a HTML document produced by a MS product
> with a non-Windows browser.
Silly me. SHORT(LONGCHAR) is integer arithmetik, and integer
arithmetic can overflow. If it overflows it will trigger a
compilation or run-time error. The result of an operation that causes
an overflow, but is not detected as such, is undefined. This, of
course, is The Right Way(tm) to deal with SHORT(LONGCHAR).
-- mva