[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TextRider bug on last token



> The question still remains, how different end of line conventions
> can be selected for the TextRider module.  The current
> implementation hardwires a single end of line character (ASCII.lf,
> I believe) into the module.  I think we must make the eol
> convention an attribute of a rider instance instead.  Then, to

That would make sense.

> deal with MS-DOG's CR/LF, the mechanism to "unget" characters must
> be extended to deal with 2 or more characters.  I believe, MS-DOG

Yes, that makes things very messy.  Maybe the end-of-line "character"
should be defined as a string which is treated as an aggregate
character.  Both CR/LF are read but only a CR is actually returned.
When ungetting, maybe it's enough just to unget just the CR.  Would
that cause problems for anyone?  Obviously, the position would also
have to be updated by 2 whenever an eol is reached.

> also has an end of file character (^Z).  Do we need to tackle
> this, too?

I believe that will cause problems since it will be read as a valid
character -- assuming that control characters are enabled.  The current
method of signaling an end of file (at least within the scanner) is to
return a 0X character.  I suppose that could be altered to be a ^Z
character instead.

Michael G.