[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: TextRider bug on last token
> The question still remains, how different end of line conventions
> can be selected for the TextRider module. The current
> implementation hardwires a single end of line character (ASCII.lf,
> I believe) into the module. I think we must make the eol
> convention an attribute of a rider instance instead. Then, to
That would make sense.
> deal with MS-DOG's CR/LF, the mechanism to "unget" characters must
> be extended to deal with 2 or more characters. I believe, MS-DOG
Yes, that makes things very messy. Maybe the end-of-line "character"
should be defined as a string which is treated as an aggregate
character. Both CR/LF are read but only a CR is actually returned.
When ungetting, maybe it's enough just to unget just the CR. Would
that cause problems for anyone? Obviously, the position would also
have to be updated by 2 whenever an eol is reached.
> also has an end of file character (^Z). Do we need to tackle
> this, too?
I believe that will cause problems since it will be read as a valid
character -- assuming that control characters are enabled. The current
method of signaling an end of file (at least within the scanner) is to
return a 0X character. I suppose that could be altered to be a ^Z
character instead.
Michael G.