[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ooc-980713 & oo2c-1.3.8



> I've tested TextRider and unfortunately it's broken in a
> number of ways; the simple problems are described below.  The
> more complicate problem is with eol auto-detection.  I used 8
> test files. 4 one-line files ending in: nothing, cr, lf, or
> cr/lf. A second 4 files are two-line versions of the first
> four.

Thanks for giving the module a thorough bashing.  Unfortunately you
seem to have tested an old version.  Please use the module from 1.3.8
and run your checks again.

>  When the eol is lf or cr/lf things work fairly well.  When
> its nothing or cr I can't seem to make a fix for one of the
> test files that doesn't break another!  The test is to call
> r.ReadLine() and get the correct characters and then r.Res() #
> 0 on subsequent calls.  I expect that ReadLn, and possibly all
> the ReadBool, etc are also all affected but haven't tested
> them.
> 
> 
> So here are two proposals:
> 1. Fix TextRider.
> or 
> 2. Remove auto-detection of eol.  Default eolLen=1, eol=0AX,
>    and require all other implementations to use SetEol().
> 
> ---Problems fixed:---
> (I have to change TextRider.mod to port it to BBox, so I'll
> just describe the fixes individually.  Original code is marked
> "old" and the fix is marked "new")
> 
> 1. 2line cr/lf test file.  Eol(): once deferredEol was set, it
> never looked ahead 2 to clear it.
> old:
>       ELSE
>         IF Lookahead (r, 1) & (r. la[0] = Ascii.cr) THEN
>           (* unfinished auto detection *)
>           RETURN TRUE

This is correct (the idea, not the code -- it has changed in
1.3.8).  It is not supposed to look ahead another character,
or to clear anything.  If deferredEol=TRUE, then it is at a
character that denotes an end-of-line, and can produce the
correct answer.  Doing an additional lookahead may block.

> 2. Typo in Consume
> (*bug 2: old
>         r. laRes[i-1] := r. laRes[i-1]
> new:*)
>         r. laRes[i-1] := r. laRes[i]

Ouch.  This one looks awfully familiar.  I know that I put a
correction for this weeks ago on http://home.t-online.de/...
This is also in 1.3.8, of course.

> 3. r.Res() returns 0 even after reading past the eof.  The
> problem is with Lookahead and Consume.  In this pair of
> functions Lookahead always clears r.byteReader.res and Consume
> propogates any lookahead res values back into
> r.byteReader.res.  If Consume is called last everything is
> fine.  If Lookahead is ever the last function of the pair to
> be called res will always be 0.  

`Lookahead' returning `done' is fine.

Quoting the specs:

PROCEDURE Lookahead (r: Reader; len: INTEGER): BOOLEAN;
(*
Tries to read `len' characters past the current position
from the input stream.  Characters present in the lookahead
FIFO are taken into account.  After successful completion,
`len' characters or more are available in the lookahead FIFO
and result is TRUE.  Less than `len' characters may be
available if the operation is aborted due to a read error.
In this case result is FALSE.

pre: (len >= 1) & (r.Res() = done) & (len <= maxLookahead) 
post: (r.Res() = done)
*)

Notice that it requires r.Res()=done, and asserts that
r.Res()=done after completion.  The reason for this is that
it does not consume any characters as far as the reader is
concerned.  Therefore it cannot set any error code, but it
has to store them for `Consume'.

> The fix to ReadLine is as
> follows but it might have to be repeated in many
> functions...there's probably a better way.
> 
> (*bug#3: new:*)
> 	IF r.Available() <= 0 THEN
> 		r.byteReader.res = Channel.readAfterEnd;
> 		s[0] := 0X;
> 		RETURN
> 	END;

Note that r.Available()=0 does not imply `readAfterEnd'.

> 4. Available() returns > 0 after eof because it was
> considering anything in la valid without checking laRes.
> 
> (*bug #4: old:
>     IF (avail < 0) & (r. laLen > 0) THEN
>       RETURN r. laLen
>     ELSE
>       RETURN avail+r. laLen
>     END
> new:*)
>   numNonChars := 0; 
>   FOR i := 0 TO r.laLen - 1 DO
>     IF r.laRes[i] # done THEN INC(numNonChars); END;
>   END;
>   IF (avail < 0) & (r. laLen > 0) THEN
>      RETURN r.laLen - numNonChars;
>   ELSE
>       RETURN avail+r. laLen - numNonChars;
>   END

I agree with you here.  I will incorporate the fix into my
sources.  

I believe that only the last entry of `laRes',
`r.laRes[r.laLen-1]', can ever be something different than
`done', but I have never investigated this in detail.  In
other words, keeping an array of old result is probably not
necessary.  On the other hand, I am to lazy to change this
now :-)

-- mva