[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: TextRider bug on last token
- To: ooc-list@informatik.uni-kl.de
- Subject: Re: TextRider bug on last token
- From: Ian Rae <ianrae@istar.ca>
- Date: Fri, 26 Jun 1998 22:20:40 -0400
- MMDF-Warning: Parse error in original version of preceding line at uklirb.informatik.uni-kl.de
> From: Mark K. Gardner <mkgardne@rtsl3.cs.uiuc.edu>
> Stewart Smith <ssmith@murdoch.edu.au> wrote:
> > ..snip..
> ..snip..
> My favorite solution, based on experience gained by porting many
> programs between Unix, Macintosh, and MS-DOS / Windows, is to always
> read files in binary mode (using #ifdef's as appropriate). I then
> convert CR-LF or CR sequences to LF before I use the data as text.
> (Most usually, I have a compiler-like scanner anyway so accepting CR,
> LF, or CR-LF is not a problem in my software.)
> ..snip..
> 
Here's a proposal for eol support in TextRider.  It's goals are:
-eol handling can be set on a per TextRider object basis
-OOC would support each platform's native eol format (POSIX, DOS, Mac)
-Ott (my text i/o library) could adjust eol format on per-channel basis
-TextRider.Mod stays as Oberon-2 code (no C code)
----Text End Of Line Handling Proposal----
1. Add a new module that would be implemented in C so that it could
be platform-specific using #ifdef.
MODULE TextEOL;
(*
    TextEOL -- defines end-of-line handling for text data streams.
*)
IMPORT Ascii;
CONST
(* eolTypes -- add more if necessary *)
 isLF = 0;
 isCRLF = 1;
 isCR = 2;
TYPE
  EndOfLine* = POINTER TO EndOfLineDesc;
  EndOfLineDesc* = RECORD
    type-: SHORTINT; (* on of the isXXX constants *)
    prefix-: CHAR; 	(* leading eol char, eg cr. undefined if eolLength # 2
*)
    eol-: CHAR;			(* final or only eol char *)
    length-: SHORTINT; (* 1 or 2 *)
  END;
PROCEDURE Init(VAR eol: EndOfLine);
(* init to platform's eol format *)
BEGIN
(* the following is presented in C *)
eol.type = isLF;
#ifdef _MSDOS_
eol.type = isLFCR;
#endif
/* handle Mac here with ifdef too... */
Force(eol, eol.type);
END Init;
PROCEDURE Force(VAR eol: EndOfLine, type: SHORTINT);
(* useful when channel-specific eol handling is required *)
BEGIN
(* the following is presented in C *)
eol.type = type;
switch(eol.type) {
	default:
	case 0:	eol.prefix = 0X;
		eol.eol =  Ascii.lf;
		eol.length = 1;
		break;
	case 1:	eol.prefix = Ascii.cr;
		eol.eol = Ascii.lf;
		eol.length = 2;
		break;
		
	case 2:	eol.prefix = 0X;
		eol.eol =  Ascii.cr;
		eol.length = 1;
		break;
}
END Init;
END TextEOL;
2.  Then change TextRider to use the new eol type
 IMPORT TextEOL, ...
  Reader* = POINTER TO ReaderDesc;
  ReaderDesc* = RECORD
	eol-: TextEOL.EndOfLine;	
	...
  END;
  Writer* = POINTER TO WriterDesc;
  WriterDesc* = RECORD
	eol-: TextEOL.EndOfLine;	
	...
  END;
 and ConnectReader would do
    r: Reader; t: Channel.Reader;
  BEGIN
    t:=ch.NewReader();
    IF t=NIL THEN RETURN NIL END;
    NEW(r);
    TextEOL.Init(r.eol);   (* <==== added *)
    ...
  ConnectWriter would do the same...
 Note. After calling Init my portable text i/o library version of
TextRider could call Force  based on channel-specific information.
 Then TextRider would use the eol object in several places.
a) ReadLine would now end:
	IF (cnt > 0) & (r.eol.length = 2) & (s[cnt-1] = r.eol.prefix) THEN
		DEC(cnt);
	END;
    s[cnt]:=0X (* terminate string *)
  END ReadLine;
b) WriteLn becomes
PROCEDURE (w: Writer) WriteLn*, NEW;
(* Write a newline *)
  BEGIN
	IF w.eol.length = 2 THEN
	   w.WriteChar(w.eol.prefix);
	END;
    w.WriteChar(w.eol.eol)
  END WriteLn;
c) I haven't added the UngetChar() eol handling.  It'll be a bit ticklish,
but better
 done here in TextRider than in Channel which is more position sensitive.
3.  Ctrl-Z handling for DOS is not IMO required.  It hasn't been commonly
used since
about DOS 2 or DOS.  Certainly none of the Windows code I've written in the
last
10 years ever used it.
I hope this is not overkill.  It should cover the bases.
--Ian