[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 64-bit extensions



Michael van Acken wrote:

> [The quotes below are taken from Mike's, Hartmut's, and Tim's
> postings.] 
> 
> > It has never made much sense to not have any operations defined on
> > BYTEs since they are a basic data type all computers support.  I
> > don't see any advantage to defining an arbitrary SYSTEM entity the
> > same size as a BYTE.
> 
> There _are_ operations on SYSTEM.BYTE: you can assign SHORTINT and
> CHAR values to it.  And then there is the generic VAR parameter ARRAY
> OF SYSTEM.BYTE.  The _only_ purpose of this type is to circumvent the
> type system.  This is an unsafe operation, as the SYSTEM prefix nicely
> points out.  I'll never understand why the CP people chose BYTE as the
> name of a plain and simple integer type.

I guess they must think the way I do.  Why Wirth choose to allow a
way to break the type system is beyond me.
 
> > Of course, the argument might be made that this will break the type
> > safeness of Oberon-2.  But it's already broken.  My response would
> > be to close this loophole and not allow the concept of any type
> > being compatible with an ARRAY OF BYTE.  I don't know of any
> > functions which couldn't be performed in a type-safe way without
> > this loophole.
> 
> Yes, O2 has no true type safety as long as ARRAY OF BYTE is around.
> But you'll be hard pressed to implement Files.ReadBytes & friends
> without it.  Unless you turn the language into something that isn't O2
> anymore.

There _are_ other ways to do I/O without relying on ARRAY OF SYSTEM.BYTE.
For example in Oberon/F there is no concept of such a thing.  Lowest
level I/O is done with the CHAR and ARRAY OF CHAR types.  Of course,
you need the VAL() function to do the type casts. 

> A Java back-end isn't really an option.  You can't efficiently
> translate O2 to Java byte-code because there are lots of differences
> in the type and class concepts of the languages, and Java byte-code
> includes lots of Java type information.  It's (maybe) possible to
> conceive a back-end translating to Java, but it would be clumsy beyond
> measure and most likely slightly incompatible with standard O2.

Could you elaborate a little bit on this.  It seems that O2 and Java
are very close in architecture.  I'm surprised that it would be that
difficult to provide byte-code output from an O2 compiler.  I know
Ada compilers exist which produce Java byte-code output so it is at
least possible.  I don't know about how "clumsy" that backend is.

I've looked at the Java VM and the types map very closely to those
in Oberon-2 (except we don't have a 64-bit integer and don't support
Unicode in the compiler).  They don't even have an unsigned integer
concept anymore.  As far as I can see the class concepts are part
of Java but have nothing to do with the Java VM.

> [changing LONGINT to 64 bit]
> > > But wouldn't that break compatibility between the 32bit and the 64bit
> > > version. Can files (ascci, binary) writen by the first loaded by the
> > > second and the other way round?
> > 
> > It was never intended to have the two binary compatible.  [...] If
> > an application takes care, it would be possible to have binary files
> > produced which are compatible on both systems but I don't see any
> > advantage to that.  Of course, the C-object/library files would be
> > useable on both systems.
> 
> I'll never accept a change to the type system in such a way that a
> module will read/write different data depending on the OOC compiler it
> was compiled under.  
> 
> As for usability of "C-object/library files": Assuming Mike refers to
> foreign modules implemented in C, like Files, SysClock, etc., changing
> the range of the types would break all of them.  Every module
> interfacing with C code (libc, X11, foreign implementations, etc)
> would have to be rewritten.  Keeping 2 versions of each would be a
> true maintenance nightmare.

I agree it would be a nightmare and once again I stress that the
changes I was proposing would be for a different language -- it would
not be Oberon-2 anymore, obviously.
 
> C compiler implementors take a fairly pragmatic approach when moving
> to a 64 bit platform: except for `long int' all types have the same
> size as on 32 bit architectures.
> 
> Type size for 32 bit compilers:
>   C type        size in bytes   OOC type
>   ===========   =============   =========
>   signed char   1               SHORTINT
>   short int     2               INTEGER
>   int           4               LONGINT
>   long int      4               ---
>   void*         4               LONGINT (aka SYSTEM.ADDRESS)
> 
> For a 64 bit compiler the last two lines change to
>   long int      8               HUGEINT
>   void*         8               HUGEINT (aka SYSTEM.ADDRESS)
> 
> This way proper code, i.e. all FOREIGN and INTERFACE modules and all
> modules using them, should work without changing a single byte both on
> 32 _and_ 64 bit systems.  Breaking this compability on a whim is _not_
> an option.

I never said it was -- at least not for OOC.  Given your constraints about
not "changing a single byte" there is no other solution for OOC.  But you
will see the base C types make more sense since they define the SHORTINT
to be 16 bits.  They then have a signed char for 8 bits.  Of course a
signed char is plain nonsense (there is no such thing), the name signed
byte would have been a better name.  In the C integer hierarchy it is a
natural progression to define the 64-bit integers as a LONGINT.  In a 
different compiler without all the current biases, this would be a 
non-issue.  I believe that's why the CP designers went the way they did.

> > Almost everyone is also talking about using a Unicode character
> > set which are basically 16-bit characters.  I noticed the Component
> > Pascal has defined the following:
> > 
> >   SHORTCHAR = 0X - 0FFX    (perhaps in SYSTEM)
> >   CHAR      = 0X - 0FFFFX
> > 
> > What are people's thought's about extending the CHAR type as
> > well?  Personally, this is not a big deal for me, but our
> > Asian colleagues might definitely be interested in something
> > like this.
> 
> This isn't an extension since you are changing something that already
> exists.  Adding a new character type (WCHAR, HUGECHAR, UNICHAR,
> LONGCHAR?) would be an option.

Very well, I'm not tied to the particular name.  I'm more concerned
with getting language support for an extended character range.  This
has more to do with interfacing naturally to other OSes which would
support extended character sets.  Again, in a language with no
historical biases, it is easy to see why the CP designers made the
choices they did.  In this particular mapping, they also seem to
be setting themselves up for a future addition of a LONGCHAR.  I
noticed that the LONGREAL designation had also disappeared.  Instead
they introduced a SHORTREAL for 32-bit IEEE and REAL for 64-bit IEEE
numbers.  Again, they seem to anticipate a LONGREAL 128-bit IEEE
number in the future.   

> But: All support OOC could give is a new data type, conversion
> functions between standard CHAR and this type, and a set of library
> modules like Strings & friends.  But no Unicode IO.  I doubt that
> anyone would gain something with these changes.

International (i.e., Asian) users might disagree with you.  Java 
compatibility would be another gain in my view.
 
> > Some advantages I see to these additions are that:
> > 
> > 1) Our data types would be much closer to the Java virtual
> >    machine; perhaps making an Oberon-2 to Java byte code
> >    back end simpler.
> 
> Why should we ever try to make O2 closer to the Java VM??? 

Because it's designer's clearly planned for the future OSes.  We
should be ready for the same.  As I said, it will make a Java
byte-code back end simpler.

> > 2) The compiler would be ready for 64-bit machines.
> 
> There are better ways to do this.

For the existing OOC compiler we have no choices given your
self-imposed constraints.
 
> > 3) OOC would be compatible (sort of) with the Component
> >    Pascal compiler.
> 
> Why should we ever try to make O2 compatible with CP??? 

Because it is the most mainstream of the Oberon-2 systems. 
(Ignoring the misnomer CP -- it should be called Oberon-3.)
If any one of the systems will prosper it will be this one
and being able to compile their code on a Unix platform (the
only main OS not supported by them) could be advantageous
to us by giving us a broader customer base.  Of course, we
can wait and see how well they actually do.

> ------------------------------------------------------------------------
> 
[interesting stuff omitted]

> All of the above is implemented.  The only missing piece is that the
> compiler doesn't support 64 bit _constants_ yet.  Constants in O2 code
> are restricted to the range of LONGINT values.  Implementing this
> would mean to add a second HUGEINT version of all code pieces dealing
> with integer constants.  But we can't do this without a working 64 bit
> compiler.

We can simulate the 64-bit math without too much trouble.  The designers
of CP have obviously done that.  The Integers module actually already
provides such support.  This could be optimized specifically for 64-bits
if desired.

Michael G.