> Hans-Peter Diettrich wrote on Mon, 30 Jan 2012 17:40:27 +0100
> Existing source code frequently assumes ASCII encoding. The obvious are 
> upper/lowercase conversions, by and/or or add/sub constant values to the 
> characters. It will be hell to find and fix all such code in the 
> compiler and RTL, even if only the constants have to be modified for 
> EBCDIC. Even code with the assumed order of common characters (' ' < '0' 
> < 'A' < 'a') has to be found and fixed manually - how would you even 
> *find* code with such implicit assumptions?

It does indeed.  I am aware of the problems inherent in this.  But the RTL
has to be more or less rewritten anyway to support OS.  OS is a very different
animal to Windows or Linux.

But, you would start with various searches using grep or something
and scan for bits of the code that use constants like '#7' and change them to
fpc_Char_Bell or something similar that would live in an fpcASCII or fpcEBCDIC
unit or something similar.  You would search for all the combinations you could
think of '['a'..'z']', '['A'..'Z']' etc.  Finally, exhausting your ingenuity
you would be left with the old stand-by of testing.

A God-awful task I know.  But what's the alternative?  A note in the 
documentation
for FreePascal/MVS that whenever you reference any external data it is the 
user's
responsibility to convert from ASCII to EBCDIC.  Really?  
AssignFile(f,'SYS1.PARMLIB'),
sorry doesn't work, you forgot the ASCII conversion;  WRITELN('Hello World') 
produces
garbage on the user's terminal.  Who will they blame then.  
JobSubmit(asciifile) will
disappear from the face of the planet because JES won't have a clue what to do 
with
an ASCII file.

You can't convert automatically because you don't necessarily know whether the 
user
is writing ASCII, EBCDIC or binary.  What happens to

  MyRec = record
     Field1 : string;
     Field2 : char;
     Field3 : integer;
     end;

If we are using ASCII should we be using Little-Endian numbers too!
 
> Next come character ranges, where letters are assumed contiguous in all 
> existing code and examples. Clearly this is true only for ASCII 
> ('a'..'z'), not for national characters like 'ä' or 'é', but the 
> compiler assumes ASCII source encoding all over. Fixing the set 
> constructor to make Set Of Char work with EBCDIC will be a challenge.
> 
> When a user e.g. picks up such example or library code from somewhere, 
> and finds that it doesn't work, he'll blame the compiler for malfunction.
> 
> An EBCDIC based compiler will disallow the use of any foreign libraries, 
> because a simple (syntactic) conversion from ASCII to EBCDIC encoding 
> doesn't cover beforementioned (semantic) issues :-(

A compiler is not just a tool for syntax analysis.  It has semantic routines 
built
into already.  It's up to us to use enough ingenuity to cater for as many of 
these
as possible.  Surely it should be possible to pick up stuff like 'a'..'z' at 
compile-
time


Regards
Steve
> 
> Mark Morgan Lloyd wrote:

> I repeat: IBM is now happily using ASCII on zSeries. That includes the 
> CDSL system made available to developers 
> http://www-03.ibm.com/systems/z/os/linux/support/community.html

Yes. The Community Software Development for Linux on System/Z would use ASCII.  
As
we have already ascertained, Linux/390 is an ASCII system;  Using EBCDIC would 
be
slightly south of stupid.

CDSL doesn't run on OS.  Except possibly under USS.  Does anyone know if USS is
ASCII or EBCDIC?

(USS is Unix System Services, formerly known as Open/MVS.  It's a sort of Unix 
type
Look-alike ish sort of thing that runs under versions of OS from MVS/ESA SP 4.3 
onwards)

> I think the reason for producing an ASCII version first is very simple:

Converting the source from ASCII to EBCDIC isn't a huge problem.  Their are 
many much larger problems ahead :)

> 
> No - sending source code from a PC to a 370 performs an automatic translation 
> to EBCDIC (and vice versa).
> 
It depends on what you use to do the transfer and what options you specify.  
These utilities
are normally configurable.  FTP and IND£FILE are.  They're the two I've used in 
the past.
 
> IBM 370 doesn't use ASCII, anywhere, but it has a hardware instruction (TRT _ 
> Translate and Test)
> which can convert between character sets in a single instruction using a 
> suitable table. 

Translate and Test wouldn't help.  Despite the name it doesn't actually do any 
translation as such.
The instruction you meant was TR (Translate).  To quote "Sorry, pedantry strong 
this one runs" 

--
Regards
Steve
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to