It can be confusing when in some contexts one uses a file extension to imply record structure and in others to  imply the functional content of the file --i.e., what app or family of apps should process the record content in the file.   Of course now that the extension is no longer restricted to only 3 chars, the "vb"  could also be part of the file type with the remaining characters used to convey the functional content where that is appropriate -- e.g.,  "vbtxt", or maybe just "vtxt".   There is already some precedent for such usage with file types sometimes having an added "z" or "x" to indicate zipped or compressed structure imposed on underlying file content.

    JC Ewing

On 12/31/24 5:31 AM, Rupert Reynolds wrote:
I've long been searching for a solution that might become a defacto
standard. So far we have a collection of proprietary file formats.

But we should remember that C was originally not a general purpose
language, but a tool for a specific need.

Few would have predicted that C and its silly strings would power a new
wave of computing (which ignored many of the lessons learned even before
S/360 was released--as Alan Kay said, computing 'is a pop culture')

Thompson and Ritchie needed something to build an OS for lower cost
hardware and the assembler wasn't sophisticated enough to make that easy,
or even doable, but C worked for them.

They only needed to move short strings, so having the CPU run REPNZ SCASB
followed by REP MOVSB was an acceptable overhead. Oh yes, why was there no
REPNZ MOVSB?

One of the reasons I use Rexx for testing everything is that it doesn't
barf on magic numbers hidden in the data.

Roops

On Tue, 31 Dec 2024, 08:01 Clement Clarke, <ad...@oscar-jol.com> wrote:

Over many years, I have brought up the C string problem.  Both it's speed
and safety.  I developed some C macros which improve the situation, however
I think that by simply introducing a new file type, say .VB like .EXE, many
speed and safety issues can be corrected relatively simply, as well
as reducing the electricity and cooling water needed to shunt 5 billion
emails around the world each day.

The VB format would follow IBM's VB format.  Thus the length of the record
would precede the data, and allow records to be moved and copied without
constantly searching for binary zeros, or carriage returns and line feeds.

My tests have consistently shown that copying strings with C takes a
minimum of about 2.5 times longer to copy C terminated strings. Often, the
time is greater depending on the compiler and associated routines.

I believe implementing such code would be relatively trivial.  And I
would also suggest the record length and blocksize be 4 bytes rather than 2
bytes.

The open routines could look to see if the file or data set is .VB, and if
so set a flag.  When a GET or equivalent is issued, then it is a simple
matter to read the block if necessary, and load the length of the record
and issue an MVCL or equivalent on other machines.

Gradually, C and C++ could be changed internally to follow the excellent
PL/I practice of having the length of variable length strings at the front
of the strings which means searching for string terminators is not
necessary.

Kind regards, and a Happy New Year to all,

Clem Clarke
PS: Some further information is at
https://start.oscar-jol.com/fast-safe-c-strings





Charles Mills kindly suggested that C++ was the answer.  However, I think

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

--
Joel C Ewing

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to