True, I typed in haste, although my thoughts still seem valid, that most
string data was smaller back then, and we didn't have thousands of buggy
programs blindly copying data from a socket to a static buffer until one of
them contained a magic 0x00.

I was assuming that PDP 7 would have a handy equivalent to REPNZ SCASB,
otherwise to my mind they wouldn't have used null-terminated strings so
readily.

I still want to know why Intel didn't implement REPNZ MOVSB, which would
have worked better with C strings :-)

But explicit length is definitely the way in my book, along with the
ability to to record I/O as well as stream.

Roops


On Tue, 31 Dec 2024, 11:46 Jay Maynard, <
000005997213d6c2-dmarc-requ...@listserv.ua.edu> wrote:

> Thompson and Ritchie were working on the PDP-7 and PDP-11 during the
> formative years of C and its predecessors B and BCPL, not the x86, so they
> didn't have REP MOVSB and friends available. From a Bell Labs article on
> the evolution of C:
>
> "None of BCPL, B, or C supports character data strongly in the language;
> each treats strings much like vectors of integers and supplements general
> rules by a few conventions. In both BCPL and B a string literal denotes the
> address of a static area initialized with the characters of the string,
> packed into cells. In BCPL, the first packed byte contains the number of
> characters in the string; in B, there is no count and strings are
> terminated by a special character, which B spelled `*e'. This change was
> made partially to avoid the limitation on the length of a string caused by
> holding the count in an 8- or 9-bit slot, and partly because maintaining
> the count seemed, in our experience, less convenient than using a
> terminator."
>
> https://www.bell-labs.com/usr/dmr/www/chist.html
>
> On Tue, Dec 31, 2024 at 5:31 AM Rupert Reynolds <rreyno...@cix.co.uk>
> wrote:
>
> > I've long been searching for a solution that might become a defacto
> > standard. So far we have a collection of proprietary file formats.
> >
> > But we should remember that C was originally not a general purpose
> > language, but a tool for a specific need.
> >
> > Few would have predicted that C and its silly strings would power a new
> > wave of computing (which ignored many of the lessons learned even before
> > S/360 was released--as Alan Kay said, computing 'is a pop culture')
> >
> > Thompson and Ritchie needed something to build an OS for lower cost
> > hardware and the assembler wasn't sophisticated enough to make that easy,
> > or even doable, but C worked for them.
> >
> > They only needed to move short strings, so having the CPU run REPNZ SCASB
> > followed by REP MOVSB was an acceptable overhead. Oh yes, why was there
> no
> > REPNZ MOVSB?
> >
> > One of the reasons I use Rexx for testing everything is that it doesn't
> > barf on magic numbers hidden in the data.
> >
> > Roops
> >
> > On Tue, 31 Dec 2024, 08:01 Clement Clarke, <ad...@oscar-jol.com> wrote:
> >
> > > Over many years, I have brought up the C string problem.  Both it's
> speed
> > > and safety.  I developed some C macros which improve the situation,
> > however
> > > I think that by simply introducing a new file type, say .VB like .EXE,
> > many
> > > speed and safety issues can be corrected relatively simply, as well
> > > as reducing the electricity and cooling water needed to shunt 5 billion
> > > emails around the world each day.
> > >
> > > The VB format would follow IBM's VB format.  Thus the length of the
> > record
> > > would precede the data, and allow records to be moved and copied
> without
> > > constantly searching for binary zeros, or carriage returns and line
> > feeds.
> > >
> > > My tests have consistently shown that copying strings with C takes a
> > > minimum of about 2.5 times longer to copy C terminated strings. Often,
> > the
> > > time is greater depending on the compiler and associated routines.
> > >
> > > I believe implementing such code would be relatively trivial.  And I
> > > would also suggest the record length and blocksize be 4 bytes rather
> > than 2
> > > bytes.
> > >
> > > The open routines could look to see if the file or data set is .VB, and
> > if
> > > so set a flag.  When a GET or equivalent is issued, then it is a simple
> > > matter to read the block if necessary, and load the length of the
> record
> > > and issue an MVCL or equivalent on other machines.
> > >
> > > Gradually, C and C++ could be changed internally to follow the
> excellent
> > > PL/I practice of having the length of variable length strings at the
> > front
> > > of the strings which means searching for string terminators is not
> > > necessary.
> > >
> > > Kind regards, and a Happy New Year to all,
> > >
> > > Clem Clarke
> > > PS: Some further information is at
> > > https://start.oscar-jol.com/fast-safe-c-strings
> > >
> > >
> > >
> > >
> > >
> > > Charles Mills kindly suggested that C++ was the answer.  However, I
> think
> > >
> > > ----------------------------------------------------------------------
> > > For IBM-MAIN subscribe / signoff / archive access instructions,
> > > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
> > >
> >
> > ----------------------------------------------------------------------
> > For IBM-MAIN subscribe / signoff / archive access instructions,
> > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
> >
>
>
> --
> Jay Maynard
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to