Thompson and Ritchie were working on the PDP-7 and PDP-11 during the formative years of C and its predecessors B and BCPL, not the x86, so they didn't have REP MOVSB and friends available. From a Bell Labs article on the evolution of C:
"None of BCPL, B, or C supports character data strongly in the language; each treats strings much like vectors of integers and supplements general rules by a few conventions. In both BCPL and B a string literal denotes the address of a static area initialized with the characters of the string, packed into cells. In BCPL, the first packed byte contains the number of characters in the string; in B, there is no count and strings are terminated by a special character, which B spelled `*e'. This change was made partially to avoid the limitation on the length of a string caused by holding the count in an 8- or 9-bit slot, and partly because maintaining the count seemed, in our experience, less convenient than using a terminator." https://www.bell-labs.com/usr/dmr/www/chist.html On Tue, Dec 31, 2024 at 5:31 AM Rupert Reynolds <rreyno...@cix.co.uk> wrote: > I've long been searching for a solution that might become a defacto > standard. So far we have a collection of proprietary file formats. > > But we should remember that C was originally not a general purpose > language, but a tool for a specific need. > > Few would have predicted that C and its silly strings would power a new > wave of computing (which ignored many of the lessons learned even before > S/360 was released--as Alan Kay said, computing 'is a pop culture') > > Thompson and Ritchie needed something to build an OS for lower cost > hardware and the assembler wasn't sophisticated enough to make that easy, > or even doable, but C worked for them. > > They only needed to move short strings, so having the CPU run REPNZ SCASB > followed by REP MOVSB was an acceptable overhead. Oh yes, why was there no > REPNZ MOVSB? > > One of the reasons I use Rexx for testing everything is that it doesn't > barf on magic numbers hidden in the data. > > Roops > > On Tue, 31 Dec 2024, 08:01 Clement Clarke, <ad...@oscar-jol.com> wrote: > > > Over many years, I have brought up the C string problem. Both it's speed > > and safety. I developed some C macros which improve the situation, > however > > I think that by simply introducing a new file type, say .VB like .EXE, > many > > speed and safety issues can be corrected relatively simply, as well > > as reducing the electricity and cooling water needed to shunt 5 billion > > emails around the world each day. > > > > The VB format would follow IBM's VB format. Thus the length of the > record > > would precede the data, and allow records to be moved and copied without > > constantly searching for binary zeros, or carriage returns and line > feeds. > > > > My tests have consistently shown that copying strings with C takes a > > minimum of about 2.5 times longer to copy C terminated strings. Often, > the > > time is greater depending on the compiler and associated routines. > > > > I believe implementing such code would be relatively trivial. And I > > would also suggest the record length and blocksize be 4 bytes rather > than 2 > > bytes. > > > > The open routines could look to see if the file or data set is .VB, and > if > > so set a flag. When a GET or equivalent is issued, then it is a simple > > matter to read the block if necessary, and load the length of the record > > and issue an MVCL or equivalent on other machines. > > > > Gradually, C and C++ could be changed internally to follow the excellent > > PL/I practice of having the length of variable length strings at the > front > > of the strings which means searching for string terminators is not > > necessary. > > > > Kind regards, and a Happy New Year to all, > > > > Clem Clarke > > PS: Some further information is at > > https://start.oscar-jol.com/fast-safe-c-strings > > > > > > > > > > > > Charles Mills kindly suggested that C++ was the answer. However, I think > > > > ---------------------------------------------------------------------- > > For IBM-MAIN subscribe / signoff / archive access instructions, > > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN > > > > ---------------------------------------------------------------------- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN > -- Jay Maynard ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN