At 09:05 PM 11/28/00 +0000, Nicholas Clark wrote:
>On Tue, Nov 28, 2000 at 03:35:37PM -0500, Dan Sugalski wrote:
> > > > is treated as if it points to a stream of bytes, where the first 
> four are
>                                                                        ????
>
>I spy magic number.

Nah. 32-bit length. If someone needs to pass us more than 4G of source 
code, I do *not* want to know about it. :)

> > > > the length of the source to be read followed by the source. If set to
> > >
> > >Since you have a fourth argument couldn't that be used for the length
> > >of the byte stream rather than embedding that length into the byte stream
> > >itself? Makes more sense to me to separate the bytes from the length.
> >
> > I'd rather the stream be self-contained, rather than needing an extra
> > argument for the length. Counted strings aren't uncommon outside of C, and
> > there's no reason a Fortran or COBOL (or Java, or...) program can't 
> embed perl.
>
>
>Why four? Surely that's imposing an arbitrary binary structure. If it's a
>parameter then it's (probably) a machine register and certainly a "natural"
>quantity for whatever's running the code (and automatically the correct
>endian-ness just in case perl is running in some (oddball partial)
>binary emulation environment. Erm. Or something like that.

It's not necessarily in a register. In at least some of the languages I 
named (and you can add BASIC and pascal to the list as well), a string 
consists of a length and data pointer pair, usually together. What's handy 
is a pointer to the data structure, not the length and a pointer to the buffer.

Of course, for some of those languages the lengths are 16-bit quantities. Damn.

>I forget the source of the quote, but it was to the effect of
>C is the only language where not just the binaries but also the source is
>not portably.
>
>Say you'd said 2 not 4.
>
>struct counted_file {
>   short count;
>   struct  {
>     char  bytes[1];
>   } file;
>};
>
>
>erm. can't have bytes[0]; because that's not portable.

That'd probably be:

   struct counted_string {
     int length;
     char data[];
   }

which is legal ANSI C. Not that it helps with the size of an int issue, though.

>Can't really be short because who said that that was 2 bytes?
>For that matter I know of one compiler which doesn't have any type
>sizeof(2), and sizeof (struct counted_file) is 8 here on this arm machine
>:-) Wierdo but ANSI compliant alignment constraints.
>[yes, I forced that one using the second struct inside the first]

Y'know, I really loathe C. Really, really, loathe it.

Anyway, regardless of the platform, there is *some* way to force this to 
work--if there weren't, then implementing things like a TCP stack would be 
pretty much impossible.

Counted strings should probably just have either a platform-native int in 
front, or a 32-bit int in network format, both of which should be doable on 
any platform that perl deals with.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to