The simplest runtime implementation would be non-zero subtype on the
string, to mark that it is binary data and not unicode text.
(Although that might make the stringp operator a bit ambiguous.)
The main benefits are in static typechecking, making sure you don't
send unencoded text to I/O functions
I'm not sure I follow. Which problem should this solve, a mark in the
string struct what the type of data the string contains?
el @ Pike developers forum wrote:
>the additional function call overhead for the call to `[](). I think
>inlining does not work unless you are inlining something from within
>the same class (or parent maybe). It would be nice if the compiler
Can anyone confirm this?
When declaring a function inli
On Thu, Nov 24, 2016 at 12:40 AM, Marcus Comstedt (ACROSS) (Hail
Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se>
wrote:
>>In Python, it's done with a prefix - u"asdf" is a Unicode string, and
>>b"asdf" is a byte string.
>
> Since nominally strings are Unicode (with the extende
Pontus ??stlund wrote:
>> 23 nov. 2016 kl. 15:03 skrev Stephen R. van den Berg :
>> I'd like to suggest that we make available a generated Refdoc for the
>> development tree (8.1 currently) on the pike website somewhere too.
>> Regenerated regularly through xenofarm perhaps.
>Isn't it sufficient e
> 23 nov. 2016 kl. 15:03 skrev Stephen R. van den Berg :
>
> I'd like to suggest that we make available a generated Refdoc for the
> development tree (8.1 currently) on the pike website somewhere too.
> Regenerated regularly through xenofarm perhaps.
> --
> Stephen.
Isn't it sufficient enough to
Any plans to add buffer mode support to SSL.File? I just discovered this in
Stdio.File and think it would be very nice if it were available here as well...
Yes, s will be Unicode. Of course, you need to declare the character
encoding of your source file using a #charset tag (or use a BOM to
indicate UTF encoding).
I think it would be a good idea as well, see 21907878.
The only thing that should have to care about the encoding should be
the endpoints.
How are string constants handled today? If I do
string s = "räksmörgås";
am I guaranteed a certain encoding of s?
Well, I'm not sure that's actually abusing it; Stdio.Buffer is a
sort of compromise for getting some of the benefits of a native buffer
type while not getting all of the problems (it does not affect
compatibility as it uses a separate set of APIs, and while that does
lead to inconsistency it's not
The main challenge with making the class version fast is that you pay the
additional function call overhead for the call to `[](). I think inlining does
not work unless you are inlining something from within the same class (or
parent maybe). It would be nice if the compiler would generate a fast
Monger supports fetching module code from git/hg/whatever, but released
versions should still be uploaded for those users who don't have those tools
and want to install a stable version of a module.
Either way, the module should be registered at modules.gotpike.org so others
can find it.
A
I'd like to suggest that we make available a generated Refdoc for the
development tree (8.1 currently) on the pike website somewhere too.
Regenerated regularly through xenofarm perhaps.
--
Stephen.
Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
>can also look at Java, which has byte[] as the type for byte strings,
>requiring literals like {'a','s','d','f'}, but I would like to see
In the EngineIO implementation I currently abuse Stdio.Buffer to fulfill this
bin
Yup, the thing we were discussing was how it would be nice to actually
be able to declare when they contain something else. :-) But it is a
valid point that binary encoded data is not necessarily 8-bit. You
should definitely be allowed to declare something as buffer(12bit) if
you want to store 1
>It's valid Pike. Pike supports the full ISO/IEC 10646 31-bit range,
>plus an equally large negative range.
Also note that Pike strings doesn't necessarily contain Unicode, even
if they usually do. They _could_ just as well contain RGB pixels or
random memory access data from a 12-bit-word syst
Weird, I don't know what page I was getting then.. hmmm
Sent from Yahoo Mail on Android
On Wed, Nov 23, 2016 at 8:30 AM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @
Pike (-) developers forum<10...@lyskom.lysator.liu.se> wrote: The index works
for me. If you were unable to find Gnol and
>Right, and that's something that can't be done in the current
>standard. Hence this entire proposal has to wait until some major
>changes can be done.
Yup. And then those changes should not be a repurposing of an
existing mechanism (element ranges on the string type) but something
more appropria
Given the attached sample code.
It tries to profile the difference between accessing a native mapping and
accessing it through a class which is declared inline as much as possible.
When I run it, I find that the native mapping is about twice as fast and
the inlined amapping class.
What would it tak
The index works for me. If you were unable to find Gnol and Drow in
the list of methods that's probably because they are classes and not
methods. :-) Click "MODULE REFERENCE" -> "ADT" -> "Struct" and both
"Drow" and "Gnol" appear in the class list to the left.
On Thu, Nov 24, 2016 at 12:20 AM, Marcus Comstedt (ACROSS) (Hail
Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se>
wrote:
>>\U12345678 possibly should be an error, as it's not valid Unicode.
>
> It's valid Pike. Pike supports the full ISO/IEC 10646 31-bit range,
> plus an equal
Yeah, I found the ADT.Struct type with Gnol and Drow. However, the
documentation on the website gave me trouble. I couldn't find an index to the
left, just the modules themselves, so when I clicked on ADT, I couldn't see all
the methods and click on them individually, I had to click on links a
>By "binary data", I mean eight-bit strings of arbitrary bytes - like
>you'd read from a file or something. Currently, functions like
>Stdio.read_file simply return "string", but they'll effectively be
>returning string(8bit).
No, Stdio.read_file currently returns string(8bit). That simply means
23 nov. 2016 kl. 13:25 skrev Peter Bortas @ Pike developers forum
<10...@lyskom.lysator.liu.se>:
>
> Well, obviously you should put it in public git somewhere. :)
>
> But the point is that pike has built in support for fetching module
> packages from gotpike.org via pike -x monger.
Hence my re
On Wed, Nov 23, 2016 at 11:10 PM, Marcus Comstedt (ACROSS) (Hail
Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se>
wrote:
>>I agree, but using string(8bit) to mean "binary data" is something
>>that's 100% backward compatible.
>
> It would not be backwards compatible, since that
Strings with known encoding that can transfer into other strings with
a known encoding easily and readable (and in some cases without any
interaction) would be useful.
For instance,
Stdio.FILE x = ...;
x->set_encoding("utf8");
string s = "räksmörgås";
String t = String.JP2022("\33(BHello, world!
Well, obviously you should put it in public git somewhere. :)
But the point is that pike has built in support for fetching module
packages from gotpike.org via pike -x monger. It should really be the
default place to put new stuff before we determine if it's good for
mainline, especially if it has
>I agree, but using string(8bit) to mean "binary data" is something
>that's 100% backward compatible.
It would not be backwards compatible, since that is not what
string(8bit) means today.
>Unicode text would always be referred
>to as string(21bit), even if it happens to contain nothing but Latin
On Wed, Nov 23, 2016 at 10:30 PM, Marcus Comstedt (ACROSS) (Hail
Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se>
wrote:
> I think you are conflagrating range with interpretation. Both a
> Latin1 string and an UTF-8 encoded one are 8-bit strings (with a 0-255
> range). What w
Even if it hadn't been, fixing that would have been the correct
course of action. ;-)
I think you are conflagrating range with interpretation. Both a
Latin1 string and an UTF-8 encoded one are 8-bit strings (with a 0-255
range). What would be useful is a datatype that declares that the
elements are not Unicode characters (as they are in the Latin1 string
case) but some raw binary
Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
>If there are no character values >127, then the encoding step is a
>no-op, so skipping it buys you nothing except making your code harder
>to read.
I see. I should have guessed that string_to_utf8() is already smart en
On Wed, Nov 23, 2016 at 10:00 PM, Marcus Comstedt (ACROSS) (Hail
Ilpalazzo!) @ Pike (-) developers forum <10...@lyskom.lysator.liu.se>
wrote:
> If there are no character values >127, then the encoding step is a
> no-op, so skipping it buys you nothing except making your code harder
> to read.
I en
If there are no character values >127, then the encoding step is a
no-op, so skipping it buys you nothing except making your code harder
to read.
23 nov. 2016 kl. 11:10 skrev Peter Bortas @ Pike developers forum
<10...@lyskom.lysator.liu.se>:
>
> Sounds like something that belongs on http://modules.gotpike.org/
Good enough! But I'd rather put it on Github then. It is after all 2016 ;)
A a side note: I was thinking about having something
Sounds like something that belongs on http://modules.gotpike.org/
Martin Nilsson (Coppermist) @ Pike (-) developers forum wrote:
>>Please review, any comments are welcome.
>This looks wrong:
> if (String.width(msg) > 8)
>msg = string_to_utf8(msg);
>You are always utf8-decoding the string, so you should always
>utf8-encode them.
Well spott
37 matches
Mail list logo