Re: [PHP-DEV] Adding a more logical string slicing function to PHP

Hannes Landeholm Wed, 30 Mar 2011 06:06:24 -0700

PHP's substr() is awesome and that comes from a person that code in at least
5 different languages daily. Parsing is a problem in many real-world
problems and substr currently works great for that purpose. You work with
two parameters: offset and length of parsing. Since meaning of a negative
offset/length when sub-stringing is intuitively undefined, PHP has reserved
these ranges for two common usecases: offsets from the end of the string and
truncation length.

"I just think it is very unintuitive for the first parameter always to be a
position, and the 2nd parameter to be a length if the value is positive, and
a position if the value is negative."

The first parameter is either an offset from start or offset from end - the
second parameter is either a length or a truncation length. I don't see why
this would be unintuitive? Perhaps you get confused by other languages that
just work with offsets.

In most real world parsing scenarios I've worked with offsets and lengths,
so the current substr definition gets my job done fastest without doubt.
Let's have a real world example "Parse through the data in chunks of 64
bytes at a time." In PHP this is simple, just take the current offset and a
length of 64. In python you'd have to add 64 to the current offset and put
into the second offset parameter = you have to think more and write more
code instead of just working with the length directly.

"Returning FALSE when start + length parameters are invalid.  This is
annoying
because when using this function you always have to deal with this FALSE
case if
you need a string. "

Guess what this code outputs?

var_dump((string) \substr("foo", 5, 6));

Now try this and you'll understand why this is basically never a problem
that substr outputs false and why you don't even have to think about it:

var_dump(\substr("foo", 5, 6) == "", (string) false, false == "");

Welcome to PHP. To be honest this criticism pretty much falls in the "from
person that comes from another language X and is annoyed that every little
detail isn't exactly the same"-category. Just make your own substr()
function that uses the behavior you expect if you don't like the native
version. Although that's bad practice - the best solution is to get used to
it. And if you have an urge to write about your experience with a new
language I suggest you do it in a blog instead of posting it in the
internals mailing list...

~Hannes

On 30 March 2011 09:42, Dan Birken <bir...@gmail.com> wrote:

> I think when the values are positive everything is mostly great.  I think
> when the values are negative is where the main problems are.  Both the C
> function strncpy() and the C++ strings substr() function only support
> positive values for length AFAIK.
>
> I just think it is very unintuitive for the first parameter always to be a
> position, and the 2nd parameter to be a length if the value is positive,
> and
> a position if the value is negative.
>
> substr('string', 1, 2); // Goes from position 1 to position 3
> substr('string', -2, -1); // Goes from position -2 to position -1
>
> So here is the same kind of thing in python, which uses [start, end):
> string[1:2] ==> 't'
> string[-2:-1] ==> 'n'
>
> And ruby, which uses [start, end]:
> "string"[2..3] ==> 'tr'
> "string"[-2:-1] ==> 'ng'
>
> Both of these languages use positions for positive and negative values.  In
> addition, in both of these languages if you slice a string impossibly, both
> of them return an empy string as opposed to false, which just seems more
> intuitive to me.
>
> I don't think this function is particularly novel, I just think both
> returning an empty string on impossible slicing and slicing based on
> positions are improvements, and combined I think this function is
> noticeably
> more durable and readable than substr().
>
> -Dan
>
> On Tue, Mar 29, 2011 at 11:22 PM, Lars Schultz <lars.schu...@toolpark.com
> >wrote:
>
> > I just love substr() and I think all other languages got it wrong;)
> >
> > Seriously...it behaves the same as implementations in other languages as
> > long as values are positive, right? how is that counter-intuitive? How do
> > other languages handle negative values?
> >
> > Am 30.03.2011 08:06, schrieb Dan Birken:
> >
> >  My apologizes if I am bringing up a topic that has been discussed
> before,
> >> this is my first time wading into the PHP developers lists and I
> couldn't
> >> find anything particularly relevant with the search.
> >>
> >> Here is a bug I submitted over the weekend (
> >> http://bugs.php.net/bug.php?id=54387) with an attached patch that adds
> a
> >> str_slice() function into PHP.  This function is just a very simple
> string
> >> slicing function, with the logical interface of str_slice(string, start,
> >> [end]).  It is of course meant to replace substr() as an interface for
> >> string slicing.
> >>
> >> I detailed the reasons I submitted the patch in the bug a little bit,
> but
> >> the main reason is that I think the substr() function is really overly
> >> confusing and just not an intuitive method of string slicing, which is
> >> exceedingly common functionality.  I realize we don't want to go around
> >> adding lots of random little functions into the language that don't
> offer
> >> much, but the problem with that is that if we have a function like
> >> substr()
> >> with an unusual and unintuitive interface, it becomes unchangeable due
> to
> >> legacy issues and then you can never improve.  I think this particular
> >> functionality is important enough to offer an updated interface.  In the
> >> bug
> >> I also pointed to two related bugs that would be essentially fixed with
> >> this
> >> patch.
> >>
> >> -Dan
> >>
> >>
> >
> > --
> > PHP Internals - PHP Runtime Development Mailing List
> > To unsubscribe, visit: http://www.php.net/unsub.php
> >
> >
>

Re: [PHP-DEV] Adding a more logical string slicing function to PHP

Reply via email to