On Thu, May 12, 2005 at 06:29:49PM +0200, "TSa (Thomas Sandla?)" wrote:
> perl -le 'print join ",", split /(..)/, 112233445566'
> ,11,,22,,33,,44,,55,,66
[snipped]
> perl -le 'print join ",", split /(..)/, 11223'
> ,11,,22,3
> 
> Am I the only one who finds that inconsistent?
Maybe, but it's because you're misunderstanding what split does (i can
heartily recommend TFM in this case).

Let's start with a simpler case (inside debugger for help):


x split /../, 112233445566, -1           [ -1 to preserve all found fields ]

0  ''
1  ''
2  ''
3  ''
4  ''
5  ''
6  ''

Split uses the regular expression to find "seperators" in the text, and
then return the contents of the fields between them. The above case looks
like this:

     sep    sep    sep    sep    sep    sep
     |      |      |      |      |      |
     11     22     33     44     55     66
  |      |      |      |      |      |
field  field  field  field  field  field



Ok, let's try that with your second example:

x split /../, 11223, -1

0 ''
1 ''
2 3

     sep    sep
     |      |
     11     22  3
  |      |      |
field  field  field


Now, if the regular expression contains parentheses, additional list
elements are created from each matching substring (quoted almost verbatim
from TFM). So:

x split /(..)/, 112233445566, -1

0  ''
1  11
2  ''
3  22
4  ''
5  33
6  ''
7  44
8  ''
9  55
10  ''
11  66
12  ''


x split /(..)/, 11223, -1

0  ''
1  11
2  ''
3  22
4  3



And of course, if we remove the LIMIT from the equation, then any trailing
fields will be removed. Ergo the results quoted at the top of this email.
Hope this helps you (and anyone else who might have been confused) understand
what is going on.


J

-- 
Jody Belka
knew (at) pimb (dot) org

Reply via email to