Peter Otten wrote:
> import re
> _reLump = re.compile(r"\S+")
>
> def indices(text, chunks):
> lumps = _reLump.finditer(text)
> for chunk in chunks:
> lump = [lumps.next() for _ in chunk.split()]
> yield lump[0].start(), lump[-1].end()
Thanks, that's a really nice, clean s
John Machin wrote:
> Steven Bethard wrote:
>
>> John Machin wrote:
>>
>>> For example, text = 'foo bar', chunks = ['foobar']
>>
>> This doesn't match the (admittedly vague) spec
>
> That is *exactly* my point -- it is not valid input, and you are not
> reporting all cases of invalid input; you h
Steven Bethard wrote:
> John Machin wrote:
>
>> If "work" is meant to detect *all* possibilities of 'chunks' not
>> having been derived from 'text' in the described manner, then it
>> doesn't work -- all information about the positions of the whitespace
>> is thrown away by your code.
>>
>> For
Steven Bethard wrote:
> I have a string with a bunch of whitespace in it, and a series of chunks
> of that string whose indices I need to find. However, the chunks have
> been whitespace-normalized, so that multiple spaces and newlines have
> been converted to single spaces as if by ' '.join(chun
John Machin wrote:
> If "work" is meant to detect *all* possibilities of 'chunks' not having
> been derived from 'text' in the described manner, then it doesn't work
> -- all information about the positions of the whitespace is thrown away
> by your code.
>
> For example, text = 'foo bar', chun
Steven Bethard wrote:
[snip]
> And it appears to work:
[snip]
> But it seems somewhat inelegant. Can anyone see an easier/cleaner/more
> Pythonic way[1] of writing this code?
>
> Thanks in advance,
>
> STeVe
>
> [1] Yes, I'm aware that these are subjective terms. I'm looking for
> subjective