On 3 Dec 2005 03:28:19 -0800, [EMAIL PROTECTED] wrote: > >Bengt Richter wrote: >> On 2 Dec 2005 18:34:12 -0800, [EMAIL PROTECTED] wrote: >> >> > >> >Bengt Richter wrote: >> >> It looks to me like itertools.groupby could get you close to what you >> >> want, >> >> e.g., (untested) >> >Ah, groupby. The generic string.split() equivalent. But the doc said >> >the input needs to be sorted. >> > >> >> >>> seq = [3,1,4,'t',0,3,4,2,'t',3,1,4] >> >>> import itertools >> >>> def condition(item): return item=='t' >> ... >> >>> def dosomething(it): return 'doing something with %r'%list(it) >> ... >> >>> for condresult, acciter in itertools.groupby(seq, condition): >> ... if not condresult: >> ... dosomething(acciter) >> ... >> 'doing something with [3, 1, 4]' >> 'doing something with [0, 3, 4, 2]' >> 'doing something with [3, 1, 4]' >> >> I think the input only needs to be sorted if you a trying to group sorted >> subsequences of the input. >> I.e., you can't get them extracted together unless the condition is >> satisfied for a contiguous group, which >> only happens if the input is sorted. But AFAIK the grouping logic just scans >> and applies key condition >> and returns iterators for the subsequences that yield the same key function >> result, along with that result. >> So it's a general subsequence extractor. You just have to supply the key >> function to make the condition value >> change when a group ends and a new one begins. And the value can be >> arbitrary, or just toggle beween two values, e.g. >> >> >>> for condresult, acciter in itertools.groupby(range(20), lambda x:x%3==0 >> or x==5): >> ... print '%6s: %r'%(condresult, list(acciter)) >> ... >> True: [0] >> False: [1, 2] >> True: [3] >> False: [4] >> True: [5, 6] >> False: [7, 8] >> True: [9] >> False: [10, 11] >> True: [12] >> False: [13, 14] >> True: [15] >> False: [16, 17] >> True: [18] >> False: [19] >> >> or a condresult that stays the same in groups, but every group result is >> different: >> >> >>> for condresult, acciter in itertools.groupby(range(20), lambda x:x//3): >> ... print '%6s: %r'%(condresult, list(acciter)) >> ... >> 0: [0, 1, 2] >> 1: [3, 4, 5] >> 2: [6, 7, 8] >> 3: [9, 10, 11] >> 4: [12, 13, 14] >> 5: [15, 16, 17] >> 6: [18, 19] >> >Thanks. So it basically has an internal state storing the last >"condition" result and if it flips(different), a new group starts. > So it appears. But note that "flips(different)" seems to be based on ==, and default key function is just passthrough like lambda x:x, so e.g. integers and floats will group together if their values are equal. E.g., to elucidate further,
Default key function: >>> from itertools import groupby >>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j]): ... print k, list(g) ... 0 [0, 0.0, 0j] [] [[]] () [()] None [None] 1 [1, 1.0] 1j [1j] Group by bool value: >>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j], key=bool): ... print k, list(g) ... False [0, 0.0, 0j, [], (), None] True [1, 1.0, 1j] It's not trying to sort, so it doesn't trip on complex >>> for k,g in groupby([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j]): ... print k, list(g) ... 0 [0, 0.0, 0j] [] [[]] () [()] None [None] 1 [1, 1.0] 1j [1j] 2j [2j] But you have to watch out if you try to pre-sort stuff that includes complex numbers >>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j])): ... print k, list(g) ... Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: cannot compare complex numbers using <, <=, >, >= And if you do sort using a key function, it doesn't mean groupy inherits that keyfunction for grouping unless you specify it >>> def keyfun(x): ... if isinstance(x, (int, long, float)): return x ... else: return type(x).__name__ ... >>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], >>> key=keyfun)): ... print k, list(g) ... 0 [0, 0.0] 1 [1, 1.0] None [None] 0j [0j] 1j [1j] 2j [2j] [] [[]] () [()] Vs giving groupby the same keyfun >>> for k,g in groupby(sorted([0, 0.0, 0j, [], (), None, 1, 1.0, 1j, 2j], >>> key=keyfun), keyfun): ... print k, list(g) ... 0 [0, 0.0] 1 [1, 1.0] NoneType [None] complex [0j, 1j, 2j] list [[]] tuple [()] Exmple of unsorted vs sorted subgroup extraction: >>> for k,g in groupby('this that other thing note order'.split(), key=lambda >>> s:s[0]): ... print k, list(g) ... t ['this', 'that'] o ['other'] t ['thing'] n ['note'] o ['order'] vs. >>> for k,g in groupby(sorted('this that other thing note order'.split()), >>> key=lambda s:s[0]): ... print k, list(g) ... n ['note'] o ['order', 'other'] t ['that', 'thing', 'this'] Oops, that key would be less brittle as (untested) key=lambda s:s[:1], e.g., in case a split with args was used. Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list