Re: Better ways for implementing two situations

2019-04-23 Thread Paulo da Silva
Às 22:21 de 21/04/19, Paul Rubin escreveu: > Paulo da Silva writes: >> splitter={} >> for f in Objs: >> splitter.setdefault(f.getId1,[]).append(f) >> groups=[gs for gs in splitter.values() if len(gs)>1] > > It's easiest if you can sort the input list and then use > itertools.groupby. Yes, so

Re: Better ways for implementing two situations

2019-04-23 Thread Paulo da Silva
Às 20:41 de 21/04/19, DL Neil escreveu: > Olá Paulo, > ... > > Given that we're talking "big data", which Python Data Science tools are > you employing? eg NumPy. Sorry. I misused the term "big data". I should have said a big amount of data. It is all about objects built of text and some number

Re: Better ways for implementing two situations

2019-04-23 Thread Paulo da Silva
Às 20:10 de 21/04/19, MRAB escreveu: > On 2019-04-21 19:23, Paulo da Silva wrote: >> Hi all. >> ... > Have you compared the speed with an implementation that uses > defaultdict? Your code always creates an empty list for each item, even > though it might not be needed. I never used defaultdict. I'

Re: Better ways for implementing two situations

2019-04-23 Thread Paulo da Silva
Às 19:42 de 21/04/19, Stefan Ram escreveu: > Paulo da Silva writes: >> I have a list of objects and want to split it in a list of groups. >> "equal objects" is based on an id we can get from the object. > > main.py > > input = [ 'abc', 'ade', 'bcd' ] > > for group, list in \ > __import__( 'ite

Re: Better ways for implementing two situations

2019-04-21 Thread DL Neil
Olá Paulo, On 22/04/19 6:23 AM, Paulo da Silva wrote: Hi all. I am looking for improved solutions to these two problems. They are to be in a program that deals with big data. So, they need to be fast and save memory. ... Given that we're talking "big data", which Python Data Science tools a

Re: Better ways for implementing two situations

2019-04-21 Thread MRAB
On 2019-04-21 19:23, Paulo da Silva wrote: Hi all. I am looking for improved solutions to these two problems. They are to be in a program that deals with big data. So, they need to be fast and save memory. Problem 1. I have a list of objects and want to split it in a list of groups. Each group

Better ways for implementing two situations

2019-04-21 Thread Paulo da Silva
Hi all. I am looking for improved solutions to these two problems. They are to be in a program that deals with big data. So, they need to be fast and save memory. Problem 1. I have a list of objects and want to split it in a list of groups. Each group must have all "equal objects" and have more