Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
Hi,

I am creating a tree data-structure in python; with nodes of the tree
created by a simple class :

class Node :
   def __init__(self ,  other attributes):
  # initialise the attributes here!!

But the problem is I am working with a huge tree (millions of nodes); and
each node is consuming much more memory than it should. After a little
analysis, I found out that in general it uses about 1.4 kb of memory for
each node!!
I will be grateful if someone could help me optimize the memory usage.
Thanks.

Regards,
Mayank


-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
Thanks for the other possibilites. I would consider option (2) and (3) to
improve my code.

But out of curiosity, I would still like to know why does an object of a
Python-class consume "so" much of memory (1.4 kb), and this memory usage has
nothing to do with its attributes.

Thanks

Regards.

On Mon, Jul 6, 2009 at 12:03 PM, Chris Rebert  wrote:

> On Mon, Jul 6, 2009 at 2:55 AM, mayank gupta wrote:
> > Hi,
> >
> > I am creating a tree data-structure in python; with nodes of the tree
> > created by a simple class :
> >
> > class Node :
> >def __init__(self ,  other attributes):
> >   # initialise the attributes here!!
> >
> > But the problem is I am working with a huge tree (millions of nodes); and
> > each node is consuming much more memory than it should. After a little
> > analysis, I found out that in general it uses about 1.4 kb of memory for
> > each node!!
> > I will be grateful if someone could help me optimize the memory usage.
>
> (1) Use __slots__ (see
> http://docs.python.org/reference/datamodel.html#slots)
> (2) Use some data structure other than a tree
> (3) Rewrite your Node/Tree implementation in C
>
> Cheers,
> Chris
> --
> http://blog.rebertia.com
>



-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
I worked out a small code which initializes about 1,000,000 nodes with some
attributes, and saw the memory usage on my linux machine (using 'top'
command). Then just later I averaged out the memory usage per node. I know
this is not the most accurate way but just for estimated value.

The kind of Node class I am working on in my original code is like :

class Node:
 def __init__(self, #attributes ):
 self.coordinates = coordinates
 self.index = index
 self.sibNum = sibNum
 self.branchNum - branchNum

#here 'coordinates' and 'index' are LISTS with length = "dimension", where
"dimension" is a user-input.

The most shocking part of it after the memory-analysis was that, the memory
usage was never dependent on the "dimension". Yeah it varied a bit, but
there wasnt any significant changes in the memory usage even when the
"dimension" was doubled

-- Any clues?

Thank you for all your suggestions till this point.

Regards.




On Tue, Jul 7, 2009 at 1:28 AM, Antoine Pitrou  wrote:

> mayank gupta  gmail.com> writes:
> >
> > After a little analysis, I found out that in general it uses about
> > 1.4 kb of memory for each node!!
>
> How did you measure memory use? Python objects are not very compact, but
> 1.4KB
> per object seems a bit too much (I would expect more about 150-200
> bytes/object
> in 32-bit mode, or 300-400 bytes/object in 64-bit mode).
>
> One of the solutions is to use __slots__ as already suggested. Another,
> which
> will have similar benefits, is to use a namedtuple. Both suppress the
> instance
> dictionnary (`instance`.__dict__), which is a major contributor to memory
> consumption. Illustration (64-bit mode, by the way):
>
> >>> import sys
> >>> from collections import namedtuple
>
> # First a normal class
> >>> class Node(object): pass
> ...
> >>> o = Node()
> >>> o.value = 1
> >>> o.children = ()
> >>>
> >>> sys.getsizeof(o)
> 64
> >>> sys.getsizeof(o.__dict__)
> 280
> # The object seems to take a mere 64 bytes, but the attribute dictionnary
> # adds a whoppy 280 bytes and bumps actual size to 344 bytes!
>
> # Now a namedtuple (a tuple subclass with property accessors for the
> various
> # tuple items)
> >>> Node = namedtuple("Node", "value children")
> >>>
> >>> o = Node(value=1, children=())
> >>> sys.getsizeof(o)
> 72
> >>> sys.getsizeof(o.__dict__)
> Traceback (most recent call last):
>  File "", line 1, in 
> AttributeError: 'Node' object has no attribute '__dict__'
>
> # The object doesn't have a __dict__, so 72 bytes is its real total size.
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


'del' function for Dictionary

2009-07-17 Thread mayank gupta
Hi all,

I wanted to know whether there is a more efficient way to delete an entry
from a dictionary (instead of using the 'del' function), because after
analyzing the time taken by the code, it seems to me that the 'del' function
takes most of the time. I might be incorrect as well.
Kindly help me in this regard.

Cheers,
Mayank

-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list