Re: [Twisted-Python] Sending jpeg data over TCP/IP

2010-02-22 Thread gary clark
The correct way I think is to use base64 and just one connection after more 
research. Good to research.

Thanks,
Garyc

--- On Sun, 2/14/10, gary clark  wrote:

> From: gary clark 
> Subject: Re: [Twisted-Python] Sending jpeg data over TCP/IP
> To: "Twisted general discussion" 
> Date: Sunday, February 14, 2010, 12:18 PM
> 
> hey Alexandra,
> 
> The only reason why I suggested another server would be to
> distribute the
> load on the system. Essentially the files that are intended
> to be sent over will consume alot of processing since .jpeg
> etc are humongous.
> Yes two connections would be required. Well you could use
> the MD5 sum, I will not, essentially your after a unique
> identifier that represents the file your sending. What I
> intend to do is simply embed the filename and unique
> identifier as a header. Hence I do not not need to recompute
> the MD5 at the receiving end, which to be honest seems
> overkill. 
> 
> Anyway good luck. KISS for software is the best approach.
> 
> Thanks,
> Garyc
> 
> 
> --- On Sun, 2/14/10, Alexandre Quessy 
> wrote:
> 
> > From: Alexandre Quessy 
> > Subject: Re: [Twisted-Python] Sending jpeg data over
> TCP/IP
> > To: "Twisted general discussion" 
> > Date: Sunday, February 14, 2010, 11:10 AM
> > Hello again everyone,
> > Maybe using two senders/receivers would help. The
> control
> > protocol,
> > which can use XML or JSON, or whatever, would identify
> the
> > files by
> > their md5 sum? The file transfert protocol would
> detect the
> > header of
> > each file to separate them.
> > 
> > a
> > 
> > gary clark wrote:
> > > There are probably several way to accomplish this
> I
> > just needed to think about it a wee bit longer. One
> way
> > would be to prepend an identifier to the file,strip
> the
> > header from the raw data on reception and then save
> the
> > image. I dont think its complicated. I may need a
> seperate
> > server to handle the files though.
> > > 
> > > Thanks,
> > > Garyc
> > > 
> > > --- On Sat, 2/13/10, Maarten ter Huurne 
> > wrote:
> > > 
> > >> From: Maarten ter Huurne 
> > >> Subject: Re: [Twisted-Python] Sending jpeg
> data
> > over TCP/IP
> > >> To: "Twisted general discussion" 
> > >> Date: Saturday, February 13, 2010, 6:07 PM
> > >> On Sunday 14 February 2010, Alexandre
> > >> Quessy wrote:
> > >>
> > >>> This said, sending them using a
> programmer's
> > solution
> > >> - not a sysadmin
> > >>> solution - would be closer to my own
> skills,
> > so I am
> > >> interested in
> > >>> knowing if this could be suitable. I
> think,
> > though,
> > >> that it would be
> > >>> faster to use a transfert protocol that
> would
> > be
> > >> implemented in C, not
> > >>> Python. Am I wrong?
> > >> I would suggest to implement it in Python
> first
> > and then
> > >> benchmark it. Maybe 
> > >> the simplest implementation is already fast
> > enough. Maybe
> > >> the bottleneck is 
> > >> the network or the disk you're writing to; in
> that
> > case you
> > >> would be better 
> > >> off upgrading your switches or buying an SSD
> > instead of
> > >> writing C code.
> > >>
> > >> Bye,
> > >>         Maarten
> > >>
> > >>
> ___
> > >> Twisted-Python mailing list
> > >> Twisted-Python@twistedmatrix.com
> > >> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
> > >>
> > > 
> > > 
> > > ___
> > > Twisted-Python mailing list
> > > Twisted-Python@twistedmatrix.com
> > > http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
> > > 
> > 
> > ___
> > Twisted-Python mailing list
> > Twisted-Python@twistedmatrix.com
> > http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
> >
> 


___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Sending jpeg data over TCP/IP

2010-02-22 Thread Christopher Armstrong
On Mon, Feb 22, 2010 at 2:03 AM, gary clark  wrote:
> The correct way I think is to use base64 and just one connection after more 
> research. Good to research.
>
> Thanks,
> Garyc

No, base64ing file contents is a terrible thing to do if you're
already writing your own TCP-based protocol. Just length-prefix the
data.


>
> --- On Sun, 2/14/10, gary clark  wrote:
>
>> From: gary clark 
>> Subject: Re: [Twisted-Python] Sending jpeg data over TCP/IP
>> To: "Twisted general discussion" 
>> Date: Sunday, February 14, 2010, 12:18 PM
>>
>> hey Alexandra,
>>
>> The only reason why I suggested another server would be to
>> distribute the
>> load on the system. Essentially the files that are intended
>> to be sent over will consume alot of processing since .jpeg
>> etc are humongous.
>> Yes two connections would be required. Well you could use
>> the MD5 sum, I will not, essentially your after a unique
>> identifier that represents the file your sending. What I
>> intend to do is simply embed the filename and unique
>> identifier as a header. Hence I do not not need to recompute
>> the MD5 at the receiving end, which to be honest seems
>> overkill.
>>
>> Anyway good luck. KISS for software is the best approach.
>>
>> Thanks,
>> Garyc
>>
>>
>> --- On Sun, 2/14/10, Alexandre Quessy 
>> wrote:
>>
>> > From: Alexandre Quessy 
>> > Subject: Re: [Twisted-Python] Sending jpeg data over
>> TCP/IP
>> > To: "Twisted general discussion" 
>> > Date: Sunday, February 14, 2010, 11:10 AM
>> > Hello again everyone,
>> > Maybe using two senders/receivers would help. The
>> control
>> > protocol,
>> > which can use XML or JSON, or whatever, would identify
>> the
>> > files by
>> > their md5 sum? The file transfert protocol would
>> detect the
>> > header of
>> > each file to separate them.
>> >
>> > a
>> >
>> > gary clark wrote:
>> > > There are probably several way to accomplish this
>> I
>> > just needed to think about it a wee bit longer. One
>> way
>> > would be to prepend an identifier to the file,strip
>> the
>> > header from the raw data on reception and then save
>> the
>> > image. I dont think its complicated. I may need a
>> seperate
>> > server to handle the files though.
>> > >
>> > > Thanks,
>> > > Garyc
>> > >
>> > > --- On Sat, 2/13/10, Maarten ter Huurne 
>> > wrote:
>> > >
>> > >> From: Maarten ter Huurne 
>> > >> Subject: Re: [Twisted-Python] Sending jpeg
>> data
>> > over TCP/IP
>> > >> To: "Twisted general discussion" 
>> > >> Date: Saturday, February 13, 2010, 6:07 PM
>> > >> On Sunday 14 February 2010, Alexandre
>> > >> Quessy wrote:
>> > >>
>> > >>> This said, sending them using a
>> programmer's
>> > solution
>> > >> - not a sysadmin
>> > >>> solution - would be closer to my own
>> skills,
>> > so I am
>> > >> interested in
>> > >>> knowing if this could be suitable. I
>> think,
>> > though,
>> > >> that it would be
>> > >>> faster to use a transfert protocol that
>> would
>> > be
>> > >> implemented in C, not
>> > >>> Python. Am I wrong?
>> > >> I would suggest to implement it in Python
>> first
>> > and then
>> > >> benchmark it. Maybe
>> > >> the simplest implementation is already fast
>> > enough. Maybe
>> > >> the bottleneck is
>> > >> the network or the disk you're writing to; in
>> that
>> > case you
>> > >> would be better
>> > >> off upgrading your switches or buying an SSD
>> > instead of
>> > >> writing C code.
>> > >>
>> > >> Bye,
>> > >>         Maarten
>> > >>
>> > >>
>> ___
>> > >> Twisted-Python mailing list
>> > >> Twisted-Python@twistedmatrix.com
>> > >> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>> > >>
>> > >
>> > >
>> > > ___
>> > > Twisted-Python mailing list
>> > > Twisted-Python@twistedmatrix.com
>> > > http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>> > >
>> >
>> > ___
>> > Twisted-Python mailing list
>> > Twisted-Python@twistedmatrix.com
>> > http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>> >
>>
>
>
> ___
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>



-- 
Christopher Armstrong
http://radix.twistedmatrix.com/
http://planet-if.com/

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Try/catching yielded Exceptions and getting proper tracebacks

2010-02-22 Thread Terry Jones
Hi Paul

> "Paul" == Paul Goins  writes:

You're only printing the exception, not a full traceback, so you don't see
much. I tend to write what you're doing as follows:

from twisted.python import log

@defer.inlineCallbacks
def xmlrpc_dosomething(self):
d = self._do_something_else()
d.addErrback(log.err)
result = yield d
defer.returnValue(result)

If you try that you'll see a full traceback. The above lets log.err handle
the failure that comes back via the errback on the deferred you get from
_do_something_else, and log.err knows how to get the full traceback.

I don't know if it's clear, but whenever you call an inlineCallbacks
decorated method/func, you get a deferred back (unless you happen to
mistakenly use inlineCallbacks to decorate something that's not a
generator). You can add errbacks to that deferred, just like any other, and
if you're making the call from inside another inlineCallbacks decorated
function, you can just do as above: add call/errbacks to the deferred, and
then yield it.

Terry

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] Looking for companies doing Twisted

2010-02-22 Thread Nicolas Dietrich
Hi there,
I'm not sure if this is the right place to ask, so sorry for the spam in 
advance. Anyway:

I'm looking for companies which might be interested in developing a booking 
system for an emerging Germany-based long-distance railway undertaking during 
this year. This will be a distributed system, so Twisted might be a good 
solution for this.

Please approach me directly for any details.

Thankful for all hints,
Nicolas


signature.asc
Description: This is a digitally signed message part.
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] IMAP4 Client extrange behavior

2010-02-22 Thread César García
Hello all, I have this imap4 client http://www.pastebin.com/m4e387f1a
most of the code is barrowed from a IRC friend, bob_f , it works fine
but as soon as more that 1 email arrrives to the inbox, it only prints
the first new messages but not the others,  I wonder what am I doing
wrong

Thanks a lot







Here is the code:

#!/usr/bin/env python
#coding=utf-8


"""
Client de IMAP4 que descarga contenido del INBOX de una cuenta en especifico
para luego extraer el numero de telefono que debe venir en los correos enviados
"""

import StringIO
import sys
from Config import retCredentials

from twisted.internet.task import LoopingCall
from twisted.internet import protocol
from twisted.internet import defer
from twisted.mail import imap4
from twisted.python import util
from twisted.python import log
from twisted.internet import reactor
debug = 1

class IMAP4Client(imap4.IMAP4Client):

def serverGreeting(self,caps):
"""
Metodo llamado cuando el servidor contesta
"""
if debug: print "On serverGreeting"

self.serverCapabilities = caps
if self.greetDeferred is not None:
d, self.greetDeferred = self.greetDeferred, None
d.addCallback(self.cbLogin)
d.callback(self)


def cbLogin(self, proto):
"""
Callback to IMAP login
"""
if debug: print "On Login"

login =  self.login(self.factory.username, self.factory.password)
login.addCallback(self.startPolling)


def startPolling(self, proto):
"""
Callback to poll every x seconds
"""
call = LoopingCall(self.selectMailbox,mailbox="INBOX")
call.start(20, now=True)


def selectMailbox(self, mailbox):
"""
Select the mailbox to examin
"""
if debug: print "On selectMailbox"

mailbox = self.factory.mailbox
return self.select(mailbox).addCallback(self.cbSelectSuccess)


def cbSelectSuccess(self, selected):
"""
Examine the INBOX mailbox for new mails

"""
if debug: print "On cbSelectSuccess"


self.messageCount = selected['EXISTS']
print "Messages: ", self.messageCount

unseen = selected['EXISTS'] - selected['RECENT']

if selected['RECENT'] == 0:
print "No new messages"
return

return self.fetchMessage("%s:*" % (unseen)
).addCallback(self.cbProcMessages)

def cbProcMessages(self,messages):

print messages


class IMAP4ClientFactory(protocol.ClientFactory):

protocol = IMAP4Client

def __init__(self, username, password,  onConn):

self.username = username
self.password = password
self.mailbox = 'INBOX'
self.onConn = onConn

def buildProtocol(self,addr):
if debug: print "On buildProtocol"
p = self.protocol()
p.factory = self
p.greetDeferred = self.onConn
auth = imap4.CramMD5ClientAuthenticator(self.username)
p.registerAuthenticator(auth)

return p

def clientConectionFailed(self, connector, reason):
d, self.onConn = self.onConn, None
d.errback(reason)



def ebConnection(reason):
log.startLogging(sys.stdout)
log.err(reason)
reactor.stop()



PORT = 143
RESULT = "INBOX"


def main():
credentials = retCredentials()
hostname = credentials['server']
username = credentials['mailbox']
password = util.getPassword('IMAP4 Password: ')

onConn = defer.Deferred(
).addErrback(ebConnection
)

factory = IMAP4ClientFactory(username, password, onConn)


reactor.connectTCP(hostname, PORT, factory)
reactor.run()

if  __name__ == "__main__":
main()

-- 
http://celord.blogspot.com/

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] [ANNOUNCE] Twisted 10.0.0pre1 is now released

2010-02-22 Thread Kevin Horn
On Sun, Feb 21, 2010 at 9:50 PM, Jonathan Lange  wrote:

> Live from PyCon Atlanta, I'm pleased to herald the approaching
> footsteps of the 10.0 release.
>
> Tarballs for the first Twisted 10.0.0 pre-release are now available at:
>  
> http://people.canonical.com/~jml/Twisted/
>
> This release is the first release ever with the new NEWS building
> system, which turns out to be utterly fantastic.
>
> We're also using this release to actually hammer out a release
> process. You can find the draft at:
>  http://twistedmatrix.com/trac/wiki/ReleaseProcess
>
> Please feel free to update it with questions, thoughts, corrections and
> advice.
>
>
Here's a few...perhaps these should be added to the "Open Questions"
section:

1. How/when in this process are the Windows installers and/or MacOS .dmg
files created?
   (I presume .deb and .rpm packages are left up to Linux distro packagers)
2. How/when in this process are the docs built?
3. When should the front page of the wiki be updated?


> Thanks,
> jml
>
>
No, thank _you_. :)

Kevin Horn
___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


[Twisted-Python] twisted.protocol.sip

2010-02-22 Thread Lorenzo Mainardi
Hello,
I'm trying to extend the class twisted.protocols.sip.MessageParser for
create my parser.
I had read the documentation and I found this: "Shouldn't
be connected to actual transport.".
What does it means? I can't use that binding it to a socket?

I found also that many other object in twisted.protocols.sip are not
complete and/or not working (some of that return everytime
NotImplementedYet). Is it correct or I'm doing some mistakes?

-- 
LORENZO MAINARDI
Email: lorma...@gmail.com
Linux Registered User: 461615
Key Fingerprint: AC63 5C15 562F 71AF C853  4D4A C03F 75EB 52F4 A0D0

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] debugging a memory leak

2010-02-22 Thread Alec Matusis
Werner

I am using your code, and it shows essentially the same thing as Heapy:
the counts of all common objects more or less agree.
The 'Total size' shown in Heapy

When I start the process, both python object sizes and their counts rise
proportionally to the numbers of reconnected clients, and then they
stabilize after all clients have reconnected.
At that moment, the "external" RSS process size is about 260MB. The
"internal size" of all python objects reported by Heapy is about 150MB.
After two days, the internal sizes/counts stay the same, but the external
size grows to 1500MB.

Python object counts/total sizes are measured from the manhole.
Is this sufficient to conclude that this is a C memory leak in one of the
external modules or in the Python interpreter itself?

> -Original Message-
> From: twisted-python-boun...@twistedmatrix.com [mailto:twisted-python-
> boun...@twistedmatrix.com] On Behalf Of Werner Thie
> Sent: Friday, February 19, 2010 4:10 PM
> To: Twisted general discussion
> Subject: Re: [Twisted-Python] debugging a memory leak
> 
> Hi Alec
> 
> ...and they promised you that with a gc'ed language there will never be
> a memory problem again, you just plain forget about it.
> 
> I was stuck in the same position as you and after lots of probing the
> following attempt helped a lot to correct what was later proofed to be
> overly optimistic coding by holding on to objects for
> performance/practical reasons in other objects. Producing non collect
> able cycles in twisted is probably as easy as to forget about memory
> when you have Alzheimer.
> 
> Proofing and working on the problem was only possible on the production
> machine under real load situations. I went ahead and created a manhole
> service on the production server, allowing me to peek at the python
> object space without disturbing it too much. What I used as a tool was
> the code you find later on included.
> 
> After cleaning all the self produced cycles out our servers processes
> stabilized at roughly 280 to 320 MB per process and are now running
> stable for months with more than 20k logins per day and a usual time of
> connect per user on the average of 25 minutes playing games delivered by
> nevow/athena LivePages.
> 
> As I said before, all cycles I found in our SW were introduced by
> patterns like
> 
> def beforeRender(self, ctx):
>  self.session = inevow.ISession(ctx)
> 
> The included code helps to identify the amount of objects being around.
> Although it's a primitive tool it shines the light where its needed and
> if you see certain object counts run away then you have at least
> identified the surrounding where the non collect able cycles are built.
> 
> Why didn't I use heapy/guppy and found out that way? I wasn't able to
> find the evidence for what I was suspecting with all the tools I tried
> (and boy I tried for WEEKS). Avid users of heapy will most probably
> disagree and tell me it would have been easy. But in a situation as this
> everything that works to move you out of that pothole you're in is the
> right thing to do.
> 
> HTH, Werner
> 
> exc = [
>"function",
>"type",
>"list",
>"dict",
>"tuple",
>"wrapper_descriptor",
>"module",
>"method_descriptor",
>"member_descriptor",
>"instancemethod",
>"builtin_function_or_method",
>"frame",
>"classmethod",
>"classmethod_descriptor",
>"_Environ",
>"MemoryError",
>"_Printer",
>"_Helper",
>"getset_descriptor",
>"weakreaf"
> ]
> 
> inc = [
>'myFirstSuspect',
>'mySecondSuspect'
> ]
> 
> prev = {}
> 
> def dumpObjects(delta=True, limit=0, include=inc, exclude=[]):
>global prev
>if include != [] and exclude != []:
>  print 'cannot use include and exclude at the same time'
>  return
>print 'working with:'
>print '   delta: ', delta
>print '   limit: ', limit
>print ' include: ', include
>print ' exclude: ', exclude
>objects = {}
>gc.collect()
>oo = gc.get_objects()
>for o in oo:
>  if getattr(o, "__class__", None):
>name = o.__class__.__name__
>if ((exclude == [] and include == [])   or \
>(exclude != [] and name not in exclude) or \
>(include != [] and name in include)):
>  objects[name] = objects.get(name, 0) + 1
> ##if more:
> ##  print o
>pk = prev.keys()
>pk.sort()
>names = objects.keys()
>names.sort()
>for name in names:
>  if limit == 0 or objects[name] > limit:
>if not prev.has_key(name):
>  prev[name] = objects[name]
>dt = objects[name] - prev[name]
>if delta or dt != 0:
>  print '%0.6d -- %0.6d -- ' % (dt, objects[name]),  name
>prev[name] = objects[name]
> 
> def getObjects(oname):
>"""
>gets an object list with all the named objects out of the sea of
>gc'ed objects
>"""
>olist = []
>objects = {}
>gc.collect()
>oo = gc.get_objects()
>for o in 

Re: [Twisted-Python] debugging a memory leak

2010-02-22 Thread Maarten ter Huurne
On Tuesday 23 February 2010, Alec Matusis wrote:

> When I start the process, both python object sizes and their counts rise
> proportionally to the numbers of reconnected clients, and then they
> stabilize after all clients have reconnected.
> At that moment, the "external" RSS process size is about 260MB. The
> "internal size" of all python objects reported by Heapy is about 150MB.
> After two days, the internal sizes/counts stay the same, but the external
> size grows to 1500MB.
> 
> Python object counts/total sizes are measured from the manhole.
> Is this sufficient to conclude that this is a C memory leak in one of the
> external modules or in the Python interpreter itself?

In general, there are other reasons why heap size and RSS size do not match:
1. pages are empty but not returned to the OS
2. pages cannot be returned to the OS because they are not completely empty

It seems Python has different allocators for small and large objects:
http://www.mail-archive.com/python-l...@python.org/msg256116.html
http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-
a-large-object.htm

Assuming Python uses malloc for all its allocations (does it?), it is the 
malloc implementation that determines whether empty pages are returned to 
the OS. Under Linux with glibc (your system?), empty pages are returned, so 
there reason 1 does not apply.

Depending on the allocation behaviour of Python, the pages may not be empty 
though, so reason 2 is a likely suspect.

Python extensions written in C could also leak or fragment memory. Are you 
using any extensions that are not pure Python?

Bye,
Maarten

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] Try/catching yielded Exceptions and getting proper tracebacks

2010-02-22 Thread Paul Goins
Hi Terry,

> You're only printing the exception, not a full traceback, so you don't see
> much. I tend to write what you're doing as follows: [...]

Excellent.  I'll give it a try.  I had a feeling it was something like
that which I was missing.

I already understood that inlineCallbacks returns Deferreds, but the
clarification is appreciated nonetheless.  I think it was just my lack
of understanding of how Failures are handled, logged, etc.

- Paul


___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] debugging a memory leak

2010-02-22 Thread Alec Matusis
Hi Maarten,

Your link
http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-
a-large-object.htm 
seems to suggest that even though the interpreter does not release memory
back to the OS, it can be re-used by the interpreter.
If this was our problem, I'd expect the memory to be set by the highest
usage, as opposed to it constantly leaking: in my case, the load is
virtually constant, but the memory still leaks over time.

The environment is Linux 2.6.24 x86-64, the extensions used are MySQLdb,
pyCrypto (latest stable releases for both).

> -Original Message-
> From: twisted-python-boun...@twistedmatrix.com [mailto:twisted-python-
> boun...@twistedmatrix.com] On Behalf Of Maarten ter Huurne
> Sent: Monday, February 22, 2010 6:24 PM
> To: Twisted general discussion
> Subject: Re: [Twisted-Python] debugging a memory leak
> 
> On Tuesday 23 February 2010, Alec Matusis wrote:
> 
> > When I start the process, both python object sizes and their counts rise
> > proportionally to the numbers of reconnected clients, and then they
> > stabilize after all clients have reconnected.
> > At that moment, the "external" RSS process size is about 260MB. The
> > "internal size" of all python objects reported by Heapy is about 150MB.
> > After two days, the internal sizes/counts stay the same, but the
external
> > size grows to 1500MB.
> >
> > Python object counts/total sizes are measured from the manhole.
> > Is this sufficient to conclude that this is a C memory leak in one of
the
> > external modules or in the Python interpreter itself?
> 
> In general, there are other reasons why heap size and RSS size do not
match:
> 1. pages are empty but not returned to the OS
> 2. pages cannot be returned to the OS because they are not completely
empty
> 
> It seems Python has different allocators for small and large objects:
> http://www.mail-archive.com/python-l...@python.org/msg256116.html
> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-
> delete-
> a-large-object.htm
> 
> Assuming Python uses malloc for all its allocations (does it?), it is the
> malloc implementation that determines whether empty pages are returned to
> the OS. Under Linux with glibc (your system?), empty pages are returned,
so
> there reason 1 does not apply.
> 
> Depending on the allocation behaviour of Python, the pages may not be
> empty
> though, so reason 2 is a likely suspect.
> 
> Python extensions written in C could also leak or fragment memory. Are you
> using any extensions that are not pure Python?
> 
> Bye,
>   Maarten
> 
> ___
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


Re: [Twisted-Python] debugging a memory leak

2010-02-22 Thread Werner Thie
Hi

Assuming that if memory not released to the OS can be reused by the 
interpreter because of a suballocation system used in the interpreter 
should eventually lead to a leveling out of the overall memory usage 
over time, that's what I observe with our processes (sitting at several 
100 MB per process). We are using external C libraries which do lots of 
malloc/free and one of the bigger sources of pain is indeed to bring 
such a library to a point where its clean not only by freeing all memory 
allocated in every circumstance but also Python refcounting wise. I 
usually go thru all the motions to build up a complete debug chain for 
all modules involved in a project and write a test bed to proof clean 
and proper implementation.

So if your using C/C++ based modules in your project I would mark them 
as highly suspicious to be responsible for leaks until proven otherwise.

Not to bother you with numbers but I usually allocate about 30% of 
overall project time to bring a server into a production ready state, 
meaning uptimes of months/years, no fishy feelings, no performance 
oscillations, predictable caving and recuperating when overloaded, just 
all the things you have to tick to sign off a project as completed, 
meaning you don't have to do daily 'tire kicking' maintenance and 
periodic reboots.

Werner

Alec Matusis wrote:
> Hi Maarten,
> 
> Your link
> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-
> a-large-object.htm 
> seems to suggest that even though the interpreter does not release memory
> back to the OS, it can be re-used by the interpreter.
> If this was our problem, I'd expect the memory to be set by the highest
> usage, as opposed to it constantly leaking: in my case, the load is
> virtually constant, but the memory still leaks over time.
> 
> The environment is Linux 2.6.24 x86-64, the extensions used are MySQLdb,
> pyCrypto (latest stable releases for both).
> 
>> -Original Message-
>> From: twisted-python-boun...@twistedmatrix.com [mailto:twisted-python-
>> boun...@twistedmatrix.com] On Behalf Of Maarten ter Huurne
>> Sent: Monday, February 22, 2010 6:24 PM
>> To: Twisted general discussion
>> Subject: Re: [Twisted-Python] debugging a memory leak
>>
>> On Tuesday 23 February 2010, Alec Matusis wrote:
>>
>>> When I start the process, both python object sizes and their counts rise
>>> proportionally to the numbers of reconnected clients, and then they
>>> stabilize after all clients have reconnected.
>>> At that moment, the "external" RSS process size is about 260MB. The
>>> "internal size" of all python objects reported by Heapy is about 150MB.
>>> After two days, the internal sizes/counts stay the same, but the
> external
>>> size grows to 1500MB.
>>>
>>> Python object counts/total sizes are measured from the manhole.
>>> Is this sufficient to conclude that this is a C memory leak in one of
> the
>>> external modules or in the Python interpreter itself?
>> In general, there are other reasons why heap size and RSS size do not
> match:
>> 1. pages are empty but not returned to the OS
>> 2. pages cannot be returned to the OS because they are not completely
> empty
>> It seems Python has different allocators for small and large objects:
>> http://www.mail-archive.com/python-l...@python.org/msg256116.html
>> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-
>> delete-
>> a-large-object.htm
>>
>> Assuming Python uses malloc for all its allocations (does it?), it is the
>> malloc implementation that determines whether empty pages are returned to
>> the OS. Under Linux with glibc (your system?), empty pages are returned,
> so
>> there reason 1 does not apply.
>>
>> Depending on the allocation behaviour of Python, the pages may not be
>> empty
>> though, so reason 2 is a likely suspect.
>>
>> Python extensions written in C could also leak or fragment memory. Are you
>> using any extensions that are not pure Python?
>>
>> Bye,
>>  Maarten
>>
>> ___
>> Twisted-Python mailing list
>> Twisted-Python@twistedmatrix.com
>> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
> 
> 
> ___
> Twisted-Python mailing list
> Twisted-Python@twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

___
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python