[C-API] malloc error in PyDict_New

2010-03-26 Thread Jonas H.

Hi there,

I'm currently diving into Python C programming and I have a problem with 
`PyDict_New`.


My application receives a SIGABRT from malloc every time I execute 
`PyDict_New`. malloc throws the following error:


malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) 
(((char *) &((av)->bins[((1) - 1) * 2])) [snip]' failed.


gdb gives me the following traceback:

Program received signal SIGABRT, Aborted.
0x0012d422 in __kernel_vsyscall ()
(gdb) bt full
#0  0x0012d422 in __kernel_vsyscall ()
#5  0x003fef8c in malloc () from /lib/tls/i686/cmov/libc.so.6
#6  0x001b129c in new_arena (nbytes=3221223842) at ../Objects/obmalloc.c:534
i = 
numarenas = 16
arenaobj = 0x0
excess = 16
#7  PyObject_Malloc (nbytes=3221223842) at ../Objects/obmalloc.c:794
bp = 
pool = 
next = 
size = 4983326
#8  0x001baef5 in PyString_FromString (str=0x2964bf "") at 
../Objects/stringobject.c:138

op = 0x0
#9  0x001a9d55 in PyDict_New () at ../Objects/dictobject.c:227
mp = 
#10 0x08048fc0 in Transaction_new () at bjoern.c:32
transaction = 0x80503a0
#11 0x08049309 in on_sock_accept (mainloop=0x13a120, 
accept_watcher=0xb770, revents=1) at bjoern.c:109

[snip]
#12 0x00130594 in ev_invoke_pending () from /usr/lib/libev.so.3
#13 0x00135774 in ev_loop () from /usr/lib/libev.so.3
#14 0x080496e0 in main (argcount=1, args=0xb864) at bjoern.c:207
[snip]


I have walked millions of Google pages but I couldn't find any 
explanation what causes the allocation error.  I tried to put the 
`PyDict_New` somewhere else to let it be invoked earlier/later. The only 
effect I got is a "memory corruption" reported by glibc.



Could anybody tell me what exactly I'm doing wrong? It is quite possible 
that I fscked up some pointers or memory ranges as this is my first C 
project.


You can find the whole source at github:
http://github.com/jonashaag/bjoern

The call to `PyDict_New` is here:
http://github.com/jonashaag/bjoern/blob/master/bjoern.c#L32


Thanks for your help!

Jonas
--
http://mail.python.org/mailman/listinfo/python-list


[C-API] Weird sys.exc_info reference segfault

2010-10-02 Thread Jonas H.

Hello list,

I have a really weird reference problem with `sys.exc_info`, and, if I'm 
right, function frames.


The software in question is bjoern, a WSGI web server written in C, 
which you can find at http://github.com/jonashaag/bjoern.


This WSGI application:

  def app(env, start_response):
  start_response('200 alright', [])
  try:
  a
  except:
  import sys
  sys.exc_info()
  return ['hello']

  import bjoern
  bjoern.run(app, '0.0.0.0', 8080)

works perfect, however, if I make the 7th line an assignment:

  x = sys.exc_info()

I get a segmentation fault after a few requests, the stack trace being:

  #1  frame_clear ()
  #2  collect ()
  #3  _PyObject_GC_Malloc ()
  #4  PyType_GenericAlloc ()
  #5  BaseException_new ()
  #6  type_call ()
  #7  PyObject_Call ()
  #8  PyEval_CallObjectWithKeywords ()
  #9  PyErr_NormalizeException ()
  #10 PyEval_EvalFrameEx ()
  #11 PyEval_EvalCodeEx ()
  #12 function_call ()
  #14 PyObject_CallFunctionObjArgs ()
  #15 wsgi_call_application (request=...) at bjoern/wsgi.c:33

Now that is weird. The only difference between the two functions is that 
the second one (with the assignment) keeps a reference to the exc_info 
tuple in the function frame.
The `PyThreadState_GET()->exc_{type,value,traceback}` values, however, 
should be the same in both cases, because the `except:` cleanup resets 
those to NULL, shouldn't it?


Do you have any tips how to debug this?

Thanks in advance,
Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: [C-API] Weird sys.exc_info reference segfault

2010-10-03 Thread Jonas H.

On 10/03/2010 01:16 AM, Antoine Pitrou wrote:

You should check that you aren't doing anything wrong
with "env" and "start_response" (like deallocate them forcefully).


I commented out the `Py_DECREF(start_response)` after the `app` call and 
the crash was gone. `start_response` is created via `PyObject_NEW` on 
run time for every `app` call and `PyObject_FREE`d after that call.


I do not understand why I'm not supposed to DECREF the start_response 
callable after the call -- doesn't a function INCREF its arguments when 
called, so I'm free to DECREF them?  If not, how can I know at which 
point of time I can safely do the DECREF?


Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: [C-API] Weird sys.exc_info reference segfault

2010-10-03 Thread Jonas H.

On 10/03/2010 03:47 PM, Antoine Pitrou wrote:

You shouldn't call PyObject_FREE yourself, but instead rely on
Py_DECREF to deallocate it if the reference count drops to zero.
So, instead of commenting out Py_DECREF and keeping PyObject_FREE, I'd
recommend doing the reverse. That way, if a reference is still living
in the frame, the object doesn't get deallocated too early.


Humm. Now the behaviour is as follows:

with assignment to local variable
--
* start_response = PyObject_NEW(...) -> start_response->ob_refcnt=1
* wsgiapp(environ, start_response)   -> ob_refcnt=2
* Py_DECREF(start_response)  -> ob_refcnt=1


without assignment
--
* start_response = PyObject_NEW(...) -> start_respinse->ob_refcnt=1
* wsgiapp(environ, start_response)   -> ob_refcnt=1
* Py_DECREF(start_response): CRASH

I think I'll compile Python with debug support to check out what's going 
wrong in the second case.


Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: [C-API] Weird sys.exc_info reference segfault

2010-10-04 Thread Jonas H.

On 10/03/2010 11:52 PM, Antoine Pitrou wrote:

You probably have a problem in your tp_dealloc implementation.


`tp_dealloc` is NULL...
--
http://mail.python.org/mailman/listinfo/python-list


Re: [C-API] Weird sys.exc_info reference segfault

2010-10-04 Thread Jonas H.

On 10/04/2010 10:46 AM, Jonas H. wrote:

On 10/03/2010 11:52 PM, Antoine Pitrou wrote:

You probably have a problem in your tp_dealloc implementation.


`tp_dealloc` is NULL...


Alright, `tp_dealloc` must not be NULL because it's called by 
`_Py_Dealloc`.  The C-API tutorial is quite confusing here:



[...] here's a minimal, but __complete__, module that defines a new type:

[...]
static PyTypeObject noddy_NoddyType = {
[...]
0, /*tp_dealloc*/
[...]
};


So I thought "complete" meant "it works". Actually, DECREFing that an 
object of *that* type does not work, it crashes because of the NULL 
`tp_dealloc` function pointer that is called in `_Py_Dealloc`.


Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: [C-API] Weird sys.exc_info reference segfault

2010-10-05 Thread Jonas H.

On 10/04/2010 11:41 PM, Antoine Pitrou wrote:

Well, it should work, but you have to call PyType_Ready() to fill in
the NULL fields with default values (for those where it's necessary).
Does it solve it for you?


Yes, thank you! Although I do not understand which fields I have to 
provide. I want an object that behaves like a function: it should be 
callable and have a __dict__, but it should not be subclass of object - 
so now new instances of the object's type should be allowed to create 
and no subclasses, either.


Right now I have this minimal struct:

static PyTypeObject StartResponse_Type = {
PyObject_HEAD_INIT(&PyType_Type)
0,  /* ob_size */
"start_response",   /* tp_name */
sizeof(StartResponse),  /* tp_basicsize */
0,  /* tp_itemsize */
(destructor)PyObject_FREE,  /* tp_dealloc */
0, 0, 0, 0, 0, 0, 0, 0, 0,  /* tp_print, tp_{get,set}attr, stuff */
start_response  /* tp_call */
};

I'm not sure about the `PyObject_HEAD_INIT` argument, but passing NULL 
to it made `dir(obj)` crash.  So does setting `GenericGetAttr` as 
`tp_getattr`.

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to save a binary file?

2010-10-05 Thread Jonas H.

On 10/05/2010 11:11 PM, hid...@gmail.com wrote:

Hello, how i can save a binary file, i read in the manual in the IO area
but doesn' t show how to save it.
Here is the code what i am using:
s = open('/home/hidura/test.jpeg', 'wb')
s.write(str.encode(formFields[5]))
s.close()


So where's the problem? That code should work. Anyway, you want to have 
a look at with-statements.


Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: [C-API] Weird sys.exc_info reference segfault

2010-10-06 Thread Jonas H.

On 10/06/2010 02:01 PM, Antoine Pitrou wrote:

It shouldn't. Are you sure you're calling PyType_Ready in the module
initialization routine?


Yeah. The problem was that the type struct was declared 'static' in 
another module so the changes `PyType_Ready` made to the struct weren't 
applied correctly.




By the way, it is recommended to use at least Py_TPFLAGS_DEFAULT for
tp_flags.


Thanks, but I chose not to use that flags. I don't need any.


tp_getattr has the same signature as PyObject_GetAttrString. You're
looking for tp_getattro, which takes the attribute name as a PyObject *
rather than as a char *.


Thanks again, my fault :-)

I think my problems are solved and my questions answered -- thank you so 
much for you patience!


Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: frozendict (v0.1)

2010-10-08 Thread Jonas H.

On 10/08/2010 02:23 AM, kj wrote:

I imagine that frozenset is better than sorted(tuple(...)) here,
but it's not obvious to me why.


dicts are unsorted. That means their item-order is undefined. So are sets.

If you want a hash that is independent from the order of items, you 
could ensure the items are always in the same order when you do the 
hashing; or you could use a hashing algorithm that ignore item order.


As creating a `frozenset` is probably more efficient than sorting, that 
is the preferred solution.


Here's my implementation suggestion:

class frozendict(dict):
def _immutable_error(self, *args, **kwargs):
raise TypeError("%r object is immutable" % self.__class__.__name__)

__setitem__ = __delitem__ = clear = pop \
= popitem = setdefault = update = _immutable_error

def __hash__(self):
return hash(frozenset(self.iteritems()))

Only 9 lines :-)

Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: frozendict (v0.1)

2010-10-08 Thread Jonas H.

On 10/08/2010 03:27 PM, kj wrote:

I tried to understand this by looking at the C source but I gave
up after 10 fruitless minutes.  (This has been invariably the
outcome of all my attempts at finding my way through the Python C
source.)


It's not you. CPython's code is ... [censored]

Anyway, you don't even need to read C code to understand how sets are 
implemented.


There is a (now deprecated) Python module, Libs/set.py, that has 
implementations for `Set` and `ImmutableSet` (nowadays `set` and 
`frozenset`).


The implementation strategy you can see there is quite simple. The code 
uses dictionary keys to store the set items and "ignores" the dictionary 
values, so that `.add(value)` is implemented as `._dict[value] = 
some_value_nobody_cares_about`.


Here comes a very limited example set implementation using a dict:

class PrimitiveSet(object):
def __init__(self):
self._dict = {}

def add(self, value):
self._dict[value] = True

def __contains__(self, value):
return value in self._dict

def __repr__(self):
return 'PrimitiveSet(%r)' % self._dict.keys()

>>> s = PrimitiveSet()
>>> 'hello' in s
False
>>> s.add('hello')
>>> 'hello' in s
True
>>> s
PrimitiveSet(['hello'])
>>> s.add(tuple(xrange(10)))
>>> s
PrimitiveSet([(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), 'hello'])
>>> s.add(xrange(5))
>>> s
PrimitiveSet([(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), xrange(5), 'hello'])

This has a few implications for sets:
* dict keys are unordered/sorted. so are sets.
* dict keys are unique. same for set values.
* dict keys have to be hashable (immutable). same for sets values.

So far our implementation is not hashable, and we need a custom 
implementation for __hash__ (because dicts aren't hashable, so we cannot 
re-use dictionary methods).
There is one requirement for set hashes: they have to be independent of 
the item order (there *is* an order in memory of course, and it may vary 
depending on the order assignments to our dict are done).


Here is an extract from the Python set implementation, 
`BaseSet._compute_hash`:


def _compute_hash(self):
# Calculate hash code for a set by xor'ing the hash codes of
# the elements.  This ensures that the hash code does not depend
# on the order in which elements are added to the set. [...]
result = 0
for elt in self:
result ^= hash(elt)
return result

Hope this helps :-)

Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: Help with pointers when calling from python to C

2010-10-08 Thread Jonas H.

On 10/08/2010 05:23 PM, Carolyn MacLeod wrote:

"How do I pass an integer by reference to a C function?"


That's impossible in pure Python. The only thing I can think of is a 
wrapper in C.

--
http://mail.python.org/mailman/listinfo/python-list


Re: My first Python program

2010-10-12 Thread Jonas H.

On 10/12/2010 09:14 PM, Seebs wrote:

http://github.com/wrpseudo/pseudo/blob/master/makewrappers


Just a few pointers, looks quite good to me for a newbie :)

* Less action in __init__.
* Use `open` instead of `file` to open a file
* Have a look at context managers for file handling (avoids doing 
error-prune stuff like __del__)
* Your `del` in line 464 is useless. A reference will be removed from 
the object bound to the local variable 'source' anyway because of the 
re-assignment.
* according to common Python style guides you should not use underscores 
in class names.

* no need for 'r' in `open` calls ('r' is the default mode)
* `line == ''` can be written more pythonic as `not line`
* `str.{r,l,}strip` removes '\n\t\r ' by default, no need for an 
argument here (line 440 for example)

* you might want to pre-compile regular expressions (`re.compile`)
* raising `Exception` rather than a subclass of it is uncommon.

Hope that helps :-)

Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: Reading after a symbol..

2010-10-12 Thread Jonas H.

On 10/12/2010 10:48 PM, Pratik Khemka wrote:

Likewise I want to read the number after the '#' and store it in num. The 
problem is that the number can be a 1/2/3/4 digit number. So is there a way in 
which I can define num so that it contains the number after '#' irrespective of 
how many digits the number is. Because the problem is that the above code will 
not work for scenarios when the number is  not 2 digits..


Easy with regular expressions:

>>> re.search('(\d{2,4}})', 'foo123bar').group(1)
'123'

That regular expression basically means "match any sequence of 2 to 4 
digits (0-9)". Note that this would also match against a sequence that 
is not surrounded by non-digits (hence, a sequence that is longer than 4 
digits). You could work around that with something like this:


[^\d](\d{2,4})[^\d]

That's the expression used above but encapsulated with '[^\d]', which 
stands for "anything but a digit", so the complete expression now 
matches against "all sequence of 2 to 4 digits that is surrounded by 
non-digits". Note that this expression wouldn't match against a string 
that has no surrounding characters. So for example '1234' or '1234a' or 
'a1234' won't be matched, but 'a1234b' will be.


Hope this helps :-)

Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: My first Python program

2010-10-13 Thread Jonas H.

On 10/13/2010 06:48 PM, Seebs wrote:

Is it safe for me to assume that all my files will have been flushed and
closed?  I'd normally assume this, but I seem to recall that not every
language makes those guarantees.


Not really. Files will be closed when the garbage collector collects the 
file object, but you can't be sure the GC will run within the next N 
seconds/instructions or something like that. So you should *always* make 
sure to close files after using them. That's what context managers were 
introduced for.


with open('foobar') as fileobject:
do_something_with(fileobject)

basically is equivalent to (simplified!)

fileobject = open('foobar')
try:
do_something_with(fileobject)
finally:
fileobject.close()

So you can sure `fileobject.close()` is called in *any* case.


* you might want to pre-compile regular expressions (`re.compile`)


Thought about it, but decided that it was probably more complexity than I
need -- this is not a performance-critical thing.  And even if it were, well,
I'm pretty sure it's I/O bound.  (And on my netbook, the time to run this
is under .2 seconds in Python, compared to 15 seconds in shell, so...)


Forget about my suggestion. As someone pointed out in a another post, 
regular expressions are cached anyway.



I'm a bit unsure as to how to pick the right subclass, though.


There are a few pointers in the Python documentation on exceptions.

Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: My first Python program

2010-10-13 Thread Jonas H.

On 10/13/2010 11:26 PM, Seebs wrote:

 stderr.write(
 "WARNING:"
 " Pants on fire\n")


Hmm.  So I just indent stuff inside the ()s or whatever?  I can work with
that.


I think common is

stderr.write("WARNING: ",
 "Pants on fire")

or

stderr.write(
"WARNING: "
"Pants on fire"
)

If you haven't got braces around an expression and you want it to be 
multi-line, you need a '\' at the end of each line, just like C macros:


msg = "WARNING: " \
  "Pants on fire"

Though that is not commonly used afaik.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Minimal D

2010-10-17 Thread Jonas H.

On 10/16/2010 06:04 PM, Kruptein wrote:

Hey, I've written a small "IDE".  It is written in python using the
python toolkit and
offers an advanced text-editor, file-manager, ftp-client, sql-
client(in development) and more towards the future.


You definitely want to have a look at PEP8.
--
http://mail.python.org/mailman/listinfo/python-list


Re: embarrassing class question

2010-10-21 Thread Jonas H.

On 10/21/2010 08:09 PM, Brendan wrote:

Two modules:
x.py:
class x(object):
 pass

y.py:
from x import x
class y(x):
 pass

Now from the python command line:

import y
dir(y)

['__builtins__', '__doc__', '__file__', '__name__', '__package__',
'x', 'y']

I do not understand why class 'x' shows up here.


Because that's how `import` behaves. It imports *every* member of the 
module into the importing module's global namespace (except for 
attributes that start with an underscore).


You can specify the attributes that shall be import with a star-import 
by settings __all__. In your case, you would add `__all__ = ['y']` to y.py.


Jonas
--
http://mail.python.org/mailman/listinfo/python-list


Re: time difference interms of day

2010-10-24 Thread Jonas H.

On 10/24/2010 07:55 PM, mukkera harsha wrote:

On, doing now - startup I want the program to return in terms of days. How ?


>>> import datetime
>>> now = datetime.datetime.now()
>>> after_few_seconds = datetime.datetime.now()
>>> after_few_seconds - now
datetime.timedelta(0, 14, 256614)
>>> (after_few_seconds - now).seconds
14

Hope this helps :-)

Jonas
--
http://mail.python.org/mailman/listinfo/python-list