Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Paul Moore
On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> In order to make sense of the data-format object that I'm proposing you
> have to see the need to share information about data-format through an
> extended buffer protocol (which I will be proposing soon).  I'm not
> going to try to argue that right now because there are a lot of people
> who can do that.
>
> So, I'm going to assume that you see the need for it.  If you don't,
> then just suspend concern about that for the moment.  There are a lot of
> us who really see the need for it.

[...]

> Again, my real purpose is the extended buffer protocol.  These
> data-format type is a means to that end.  If the consensus is that
> nobody sees a greater use of the data-format type beyond the buffer
> protocol, then I will just write 1 PEP for the extended buffer protocol.

While I don't personally use NumPy, I can see where an extended buffer
protocol like you describe could be advantageous, and so I'm happy to
concede that benefit.

I can also vaguely see that a unified "block of memory description"
would be useful. My interest would be in the area of the struct module
(unpacking and packing data for dumping to byte streams - whether this
happens in place or not is not too important to this use case).
However, I cannot see how your proposal would help here in practice -
does it include the functionality of the struct module (or should it?)
If so, then I'd like to see examples of equivalent constructs. If not,
then isn't it yet another variation on the theme, adding to the
problem of multiple approaches rather than helping?

I can also see the parallels with ctypes. Here I feel a little less
sure that keeping the two approaches is wrong. I don't know why I feel
like that - maybe nothing more than familiarity with ctypes - but I
don't have the same reluctance to have both the ctypes data definition
stuff and the new datatype proposal.

Enough of the abstract. As a concrete example, suppose I have a (byte)
string in my program containing some binary data - an ID3 header, or a
TCP packet, or whatever. It doesn't really matter. Does your proposal
offer anything to me in how I might manipulate that data (assuming I'm
not using NumPy)? (I'm not insisting that it should, I'm just trying
to understand the scope of the PEP).

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Gareth McCaughan
> > It might be better not to consider "bit" to be a
> > type at all, and come up with another way of indicating
> > that the size is in bits. Perhaps
> > 
> > 'i4'   # 4-byte signed int
> > 'i4b'  # 4-bit signed int
> > 'u4'   # 4-byte unsigned int
> > 'u4b'  # 4-bit unsigned int
> > 
> 
> I like this.  Very nice.  I think that's the right way to look at it.

I remark that 'ib4' and 'ub4' make for marginally easier
parsing and less danger of ambiguity.

-- 
g

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Michael Chermside
In this email I'm responding to a series of emails from Travis
pretty much in the order I read them:

Travis Oliphant writes:
> I'm saying we should introduce a single-object mechanism for  
> describing binary data so that the many-object approach of c-types  
> does not become some kind of de-facto standard.  C-types can  
> "translate" this object-instance to its internals if and when it  
> needs to.
>
> In the mean-time, how are other packages supposed to communicate  
> binary information about data with each other?

Here we disagree.

I haven't used C-types. I have no idea whether it is well-designed or
horribly unusable. So if someone wanted to argue that C-types is a
mistake and should be thrown out, I'd be willing to listen. Until
someone tries to make that argument, I'm presuming it's good enough to
be part of the standard library for Python.

Given that, I think that it *SHOULD* become a de-facto standard. I
think that the way different packages should communicate binary information
about data with each other is using C-types. Not because it's wonderful
(remember, I've never used it), but because it's STANDARD. There should
be one obvious way to do things! When there is, it makes interoperability
WAY easier, and interoperability is the main objective when dealing with
things like binary data formats.

Propose using C-types. Or propose *improving* C-types. But don't propose
ignoring it.

In a different message, he writes:
> It also bothers me that so many ways to describe binary data are  
> being used out there.  This is a problem that deserves being solved.  
>  And, no, ctypes hasn't solved it (we can't directly use the ctypes  
> solution).

Really? Why? Is this a failing in C-types? Can C-types be "fixed"?

Later he explains:
> Remember the buffer protocol is in compiled code.  So, as a result,
>
> 1) It's harder to construct a class to pass through the protocol  
> using the multiple-types approach of ctypes.
>
> 2) It's harder to interpret the object recevied through the buffer protocol.
>
> Sure, it would be *possible* to use ctypes, but I think it would be  
> very difficult.  Think about how you would write the get_data_format  
> C function in the extended buffer protocol for NumPy if you had to  
> import ctypes and then build a class just to describe your data.   
> How would you interpret what you get back?

Aha! So what you REALLY ought to be asking for is a C interface to the
ctypes module. That seems like a very sensible and reasonable request.

> I don't think we should just *use ctypes because it's there* when  
> the way it describes binary data was not constructed with the  
> extended buffer protocol in mind.

I just disagree. (1) I *DO* think we should "just use ctypes because it's
there". After all, the problem we're trying to solve is one of
COMPATIBILITY - you don't solve those by introducing competing standards.
(2) From what I understand of it, I think ctypes is quite capable of
describing data to be accessed via the buffer protocol.

In another email:
> In order to make sense of the data-format object that I'm proposing  
> you have to see the need to share information about data-format  
> through an extended buffer protocol (which I will be proposing  
> soon).  I'm not going to try to argue that right now because there  
> are a lot of people who can do that.

Actually, no need to convince me... I am already convinced of the
wisdom of this approach.

> My view is that it is un-necessary to use a different type object to  
> describe each different data-type.
  [...]
> So, the big difference is that I think data-formats should be  
> *instances* of a single type.

Why? Who cares? Seriously, if we were proposing to describe the layouts
with a collection of rubber bands and potato chips, I'd say it was a
crazy idea. But we're proposing using data structures in a computer
memory. Why does it matter whether those data structures are of the same
"python type" or different "python types"? I care whether the structure
can be created, passed around, and interrogated. I don't care what
Python type they are.

> I'm saying that I don't like the idea of forcing this approach on  
> everybody else who wants to describe arbitrary binary data just  
> because ctypes is included.

And I'm saying that I *do*. Hey, if someone proposed getting rid of
the current syntax for the array module (for Py3K) and replacing it with
use of ctypes, I'd give it serious consideration. There should be only
one way to describe binary structures. It should be powerful enough to
describe almost any structure, easy-to-use, and most of all it should be
used consistently everywhere.

> I need some encouragement in order to continue to invest energy in  
> pushing this forward.

Please keep up the good work! Some day I'd like to see NumPy built in
to the standard Python distribution. The incremental, PEP by PEP approach
you are taking is the best route to getting there. But there may be
some changes a

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Nick Coghlan
Travis E. Oliphant wrote:
> However, the existence of an alternative strategy using a single Python 
> type and multiple instances of that type to describe binary data (which 
> is the NumPy approach and essentially the array module approach) means 
> that we can't just a-priori assume that the way ctypes did it is the 
> only or best way.

As a hypothetical, what if there was a helper function that translated a 
description of a data structure using basic strings and sequences (along the 
lines of what you have in your PEP) into a ctypes data structure?

> The examples of "missing features" that Martin has exposed are not 
> show-stoppers.  They can all be easily handled within the context of 
> what is being proposed.   I can modify the PEP to show this.  But, I 
> don't have the time to spend if it's just all going to be rejected in 
> the end.  I need some encouragement in order to continue to invest 
> energy in pushing this forward.

I think the most important thing in your PEP is the formats for describing 
structures in a way that is easy to construct in both C and Python 
(specifically, by using strings and sequences), and it is worth pursuing for 
that aspect alone. Whether that datatype is then implemented as a class in its 
own right or as a factory function that returns a ctypes data type object is, 
to my mind, a relatively minor implementation issue (either way has questions 
to be addressed - I'm not sure how you tell ctypes that you have a 32-bit 
integer with a non-native endian format, for example).

In fact, it may make sense to just use the lists/strings directly as the data 
exchange format definitions, and let the various libraries do their own 
translation into their private format descriptions instead of creating a new 
one-type-to-describe-them-all.

Cheers,
Nick.


-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis E. Oliphant
Michael Chermside wrote:
> In this email I'm responding to a series of emails from Travis
> pretty much in the order I read them:
> 
>>
>> In the mean-time, how are other packages supposed to communicate  
>> binary information about data with each other?
> 
> Here we disagree.
> 
> I haven't used C-types. I have no idea whether it is well-designed or
> horribly unusable. So if someone wanted to argue that C-types is a
> mistake and should be thrown out, I'd be willing to listen. 
> Until
> someone tries to make that argument, I'm presuming it's good enough to
> be part of the standard library for Python.

My problem with this argument is two fold:

1) I'm not sure you really know what your talking about since you 
apparently haven't used either ctypes or NumPy (I've used both and so 
forgive me if I claim to understand the strengths of the data-format 
representations that each uses a bit better).  Therefore, it's hard for 
me to take your opinion seriously.  I will try though. I understand you 
have a preference for not wildly expanding the ways to do similar 
things.  I share that preference with you.

2) You are assuming that because it's good enough for the standard 
library means that the way they describe data-formats (using a separate 
Python type for each one) is the *one true way*.  When was this 
discussed?   Frankly it's a weak argument because the struct module has 
been around for a lot longer.  Why didn't the ctypes module follow that 
standard?  Or the standard that's in the array module for describing 
data-types.  That's been there for a long time too.  Why wasn't ctypes 
forced to use that approach?

The reason it wasn't is because it made sense for ctypes to use a 
separate type for each data-format object so that you could call 
C-functions as if they were Python functions.  If this is your goal, 
then it seems like a good idea (though not strictly necessary) to use a 
separate Python type for each data-format.

But, there are distinct disadvantages to this approach compared to what 
I'm trying to allow.   Martin claims that the ctypes approach is 
*basically* equivalent but this is just not true.  It could be made more 
true if the ctypes objects inherited from a "meta-type" and if Python 
allowed meta-types to expand their C-structures.  But, last I checked 
this is not possible.

A Python type object is a very particular kind of Python-type.  As far 
as I can tell, it's not as flexible in terms of the kinds of things you 
can do with the "instances" of a type object (i.e. what ctypes types 
are) on the C-level.

The other disadvantage of what you are describing is: Who is going to 
write the code?

I'm happy to have the data-format object live separate from ctypes and 
leave it to the ctypes author(s) to support it if desired.  But, the 
claim that the extended buffer protocol jump through all kinds of hoops 
to conform to the "ctypes standard" when that "standard" was designed 
with a different idea in mind is not acceptable.

Ctypes has only been in Python since 2.5 and the array interface was 
around before that.   Numeric has been around longer than ctypes.  The 
array module and the struct modules in Python have also both been around 
longer than ctypes as well.

Where is the discussion that crowned the ctypes way of doing things as 
"the one true way"

> 
> In a different message, he writes:
>> It also bothers me that so many ways to describe binary data are  
>> being used out there.  This is a problem that deserves being solved.  
>>  And, no, ctypes hasn't solved it (we can't directly use the ctypes  
>> solution).
> 
> Really? Why? Is this a failing in C-types? Can C-types be "fixed"?

You can't grow C-function pointers on to an existing type object.   You 
are also carrying around a lot of weight in the Python type object that 
is un-necessary if all you are doing is describing data.

> 
> I just disagree. (1) I *DO* think we should "just use ctypes because it's
> there". After all, the problem we're trying to solve is one of
> COMPATIBILITY - you don't solve those by introducing competing standards.
> (2) From what I understand of it, I think ctypes is quite capable of
> describing data to be accessed via the buffer protocol.

Capable but not supporting all the things I'm talking about.  The ctypes 
objects don't have any of the methods or attributes (or C function 
pointers) that I've described.  Nor should they necessarily grow them.

> 
> Why? Who cares? Seriously, if we were proposing to describe the layouts
> with a collection of rubber bands and potato chips, I'd say it was a
> crazy idea. But we're proposing using data structures in a computer
> memory. Why does it matter whether those data structures are of the same
> "python type" or different "python types"? I care whether the structure
> can be created, passed around, and interrogated. I don't care what
> Python type they are.

Sure, but the flexibility you have with an instance of a Python type is 
different 

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant
Martin v. Löwis wrote:
> Travis Oliphant schrieb:
> 
>>So, the big difference is that I think data-formats should be 
>>*instances* of a single type.
> 
> 
> This is nearly the case for ctypes as well. All layout descriptions
> are instances of the type type. Nearly, because they are instances
> of subtypes of the type type:
> 
> py> type(ctypes.c_long)
> 
> py> type(ctypes.c_double)
> 
> py> type(ctypes.c_double).__bases__
> (,)
> py> type(ctypes.Structure)
> 
> py> type(ctypes.Array)
> 
> py> type(ctypes.Structure).__bases__
> (,)
> py> type(ctypes.Array).__bases__
> (,)
> 
> So if your requirement is "all layout descriptions ought to have
> the same type", then this is (nearly) the case: they are instances
> of type (rather then datatype, as in your PEP).
> 

The big difference, however, is that by going this route you are forced 
to use the "type object" as your data-format "instance".  This is 
fitting a square peg into a round hole in my opinion.To really be 
useful, you would need to add the attributes and (most importantly) 
C-function pointers and C-structure members to these type objects.  I 
don't even think that is possible in Python (even if you do create a 
meta-type that all the c-type type objects can use that carries the same 
information).

There are a few people claiming I should use the ctypes type-hierarchy 
but nobody has explained how that would be possible given the 
attributes, C-structure members and C-function pointers that I'm proposing.

In NumPy we also have a Python type for each basic data-format (we call 
them array scalars).  For a little while they carried the data-format 
information on the Python side.  This turned out to be not flexible 
enough.  So, we expanded the PyArray_Descr * structure which has always 
been a part of Numeric (and the array module array type) into an actual 
Python type and a lot of things became possible.

It was clear to me that we were "on to something".  Now, the biggest 
claim against the gist of what I'm proposing (details we can argue 
about), seems from my perspective to be a desire to "go backwards" and 
carry data-type information around with a Python type.

The data-type object did not just appear out of thin-air one day.  It 
really can be seen as an evolution from the beginnings of Numeric (and 
the Python array module).

So, this is what we came up with in the NumPy world.  Ctypes came up 
with something a bit different.  It is not "trivial" to "just use 
ctypes."  I could say the same thing and tell ctypes to just use NumPy's 
  data-type object.   It could be done that way, but of course it would 
take a bit of work on the part of ctypes to make that happen.

Having ctypes in the standard library does not mean that any other 
discussion of how data-format should be represented has been decided on. 
If I had known that was what it meant to put ctypes in the standard 
library, I would have been more vocal several months ago.


-Travis

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Stephan Tolksdorf
Martin v. Löwis wrote:
> Travis Oliphant schrieb:
>> Function pointers are "supported" with the void data-type and could be 
>> more specifically supported if it were important.   People typically 
>> don't use the buffer protocol to send function-pointers around in a way 
>> that the void description wouldn't be enough.
> 
> As I said before, I can't tell whether it's important, as I still don't
> know what the purpose of this PEP is. If it is to support a unification
> of memory layout specifications, and if that unifications is also to
> include ctypes, then yes, it is important. If it is to describe array
> elements in NumArray arrays, then it might not be important.
 >
 > For the usage of ctypes, the PEP void type is insufficient to describe
 > function pointers: you also need a specification of the signature of
 > the function pointer (parameter types and return type), or else you
 > can't use the function pointer (i.e. you can't call the function).

The buffer protocol is primarily meant for describing the format of 
(large) contiguous pieces of binary data. In most cases that will be all 
kinds of numerical data for scientific applications, image and other 
media data, simple databases and similar kinds of data.

There is currently no adequate data format type which sufficiently 
supports these applications, otherwise Travis wouldn't make this proposal.

While Travis' proposal encompasses the data format functionality within 
the struct module and overlaps with what ctypes has to offer, it does 
not aim to replace ctypes.

I don't think that a basic data format type necessarily should be able 
to encode all the information a foreign function interface needs to call 
a code library. From my point of view, that kind of information is one 
abstraction layer above a basic data format and should be implemented as 
an extension of or complementary to the basic data format.

I also do not understand why the data format type should attempt to 
fully describe arbitrarily complex data formats, like fragmented 
(non-continuous) data structures in memory. You'd probably need a full 
programming language for that anyway.

Regards,
   Stephan
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis
Travis E. Oliphant schrieb:
> But, there are distinct disadvantages to this approach compared to what 
> I'm trying to allow.   Martin claims that the ctypes approach is 
> *basically* equivalent but this is just not true.

I may claim that, but primarily, my goal was to demonstrate that the
proposed PEP cannot be used to describe ctypes object layouts (without
checking, I can readily believe that the PEP covers everything in
the array and struct modules).

> It could be made more 
> true if the ctypes objects inherited from a "meta-type" and if Python 
> allowed meta-types to expand their C-structures.  But, last I checked 
> this is not possible.

That I don't understand. a) what do you think is not possible? b)
why is that an important difference between a datatype and a ctype?

If you are suggesting that, given two Python types A and B, and
B inheriting from A, that the memory layout of B cannot extend
the memory layout of A, then: that is certainly possible in Python,
and there are many examples for it.

> A Python type object is a very particular kind of Python-type.  As far 
> as I can tell, it's not as flexible in terms of the kinds of things you 
> can do with the "instances" of a type object (i.e. what ctypes types 
> are) on the C-level.

Ah, you are worried that NumArray objects would have to be *instances*
of ctypes types. That wouldn't be necessary at all. Instead, if each
NumArray object had a method get_ctype(), which returned a ctypes type,
then you would get the same desciptiveness that you get with the
PEP's datatype.

> I'm happy to have the data-format object live separate from ctypes and 
> leave it to the ctypes author(s) to support it if desired.  But, the 
> claim that the extended buffer protocol jump through all kinds of hoops 
> to conform to the "ctypes standard" when that "standard" was designed 
> with a different idea in mind is not acceptable.

That, of course, is a reasoning I can understand. This is free software,
contributors can chose to contribute whatever they want; you can't force
anybody to do anything specific you want to get done. Acceptance of
any PEP (not just this PEP) should always be contingent on available
of a patch implementing it.

> Where is the discussion that crowned the ctypes way of doing things as 
> "the one true way"

It hasn't been crowned this way. Me, personally, I just said two things
about this PEP and ctypes:
a) the PEP does not support all concepts that ctypes needs
b) ctypes can express all examples in the PEP
in response to your proposal that ctypes should adopt the PEP, and
that ctypes is not good enough to be the one true way.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant
Nick Coghlan wrote:
> Travis E. Oliphant wrote:
> 
>>However, the existence of an alternative strategy using a single Python 
>>type and multiple instances of that type to describe binary data (which 
>>is the NumPy approach and essentially the array module approach) means 
>>that we can't just a-priori assume that the way ctypes did it is the 
>>only or best way.
> 
> 
> As a hypothetical, what if there was a helper function that translated a 
> description of a data structure using basic strings and sequences (along the 
> lines of what you have in your PEP) into a ctypes data structure?
> 

That would be fine and useful in fact.  I don't see how it helps the 
problem of "what to pass through the buffer protocol"  I see passing 
c-types type objects around on the c-level as an un-necessary and 
burdensome approach unless the ctypes objects were significantly enhanced.


> 
> In fact, it may make sense to just use the lists/strings directly as the data 
> exchange format definitions, and let the various libraries do their own 
> translation into their private format descriptions instead of creating a new 
> one-type-to-describe-them-all.

Yes, I'm open to this possibility.   I basically want two things in the 
object passed through the extended buffer protocol:

1) It's fast on the C-level
2) It covers all the use-cases.

If just a particular string or list structure were passed, then I would 
drop the data-format PEP and just have the dataformat argument of the 
extended buffer protocol be that thing.

Then, something that converts ctypes objects to that special format 
would be very nice indeed.

-Travis

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis
Travis Oliphant schrieb:
> The big difference, however, is that by going this route you are forced 
> to use the "type object" as your data-format "instance".

Since everything is an object (an "instance) in Python, this is not
such a big difference.

> This is 
> fitting a square peg into a round hole in my opinion.To really be 
> useful, you would need to add the attributes and (most importantly) 
> C-function pointers and C-structure members to these type objects. 

Can you explain why that is? In the PEP, I see two C fucntions:
setitem and getitem. I think they can be implemented readily with
ctypes' GETFUNC and SETFUNC function pointers that it uses
all over the place.

I don't see a requirement to support C structure members or
function pointers in the datatype object.

> There are a few people claiming I should use the ctypes type-hierarchy 
> but nobody has explained how that would be possible given the 
> attributes, C-structure members and C-function pointers that I'm proposing.

Ok, here you go. Remember, I'm still not claiming that this should be
done: I'm just explaining how it could be done.

- byteorder/isnative: I think this could be derived from the
  presence of the _swappedbytes_ field
- itemsize: can be done with ctypes.sizeof
- kind: can be created through a mapping of the _type_ field
  (I think)
- fields: can be derived from the _fields_ member
- hasobject: compare, recursively, with py_object
- name: use __name__
- base: again, created from _type_ (if _length_ is present)
- shape: recursively look at _length_
- alignment: use ctypes.alignment

> It was clear to me that we were "on to something".  Now, the biggest 
> claim against the gist of what I'm proposing (details we can argue 
> about), seems from my perspective to be a desire to "go backwards" and 
> carry data-type information around with a Python type.

I, at least, have no such desire. I just explained that the ctypes
model of memory layouts is just as expressive as the one in the
PEP. Which of these is "better" for what the PEP wants to achieve,
I can't say, because I still don't quite understand what the PEP
wants to achieve.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis
Stephan Tolksdorf schrieb:
> While Travis' proposal encompasses the data format functionality within 
> the struct module and overlaps with what ctypes has to offer, it does 
> not aim to replace ctypes.

This discussion could have been a lot shorter if he had said so.
Unfortunately (?) he stated that it was *precisely* a motivation
of the PEP to provide a standard data description machinery that
can then be adopted by the struct, array, and ctypes modules.

> I also do not understand why the data format type should attempt to 
> fully describe arbitrarily complex data formats, like fragmented 
> (non-continuous) data structures in memory. You'd probably need a full 
> programming language for that anyway.

For an FFI application, you need to be able to describe arbitrary
in-memory formats, since that's what the foreign function will
expect. For type safety and reuse, you better separate the
description of the layout from the creation of the actual values.
Otherwise (i.e. if you have to define the layout on each invocation),
creating the parameters for a foreign function becomes very tedious
and error-prone, with errors often being catastrophic (i.e. interpreter
crashes).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Thomas Heller
Travis Oliphant schrieb:
> Greg Ewing wrote:
>> Travis Oliphant wrote:
>> 
>> 
>>>Part of the problem is that ctypes uses a lot of different Python types 
>>>(that's what I mean by "multi-object" to accomplish it's goal).  What 
>>>I'm looking for is a single Python type that can be passed around and 
>>>explains binary data.
>> 
>> 
>> It's not clear that multi-object is a bad thing in and
>> of itself. It makes sense conceptually -- if you have
>> a datatype object representing a struct, and you ask
>> for a description of one of its fields, which could
>> be another struct or array, you would expect to get
>> another datatype object describing that.
>> 
>> Can you elaborate on what would be wrong with this?
>> 
>> Also, can you clarify whether your objection is to
>> multi-object or multi-type. They're not the same thing --
>> you could have a data structure built out of multiple
>> objects that are all of the same Python type, with
>> attributes distinguishing between struct, array, etc.
>> That would be single-type but multi-object.
> 
> I've tried to clarify this in another post.  Basically, what I don't 
> like about the ctypes approach is that it is multi-type (every new 
> data-format is a Python type).
> 
> In order to talk about all these Python types together, then they must 
> all share some attribute (or else be derived from a meta-type in C with 
> a specific function-pointer entry).

(I tried to read the whole thread again, but it is too large already.)

There is a (badly named, probably) api to access information
about ctypes types and instances of this type.  The functions are
PyObject_stgdict(obj) and PyType_stgdict(type).  Both return a
'StgDictObject' instance or NULL if the funtion fails.  This object
is the ctypes' type object's __dict__.

StgDictObject is a subclass of PyDictObject and has fields that
carry information about the C type (alignment requirements, size in bytes,
plus some other stuff).  Also it contains several pointers to functions
that implement (in C) struct-like functionality (packing/unpacking).

Of course several of these fields can only be used for ctypes-specific
purposes, for example a pointer to the ffi_type which is used when
calling foreign functions, or the restype, argtypes, and errcheck fields
which are only used when the type describes a function pointer.


This mechanism is probably a hack because it'n not possible to add C accessible
fields to type objects, on the other hand it is extensible (in principle, at 
least).


Just to describe the implementation.

Thomas

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant
Martin v. Löwis wrote:
> Travis Oliphant schrieb:
> 
>>The big difference, however, is that by going this route you are forced 
>>to use the "type object" as your data-format "instance".
> 
> 
> Since everything is an object (an "instance) in Python, this is not
> such a big difference.
> 

I think it actually is.  Perhaps I'm wrong, but a type-object is still a 
special kind of an instance of a meta-type.  I once tried to add 
function pointers to a type object by inheriting from it.  But, I was 
told that Python is not set up to handle that.  Maybe I misunderstood.

Let me be very clear.  The whole reason I make any statements about 
ctypes is because somebody else brought it up.  I'm not trying to 
replace ctypes and the way it uses type objects to represent data 
internally.   All I'm trying to do is come up with a way to describe 
data-types through a buffer protocol.  The way ctypes does it is "too" 
bulky by definining a new Python type for every data-format.

While semantically you may talk about the equivalency of types being 
instances of a "meta-type" and regular objects being instances of a 
type.  My understanding is still that there are practical differences 
when it comes to implementation --- and certain things that "can't be done"

Here's what I mean by the difference.

This is akin to what I'm proposing

struct {
PyObject_HEAD
/* whatever you need to represent your instance
Quite a bit of flexibility
*/
} PyDataFormatObject;

A Python type object (what every C-types data-format "type" inherits 
from) has a C-structure

struct {
PyObject_VAR_HEAD
char *tp_name;
 int tp_basicsize, tp_itemsize;

 /* Methods to implement standard operations */

 destructor tp_dealloc;
 printfunc tp_print;
 getattrfunc tp_getattr;
 setattrfunc tp_setattr;
 cmpfunc tp_compare;
 reprfunc tp_repr;

...
...

PyObject *tp_bases;
 PyObject *tp_mro; /* method resolution order */
 PyObject *tp_cache;
 PyObject *tp_subclasses;
 PyObject *tp_weaklist;
 destructor tp_del;

 ... /* + more under certain conditions */
} PyTypeObject;


Why in the world do we need to carry all this extra baggage around in 
each data-format instance in order to just describe data?  I can see why 
it's useful for ctypes to do it and that's fine.  But, the argument that 
every exchange of data-format information should use this type-object 
instance is hard to swallow.

So, I'm happy to let ctypes continue on doing what it's doing trusting 
its developers to have done something good.  I'd be happy to drop any 
reference to ctypes.  The only reason to have the data-type objects is 
something to pass as part of the extended buffer protocol.

> 
> 
> Can you explain why that is? In the PEP, I see two C fucntions:
> setitem and getitem. I think they can be implemented readily with
> ctypes' GETFUNC and SETFUNC function pointers that it uses
> all over the place.

Sure, but where do these function pointers live and where are they 
stored.  In ctypes it's in the CField_object.  Now, this is closer to 
what I'm talking about.  But, why is not not the same thing.  Why, yet 
another type object to talk about fields of a structure?

These are rhetorical questions.  I really don't expect or need an answer 
because I'm not questioning why ctypes did what it did for solving the 
problem it was solving.  I am questioning anyone who claims that we 
should use this mechanism for describing data-formats in the extended 
buffer protocol.

> 
> I don't see a requirement to support C structure members or
> function pointers in the datatype object.
> 
> 
>>There are a few people claiming I should use the ctypes type-hierarchy 
>>but nobody has explained how that would be possible given the 
>>attributes, C-structure members and C-function pointers that I'm proposing.
> 
> 
> Ok, here you go. Remember, I'm still not claiming that this should be
> done: I'm just explaining how it could be done.

O.K.  Thanks for putting in the effort.   It doesn't answer my real 
concerns, though.

>>It was clear to me that we were "on to something".  Now, the biggest 
>>claim against the gist of what I'm proposing (details we can argue 
>>about), seems from my perspective to be a desire to "go backwards" and 
>>carry data-type information around with a Python type.
> 
> 
> I, at least, have no such desire. I just explained that the ctypes
> model of memory layouts is just as expressive as the one in the
> PEP. 

I agree with this.  I'm very aware of what "can" be expressed.  I just 
think it's too awkard and bulky to use in the extended buffer protocol

> Which of these is "better" for what the PEP wants to achieve,
> I can't say, because I still don't quite understand what the PEP
> wants to achieve.
>

Are you saying you still don't understand after having read the extended 
buffer protocol PEP, 

Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant
Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
> 
>>But, there are distinct disadvantages to this approach compared to what 
>>I'm trying to allow.   Martin claims that the ctypes approach is 
>>*basically* equivalent but this is just not true.
> 
> 
> I may claim that, but primarily, my goal was to demonstrate that the
> proposed PEP cannot be used to describe ctypes object layouts (without
> checking, I can readily believe that the PEP covers everything in
> the array and struct modules).
> 

That's a fine argument.  You are right in terms of the PEP as it stands. 
  However, I want to make clear that a single Python type object *could* 
be used to describe data including all the cases you laid out.  It would 
not be difficult to extend the PEP to cover all the cases you've 
described --- I'm not sure that's desireable.  I'm not trying to replace 
what ctypes does.  I'm just trying to get something that we can use to 
exchange data-format information through the extended buffer protocol.

It really comes down to using Python type-objects as the instances 
describing data-formats (which ctypes does) or "normal" Python objects 
as the instances describing data-formats (what the PEP proposes).

> 
>>It could be made more 
>>true if the ctypes objects inherited from a "meta-type" and if Python 
>>allowed meta-types to expand their C-structures.  But, last I checked 
>>this is not possible.
> 
> 
> That I don't understand. a) what do you think is not possible?

Extending the C-structure of PyTypeObject and having Python types use 
that as their "type-object".

  b)
> why is that an important difference between a datatype and a ctype?

Because with instances of C-types you are stuck with the PyTypeObject 
structure.  If you want to add anything you have to do it in the 
dictionary.

Instances of a datatype allow adding anything after the PyObject_HEAD 
structure.

> 
> If you are suggesting that, given two Python types A and B, and
> B inheriting from A, that the memory layout of B cannot extend
> the memory layout of A, then: that is certainly possible in Python,
> and there are many examples for it.
>

I know this.  I've done it for many different objects.  I'm saying it's 
not quite the same when what you are extending is the PyTypeObject and 
trying to use it as the type object for some other object.


> 
>>A Python type object is a very particular kind of Python-type.  As far 
>>as I can tell, it's not as flexible in terms of the kinds of things you 
>>can do with the "instances" of a type object (i.e. what ctypes types 
>>are) on the C-level.
> 
> 
> Ah, you are worried that NumArray objects would have to be *instances*
> of ctypes types. That wouldn't be necessary at all. Instead, if each
> NumArray object had a method get_ctype(), which returned a ctypes type,
> then you would get the same desciptiveness that you get with the
> PEP's datatype.
> 

No, I'm not worried about that (It's not NumArray by the way, it's 
NumPy.  NumPy replaces both NumArray and Numeric).

NumPy actually interfaces with ctypes quite well.  This is how I learned 
anything I might know about ctypes.  So, I'm well aware of this.

What I am concerned about is using Python type objects (i.e. Python 
objects that can be cast in C to PyTypeObject *) outside of ctypes to 
describe data-formats when you don't need it and it just complicates 
dealing with the data-format description.

> 
>>Where is the discussion that crowned the ctypes way of doing things as 
>>"the one true way"
> 
> 
> It hasn't been crowned this way. Me, personally, I just said two things
> about this PEP and ctypes:

Thanks for clarifying, but I know you didn't say this.  Others, however, 
basically did.

> a) the PEP does not support all concepts that ctypes needs

It could be extended, but I'm not sure it *needs* to be in it's real 
context.  I'm very sorry for contributing to the distraction that ctypes 
should adopt the PEP.  My words were unclear.  But, I'm not pushing for 
that.  I really have no opinion how ctypes describes data.

> b) ctypes can express all examples in the PEP
> in response to your proposal that ctypes should adopt the PEP, and
> that ctypes is not good enough to be the one true way.
> 

I think it is "good enough" in the semantic sense.  But, I think using 
type objects in this fashion for general-purpose data-description is 
over-kill and will be much harder to extend and deal with.

-Travis


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant
Thomas Heller wrote:
> 
> (I tried to read the whole thread again, but it is too large already.)
> 
> There is a (badly named, probably) api to access information
> about ctypes types and instances of this type.  The functions are
> PyObject_stgdict(obj) and PyType_stgdict(type).  Both return a
> 'StgDictObject' instance or NULL if the funtion fails.  This object
> is the ctypes' type object's __dict__.
> 
> StgDictObject is a subclass of PyDictObject and has fields that
> carry information about the C type (alignment requirements, size in bytes,
> plus some other stuff).  Also it contains several pointers to functions
> that implement (in C) struct-like functionality (packing/unpacking).
> 
> Of course several of these fields can only be used for ctypes-specific
> purposes, for example a pointer to the ffi_type which is used when
> calling foreign functions, or the restype, argtypes, and errcheck fields
> which are only used when the type describes a function pointer.
> 
> 
> This mechanism is probably a hack because it'n not possible to add C 
> accessible
> fields to type objects, on the other hand it is extensible (in principle, at 
> least).
> 

Thank you for the description.  While I've studied the ctypes code, I 
still don't understand the purposes beind all the data-structures.

Also, I really don't have an opinion about ctypes' implementation.   All 
my comparisons are simply being resistant to the "unexplained" idea that 
I'm supposed to use ctypes objects in a way they weren't really designed 
to be used.

For example, I'm pretty sure you were the one who made me aware that you 
can't just extend the PyTypeObject.  Instead you extended the tp_dict of 
the Python typeObject to store some of the extra information that is 
needed to describe a data-type like I'm proposing.

So, if you I'm just describing data-format information, why do I need 
all this complexity (that makes ctypes implementation easier/more 
natural/etc)?  What if the StgDictObject is the Python data-format 
object I'm talking about?  It actually looks closer.

But, if all I want is the StgDictObject (or something like it), then why 
should I pass around the whole type object?

This is all I'm saying to those that want me to use ctypes to describe 
data-formats in the extended buffer protocol.  I'm not trying to change 
anything in ctypes.

-Travis

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Thomas Heller
Travis Oliphant schrieb:
> For example, I'm pretty sure you were the one who made me aware that you 
> can't just extend the PyTypeObject.  Instead you extended the tp_dict of 
> the Python typeObject to store some of the extra information that is 
> needed to describe a data-type like I'm proposing.
> 
> So, if you I'm just describing data-format information, why do I need 
> all this complexity (that makes ctypes implementation easier/more 
> natural/etc)?  What if the StgDictObject is the Python data-format 
> object I'm talking about?  It actually looks closer.
> 
> But, if all I want is the StgDictObject (or something like it), then why 
> should I pass around the whole type object?

Maybe you don't need it.  ctypes certainly needs the type object because
it is also used for constructing instances (while NumPy uses factory functions,
IIUC), or for converting 'native' Python object into foreign function arguments.

I know that this doesn't interest you from the NumPy perspective (and I don't 
want
to offend you by saying this).

> This is all I'm saying to those that want me to use ctypes to describe 
> data-formats in the extended buffer protocol.  I'm not trying to change 
> anything in ctypes.

I don't want to change anything in NumPy, either, and was not the one who
suggested to use ctypes objects, although I had thought about whether it
would be possible or not.

What I like about ctypes, and dislike about Numeric/Numarry/NumPy is
the way C compatible types are defined in ctypes.  I find the ctypes
way more natural than the numxxx or array module way, but what else would
anyone expect from me as the ctypes author...

I hope that a useful interface is developed from your proposals, and
will be happy to adapt ctypes to use it or interface ctypes with it
if this makes sense.

Thomas

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant
Martin v. Löwis wrote:
> Stephan Tolksdorf schrieb:
> 
>>While Travis' proposal encompasses the data format functionality within 
>>the struct module and overlaps with what ctypes has to offer, it does 
>>not aim to replace ctypes.
> 
> 
> This discussion could have been a lot shorter if he had said so.
> Unfortunately (?) he stated that it was *precisely* a motivation
> of the PEP to provide a standard data description machinery that
> can then be adopted by the struct, array, and ctypes modules.

Struct and array I was sure about.  Ctypes less sure.  I'm very sorry 
for the distraction I caused by mis-stating my objective.   My objective 
is really the extended buffer protocol.  The data-type object is a means 
to that end.

I do think ctypes could make use of the data-type object and that there 
is a real difference between using Python type objects as data-format 
descriptions and using another Python type for those descriptions.  I 
thought to go the ctypes route (before I even knew what ctypes did) but 
decided against it for a number of reasons.

But, nonetheless those are side issues.  The purpose of the PEP is to 
provide an object that the extended buffer protocol can use to share 
data-format information.  It should be considered primarily in that context.

-Travis

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Martin v. Löwis
Travis Oliphant schrieb:
> I think it actually is.  Perhaps I'm wrong, but a type-object is still a 
> special kind of an instance of a meta-type.  I once tried to add 
> function pointers to a type object by inheriting from it.  But, I was 
> told that Python is not set up to handle that.  Maybe I misunderstood.

I'm not quite sure what the problems are: one "obvious" problem is
that the next Python version may also extend the size of type objects.
But, AFAICT, even that should "work", in the sense that this new version
should check for the presence of a flag to determine whether the
additional fields are there. The only tricky question is how you can
find out whether your own extension is there.

If that is a common problem, I think a framework could be added to
support extensible type objects (with some kind of registry for
additional fields, and a per-type-object indicator whether a certain
extension field is present).

> Let me be very clear.  The whole reason I make any statements about 
> ctypes is because somebody else brought it up.  I'm not trying to 
> replace ctypes and the way it uses type objects to represent data 
> internally.

Ok. I understood you differently earlier.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Paul Moore
On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> Martin v. Löwis wrote:
> > [...] because I still don't quite understand what the PEP
> > wants to achieve.
> >
>
> Are you saying you still don't understand after having read the extended
> buffer protocol PEP, yet?

I can't speak for Martin, but I don't understand how I, as a Python
programmer, might use the data type objects specified in the PEP. I
have skimmed the extended buffer protocol PEP, but I'm conscious that
no objects I currently use support the extended buffer protocol (and
the PEP doesn't mention adding support to existing objects), so I
don't see that as too relevant to me.

I have also installed numpy, and looked at the help for numpy.dtype,
but that doesn't add much to the PEP. The freely available chapters of
the numpy book explain how dtypes describe data structures, but not
how to use them. The freely available Numeric documentation doesn't
refer to dtypes, as far as I can tell. Is there any documentation on
how to use dtypes, independently of other features of numpy? If not,
can you clarify where the benefit lies for a Python user of this
proposal? (I understand the benefits of a common language for
extensions to communicate datatype information, but why expose it to
Python? How do Python users use it?)

This is probably all self-evident to the numpy community, but I think
that as the PEP is aimed at a wider audience it needs a little more
background.

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Extending the buffer protocol to share array information.

2006-10-31 Thread Martin v. Löwis
Travis E. Oliphant schrieb:
> Several extensions to Python utilize the buffer protocol to share
> the location of a data-buffer that is really an N-dimensional
> array.  However, there is no standard way to exchange the
> additional N-dimensional array information so that the data-buffer
> is interpreted correctly.  The NumPy project introduced an array
> interface (http://numpy.scipy.org/array_interface.shtml) through a
> set of attributes on the object itself.  While this approach
> works, it requires attribute lookups which can be expensive when
> sharing many small arrays.  

Can you please give examples for real-world applications of this
interface, preferably examples involving multiple
independently-developed libraries?
("this" being the current interface in NumPy - I understand that
 the PEP's interface isn't implemented, yet)

Paul Moore (IIRC) gave the example of equalising the green values
and maximizing the red values in a PIL image by passing it to NumPy:
Is that a realistic (even though not-yet real-world) example? If
so, what algorithms of NumPy would I use to perform this image
manipulation (and why would I use NumPy for it if I could just
write a for loop that does that in pure Python, given PIL's
getpixel/setdata)?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Extending the buffer protocol to share array information.

2006-10-31 Thread Brett Cannon
On 10/30/06, Travis E. Oliphant <[EMAIL PROTECTED]> wrote:> > Attached is my PEP for extending the buffer protocol to allow array data> to be shared.
You might want to reference this thread (http://mail.python.org/pipermail/python-3000/2006-August/003309.html) as Guido mentions that extending the buffer protocol to tell more about the data in the buffer and "would offer the numarray folks their 'array interface'".
-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Josiah Carlson

"Paul Moore" <[EMAIL PROTECTED]> wrote:
> On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> > Martin v. Löwis wrote:
> > > [...] because I still don't quite understand what the PEP
> > > wants to achieve.
> > >
> >
> > Are you saying you still don't understand after having read the extended
> > buffer protocol PEP, yet?
> 
> I can't speak for Martin, but I don't understand how I, as a Python
> programmer, might use the data type objects specified in the PEP. I
> have skimmed the extended buffer protocol PEP, but I'm conscious that
> no objects I currently use support the extended buffer protocol (and
> the PEP doesn't mention adding support to existing objects), so I
> don't see that as too relevant to me.

Presumably str in 2.x and bytes in 3.x could be extended to support the
'S' specifier, unicode in 2.x and text in 3.x could be extended to
support the 'U' specifier.  The various array.array variants could be
extended to support all relevant specifiers, etc.


> This is probably all self-evident to the numpy community, but I think
> that as the PEP is aimed at a wider audience it needs a little more
> background.

Someone correct me if I am wrong, but it allows things equivalent to the
following that is available in C, available in Python...

typedef struct {
char R;
char G;
char B;
char A;
} pixel_RGBA;

pixel_RGBA image[1024][768];

Or even...

typedef struct {
long long numerator;
unsigned long long denominator;
double approximation;
} rational;

rational ratios[1024];

The real use is that after you have your array of (packed) objects, be
it one of the above samples, or otherwise, you don't need to explicitly
pass around specifiers (like in struct, or ctypes), numpy and others can
talk to each other, and pick up the specifier with the extended buffer
protocol, and it just works.

 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Travis Oliphant
Paul Moore wrote:
> On 10/31/06, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> 
>>Martin v. Löwis wrote:
>>
>>>[...] because I still don't quite understand what the PEP
>>>wants to achieve.
>>>
>>
>>Are you saying you still don't understand after having read the extended
>>buffer protocol PEP, yet?
> 
> 
> I can't speak for Martin, but I don't understand how I, as a Python
> programmer, might use the data type objects specified in the PEP. I
> have skimmed the extended buffer protocol PEP, but I'm conscious that
> no objects I currently use support the extended buffer protocol (and
> the PEP doesn't mention adding support to existing objects), so I
> don't see that as too relevant to me.

Do you use the PIL?  The PIL supports the array interface.

CVXOPT supports the array interface.

Numarray
Numeric
NumPy

all support the array interface.

> 
> I have also installed numpy, and looked at the help for numpy.dtype,
> but that doesn't add much to the PEP. 

The source-code is available.

> The freely available chapters of
> the numpy book explain how dtypes describe data structures, but not
> how to use them. 


The freely available Numeric documentation doesn't
> refer to dtypes, as far as I can tell. 

It kind of does, they are PyArray_Descr * structures in Numeric.  They 
just aren't Python objects.


Is there any documentation on
> how to use dtypes, independently of other features of numpy? 

There are examples and other help pages at http://www.scipy.org

If not,
> can you clarify where the benefit lies for a Python user of this
> proposal? (I understand the benefits of a common language for
> extensions to communicate datatype information, but why expose it to
> Python? How do Python users use it?)
> 

The only benefit I imagine would be for an extension module library 
writer and for users of the struct and array modules.  But, other than 
that, I don't know.  It actually doesn't have to be exposed to Python. 
I used Python notation in the PEP to explain what is basically a 
C-structure.  I don't care if the object ever gets exposed to Python.

Maybe that's part of the communication problem.


> This is probably all self-evident to the numpy community, but I think
> that as the PEP is aimed at a wider audience it needs a little more
> background.

It's hard to write that background because most of what I understand is 
from the NumPy community.  I can't give you all the examples but my 
concern is that you have all these third party libraries out there 
describing what is essentially binary data and using either 
string-copies or the buffer protocol + extra information obtained by 
some method or attribute that varies across the implementations.  There 
should really be a standard for describing this data.

There are attempts at it in the struct and array module.  There is the 
approach of ctypes but I claim that using Python type objects is 
over-kill for the purposes of describing data-formats.

-Travis


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] patch 1462525 or similar solution?

2006-10-31 Thread Paul Jimenez

I submitted patch 1462525 awhile back to
solve the problem described even longer ago in
http://mail.python.org/pipermail/python-dev/2005-November/058301.html
and I'm wondering what my appropriate next steps are. Honestly, I don't
care if you take my patch or someone else's proposed solution, but I'd
like to see something go into the stdlib so that I can eventually stop
having to ship custom code for what is really a standard problem.

  --pj


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Ron Adam
> The only benefit I imagine would be for an extension module library 
> writer and for users of the struct and array modules.  But, other than 
> that, I don't know.  It actually doesn't have to be exposed to Python. 
> I used Python notation in the PEP to explain what is basically a 
> C-structure.  I don't care if the object ever gets exposed to Python.
> 
> Maybe that's part of the communication problem.


I get the impression where ctypes is good for accessing native C libraries from 
within python, the data-type object is meant to add a more direct way to share 
native python object's *data* with C (or other languages) in a more efficient 
way.  For data that can be represented well in continuous memory address's, it 
lightens the load so instead of a list of python objects you get an "array of 
data for n python_type objects" without the duplications of the python type for 
every element.

I think maybe some more complete examples demonstrating how it is to be used 
from both the Python and C would be good.

Cheers,
Ron

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Extending the buffer protocol to share array information.

2006-10-31 Thread Travis Oliphant
Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
> 
>>Several extensions to Python utilize the buffer protocol to share
>>the location of a data-buffer that is really an N-dimensional
>>array.  However, there is no standard way to exchange the
>>additional N-dimensional array information so that the data-buffer
>>is interpreted correctly.  The NumPy project introduced an array
>>interface (http://numpy.scipy.org/array_interface.shtml) through a
>>set of attributes on the object itself.  While this approach
>>works, it requires attribute lookups which can be expensive when
>>sharing many small arrays.  
> 
> 
> Can you please give examples for real-world applications of this
> interface, preferably examples involving multiple
> independently-developed libraries?
> ("this" being the current interface in NumPy - I understand that
>  the PEP's interface isn't implemented, yet)
> 

Examples of Need

 1) Suppose you have a image in *.jpg format that came from a
 camera and you want to apply Fourier-based image recovery to try
 and de-blur the image using modified Wiener filtering.  Then you
 want to save the result in *.png format.  The PIL provides an easy
 way to read *.jpg files into Python and write the result to *.png 

 and NumPy provides the FFT and the array math needed to implement
 the algorithm.  Rather than have to dig into the details of how
 NumPy and the PIL interpret chunks of memory in order to write a
 "converter" between NumPy arrays and PIL arrays, there should be
 support in the buffer protocol so that one could write
 something like:

 # Read the image
 a = numpy.frombuffer(Image.open('myimage.jpg')).

 # Process the image.
 A = numpy.fft.fft2(a)
 B = A*inv_filter
 b = numpy.fft.ifft2(B).real

 # Write it out
 Image.frombuffer(b).save('filtered.png')

 Currently, without this proposal you have to worry about the "mode"
 the image is in and get it's shape using a specific method call
 (this method call is different for every object you might want to
 interface with).

 2) The same argument for a library that reads and writes
 audio or video formats exists.

 3) You want to blit images onto a GUI Image buffer for rapid
 updates but need to do math processing on the image values
 themselves or you want to read the images from files supported by
 the PIL.

 If the PIL supported the extended buffer protocol, then you would
 not need to worry about the "mode" and the "shape" of the Image.

 What's more, you would also be able to accept images from any
 object (like NumPy arrays or ctypes arrays) that supported the
 extended buffer protcol without having to learn how it shares
 information like shape and data-format.


I could have also included examples from PyGame, OpenGL, etc.  I thought 
people were more aware of this argument as we've made it several times 
over the years.  It's just taken this long to get to a point to start 
asking for something to get into Python.


> Paul Moore (IIRC) gave the example of equalising the green values
> and maximizing the red values in a PIL image by passing it to NumPy:
> Is that a realistic (even though not-yet real-world) example? 

I think so, but I've never done something like that.

If
> so, what algorithms of NumPy would I use to perform this image
> manipulation (and why would I use NumPy for it if I could just
> write a for loop that does that in pure Python, given PIL's
> getpixel/setdata)?

Basically you would use array math operations and reductions (ufuncs and 
it's methods which are included in NumPy).  You would do it this way for 
speed.   It's going to be a lot slower doing those loops in Python. 
NumPy provides the ability to do them at close-to-C speeds.

-Travis

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Extending the buffer protocol to share array information.

2006-10-31 Thread Terry Reedy

""Martin v. Löwis"" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Paul Moore (IIRC) gave the example of equalising the green values
> and maximizing the red values in a PIL image by passing it to NumPy:
> Is that a realistic (even though not-yet real-world) example? If
> so, what algorithms of NumPy would I use to perform this image
> manipulation

The use of surfarrays manipulated by Numeric has been an optional but 
important part of PyGame for years.
http://www.pygame.org/docs/
says
Surfarray Introduction
Pygame uses the Numeric python module to allow efficient per pixel effects 
on images. Using the surface arrays is an advanced feature that allows 
custom effects and filters. This also examines some of the simple effects 
from the Pygame example, arraydemo.py.
The Examples section of the linked page
http://www.pygame.org/docs/tut/surfarray/SurfarrayIntro.html
has code snippets for generating, resizing, recoloring, filtering, and 
cross-fading images.

>(and why would I use NumPy for it if I could just
> write a for loop that does that in pure Python, given PIL's
> getpixel/setdata)?

Why does anyone use Numeric/NumArray/NumPy?  Faster,easier coding and much 
faster execution, which is especially important when straining for an 
acceptible framerate.


I believe that at present PyGame can only work with external images that it 
is programmed to know how to import.  My guess is that if image source 
program X (such as PIL) described its data layout in a way that NumPy could 
read and act on, the import/copy step could be eliminated.  But perhaps 
Travis can clarify this.

Terry Jan Reedy




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Bill Baxter
One thing I'm curious about in the ctypes vs this PEP debate is the following. 
How do the approaches differ in practice if I'm developing a library that wants
to accept various image formats that all describe the same thing: rgb data. 
Let's say for now all I want to support is two different image formats whose
pixels are described in C structs by:

struct rbg565
{
  unsigned short r:5;
  unsigned short g:6;
  unsigned short b:5; 
};

struct rgb101210
{
  unsigned int r:10;
  unsigned int g:12;
  unsigned int b:10; 
};


Basically in my code I want to be able to take the binary data descriptor and
say "give me the 'r' field of this pixel as an integer".

Is either one (the PEP or c-types) clearly easier to use in this case?  What
would the code look like for handling both formats generically?

--bb


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Path object design

2006-10-31 Thread Mike Orr
I just saw the Path object thread ("PEP 355 status", Sept-Oct), saying
that the first object-oriented proposal was rejected.  I'm in favor of
the "directory tuple" approach which wasn't mentioned in the thread.
This was proposed by Noal Raphael several months ago: a Path object
that's a sequence of components (a la os.path.split) rather than a
string.  The beauty of this approach is that slicing and joining are
expressed naturally using the [] and + operators, eliminating several
methods.

Introduction:  http://wiki.python.org/moin/AlternativePathClass
Feature discussion:  http://wiki.python.org/moin/AlternativePathDiscussion
Reference implementation:  http://wiki.python.org/moin/AlternativePathModule

(There's a link to the introduction at the end of PEP 355.)  Right now
I'm working on a test suite, then I want to add the features marked
"Mike" in the discussion -- in a way that people can compare the
feature alternatives in real code -- and write a PEP.  But it's a big
job for one person, and there are unresolved issues on the discussion
page, not to mention things brought up in the "PEP 355 status" thread.
 We had three people working on the discussion page but development
seems to have ground to a halt.

One thing is sure -- we urgently need something better than os.path.
It functions well but it makes hard-to-read and unpythonic code.  For
instance, I have an application that has to add its libraries to the
Python path, relative to the executable's location.

/toplevel
app1/
bin/
main_progam.py
utility1.py
init_app.py
lib/
app_module.py
shared/
lib/
shared_module.py

The solution I've found is an init_app module in every application
that sets up the paths.  Conceptually it needs "../lib" and
"../../shared/lib", but I want the absolute paths without hardcoding
them, in a platform-neutral way.  With os.path, "../lib" is:

os.path.join(os.path.dirname(os.path.dirname(__FILE__)), "lib")

YUK!  Compare to PEP 355:

Path(__FILE__).parent.parent.join("lib")

Much easier to read and debug.  Under Noam's proposal it would be:

Path(__FILE__)[:-2] + "lib"

I'd also like to see the methods more intelligent: don't raise an
error if an operation is already done (e.g., a directory exists or a
file is already removed).  There's no reason to clutter one's code
with extra if's when the methods can easily encapsulate this. This was
considered a too radical departure from os.path for some, but I have
in mind even more radical convenience methods which I'd put in a
third-party subclass if they're not accepted into the standard
library, the way 'datetime' has third-party subclasses.

In my application I started using Orendorff's path module, expecting
the standard path object would be close to it.  When PEP 355 started
getting more changes and the directory-based alternative took off, I
took path.py out and rewrote my code for os.path until an alternative
becomes more stable. Now it looks like it will be several months and
possibly several third-party packages until one makes it into the
standard library. This is unfortunate.  Not only does it mean ugly
code in applications, but it means packages can't accept or return
Path objects and expect them to be compatible with other packages.

The reasons PEP 355 was rejected also sound strange.  Nick Coghlan
wrote (Oct 1):

> Things the PEP 355 path object lumps together:
>   - string manipulation operations
>   - abstract path manipulation operations (work for non-existent filesystems)
>   - read-only traversal of a concrete filesystem (dir, stat, glob, etc)
>   - addition & removal of files/directories/links within a concrete filesystem

> Dumping all of these into a single class is certainly practical from a utility
> point of view, but it's about as far away from beautiful as you can get, which
> creates problems from a learnability point of view, and from a
> capability-based security point of view.

What about the convenience of the users and the beauty of users' code?
 That's what matters to me.  And I consider one class *easier* to
learn.  I'm tired of memorizing that 'split' is in os.path while
'remove' and 'stat' are in os.  This seems arbitrary: you're statting
a path, aren't you?  Also, if you have four classes (abstract path,
file, directory, symlink), *each* of those will have 3+
platform-specific versions.  Then if you want to make an enhancement
subclass you'll have to make 12 of them, one for each of the 3*4
combinations of superclasses.  Encapsulation can help with this, but
it strays from the two-line convenience for the user:

from path import Path
p = Path("ABC")  # Works the same for files/directories on any platform.

Nevertheless, I'm open to seeing a multi-class API, though hopefully
less verbose than Talin's preliminary one (Oct 26).  Is it necessary
to support path.parent(), pathobj.parent(), io.dir.listdir(), *and*
io.dir.Directory().  T

Re: [Python-Dev] Path object design

2006-10-31 Thread Talin
I'm right in the middle of typing up a largish post to go on the 
Python-3000 mailing list about this issue. Maybe we should move it over 
there, since its likely that any path reform will have to be targeted at 
Py3K...?

Mike Orr wrote:
> I just saw the Path object thread ("PEP 355 status", Sept-Oct), saying
> that the first object-oriented proposal was rejected.  I'm in favor of
> the "directory tuple" approach which wasn't mentioned in the thread.
> This was proposed by Noal Raphael several months ago: a Path object
> that's a sequence of components (a la os.path.split) rather than a
> string.  The beauty of this approach is that slicing and joining are
> expressed naturally using the [] and + operators, eliminating several
> methods.
> 
> Introduction:  http://wiki.python.org/moin/AlternativePathClass
> Feature discussion:  http://wiki.python.org/moin/AlternativePathDiscussion
> Reference implementation:  http://wiki.python.org/moin/AlternativePathModule
> 
> (There's a link to the introduction at the end of PEP 355.)  Right now
> I'm working on a test suite, then I want to add the features marked
> "Mike" in the discussion -- in a way that people can compare the
> feature alternatives in real code -- and write a PEP.  But it's a big
> job for one person, and there are unresolved issues on the discussion
> page, not to mention things brought up in the "PEP 355 status" thread.
>  We had three people working on the discussion page but development
> seems to have ground to a halt.
> 
> One thing is sure -- we urgently need something better than os.path.
> It functions well but it makes hard-to-read and unpythonic code.  For
> instance, I have an application that has to add its libraries to the
> Python path, relative to the executable's location.
> 
> /toplevel
> app1/
> bin/
> main_progam.py
> utility1.py
> init_app.py
> lib/
> app_module.py
> shared/
> lib/
> shared_module.py
> 
> The solution I've found is an init_app module in every application
> that sets up the paths.  Conceptually it needs "../lib" and
> "../../shared/lib", but I want the absolute paths without hardcoding
> them, in a platform-neutral way.  With os.path, "../lib" is:
> 
> os.path.join(os.path.dirname(os.path.dirname(__FILE__)), "lib")
> 
> YUK!  Compare to PEP 355:
> 
> Path(__FILE__).parent.parent.join("lib")
> 
> Much easier to read and debug.  Under Noam's proposal it would be:
> 
> Path(__FILE__)[:-2] + "lib"
> 
> I'd also like to see the methods more intelligent: don't raise an
> error if an operation is already done (e.g., a directory exists or a
> file is already removed).  There's no reason to clutter one's code
> with extra if's when the methods can easily encapsulate this. This was
> considered a too radical departure from os.path for some, but I have
> in mind even more radical convenience methods which I'd put in a
> third-party subclass if they're not accepted into the standard
> library, the way 'datetime' has third-party subclasses.
> 
> In my application I started using Orendorff's path module, expecting
> the standard path object would be close to it.  When PEP 355 started
> getting more changes and the directory-based alternative took off, I
> took path.py out and rewrote my code for os.path until an alternative
> becomes more stable. Now it looks like it will be several months and
> possibly several third-party packages until one makes it into the
> standard library. This is unfortunate.  Not only does it mean ugly
> code in applications, but it means packages can't accept or return
> Path objects and expect them to be compatible with other packages.
> 
> The reasons PEP 355 was rejected also sound strange.  Nick Coghlan
> wrote (Oct 1):
> 
>> Things the PEP 355 path object lumps together:
>>   - string manipulation operations
>>   - abstract path manipulation operations (work for non-existent filesystems)
>>   - read-only traversal of a concrete filesystem (dir, stat, glob, etc)
>>   - addition & removal of files/directories/links within a concrete 
>> filesystem
> 
>> Dumping all of these into a single class is certainly practical from a 
>> utility
>> point of view, but it's about as far away from beautiful as you can get, 
>> which
>> creates problems from a learnability point of view, and from a
>> capability-based security point of view.
> 
> What about the convenience of the users and the beauty of users' code?
>  That's what matters to me.  And I consider one class *easier* to
> learn.  I'm tired of memorizing that 'split' is in os.path while
> 'remove' and 'stat' are in os.  This seems arbitrary: you're statting
> a path, aren't you?  Also, if you have four classes (abstract path,
> file, directory, symlink), *each* of those will have 3+
> platform-specific versions.  Then if you want to make an enhancement
> subclass you'll have to make 12 of them, one for each of the 3*4
> combinations of superclass

Re: [Python-Dev] PEP: Extending the buffer protocol to share arrayinformation.

2006-10-31 Thread Terry Reedy

"Travis Oliphant" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
>Examples of Need
[snip]
< I could have also included examples from PyGame, OpenGL, etc.  I thought
>people were more aware of this argument as we've made it several times
>over the years.  It's just taken this long to get to a point to start
>asking for something to get into Python.

The problem of data format definition and sharing of data between 
applications has been a bugaboo of computer science for decades.  But some 
have butted their heads against it more than others.

Something which made a noticeable dent in the problem, by making sharing 
'just work' more easily, would, to me, be a read plus for python.

tjr




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Adding data-type objects to Python

2006-10-31 Thread Ronald Oussoren


On Oct 31, 2006, at 6:38 PM, Thomas Heller wrote:



This mechanism is probably a hack because it'n not possible to add  
C accessible
fields to type objects, on the other hand it is extensible (in  
principle, at least).


I better start rewriting PyObjC then :-). PyObjC stores some addition  
information in the type objects that are used to describe Objective-C  
classes (such as a reference to the proxied class).


IIRC This has been possible from Python 2.3.

Ronald




smime.p7s
Description: S/MIME cryptographic signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path object design

2006-10-31 Thread Fredrik Lundh
Talin wrote:

> I'm right in the middle of typing up a largish post to go on the 
> Python-3000 mailing list about this issue. Maybe we should move it over 
> there, since its likely that any path reform will have to be targeted at 
> Py3K...?

I'd say that any proposal that cannot be fit into the current 2.X design 
is simply too disruptive to go into 3.0.  So here's my proposal for 2.6 
(reposted from the 3K list).

This is fully backwards compatible, can go right into 2.6 without 
breaking anything, allows people to update their code as they go,
and can be incrementally improved in future releases:

 1) Add a pathname wrapper to "os.path", which lets you do basic
path "algebra".  This should probably be a subclass of unicode,
and should *only* contain operations on names.

 2) Make selected "shutil" operations available via the "os" name-
space; the old POSIX API vs. POSIX SHELL distinction is pretty
irrelevant.  Also make the os.path predicates available via the
"os" namespace.

This gives a very simple conceptual model for the user; to manipulate
path *names*, use "os.path.(string)" functions or the ""
wrapper.  To manipulate *objects* identified by a path, given either as
a string or a path wrapper, use "os.(path)".  This can be taught in
less than a minute.

With this in place in 2.6 and 2.7, all that needs to be done for 3.0 is 
to remove (some of) the old cruft.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: Extending the buffer protocol to share array information.

2006-10-31 Thread Fredrik Lundh
Terry Reedy wrote:

> I believe that at present PyGame can only work with external images that it 
> is programmed to know how to import.  My guess is that if image source 
> program X (such as PIL) described its data layout in a way that NumPy could 
> read and act on, the import/copy step could be eliminated.

I wish you all stopped using PIL as an example in this discussion;
for PIL 2, I'm moving towards an entirely opaque data model, with a 
"data view"-style client API.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com