Re: [Python-Dev] XML codec?

2007-11-10 Thread Martin v. Löwis
>  > In case it isn't clear - this is exactly my view also.
> 
> But is there an API to do it?  As MAL points out that API would have
> to return not an encoding, but a pair of an encoding and the rewound
> stream.  

The API wouldn't operate on streams. Instead, you pass a string, and
it either returns the detected encoding, or an information telling that
it needs more data. No streams.

> For non-seekable, non-peekable streams (if any), what you'd
> need would be a stream that consisted of a concatenation of the
> buffered data used for detection and the continuation of the stream.

The application would read data out of the stream, and pass it to
the detection. It then can process it in whatever manner it meant to
process it in the first place.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-10 Thread Martin v. Löwis
> A non-seekable stream is not all that uncommon in network processing.

Right. But what is the relationship to XML encoding autodetection?

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-10 Thread Walter Dörwald
"Martin v. Löwis" sagte:

>>> So what if the unicode string doesn't start with an XML declaration?
>>> Will it add one?
>>
>> No.
>
> Ok. So the XML document would be ill-formed then unless the encoding is
> UTF-8, right?

I don't know. Is an XML document ill-formed if it doesn't contain an XML 
declaration, is not in UTF-8 or UTF-8, but there's
external encoding info? If it is, then yes, the document would be ill-formed.

>> The point of this code is not just to return whether the string starts
>> with "
> Still, it's overly complex for that matter:
>
>>   * The string does start with "
>if s.startswith("  return Yes
>
>>   * The string starts with a prefix of "> decide if it starts with "
>if "  return Maybe
>
>>   * The string definitely doesn't start with "
>return No

This looks good. Now we would have to extent the code to detect and replace the 
encoding in the XML declaration too.

>>> What bit fiddling are you referring to specifically that you think
>>> is better done in C than in Python?
>>
>> The code that checks the byte signature, i.e. the first part of
>> detect_xml_encoding_str().
>
> I can't see any *bit* fiddling there, except for the bit mask of
> candidates. For the candidate list, I cannot quite understand why
> you need a bit mask at all, since the candidates are rarely
> overlapping.

I tried many variants and that seemed to be the most straitforward one.

> I think there could be a much simpler routine to have the same
> effect.
> - if it's less than 4 bytes, answer "need more data".

Can there be an XML document that is less then 4 bytes? I guess not.

> - otherwise, implement annex F "literally". Make a dictionary
>   of all prefixes that are exactly 4 bytes, i.e.
>
>   prefixes4 = {"\x00\x00\xFE\xFF":"utf-32be", ...
>   ...,"\0\x3c\0\x3f":"utf-16le"}
>
>   try: return prefixes4[s[:4]]
>   except KeyError: pass
>   if s.startswith(codecs.BOM_UTF16_BE):return "utf-16be"
>   ...
>   if s.startswith("  return get_encoding_from_declaration(s)
>   return "utf-8"

get_encoding_from_declaration() would have to do the same yes/no/maybe decision.

But anyway: would a Python implementation of these two functions 
(detect_encoding()/fix_encoding()) be accepted?

Servus,
   Walter


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] XML codec?

2007-11-10 Thread Walter Dörwald
"Martin v. Löwis" sagte:
>> And what do you do once you've detected the encoding? You decode the
>> input, so why not combine both into an XML decoder?
>
> Because it is the XML parser that does the decoding, not the
> application. Also, it is better to provide functionality in
> a modular manner (i.e. encoding detection separately from
> encodings),

It is separate. Detection is done by codecs.detect_xml_encoding(), decoding is 
done by the codec.

> and leaving integration of modules to the application,
> in particular if the integration is trivial.

Servus,
   Walter


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Tracker Issues

2007-11-10 Thread Tracker

ACTIVITY SUMMARY (11/03/07 - 11/10/07)
Tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 1322 open (+21) / 11579 closed (+18) / 12901 total (+39)

Open issues with patches:   419

Average duration of open issues: 687 days.
Median duration of open issues: 789 days.

Open Issues Breakdown
   open  1317 (+21)
pending 5 ( +0)

Issues Created Or Reopened (40)
___

test_import breaks on Linux  11/09/07
   http://bugs.python.org/issue1377reopened gvanrossum   
   py3k

fix for test_asynchat and test_asyncore on pep3137 branch11/03/07
CLOSED http://bugs.python.org/issue1380created  hupp 
   py3k, patch 

cmath is numerically unsound 11/03/07
   http://bugs.python.org/issue1381created  inducer  
   

py3k-pep3137: patch for test_ctypes  11/04/07
CLOSED http://bugs.python.org/issue1382created  amaury.forgeotdarc   
   py3k, patch 

Backport abcoll to 2.6   11/04/07
   http://bugs.python.org/issue1383created  baranguren   
   patch   

Windows fix for inspect tests11/04/07
CLOSED http://bugs.python.org/issue1384created  tiran
   py3k, patch 

hmac module violates RFC for some hash functions, e.g. sha51211/04/07
CLOSED http://bugs.python.org/issue1385created  jowagner 
   py3k

py3k-pep3137: patch to ensure that all codecs return bytes   11/04/07
CLOSED http://bugs.python.org/issue1386created  amaury.forgeotdarc   
   py3k, patch 

py3k-pep3137: patch for hashlib on Windows   11/04/07
CLOSED http://bugs.python.org/issue1387created  amaury.forgeotdarc   
   py3k, patch 

py3k-pep3137: possible ref leak in ctypes11/05/07
CLOSED http://bugs.python.org/issue1388created  tiran
   py3k

py3k-pep3137: struct module is leaking references11/05/07
CLOSED http://bugs.python.org/issue1389created  tiran
   py3k

toxml generates output that is not well formed   11/05/07
   http://bugs.python.org/issue1390created  drtomc   
   

Adds the .compact() method to bsddb db.DB objects11/05/07
   http://bugs.python.org/issue1391created  gregory.p.smith  
   patch, rfe  

py3k-pep3137: issue warnings / errors on str(bytes()) and simila 11/05/07
CLOSED http://bugs.python.org/issue1392created  tiran
   py3k, patch 

function comparing lacks NotImplemented error11/05/07
   http://bugs.python.org/issue1393created  _doublep 
   

simple patch, improving unreachable bytecode removing11/05/07
   http://bugs.python.org/issue1394created  _doublep 
   patch   

py3k: duplicated line endings when using read(1) 11/06/07
   http://bugs.python.org/issue1395created  amaury.forgeotdarc   
   py3k

py3k-pep3137: patch for mailbox  11/06/07
CLOSED http://bugs.python.org/issue1396created  tiran
   py3k, patch 

py3k-pep3137: failing unit test test_bsddb   11/06/07
   http://bugs.python.org/issue1397created  tiran
   py3k

Can't pickle partial functions   11/07/07

Re: [Python-Dev] Declaring setters with getters

2007-11-10 Thread Guido van Rossum
Unless I get negative feedback really soon I plan to submit this later
today. I've tweaked the patch slightly to be smarter about replacing
the setter and the deleter together if they are the same object.

On Nov 9, 2007 10:03 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> D'oh. I forgot to point to the patch. It's here:
> http://bugs.python.org/issue1416
>
>
> On Nov 9, 2007 10:00 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > To follow up, I now have a patch. It's pretty straightforward.
> >
> > This implements the kind of syntax that I believe won over most folks
> > in the end:
> >
> >   @property
> >   def foo(self): ...
> >
> >   @foo.setter
> >   def foo(self, value=None): ...
> >
> > There are also .getter and .deleter descriptors.  This includes the hack
> > that if you specify a setter but no deleter, the setter is called
> > without a value argument when attempting to delete something.  If the
> > setter isn't ready for this, a TypeError will be raised, pretty much
> > just as if no deleter was provided (just with a somewhat worse error
> > message :-).
> >
> > I intend to check this into 2.6 and 3.0 unless there is a huge cry of
> > dismay.  Docs will be left to volunteers as always.
> >
> > --Guido
> >
> >
> > On Oct 31, 2007 9:08 AM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > > I've come up with a relatively unobtrusive pattern for defining
> > > setters. Given the following definition:
> > >
> > > def propset(prop):
> > > assert isinstance(prop, property)
> > > def helper(func):
> > > return property(prop.__get__, func, func, prop.__doc__)
> > > return helper
> > >
> > > we can declare getters and setters as follows:
> > >
> > > class C(object):
> > >
> > > _encoding = None
> > >
> > > @property
> > > def encoding(self):
> > > return self._encoding
> > >
> > > @propset(encoding)
> > > def encoding(self, value=None):
> > > if value is not None:
> > > unicode("0", value)  # Test it
> > > self._encoding = value
> > >
> > > c = C()
> > > print(c.encoding)
> > > c.encoding = "ascii"
> > > print(c.encoding)
> > > try:
> > > c.encoding = "invalid"  # Fails
> > > except:
> > > pass
> > > print(c.encoding)
> > >
> > > I'd like to make this a standard built-in, in the hope the debate on
> > > how to declare settable properties.
> > >
> > > I'd also like to change property so that the doc string defaults to
> > > the doc string of the getter.
> > >
> > > --
> > > --Guido van Rossum (home page: http://www.python.org/~guido/)
> > >
> >
> >
> >
> > --
> > --Guido van Rossum (home page: http://www.python.org/~guido/)
> >
>
>
>
> --
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Declaring setters with getters

2007-11-10 Thread Steven Bethard
On Nov 10, 2007 11:31 AM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> Unless I get negative feedback really soon I plan to submit this later
> today. I've tweaked the patch slightly to be smarter about replacing
> the setter and the deleter together if they are the same object.

Definitely +1 on the basic patch.

Could you explain briefly the advantage of the "hack" that merges the
set and del methods?  Looking at the patch, I get a little nervous
about this::

@foo.setter
def foo(self, value=None):
if value is None:
del self._foo
else:
self._foo = abs(value)

That means that ``c.foo = None`` is equivalent to ``del c.foo`` right?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Declaring setters with getters

2007-11-10 Thread Guido van Rossum
On Nov 10, 2007 11:09 AM, Steven Bethard <[EMAIL PROTECTED]> wrote:
> On Nov 10, 2007 11:31 AM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > Unless I get negative feedback really soon I plan to submit this later
> > today. I've tweaked the patch slightly to be smarter about replacing
> > the setter and the deleter together if they are the same object.
>
> Definitely +1 on the basic patch.
>
> Could you explain briefly the advantage of the "hack" that merges the
> set and del methods?  Looking at the patch, I get a little nervous
> about this::
>
> @foo.setter
> def foo(self, value=None):
> if value is None:
> del self._foo
> else:
> self._foo = abs(value)
>
> That means that ``c.foo = None`` is equivalent to ``del c.foo`` right?

Which is sometimes convenient. But thinking about this some more I
think that if I *wanted* to use the same method as setter and deleter,
I could just write

@foo.setter
@foo.deleter
def foo(self, value=None): ...

So I'm withdrawing the hacks, making the code and semantics much simpler.

See propset3.diff in http://bugs.python.org/issue1416 .

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Declaring setters with getters

2007-11-10 Thread Christian Heimes
Guido van Rossum wrote:
> Which is sometimes convenient. But thinking about this some more I
> think that if I *wanted* to use the same method as setter and deleter,
> I could just write
> 
> @foo.setter
> @foo.deleter
> def foo(self, value=None): ...
> 
> So I'm withdrawing the hacks, making the code and semantics much simpler.

I like the new way better than the implicit magic of your former patch.
(*) I've reviewed your patch and I found a minor typo caused by copy and
paste.

Good work Guido!

Christian

(*) The buzz words 'implicit' and 'magic' are used in this posting to
make Guido's non-pythonic-code-sense tingle. *scnr* :]

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Declaring setters with getters

2007-11-10 Thread Guido van Rossum
On Nov 10, 2007 1:43 PM, Christian Heimes <[EMAIL PROTECTED]> wrote:
> Good work Guido!

With sich a ringing endorsement, I've submitted this to the 2.6 trunk
and the py3k branch.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bug tracker: meaning of resolution keywords

2007-11-10 Thread Brett Cannon
On Nov 9, 2007 9:05 AM, Christian Heimes <[EMAIL PROTECTED]> wrote:
> Hello!
>
> Guido has granted me committer privileges to svn.python.org and
> bugs.python.org about a week ago. So I'm new and new people tend to make
> mistakes until they've learned the specific rules of a project.
>
> Today I've learned that the resolution keyword "accepted" doesn't mean
> the bug report is accepted. It only means a patch for the bug is
> accepted. In the past I've used "accepted" in the meaning of "bug is
> confirmed" in my own projects. In my ignorance I've used it in the same
> way to mark bugs as confirmed when I was able to reproduce the bug myself.
>
> The tracker doc at http://wiki.python.org/moin/TrackerDocs/ doesn't have
> a formal definition of the various keywords. I like to add a definition
> to the wiki to prevent others from making the same mistake. But first I
> like to discuss my view of the keywords
>
> Resolutions
> ***
>
> accepted - patch accepted
> confirmed (*) - the problem is confirmed
> duplicate - the bug is a duplicated of another bug
> fixed - the bug is fixed / patch is applied
> invalid - catch all for invalid reports
> later - the problem is going to be addressed later in the release cycle
> out of date - the bug was already fixed in svn
> postponed - the problem is going to be fixed in the next minor version
> rejected - the patch or feature request is rejected
> remind - remind me to finish the task (docs, unit tests)
> wont fix - it's not a bug, it's a feature
> works for me - unable to reproduce the problem

It doesn't really work for you if you can't reproduce it.  =)

An important thing to remember is all of the states are there because
they are hold-overs for SourceForge's bug tracker, not from choice.
SOMEDAY, damn it, I am going to have the time to work on redesigning
our workflow is how WE want it to be and makes sense for us.  Then we
can have a doc like Django has
(http://www.djangoproject.com/documentation/contributing/#ticket-triage)
which would spell all of this out.

But as Christian knows first hand from me not getting to any of my
bugs quickly as of late, I don't have the time right now.  =(  But I
have stopped adding to my list of stuff to do for Python (it is
already long enough as it is) so that I will eventually get to this in
2008.

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bug tracker: meaning of resolution keywords

2007-11-10 Thread Facundo Batista
2007/11/9, Christian Heimes <[EMAIL PROTECTED]>:

> Guido has granted me committer privileges to svn.python.org and
> bugs.python.org about a week ago. So I'm new and new people tend to make
> mistakes until they've learned the specific rules of a project.

Yes, I saw the change in developers.txt. Now you remind me that I was
going to ask yourself for a presentation. Who're you, what do you do,
where're you from, what do you like, etc. And I hope to meet you in
Chicago!


> Today I've learned that the resolution keyword "accepted" doesn't mean
> the bug report is accepted. It only means a patch for the bug is
> accepted. In the past I've used "accepted" in the meaning of "bug is

If you accept a patch for a bug, doesn't it imply that the bug is real
and that you're accepting the bug?


> accepted - patch accepted
> confirmed (*) - the problem is confirmed
> duplicate - the bug is a duplicated of another bug
> fixed - the bug is fixed / patch is applied
> invalid - catch all for invalid reports
> later - the problem is going to be addressed later in the release cycle
> out of date - the bug was already fixed in svn
> postponed - the problem is going to be fixed in the next minor version
> rejected - the patch or feature request is rejected
> remind - remind me to finish the task (docs, unit tests)
> wont fix - it's not a bug, it's a feature
> works for me - unable to reproduce the problem

I think that they're too many. You shouldn't be thinking too much in
which category to put a bug, or arguing with a coworker for a
category.

Some can clearly be combined (like "later" and "postponed"), some
needs more thought (like "invalid", doesn't it includes "works for
me"?). But it would be great if they're only 5 or 6, and not so vague.

Thanks!

-- 
.Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com