date:20150610

Permission to Showcase the Python Program

2015-06-10 Thread Leslie Bush

I’m having trouble reaching an actual human at your organization as my emails 
get bounced back.

If someone could please see my email below and respond with an answer as soon 
as possible, I’d appreciate it.




To Whom It May Concern:

My name is Leslie Bush and I am the Intellectual Property Coordinator for The 
Great Courses. We produce non-credit, college-level educational programs on DVD 
and electronic formats in a lecture series. The lectures are recorded and then 
sold to the general public for-profit. As a didactic tool for enhancing the 
programs, we include in the lectures visual elements illustrating works of art, 
people, events, locations, etc

We are currently producing a course with Dr. John Keyser, Professor of Computer 
Science at the University of North Carolina Chapel Hill, entitled “Computer 
Science for Everyone: Programming Concepts and Exercises”  The professor would 
like to use your Python program to showcase in the course. I am writing to see 
if we may have permission to do so. If permission is granted, we will have a 
copy of our license agreement sent over to your company’s authorizer for review 
and a signature.  Here are the details of our program:

Title: Computer Science for Everyone: Programming Concepts and Exercises
Author and Publisher: The Teaching Company
Language: All
Format: all electronic formats
Distribution: Worldwide
Print Run: Life of the product
Lecturer: Dr. John Keyser
Release date: Summer 2016
Price: Unknown

If you need further details or have questions or concerns, please do not 
hesitate to contact me.  Our address is 4840 Westfields Blvd. Chantilly, VA 
20151. For more information about The Teaching Company you can look at our site 
at www.thegreatcourses.com

If you are interested in seeing any of the courses on our site please let me 
know and I’ll be happy to send you a copy. I look forward to working with you 
on this exciting course. Thank you for your assistance.

Sincerely,
Leslie Bush
Product Development Intellectual Property Coordinator
The Teaching Company/The Great Courses
4840 Westfields Blvd., Suite 500, Chantilly, VA 20151-2299
(703)774-1687 Direct
(703)502-4270 Fax
www.thegreatcourses.com

-- 
https://mail.python.org/mailman/listinfo/python-list

ANN: eGenix mxODBC Plone/Zope Database Adapter 2.2.2

2015-06-10 Thread eGenix Team: M.-A. Lemburg


ANNOUNCING

  mxODBC Plone/Zope Database Adapter

Version 2.2.2

  for the Plone CMS and Zope server platform

  Available for Plone 4.0-4.3 and Plone 5.0,
Zope 2.12 and 2.13, on
Windows, Linux, Mac OS X, FreeBSD and other platforms

This announcement is also available on our web-site for online reading:
http://www.egenix.com/company/news/eGenix-mxODBC-Zope-DA-2.2.2-GA.html


INTRODUCTION

The eGenix mxODBC Zope DA allows you to easily connect your Zope or
Plone CMS installation to just about any database backend on the
market today, giving you the reliability of the commercially supported
eGenix product mxODBC and the flexibility of the ODBC standard as
middle-tier architecture.

The mxODBC Zope Database Adapter is highly portable, just like Zope
itself and provides a high performance interface to all your ODBC data
sources, using a single well-supported interface on Windows, Linux,
Mac OS X, FreeBSD and other platforms.

This makes it ideal for deployment in ZEO Clusters and Zope hosting
environments where stability and high performance are a top priority,
establishing an excellent basis and scalable solution for your Plone
CMS.

Product page:

http://www.egenix.com/products/zope/mxODBCZopeDA/


NEWS

The 2.2.2 release of our mxODBC Zope/Plone Database Adapter product is
a patch level release of the popular ODBC database interface for Plone
and Zope. It includes these enhancements and fixes:

Driver Compatibility Enhancements
-

 * Reenabled returning cursor.rowcount for FreeTDS >= 0.91. In
   previous versions, FreeTDS could return wrong data for .rowcount
   when using SELECTs.

Fixes
-

 * Removed exists() built-in from mxODBC Zope DA's implicit addition
   of new built-ins via mxTools.

   This resolves a hard to track bug where the new built-in could
   potentially override the TAL python:exists function (in
   e.g. tal:condition="exists:something"). See this
   Products.CMFEditions fix for an example where the problem
   surfaced. This is a bug in TAL (it shouldn't give preference to
   built-ins over its own helpers), but we're providing the fix as
   easy work-around.

The complete list of changes is available on the mxODBC Zope DA
changelog page.

http://www.egenix.com/products/zope/mxODBCZopeDA/changelog.html

mxODBC Zope DA 2.2.0 was released on 2014-12-11. Please see the mxODBC
Zope DA 2.2.0 release announcement for all the new features we have
added.

http://www.egenix.com/company/news/eGenix-mxODBC-Zope-DA-2.2.0-GA.html

For the full list of features, please see the mxODBC Zope DA feature
list:

http://www.egenix.com/products/zope/mxODBCZopeDA/#Features

The complete list of changes is available on the mxODBC Zope DA
changelog page.



UPGRADING

Users are encouraged to upgrade to this latest mxODBC Plone/Zope
Database Adapter release to benefit from the new features and updated
ODBC driver support. We have taken special care not to introduce
backwards incompatible changes, making the upgrade experience as
smooth as possible.

For major and minor upgrade purchases, we will give out 20% discount
coupons going from mxODBC Zope DA 1.x to 2.2 and 50% coupons for
upgrades from mxODBC 2.x to 2.2. After upgrade, use of the original
license from which you upgraded is no longer permitted. Patch level
upgrades (e.g. 2.2.0 to 2.2.2) are always free of charge.

Please contact the eGenix.com Sales Team with your existing license
serials for details for an upgrade discount coupon.

If you want to try the new release before purchase, you can request
30-day evaluation licenses by visiting our web-site or writing to
sa...@egenix.com, stating your name (or the name of the company) and
the number of eval licenses that you need.

http://www.egenix.com/products/python/mxODBCZopeDA/#Evaluation


DOWNLOADS

Please visit the eGenix mxODBC Zope DA product page for downloads,
instructions on installation and documentation of the packages:

http://www.egenix.com/company/products/zope/mxODBCZopeDA/

If you want to try the package, please jump straight to the download
instructions:

http://www.egenix.com/products/zope/mxODBCZopeDA/#Download

Fully functional evaluation licenses for the mxODBC Zope DA are
available free of charge:

http://www.egenix.com/products/zope/mxODBCZopeDA/#Evaluation


SUPPORT

Commercial support for this product is available directly from
eGenix.com.

Please see the support section of our website for details:

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Steven D'Aprano

On Wednesday 10 June 2015 14:48, Devin Jeanpierre wrote:

[...]
> and literal_eval is not a great idea.
> 
> * the common serializer (repr) does not output a canonical form, and
>   can serialize things in a way that they can't be deserialized

For literals, the canonical form is that understood by Python. I'm pretty 
sure that these have been stable since the days of Python 1.0, and will 
remain so pretty much forever:

ints: 12345
floats: 1.2345
strings: "spam"
None
True
False
lists, tuples, dicts and sets containing the above

There may be a few differences between Python 2 and 3, e.g. no set literal 
in Python 2, but in general the Python syntax is well-known and understood 
by anyone programming in Python.

> * there is no schema
> * there is no well understood migration story for when the data you
>   load and store changes

literal_eval is not a serialisation format itself. It is a primitive 
operation usable when serialising. E.g. you might write out a simple Unix-
style rc file of key:value pairs:

length=23.45
width=10.95
landscape=False

split on "=" and call literal_eval on the value.

This is a perfectly reasonable light-weight solution for simple 
serialisation needs.

> * it is not usable from other programming languages

That's okay, we're not writing in other programming languages :-)

> * it encourages the use of eval when literal_eval becomes inconvenient
>   or insufficient

I don't think so. I think that people who make the effort to import ast and 
call ast.literal_eval are fully aware of the dangers of eval and aren't 
silly enough to start using eval.

> * It is not particularly well specified or documented compared to the
>   alternatives.
> * The types you get back differ in python 2 vs 3

Doesn't matter. The type you *write* are different in Python 2 vs 3, so of 
course you do.

> For most apps, the alternatives are better. Irmen's serpent library is
> strictly better on every front, for example. (Except potentially
> security, who knows.)

Beyond simple needs, like rc files, literal_eval is not sufficient. You 
can't use it to deserialise arbitrary objects. That might be a feature, but 
if you need something more powerful than basic ints, floats, strings and a 
few others, literal_eval will not be powerful enough.

I think we are in violent agreement :-)

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Neal Becker

Chris Warrick wrote:

> On Tue, Jun 9, 2015 at 8:08 PM, Neal Becker  wrote:
>> One of the most annoying problems with py2/3 interoperability is that the
>> pickle formats are not compatible.  There must be many who, like myself,
>> often use pickle format for data storage.
>>
>> It certainly would be a big help if py3 could read/write py2 pickle
>> format. You know, backward compatibility?
> 
> Don’t use pickle. It’s unsafe — it executes arbitrary code, which
> means someone can give you a pickle file that will delete all your
> files or eat your cat.
> 
> Instead, use a safe format that has no ability to execute code, like
> JSON. It will also work with other programming languages and
> environments if you ever need to talk to anyone else.
> 
> But, FYI: there is backwards compatibility if you ask for it, in the
> form of protocol versions. That’s all you should know — again, don’t
> use pickle.
> 

I believe a good native serialization system is essential for any modern 
programming language.  If pickle isn't it, we need something else that can 
serialize all language objects.  Or, are you saying, it's impossible to do 
this safely?

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Robert Kern


On 2015-06-10 12:04, Neal Becker wrote:

Chris Warrick wrote:


On Tue, Jun 9, 2015 at 8:08 PM, Neal Becker  wrote:

One of the most annoying problems with py2/3 interoperability is that the
pickle formats are not compatible.  There must be many who, like myself,
often use pickle format for data storage.

It certainly would be a big help if py3 could read/write py2 pickle
format. You know, backward compatibility?


Don’t use pickle. It’s unsafe — it executes arbitrary code, which
means someone can give you a pickle file that will delete all your
files or eat your cat.

Instead, use a safe format that has no ability to execute code, like
JSON. It will also work with other programming languages and
environments if you ever need to talk to anyone else.

But, FYI: there is backwards compatibility if you ask for it, in the
form of protocol versions. That’s all you should know — again, don’t
use pickle.


I believe a good native serialization system is essential for any modern
programming language.  If pickle isn't it, we need something else that can
serialize all language objects.  Or, are you saying, it's impossible to do
this safely?


By the very nature of the stated problem: serializing all language objects. 
Being able to construct any object, including instances of arbitrary classes, 
means that arbitrary code can be executed. All I have to do is make a pickle 
file for an object that claims that its constructor is shutil.rmtree().


This is fine in some use cases (e.g. wire format for otherwise-secured 
communication between two endpoints under your complete control), but it is 
worrying in others, like your use case of data storage (and presumably sharing).


Python 2/3 is also the least of your compatibility worries there. Refactor a 
class to a different module, or did one of your third-party dependencies do 
this? Poof! Your pickle files no longer work.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Chris Angelico

On Wed, Jun 10, 2015 at 9:04 PM, Neal Becker  wrote:
> I believe a good native serialization system is essential for any modern
> programming language.  If pickle isn't it, we need something else that can
> serialize all language objects.  Or, are you saying, it's impossible to do
> this safely?

It is indeed impossible to serialize _all_ objects safely. How do you,
for instance, serialize an open socket?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Marko Rauhamaa

Robert Kern :

> By the very nature of the stated problem: serializing all language
> objects. Being able to construct any object, including instances of
> arbitrary classes, means that arbitrary code can be executed. All I
> have to do is make a pickle file for an object that claims that its
> constructor is shutil.rmtree().

You can't serialize/migrate arbitrary objects. Consider open TCP
connections, open files and other objects that extend outside the Python
VM. Also objects hold references to each other, leading to a huge
reference mesh.

For example:

   a.buddy = b
   b.buddy = a
   with open("a", "wb") as f: f.write(serialize(a))
   with open("b", "wb") as f: f.write(serialize(b))

   with open("a", "rb") as f: aa = deserialize(f.read())
   with open("b", "rb") as f: bb = deserialize(f.read())
   assert aa.buddy is bb


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Python NBSP DWIM

2015-06-10 Thread Tim Chase

str.split() doesn't seem to respect non-breaking space:

  Python 3.4.2 (default, Oct  8 2014, 10:45:20) 
  [GCC 4.9.1] on linux
  Type "help", "copyright", "credits" or "license" for more information.
  >>> print(repr("hello\N{NO-BREAK SPACE}world".split()))
  ['hello', 'world']

What's the purpose of a non-breaking space if it's treated like a
space for breaking/splitting purposes? :-)

Is this a bug?

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread random832

On Wed, Jun 10, 2015, at 08:08, Marko Rauhamaa wrote:
> You can't serialize/migrate arbitrary objects. Consider open TCP
> connections, open files and other objects that extend outside the Python
> VM. Also objects hold references to each other, leading to a huge
> reference mesh.
> 
> For example:
> 
>a.buddy = b
>b.buddy = a
>with open("a", "wb") as f: f.write(serialize(a))
>with open("b", "wb") as f: f.write(serialize(b))
> 
>with open("a", "rb") as f: aa = deserialize(f.read())
>with open("b", "rb") as f: bb = deserialize(f.read())
>assert aa.buddy is bb

Of course, if you serialize a single dict with e.g. {'a': a, 'b': b},
you can expect (with advanced serialization tools, anyway  - I suspect
JSON will just make a mess or exceed maximum recursion depth)
result['a'].buddy is result['b']
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Robert Kern

On 2015-06-10 13:08, Marko Rauhamaa wrote:

Robert Kern :

By the very nature of the stated problem: serializing all language
objects. Being able to construct any object, including instances of
arbitrary classes, means that arbitrary code can be executed. All I
have to do is make a pickle file for an object that claims that its
constructor is shutil.rmtree().

You can't serialize/migrate arbitrary objects. Consider open TCP
connections, open files and other objects that extend outside the Python
VM.

Yes, yes, but that's really beside the point. Yes, there are some objects for 
which it doesn't even make sense to serialize. But my point is that even in this 
slightly smaller set of objects that *can* be serialized (and pickle currently 
does serialize), being able to serialize all of them entails arbitrary code 
execution to deserialize them. To allow people to write their own types that can 
be serialized, you have to let them specify arbitrary callables that will do the 
reconstruction. If you whitelist the possible reconstruction callables, you have 
greatly restricted the types that can participate in the serialization system.

Also objects hold references to each other, leading to a huge
reference mesh.

For example:

a.buddy = b
b.buddy = a
with open("a", "wb") as f: f.write(serialize(a))
with open("b", "wb") as f: f.write(serialize(b))

with open("a", "rb") as f: aa = deserialize(f.read())
with open("b", "rb") as f: bb = deserialize(f.read())
assert aa.buddy is bb

Yeah, no one expects that to work. For example, if I deserialize the same string 
twice, you can't expect to get identical returned objects (as in, 
"deserialize(pickle) is deserialize(pickle)"). However, pickle does correctly 
handle fairly arbitrary reference graphs within the context of a single 
serialization, which is the most that can be asked of a serialization system. 
That isn't really a concern here.

>>> class A(object):
... pass
...
>>> a = A()
>>> b = A()
>>> a.buddy = b
>>> b.buddy = a
>>> data = [a, b]
>>> data[0].buddy is data[1]
True
>>> data[1].buddy is data[0]
True
>>> import cPickle
>>> unpickled = cPickle.loads(cPickle.dumps(data))
>>> unpickled[0].buddy is unpickled[1]
True
>>> unpickled[1].buddy is unpickled[0]
True

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Mark Lawrence


On 10/06/2015 14:28, Tim Chase wrote:

str.split() doesn't seem to respect non-breaking space:

   Python 3.4.2 (default, Oct  8 2014, 10:45:20)
   [GCC 4.9.1] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   >>> print(repr("hello\N{NO-BREAK SPACE}world".split()))
   ['hello', 'world']

What's the purpose of a non-breaking space if it's treated like a
space for breaking/splitting purposes? :-)

Is this a bug?

-tkc



IMNSHO yes.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Did the 3.4.4 docs get published early?

2015-06-10 Thread Nicholas Chammas

For example, here is a "New in version 3.4.4" method:

https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future

However, the latest release appears to be 3.4.3:

https://www.python.org/downloads/

Is this normal, or did the 3.4.4 docs somehow get published early by
mistake?

Nick
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Skip Montanaro

On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase
 wrote:
> Is this a bug?

Looks like it's been reported a few times with slightly different context:

https://bugs.python.org/issue6537
https://bugs.python.org/issue16623
https://bugs.python.org/issue20491
https://bugs.python.org/issue1390608

The couple times it's come up in the context of str.split, it's been
rejected, since the purpose of that method is to split words.

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Did the 3.4.4 docs get published early?

2015-06-10 Thread Mark Lawrence


On 10/06/2015 15:11, Nicholas Chammas wrote:

For example, here is a "New in version 3.4.4" method:

https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future

However, the latest release appears to be 3.4.3:

https://www.python.org/downloads/

Is this normal, or did the 3.4.4 docs somehow get published early by
mistake?

Nick



I suspect that this is due to a trainee pilot being let loose too early 
with the time machine.  Failing that finger trouble when doing a commit. 
 Thinking about it more likely the former rather than the latter :)


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Laura Creighton

In a message of Wed, 10 Jun 2015 09:28:24 -0500, Skip Montanaro writes:
>On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase
> wrote:
>> Is this a bug?
>
>Looks like it's been reported a few times with slightly different context:
>
>https://bugs.python.org/issue6537
>https://bugs.python.org/issue16623
>https://bugs.python.org/issue20491
>https://bugs.python.org/issue1390608
>
>The couple times it's come up in the context of str.split, it's been
>rejected, since the purpose of that method is to split words.
>
>Skip

In these unicode days, this thinking may need to be revisited.  There
are many languages where whitespace does not separate words -- either
words aren't separated, or in Vietnamese, spaces separate syllables,
so entire words have spaces in them.

Laura

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Did the 3.4.4 docs get published early?

2015-06-10 Thread Zachary Ware

On Jun 10, 2015 9:41 AM, "Mark Lawrence"  wrote:
>
> On 10/06/2015 15:11, Nicholas Chammas wrote:
>>
>> For example, here is a "New in version 3.4.4" method:
>>
>> https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future
>>
>> However, the latest release appears to be 3.4.3:
>>
>> https://www.python.org/downloads/
>>
>> Is this normal, or did the 3.4.4 docs somehow get published early by
>> mistake?
>>
>> Nick
>>
>
> I suspect that this is due to a trainee pilot being let loose too early
with the time machine.  Failing that finger trouble when doing a commit.
Thinking about it more likely the former rather than the latter :)
>

Actually, it's just that the online docs reflect the latest documentation
from a particular branch of the source repository, since the docs are
continually improving and have no backwards compatibility constraints. This
does mean we sometimes have anomalies like this, though.

If you truly need the docs as they were at the time of release, they are
available, though I don't have a link handy on my phone.

--
Zach
(On a phone)
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread random832

On Wed, Jun 10, 2015, at 11:03, Laura Creighton wrote:
> In these unicode days, this thinking may need to be revisited.  There
> are many languages where whitespace does not separate words -- either
> words aren't separated, or in Vietnamese, spaces separate syllables,
> so entire words have spaces in them.

Text wrapping for CJK scripts is another topic that might be worth
addressing in textwrap - words aren't space-separated, but there are
still rules about where you can place a line break. Generally these are
centered around preventing punctuation marks from being orphaned rather
than any attempt to algorithmically find word boundaries.

For the process called "Oikomi", while messing with kerning is not
strictly possible for monospaced text, it might be worthwhile in general
to have "preferred" and "maximum" line widths as parameters for
textwrap.

http://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Memory error while using pandas dataframe

2015-06-10 Thread Jason Swails

On Mon, Jun 8, 2015 at 3:32 AM, naren  wrote:

> Memory Error while working with pandas dataframe.
>
> Description of Environment Windows 7 python 3.4.2 32-bit version pandas
> 0.16.0
>
> We are running into the error described below. Any help provided will be
> sincerely appreciated.
>
> We are able to read a 300MB Csv file into a dataframe using the read_csv
> function. While working with the dataframe we ran into memory error. We
> used the pd.Concat function to concatenate two dataframes. So we decided to
> use chunksize for lazy reading. Chunking returns an object of type
> TextFileReader.
>
>
> http://pandas.pydata.org/pandas-docs/stable/io.html#iterating-through-files-chunk-by-chunk
>
> We are able to iterate over this object once as a debugging measure. The
> iterator gets exhausted after iterating once. So we are not able to convert
> the TextFileReader object back into a dataframe, using the pd.concat
> function.
>
It looks like you already figured out what your problem is.  The
TextFileReader is exhausted (i.e., at EOF), so you end up getting None from
it.

What is your question?  You want to be able to iterate through
TextFileReader again?

If so, try rewinding the file object that you passed to pd.concat.  If you
saved a reference to the file object, just call "seek(0)" on that object.
If you didn't, access it as the "f" attribute on the TextFileReader object
and call "seek(0)" on that instead.

That might work.  Otherwise, you should be more specific with your question
and provide a full segment of code that is as small as possible to
reproduce the error you're seeing.

HTH,
Jason
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Did the 3.4.4 docs get published early?

2015-06-10 Thread Jonas Wielicki

On 10.06.2015 17:05, Zachary Ware wrote:
> On Jun 10, 2015 9:41 AM, "Mark Lawrence"  wrote:
>>
>> On 10/06/2015 15:11, Nicholas Chammas wrote:
>>>
>>> For example, here is a "New in version 3.4.4" method:
>>>
>>> https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future
>>>
>>> However, the latest release appears to be 3.4.3:
>>>
>>> https://www.python.org/downloads/
>>>
>>> Is this normal, or did the 3.4.4 docs somehow get published early by
>>> mistake?
>>>
>>> Nick
>>>
>>
>> I suspect that this is due to a trainee pilot being let loose too early
> with the time machine.  Failing that finger trouble when doing a commit.
> Thinking about it more likely the former rather than the latter :)
>>
> 
> Actually, it's just that the online docs reflect the latest documentation
> from a particular branch of the source repository, since the docs are
> continually improving and have no backwards compatibility constraints. This
> does mean we sometimes have anomalies like this, though.
> 
> If you truly need the docs as they were at the time of release, they are
> available, though I don't have a link handy on my phone.

You can (with javascript enabled) select the version for the docs at the
top right of the page. Also, just replacing the version number in the
URL works for the python 3 series (use 3.X even for python 3.0), even
farther back than the drop down menu allows.

regards,
jwi



signature.asc
Description: OpenPGP digital signature
-- 
https://mail.python.org/mailman/listinfo/python-list

How to find number of whole weeks between dates?

2015-06-10 Thread Sebastian M Cheung via Python-list

Say in 2014 April to May whole weeks would be 7th, 14th 28th April and  May 
would be 5th, 12th and 19th. So expecting 7 whole weeks in total
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Testing random

2015-06-10 Thread Thomas 'PointedEars' Lahn

Jussi Piitulainen wrote:

> Thomas 'PointedEars' Lahn writes:
>> Jussi Piitulainen wrote:
>>> Thomas 'PointedEars' Lahn writes:
   8 3 6 3 1 2 6 8 2 1 6.
>>> 
>>> There are more than four hundred thousand ways to get those numbers
>>> in some order.
>>> 
>>> (11! / 2! / 2! / 2! / 3! / 2! = 415800)
>>
>> Fallacy.  Order is irrelevant here.
> 
> You need to consider every sequence that leads to the observed counts.

No, you need _not_, because – I repeat – the probability of getting a 
sequence of length n from a set of 9 numbers whereas the probability of 
picking a number is evenly distributed, is (1∕9)ⁿ [(1/9)^n, or 1/9 to the 
nth, for those who do to see it because of lack of Unicode support at their 
system].  *Always.*  *No matter* which numbers are in it.  *No matter* in 
which order they are.  AISB, order is *irrelevant* here.  *Completely.*

This is _not_ a lottery box; you put the ball with the number on it *back 
into the box* after you have drawn it and before you draw a new one.

> One of those sequences occurred. You don't know which.

You do not have to.

> When tossing herrings […]

Herrings are the key word here, indeed, and they are deep dark red.

> Code follows. Incidentally, I'm not feeling smart here. 

Good.  Because you should not feel smart in any way after ignoring all my 
explanations.

> [nonsense]

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Steven D'Aprano

On Thu, 11 Jun 2015 12:28 am, Skip Montanaro wrote:

> On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase
>  wrote:
>> Is this a bug?
> 
> Looks like it's been reported a few times with slightly different context:
> 
> https://bugs.python.org/issue6537
> https://bugs.python.org/issue16623
> https://bugs.python.org/issue20491
> https://bugs.python.org/issue1390608
> 
> The couple times it's come up in the context of str.split, it's been
> rejected, since the purpose of that method is to split words.

That reasoning is ... strange. The whole point of the NBSP is specifically
*not* to split on it. If you wanted it to split, you would use a regular
space.

(Oh, and for the record, there are at least two non-breaking spaces in
Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".)

http://www.unicode.org/charts/PDF/U0080.pdf
http://www.unicode.org/charts/PDF/U2000.pdf


Non-breaking spaces should be used for when you want to prevent
word-wrapping, and also for "open form" compound words:

http://grammar.ccc.commnet.edu/grammar/compounds.htm

textwrap should also treat NBSPs as non-spaces for the purposes of wrapping.

As a work-around, I think this should work:

- split the string on NBSPs;

- for substring returned, split normally;

- merge sub-substrings.


def split(s):
"""Split on whitespace, except NBSP.

>>> split(u'hello world spam\\u00A0eggs cheese')
[u'hello', u'world', u'spam\\xa0eggs', 'cheese']

"""
words = []
NBSP = u'\u00A0'
substrings = s.split(NBSP)
for i, sub in enumerate(substrings):
parts = sub.split()
if i == 0:
words.extend(parts)
else:
words[-1] += NBSP + parts[0]
words.extend(parts[1:])
return words


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Irmen de Jong

On 10-6-2015 11:36, Steven D'Aprano wrote:
>> For most apps, the alternatives are better. Irmen's serpent library is
>> strictly better on every front, for example. (Except potentially
>> security, who knows.)
> 
> Beyond simple needs, like rc files, literal_eval is not sufficient. You 
> can't use it to deserialise arbitrary objects. That might be a feature, but 
> if you need something more powerful than basic ints, floats, strings and a 
> few others, literal_eval will not be powerful enough.

Just to have this off my chest:

I guess that "serialization format" is not the most correct term for what 
serpent does
(or in general, for the literal expressions that literal_eval accepts). Serpent 
doesn't
strive to (de)serialize everything perfectly. It is meant as a pythonic data 
transfer
format.

You can do this by explicitly mapping your application's object model to and 
from the
wire data format, or do it in a more pythonic way (IMO) and let python take 
care of most
of it automatically. Serpent is smart (I hope) about a number of non-primitive 
types. If
needed, use its hooks to teach it about types it doesn't readily recognize.
Yes, it does force you to reduce the arbitrary types you want to process to the 
set of
types that are accepted in a python literal expression. Thankfully lists, sets, 
tuples
and dicts are also among them.

Raison d'être for serpent is that I was looking for a safe pythonic alternative 
for
pickle, and with fewer limitations than Json.   I chose to use ast.literal_eval 
from the
standard library to do the "deserialization" for me, and so only had to build 
some code
to "serialize" object trees into python literal expressions :)

Regarding security: I simply trust the docstring of ast.literal_eval here;
"Safely evaluate an expression node or a string containing a Python expression. 
[...]"

Irmen

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Marko Rauhamaa

Sebastian M Cheung :

> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and
> May would be 5th, 12th and 19th. So expecting 7 whole weeks in total

This program gives you the number of days between two dates given in the
-MM-DD format:


#!/usr/bin/env python3

import sys

def gregorian_day_count(isodate):
year, month, day = map(int, isodate.split('-'))
a, b = divmod(12 * year + month - 3, 12)
return (a * 365 + (a >> 2) - (a * 1311 >> 17) + (a * 1311 >> 19) +
+ (31306 * b + 722 >> 10))

def main():
print(gregorian_day_count(sys.argv[2]) - gregorian_day_count(sys.argv[1]))

if __name__ == '__main__':
main()


Divide the number by 7 and you have your answer.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Marko Rauhamaa

Marko Rauhamaa :

> This program gives you the number of days between two dates given in the
> -MM-DD format:

Sorry, couldn't resist.

It still does work, though.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Ian Kelly

On Wed, Jun 10, 2015 at 11:05 AM, Sebastian M Cheung via Python-list
 wrote:
> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and  May 
> would be 5th, 12th and 19th. So expecting 7 whole weeks in total

>>> from datetime import date
>>> d1 = date(2014, 4, 7)
>>> d2 = date(2014, 5, 19)
>>> d2 - d1
datetime.timedelta(42)
>>> (d2 - d1).days
42
>>> (d2 - d1).days // 7
6
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Laura Creighton

In a message of Wed, 10 Jun 2015 20:38:59 +0300, Marko Rauhamaa writes:
>Divide the number by 7 and you have your answer.
>

I am not sure that is what he wants -- If he gives us a start of Tuesday the
9th of June 2015 (yesterday) and an end of Thursday the 25th of June, that's
16 days.  But there is only one Monday-Friday week in there, the 14th-19th.

So if the OP wants an answer of 1 for such data, he may be interested in
the python calendar module https://docs.python.org/2/library/calendar.html

Laura

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Testing random

2015-06-10 Thread sohcahtoa82

On Wednesday, June 10, 2015 at 10:06:49 AM UTC-7, Thomas 'PointedEars' Lahn 
wrote:
> Jussi Piitulainen wrote:
> 
> > Thomas 'PointedEars' Lahn writes:
> >> Jussi Piitulainen wrote:
> >>> Thomas 'PointedEars' Lahn writes:
>    8 3 6 3 1 2 6 8 2 1 6.
> >>> 
> >>> There are more than four hundred thousand ways to get those numbers
> >>> in some order.
> >>> 
> >>> (11! / 2! / 2! / 2! / 3! / 2! = 415800)
> >>
> >> Fallacy.  Order is irrelevant here.
> > 
> > You need to consider every sequence that leads to the observed counts.
> 
> No, you need _not_, because – I repeat – the probability of getting a 
> sequence of length n from a set of 9 numbers whereas the probability of 
> picking a number is evenly distributed, is (1∕9)ⁿ [(1/9)^n, or 1/9 to the 
> nth, for those who do to see it because of lack of Unicode support at their 
> system].  *Always.*  *No matter* which numbers are in it.  *No matter* in 
> which order they are.  AISB, order is *irrelevant* here.  *Completely.*
> 
> This is _not_ a lottery box; you put the ball with the number on it *back 
> into the box* after you have drawn it and before you draw a new one.
> 
> > One of those sequences occurred. You don't know which.
> 
> You do not have to.
> 
> > When tossing herrings […]
> 
> Herrings are the key word here, indeed, and they are deep dark red.
> 
> > Code follows. Incidentally, I'm not feeling smart here. 
> 
> Good.  Because you should not feel smart in any way after ignoring all my 
> explanations.
> 
> > [nonsense]
> 
> -- 
> PointedEars
> 
> Twitter: @PointedEars2
> Please do not cc me. / Bitte keine Kopien per E-Mail.

To put it another way, let's simplify the problem.  You're rolling a pair of 
dice.  What are the chances that you'll see a pair of 3s?

Look at the list of possible roll combinations:

1 1 1 2 1 3 1 4 1 5 1 6
2 1 2 2 2 3 2 4 2 5 2 6
3 1 3 2 3 3 3 4 3 5 3 6
4 1 4 2 4 3 4 4 4 5 4 6
5 1 5 2 5 3 5 4 5 5 5 6
6 1 6 2 6 3 6 4 6 5 6 6

36 possible combinations.  Only one of them has a pair of 3s.  The answer is 
1/36.

What about the chances of seeing 2 1?

Here's where I think you two are having such a huge disagreement.  Does order 
matter?  It depends what you're pulling random numbers out for.

The odds of seeing 2 1 are also only 1/36.  But if order doesn't matter in your 
application, then 1 2 is equivalent.  The odds of getting 2 1 OR 1 2 is 2/36, 
or 1/18.

But whether order matters or not, the chances of getting a pair of threes in 
two rolls is ALWAYS 1/36.

If this gets expanded to grabbing 10 random numbers between 1 and 9, then the 
chances of getting a sequence of 10 ones is still only (1/9)^10, *regardless of 
whether or not order matters*.  There are 9^10 possible sequences, but only 
*one* of these is all ones.

If order matters, then 7385941745 also has a (1/9)^10 chance of occurring.  
Just because it isn't a memorable sequence doesn't give it a higher chance of 
happening.

If order DOESN'T matter, then 1344557789 would be equivalent, and the odds are 
higher.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Testing random

2015-06-10 Thread Ian Kelly

On Wed, Jun 10, 2015 at 11:03 AM, Thomas 'PointedEars' Lahn
 wrote:
> Jussi Piitulainen wrote:
>
>> Thomas 'PointedEars' Lahn writes:
>>> Jussi Piitulainen wrote:
 Thomas 'PointedEars' Lahn writes:
>   8 3 6 3 1 2 6 8 2 1 6.

 There are more than four hundred thousand ways to get those numbers
 in some order.

 (11! / 2! / 2! / 2! / 3! / 2! = 415800)
>>>
>>> Fallacy.  Order is irrelevant here.
>>
>> You need to consider every sequence that leads to the observed counts.
>
> No, you need _not_, because – I repeat – the probability of getting a
> sequence of length n from a set of 9 numbers whereas the probability of
> picking a number is evenly distributed, is (1∕9)ⁿ [(1/9)^n, or 1/9 to the
> nth, for those who do to see it because of lack of Unicode support at their
> system].  *Always.*  *No matter* which numbers are in it.  *No matter* in
> which order they are.  AISB, order is *irrelevant* here.  *Completely.*

Order is relevant because, for instance, there are n differently
ordered sequences that contain n-1 1s and one 2, while there is only
one sequence that contains n 1s. While each of those individual
sequences are indeed equiprobable, the overall probability of getting
a sequence that contains n-1 1s and one 2 is n times the probability
of getting a sequence that contains n 1s.

The context of this whole thread is about the probability of getting a
sequence where every number occurs at least once. The order that they
occur in doesn't matter, but the number of possible permutations does,
because every one of those permutations is a distinct sequence
contributing an equal amount to the total overall probability.

The probability of 123456789 and 1 are equal. The probability
of a sequence containing all nine numbers and a sequence containing
only 1s are *not* equal.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Joel Goldstick

On Wed, Jun 10, 2015 at 1:50 PM, Laura Creighton  wrote:
> In a message of Wed, 10 Jun 2015 20:38:59 +0300, Marko Rauhamaa writes:
>>Divide the number by 7 and you have your answer.
>>
>
> I am not sure that is what he wants -- If he gives us a start of Tuesday the
> 9th of June 2015 (yesterday) and an end of Thursday the 25th of June, that's
> 16 days.  But there is only one Monday-Friday week in there, the 14th-19th.
>
> So if the OP wants an answer of 1 for such data, he may be interested in
> the python calendar module https://docs.python.org/2/library/calendar.html
>
> Laura
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list

Find the number of weeks with the above method, then

>>> import datetime
end_date = datetime.datetime(2012, 3, 23)  // whatever your end date is
if end_date.weekday() != 5:
number_of_complete _weeks -= 1

weekday returns 0 for monday, so 5 for Saturday


-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Testing random

2015-06-10 Thread random832

On Wed, Jun 10, 2015, at 13:03, Thomas 'PointedEars' Lahn wrote:
> This is _not_ a lottery box; you put the ball with the number on it *back 
> into the box* after you have drawn it and before you draw a new one.

Yes, but getting a 2, putting it back, and getting a 1 is just as good
as getting a 1, putting it back, and getting a 2, so you have to add the
probability of those cases together to get the probability of getting at
least one 1 and at least one 2.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Did the 3.4.4 docs get published early?

2015-06-10 Thread Nicholas Chammas

Also, just replacing the version number in the URL works for the python 3
series (use 3.X even for python 3.0), even farther back than the drop down
menu allows.

This does not help in this case:

https://docs.python.org/3.4/library/asyncio-task.html#asyncio.ensure_future

Also, you cannot select the docs for a maintenance release, like 3.4.3.

Anyway, it’s not a big deal as long as significant changes are tagged
appropriately with notes like “New in version NNN”, which they are.

Ideally, the docs would only show the latest changes for released versions
of Python, but since some changes (like the one I linked to) are introduced
in maintenance versions, it’s probably hard to separate them out into
separate branches.

Nick

On Wed, Jun 10, 2015 at 10:11 AM Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:

> For example, here is a "New in version 3.4.4" method:
>
> https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future
>
> However, the latest release appears to be 3.4.3:
>
> https://www.python.org/downloads/
>
> Is this normal, or did the 3.4.4 docs somehow get published early by
> mistake?
>
> Nick
>
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Mark Lawrence


On 10/06/2015 18:50, Laura Creighton wrote:

In a message of Wed, 10 Jun 2015 20:38:59 +0300, Marko Rauhamaa writes:

Divide the number by 7 and you have your answer.



I am not sure that is what he wants -- If he gives us a start of Tuesday the
9th of June 2015 (yesterday) and an end of Thursday the 25th of June, that's
16 days.  But there is only one Monday-Friday week in there, the 14th-19th.

So if the OP wants an answer of 1 for such data, he may be interested in
the python calendar module https://docs.python.org/2/library/calendar.html

Laura




For those who wish to move into the 21st century the link is 
https://docs.python.org/3/library/calendar.html


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Testing random

2015-06-10 Thread Jussi Piitulainen

sohcahto...@gmail.com writes:

[...]

> Here's where I think you two are having such a huge disagreement.
> Does order matter?  It depends what you're pulling random numbers out
> for.
>
> The odds of seeing 2 1 are also only 1/36.  But if order doesn't
> matter in your application, then 1 2 is equivalent.  The odds of
> getting 2 1 OR 1 2 is 2/36, or 1/18.

[...]

I'm not sure what Thomas 'PointedEars' Lahn is talking about. It seems
to be something else than what others have been discussing.

Others have been discussing a record of the number of times that each
possible outcome came up in a sequence of random numbers. There is no
other record of the sequence. The number of drawings is much larger than
the number of possible outcomes. The subject line refers to testing
whether the record of counts is compatible with the drawings being
random in the usual sense: independent, with uniform distribution.

Someone pointed out that some numbers may not have occurred at all - I
think a piece of code needed modification - and so people have commented
on the probability of this happening ... and whether it depends on the
number of drawings.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Sebastian M Cheung via Python-list

On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote:
> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and  May 
> would be 5th, 12th and 19th. So expecting 7 whole weeks in total

What I mean is given two dates I want to find WHOLE weeks, so if given the 2014 
calendar and function has two inputs (4th and 5th month) then 7th, 14th, 21st 
and 28th from April with 28th April week carrying into May, and then 5th, 12th 
and 19th May to give total of 7 whole weeks, because 26th May is not a whole 
week and will not be counted.

Hope thats clear.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Mark Lawrence


On 10/06/2015 21:11, Sebastian M Cheung via Python-list wrote:

On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote:

Say in 2014 April to May whole weeks would be 7th, 14th 28th April and  May 
would be 5th, 12th and 19th. So expecting 7 whole weeks in total


What I mean is given two dates I want to find WHOLE weeks, so if given the 2014 
calendar and function has two inputs (4th and 5th month) then 7th, 14th, 21st 
and 28th from April with 28th April week carrying into May, and then 5th, 12th 
and 19th May to give total of 7 whole weeks, because 26th May is not a whole 
week and will not be counted.

Hope thats clear.



If you'd be kind enough to show the code that you've written and the 
precise reasons(s) that it doesn't work then we'll be delighted to point 
you in the right direction.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Ian Kelly

On Wed, Jun 10, 2015 at 2:11 PM, Sebastian M Cheung via Python-list
 wrote:
> On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote:
>> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and  May 
>> would be 5th, 12th and 19th. So expecting 7 whole weeks in total
>
> What I mean is given two dates I want to find WHOLE weeks, so if given the 
> 2014 calendar and function has two inputs (4th and 5th month) then 7th, 14th, 
> 21st and 28th from April with 28th April week carrying into May, and then 
> 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May is 
> not a whole week and will not be counted.

So the two "dates" being passed are actually months? The calendar
module already suggested should be useful for this.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre

FWIW most of the objections below also apply to JSON, so this doesn't
just have to be about repr/literal_eval. I'm definitely a huge
proponent of widespread use of something like protocol buffers, both
for production code and personal hacky projects.

On Wed, Jun 10, 2015 at 2:36 AM, Steven D'Aprano
 wrote:
> On Wednesday 10 June 2015 14:48, Devin Jeanpierre wrote:
>
> [...]
>> and literal_eval is not a great idea.
>>
>> * the common serializer (repr) does not output a canonical form, and
>>   can serialize things in a way that they can't be deserialized
>
> For literals, the canonical form is that understood by Python. I'm pretty
> sure that these have been stable since the days of Python 1.0, and will
> remain so pretty much forever:

The problem is that there are two different ways repr might write out
a dict equal to {'a': 1, 'b': 2}. This can make tests brittle -- e.g.
it's why doctest fails badly at examples involving dictionaries. Text
format protocol buffers output everything sorted, so that you can do
textual diffs for compatibility tests and such.

At work, one thing we do in places is mock out services using "golden"
expected protobuf responses, so that you can test that the server
returns exactly that, and test what the client does with that,
separately. These are checked into perforce in text format.

>> * there is no schema
>> * there is no well understood migration story for when the data you
>>   load and store changes
>
> literal_eval is not a serialisation format itself. It is a primitive
> operation usable when serialising. E.g. you might write out a simple Unix-
> style rc file of key:value pairs:
>
-snip-
>
> split on "=" and call literal_eval on the value.
>
> This is a perfectly reasonable light-weight solution for simple
> serialisation needs.

I could spend a bunch of time writing yet another config file format,
or I could use text format protocol buffers, YAML, or TOML and call it
a day.

>> * it encourages the use of eval when literal_eval becomes inconvenient
>>   or insufficient
>
> I don't think so. I think that people who make the effort to import ast and
> call ast.literal_eval are fully aware of the dangers of eval and aren't
> silly enough to start using eval.

The problem is when you have your config file format using python
literals, and another programmer wants to deal with it and doesn't
look at your codebase, and things like that. When transferring data,
this can happen a lot, since you are often not the user of the data
you wrote, and you can't control how others consume it. They might use
eval even if you didn't mean for them to. For example, in JavaScript,
this was once a common problem for services exposing JSON, and it
still happens even now.

>> * It is not particularly well specified or documented compared to the
>>   alternatives.
>> * The types you get back differ in python 2 vs 3
>
> Doesn't matter. The type you *write* are different in Python 2 vs 3, so of
> course you do.

In a shared 2/3 codebase, if I write bytes I expect to get bytes, and
if I write unicode I expect to get unicode. (There is a third category
of thing, which should be bytes on 2.x and string on 3.x, but it's
probably best to handle that outside of the deserializer). If you
thread it through repr and literal_eval using different versions for
each, unicode in python 3 becomes bytes in python 2, and vice versa.
So it makes migrating to Python 3 even harder.

>> For most apps, the alternatives are better. Irmen's serpent library is
>> strictly better on every front, for example. (Except potentially
>> security, who knows.)
>
> Beyond simple needs, like rc files, literal_eval is not sufficient. You
> can't use it to deserialise arbitrary objects. That might be a feature, but
> if you need something more powerful than basic ints, floats, strings and a
> few others, literal_eval will not be powerful enough.

No, it is powerful enough. After all, JSON has the same limitations.
Protobuf only adds enums and structs to JSON's types, and it's
potentially the most-used serialization format in the world by
operations per second.

Serialization libraries/formats usually need handholding to serialize
complex Python objects into simple serializable types. [Except pickle,
and that's the very reason it's insecure (per previous discussion in
thread.)]

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Permission to Showcase the Python Program

2015-06-10 Thread Terry Reedy

On 6/9/2015 10:19 AM, Leslie Bush wrote:

I’m having trouble reaching an actual human at your organization

> as my emails get bounced back.

If you sent email to
p...@python.org 
it should not have bounced, as according to
https://mail.python.org/mailman/listinfo/python-legal-sig
that is the 'legal email address'.

But I do not think it really matters for your purpose.

My name is Leslie Bush and I am the Intellectual Property Coordinator

> for The Great Courses. We produce non-credit, college-level educational
> programs on DVD and electronic formats in a lecture series.
> The lectures are recorded and then sold to the general public for-profit.
> As a didactic tool for enhancing the programs, we include in the lectures
> visual elements illustrating works of art, people, events, locations, 
etc

I am a python core developer, a member of PSF, but otherwise have no 
official position.  As a parent, I am familiar with The Great Courses,

having used 2 for home schooling.  They were very helpful.

We are currently producing a course with Dr. John Keyser, Professor of
> Computer Science at the University of North Carolina Chapel Hill, 
entitled

> “Computer Science for Everyone: Programming Concepts and Exercises”
> The professor would like to use your Python program to showcase in 
the course.

Great.  Most of us would consider it silly for him not to.  Most of us 
would also urge that he use Python 3 rather than Python 2, if he is not 
already.

> I am writing to see if we may have permission to do so. If permission
> is granted, we will have a copy of our license agreement sent over to
> your company’s authorizer for review and a signature.

I am not a lawyer, but am 99.990% (and I am not exaggerating) sure that 
the Python license already gives you the permission you need. Our 
documentation page is at

https://docs.python.org/3/
Clicking the History and License of Python link takes you to
https://docs.python.org/3/license.html
Skip to Terms and conditions for accessing or otherwise using Python
and read PSF LICENSE AGREEMENT FOR PYTHON 3.4.3.  It is about as liberal 
as can be.  It was written by the PSF lawyer and is intended to be a 
blanket grant of permissiond, so that the PSF will not need to employ an 
'authorizer' to review and sign permissions. This is standard for 
open-source software.

Companies routinely use Python and write their own public and 
proprietary code.  People write and sell books about Python or that 
reference Python.  Universities teach courses that include Python. 
People post videos about Python.  We WANT people to do all of these 
things.  They all do it without further agreements and signatures beyond 
what is openly published.

I suspect that some companies paranoid about the possibility of lawsuit 
may have their own lawyer review the license agreement to make sure it 
means what it seems to say and that their use falls within the license. 
 You are free to do the same if worried.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: Did the 3.4.4 docs get published early?

2015-06-10 Thread Terry Reedy


On 6/10/2015 10:11 AM, Nicholas Chammas wrote:

For example, here is a "New in version 3.4.4" method:

https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future

However, the latest release appears to be 3.4.3:

https://www.python.org/downloads/

Is this normal, or did the 3.4.4 docs somehow get published early by
mistake?


The online x.y docs reflect the x.y branch in the repository.  New 
features are not normally added in an x.y.z maintenance release, which 
is normally bugfixes only.  However, asyncio is a new module in 3.4 and 
marked as 'provisional', which means subject to change during the 3.4 
series of releases. Idle is also exceptional in getting uncategorized 
changes in maintenance releases, so each x.y.z release needs a copy of 
the Idle doc chapter that is both up-to-date and frozen as of x.y.z. 
Making this happen is still a work-in-progress.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Terry Reedy


On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:


The problem is that there are two different ways repr might write out
a dict equal to {'a': 1, 'b': 2}. This can make tests brittle


Not if one compares objects rather than string representations of 
objects.  I am strongly of the view that code and tests should be 
written to directly compare objects as much as possible.



it's why doctest fails badly at examples involving dictionaries.


or sets or addresses or object ids or locale-dependent strings or random 
numbers or values dependent on random numbers.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Gregory Ewing


Robert Kern wrote:
To allow people to write their own types that can be serialized, 
you have to let them specify arbitrary callables that will do the 
reconstruction. If you whitelist the possible reconstruction callables, 
you have greatly restricted the types that can participate in the 
serialization system.


If whitelisting a type is the *only* thing you need to
do to make it serialisable, I think that comes close
enough to the stated goal of being able to "serialise
all [potentially serialisable] language objects".

Having to be explicit about which types are deserialisable
is probably a good thing anyway. It gives you an opportunity
to specify the mapping between the external format and
class names, so that your serialised data doesn't contain
assumptions about implementation details of your program.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre

On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy  wrote:
> On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:
>
>> The problem is that there are two different ways repr might write out
>> a dict equal to {'a': 1, 'b': 2}. This can make tests brittle
>
>
> Not if one compares objects rather than string representations of objects.
> I am strongly of the view that code and tests should be written to directly
> compare objects as much as possible.

For serialization formats that always output the same string for the
same data (like text format protos), there is no practical difference
between the two, except that if you're comparing text, you can easily
supply a diff to update one to match the other.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre

On Wed, Jun 10, 2015 at 4:39 PM, Devin Jeanpierre
 wrote:
> On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy  wrote:
>> On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:
>>
>>> The problem is that there are two different ways repr might write out
>>> a dict equal to {'a': 1, 'b': 2}. This can make tests brittle
>>
>>
>> Not if one compares objects rather than string representations of objects.
>> I am strongly of the view that code and tests should be written to directly
>> compare objects as much as possible.
>
> For serialization formats that always output the same string for the
> same data (like text format protos), there is no practical difference
> between the two, except that if you're comparing text, you can easily
> supply a diff to update one to match the other.

Ugh, there's also the fiddly difference between what goes in and what
you read. A serialized data structure might contain lots of data that
is ignored by the deserializer (in protobuf), or it might contain data
which can't be loaded by the deserializer or produces weird /
incorrect results. Being able to inspect and test the serialized data
separately from the deserialized data is useful in that regard, so
that you know where the failure lies, but it's sort of fuzzy.

Some examples of where this crops up: pickles after you've moved a
class, JSON encoders that try to be clever and output invalid JSON,
protocol buffers with unexpected fields.

Overall, though, the diff thing is probably the bigger reason everyone
wants to do this sort of thing with serialized data. If you do it
right and are principled about it, I don't see a problem with it.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Terry Reedy


On 6/10/2015 7:39 PM, Devin Jeanpierre wrote:

On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy  wrote:

On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:


The problem is that there are two different ways repr might write out
a dict equal to {'a': 1, 'b': 2}. This can make tests brittle


You commented about *tests*


Not if one compares objects rather than string representations of objects.
I am strongly of the view that code and tests should be written to directly
compare objects as much as possible.


I responded about *tests*


For serialization formats that always output the same string for the
same data (like text format protos), there is no practical difference
between the two, except that if you're comparing text, you can easily
supply a diff to update one to match the other.


Serialization is a different issue.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Chris Angelico

On Thu, Jun 11, 2015 at 8:10 AM, Devin Jeanpierre
 wrote:
> The problem is that there are two different ways repr might write out
> a dict equal to {'a': 1, 'b': 2}. This can make tests brittle -- e.g.
> it's why doctest fails badly at examples involving dictionaries. Text
> format protocol buffers output everything sorted, so that you can do
> textual diffs for compatibility tests and such.

With Python's JSON module [1], you can pass sort_keys=True to
stipulate that the keys be lexically ordered, which should make the
output "canonical". Pike's Standards.JSON.encode() [2] can take a flag
value to canonicalize the output, which currently has the same effect
(sort mappings by their indices). I did a quick check for Ruby and
didn't find anything in its standard library JSON module, but knowing
Ruby, it'll be available somewhere in a gem. A web search for 'perl
json' brought up a CPAN link [4] that has a canonicalize option for
sorting by keys. So that's three out of four definite, one uncertain,
where it's pretty easy to ensure that you get byte-for-byte identical
output from a JSON encoder.

Even though failing doctests are a separate problem, it's useful to
have canonical output. Your diffs get less noisy, for instance.
Coupled with a human-readability flag (eg "indent=4" in Python,
"Standards.JSON.HUMAN_READABLE" in Pike) that splits the result over
multiple lines, it can make a pretty easy to diff file. Definitely
worth doing... and definitely worth using a JSON encoder rather than
repr().

ChrisA

[1] https://docs.python.org/3/library/json.html#json.dump
[2] 
http://pike.lysator.liu.se/generated/manual/modref/ex/predef_3A_3A/Standards/JSON.html
[3] http://ruby-doc.org/stdlib-2.0.0/libdoc/json/rdoc/JSON.html
[4] http://search.cpan.org/~makamaka/JSON-2.90/lib/JSON.pm#PERL_-%3E_JSON
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre

On Wed, Jun 10, 2015 at 4:46 PM, Terry Reedy  wrote:
> On 6/10/2015 7:39 PM, Devin Jeanpierre wrote:
>>
>> On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy  wrote:
>>>
>>> On 6/10/2015 6:10 PM, Devin Jeanpierre wrote:
>>>
 The problem is that there are two different ways repr might write out
 a dict equal to {'a': 1, 'b': 2}. This can make tests brittle
>
>
> You commented about *tests*
>
>>> Not if one compares objects rather than string representations of
>>> objects.
>>> I am strongly of the view that code and tests should be written to
>>> directly
>>> compare objects as much as possible.
>
>
> I responded about *tests*
>
>> For serialization formats that always output the same string for the
>> same data (like text format protos), there is no practical difference
>> between the two, except that if you're comparing text, you can easily
>> supply a diff to update one to match the other.
>
>
> Serialization is a different issue.

Yes, tests of code that uses serialization (caching, RPCs, etc.).

I mentioned above a sort of test that divides tests of a client and
server along RPC boundaries by providing fake queries and responses,
and testing that those are the queries and responses given by the
client and server. This way you don't need to actually start the
client and server to test them both and their interactions. This is
one example, there are other uses, but they go along the same lines.
For example, one can also imagine testing that a serialized structure
is identical across version changes, so that it's guaranteed to be
forwards/backwards compatible. It is not enough to test that the
deserialized form is, because it might differ substantially, as long
as the communicated serialized structure is the same.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Chris Angelico

On Thu, Jun 11, 2015 at 3:11 AM, Steven D'Aprano 
wrote:
> (Oh, and for the record, there are at least two non-breaking spaces in
> Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".)
>
> http://www.unicode.org/charts/PDF/U0080.pdf
> http://www.unicode.org/charts/PDF/U2000.pdf

And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as
the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've been
fighting with VLC Media Player over the font it uses for subtitles; for
some bizarre reason, that font represents U+FEFF not with zero pixels of
emptiness, but with a box containing the letters "ZWN" "BSP" on two lines.
Yeah, because that totally takes up zero width and looks like blank space.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread random832

On Wed, Jun 10, 2015, at 19:30, Gregory Ewing wrote:
> If whitelisting a type is the *only* thing you need to
> do to make it serialisable, I think that comes close
> enough to the stated goal of being able to "serialise
> all [potentially serialisable] language objects".

IMO the serialization framework should handle this by providing your own
way to look them up (almost but not entirely unlike providing your own
globals table to eval) rather than by having a whitelist.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread random832

On Wed, Jun 10, 2015, at 20:09, Chris Angelico wrote:
> And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as
> the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've
> been
> fighting with VLC Media Player over the font it uses for subtitles; for
> some bizarre reason, that font represents U+FEFF not with zero pixels of
> emptiness, but with a box containing the letters "ZWN" "BSP" on two
> lines.
> Yeah, because that totally takes up zero width and looks like blank
> space.

As I understand it, the proper behavior is that the ZWNBSP that is the
byte order mark shall never appear in an in-memory representation of the
first line of a BOM-encoded file, or any other line of the concatenation
of two BOM-encoded files, but should "vanish" when the file is opened
and first read from. So it shouldn't be showing up in your subtitles
regardless of its rendering behavior.

The real world, needless to say, isn't so nice.

IIRC there's also a font in MS windows that uses various glyphs which
are zero-width, but are not blank, to represent ZWJ, ZWNJ, RLM, and LRM.
Good for seeing what is happening, bad for actually rendering text
that's intended to contain these characters. Though there's another
argument that ideally a rendering engine should not render any such
glyph unless something like "visible controls" has been selected (the
real world, again, isn't so nice, which is why most symbols intended for
visible control style rendering have their own distinct code points
rather than using those of the control characters they represent).
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Chris Angelico

On Thu, Jun 11, 2015 at 11:02 AM,  wrote:
>
> On Wed, Jun 10, 2015, at 20:09, Chris Angelico wrote:
> > And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as
> > the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've
> > been
> > fighting with VLC Media Player over the font it uses for subtitles; for
> > some bizarre reason, that font represents U+FEFF not with zero pixels of
> > emptiness, but with a box containing the letters "ZWN" "BSP" on two
> > lines.
> > Yeah, because that totally takes up zero width and looks like blank
> > space.
>
> As I understand it, the proper behavior is that the ZWNBSP that is the
> byte order mark shall never appear in an in-memory representation of the
> first line of a BOM-encoded file, or any other line of the concatenation
> of two BOM-encoded files, but should "vanish" when the file is opened
> and first read from. So it shouldn't be showing up in your subtitles
> regardless of its rendering behavior.

It's a perfectly valid character for other purposes; it's coming up in
the middle of pieces of text, which should be 100% legal. No, it's a
font problem.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Sebastian M Cheung via Python-list

yes just whole weeks given any two months, I did looked into calendar module 
but couldn't find specifically what i need.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Steven D'Aprano

On Thu, 11 Jun 2015 10:09 am, Chris Angelico wrote:

> On Thu, Jun 11, 2015 at 3:11 AM, Steven D'Aprano 
> wrote:
>> (Oh, and for the record, there are at least two non-breaking spaces in
>> Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".)
>>
>> http://www.unicode.org/charts/PDF/U0080.pdf
>> http://www.unicode.org/charts/PDF/U2000.pdf
> 
> And U+FEFF "ZERO WIDTH NO-BREAK SPACE", 

No, despite the name, that is not a space character, it is a formatting
character. Due to Unicode's stability policy, the name is stuck forever,
but it should not be treated as a space character:

py> unicodedata.category(' ')
'Zs'
py> unicodedata.category('\u00A0')  # NBSP
'Zs'
py> unicodedata.category('\uFEFF')  # ZWNBSP
'Cf'

Ideally, outside of the BOM, you should never come across a ZWNBSP. You
should use U+2060 WORD JOINER instead. But if you do come across one
outside of the BOM, it should be treated as a legitimate non-space
character:

http://www.unicode.org/faq/utf_bom.html#bom6

Although ZWNBSP is a "default ignorable" code point, I believe that the font
is well within its rights to show it with a visible glyph:

"Fonts can contain glyphs intended for visible display of 
default ignorable code points that would otherwise be 
rendered invisibly when not supported."

http://www.unicode.org/faq/unsup_char.html

> notable because it's also used as 
> the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've 
> been fighting with VLC Media Player over the font it uses for subtitles;
> for some bizarre reason, that font represents U+FEFF not with zero pixels
> of emptiness, but with a box containing the letters "ZWN" "BSP" on two
> lines. Yeah, because that totally takes up zero width and looks like blank
> space.

Why do the subtitles contain ZWNBSP in the first place? Surely they're not
English subtitles?

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Chris Angelico

On Thu, Jun 11, 2015 at 12:26 PM, Steven D'Aprano  wrote:
> No, despite the name, that is not a space character, it is a formatting
> character. Due to Unicode's stability policy, the name is stuck forever,
> but it should not be treated as a space character:
>
> py> unicodedata.category(' ')
> 'Zs'
> py> unicodedata.category('\u00A0')  # NBSP
> 'Zs'
> py> unicodedata.category('\uFEFF')  # ZWNBSP
> 'Cf'
>
>
> Ideally, outside of the BOM, you should never come across a ZWNBSP. You
> should use U+2060 WORD JOINER instead. But if you do come across one
> outside of the BOM, it should be treated as a legitimate non-space
> character:
>
> http://www.unicode.org/faq/utf_bom.html#bom6
>
> Although ZWNBSP is a "default ignorable" code point, I believe that the font
> is well within its rights to show it with a visible glyph:
>
> "Fonts can contain glyphs intended for visible display of
> default ignorable code points that would otherwise be
> rendered invisibly when not supported."
>
> http://www.unicode.org/faq/unsup_char.html

Huh. Okay, my bad. I was under the impression that it was supposed to
take up no width, as the name implies, but stability trumps logic
sometimes. Learn something new every day.

>> notable because it's also used as
>> the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've
>> been fighting with VLC Media Player over the font it uses for subtitles;
>> for some bizarre reason, that font represents U+FEFF not with zero pixels
>> of emptiness, but with a box containing the letters "ZWN" "BSP" on two
>> lines. Yeah, because that totally takes up zero width and looks like blank
>> space.
>
> Why do the subtitles contain ZWNBSP in the first place? Surely they're not
> English subtitles?

No, they're not :) The character comes up in the Cantonese and
Japanese subs for Once Upon A December.

http://youtu.be/CEpcUeWP0bg
http://youtu.be/WFZAaHrHens

Possibly some others in the series as well. It may well be a fault in
the subtitles, but most programs I've seen don't show U+FEFF as a big
fat box.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread random832

On Wed, Jun 10, 2015, at 23:05, Chris Angelico wrote:
> http://youtu.be/CEpcUeWP0bg
> http://youtu.be/WFZAaHrHens

An example of the actual subtitle text would be more useful than a
youtube link to the video, since we're unlikely to be able to see what
context the character appears in if our client doesn't show it. (I don't
think the default youtube player does). And you haven't even included a
time code.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Michael Torrie

On 06/10/2015 02:11 PM, Sebastian M Cheung via Python-list wrote:
> On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote:
>> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and  May 
>> would be 5th, 12th and 19th. So expecting 7 whole weeks in total
> 
> What I mean is given two dates I want to find WHOLE weeks, so if given the 
> 2014 calendar and function has two inputs (4th and 5th month) then 7th, 14th, 
> 21st and 28th from April with 28th April week carrying into May, and then 
> 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May is 
> not a whole week and will not be counted.
> 
> Hope thats clear.

I think Joel had the right idea.  First calculate the rough number of
weeks by taking the number of days between the date and divide by seven.
Then check to see what the start date's day of week is, and adjust the
rough week count down by one if it's not the first day of the week.  I'm
not sure if you have to check the end date's day of week or not.  I kind
of think checking the first one only is sufficient, but I could be
wrong.  You'll have to code it up and test it, which I assume you've
been doing up to this point, even though you haven't shared any code.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Steven D'Aprano

On Thu, 11 Jun 2015 08:10 am, Devin Jeanpierre wrote:

[...]
>> For literals, the canonical form is that understood by Python. I'm pretty
>> sure that these have been stable since the days of Python 1.0, and will
>> remain so pretty much forever:
> 
> The problem is that there are two different ways repr might write out
> a dict equal to {'a': 1, 'b': 2}. This can make tests brittle -- e.g.
> it's why doctest fails badly at examples involving dictionaries. 

Only if they are badly written.

Yes, dicts are *less convenient* for doctests, but if they fail, the blame
is on the author of the tests themselves, not doctest.

Unordered output is not a problem for dicts, because dicts also have
unordered *input*. It doesn't matter whether you input {'a':1,'b':2} or
{'b':2,'a':1}, you will get the same dict either way.

[...]
> I could spend a bunch of time writing yet another config file format,
> or I could use text format protocol buffers, YAML, or TOML and call it
> a day.

Writing a rc parser is so trivial that it's almost easier to just write it
than it is to look up the APIs for YAML or JSON, to say nothing of the
rigmarole of defining a protocol buffer config file, compiling it,
importing the module, and using that.

def read(configfile):
config = collections.OrderedDict()
with open(configfile) as f:
for line in f:
line = line.strip()
if line.startswith('#"): continue
key, value = line.split("=", 1)
key = key.rstrip()
value = value.lstrip()
config[key] = ast.literal_eval(value)
return config

That's a basic, *but acceptable*, rc parser written in literally under a
minute. At the risk of ending up with egg on my face, I reckon that it's so
simple and so obviously correct that I can tell it works correctly without
even testing it. (Famous last words, huh?)

Unlike any of the richer, more powerful serialisation formats like YAML,
JSON, or protocol buffer, its not only human readable but human writable
too. By which I mean, while it is *possible* for a sufficiently motivated
person to write correctly formatted JSON, YAML or even XML, it's not really
something you would choose to do willingly. But Unix sys admins hand-edit
rc files every day.

But of course this also means it's less powerful and can deal with few types
of data. Power comes at a cost of complexity, and simplicity itself can be
a virtue. I wouldn't use JSON etc. for config files until I was sure that a
simpler INI or RC file wasn't sufficient for my needs.

Some how I have drifted away from serialisation in general to specifically
config files... never mind.

[...]
> The problem is when you have your config file format using python
> literals, and another programmer wants to deal with it and doesn't
> look at your codebase, and things like that. When transferring data,
> this can happen a lot, since you are often not the user of the data
> you wrote, and you can't control how others consume it. 

Not only can I not control how they consume it, but I don't care how they
consume it :-)

I hear what you are saying, and I don't disagree with it. I'm just standing
up for simplicity as a virtue when appropriate. If I'm writing a script to
save a bunch of values to pass to another script after some human editing,
it's faster for me to just write out the key:value pairs than it is to
learn how to use protocol buffer, deal with a separate compilation step,
etc. It's actually easier to write out, and read in, the key:values than to
use the configfile module. If you don't need multiple sections, default
values, or variable interpolation, even configparser is overkill.

But if I'm swapping data with others, or if I have to use a richer set of
types or functionality, then naturally I'm going to need something more
powerful, preferably something standard so I don't have to document the
internal format, just say "use XML with this schema" or whatever.

> They might use 
> eval even if you didn't mean for them to. For example, in JavaScript,
> this was once a common problem for services exposing JSON, and it
> still happens even now.

 If they choose to use eval, *that's not my fault*. You can't stop
them from deserialising your data and then passing any and all strings to
eval, so why should I be expected to stop them from something similar?

[...]
>> Beyond simple needs, like rc files, literal_eval is not sufficient. You
>> can't use it to deserialise arbitrary objects. That might be a feature,
>> but if you need something more powerful than basic ints, floats, strings
>> and a few others, literal_eval will not be powerful enough.
> 
> No, it is powerful enough. After all, JSON has the same limitations.

In the sense that you can build arbitrary objects from a combination of a
few basic types, yes, literal_eval is "powerful enough" if you are prepared
to re-invent JSON, YAML, or protocol buffer.

But I'm not talking about re-inventing what already exists. If I want

Re: Python NBSP DWIM

2015-06-10 Thread Chris Angelico

On Thu, Jun 11, 2015 at 1:18 PM,   wrote:
> On Wed, Jun 10, 2015, at 23:05, Chris Angelico wrote:
>> http://youtu.be/CEpcUeWP0bg
>> http://youtu.be/WFZAaHrHens
>
> An example of the actual subtitle text would be more useful than a
> youtube link to the video, since we're unlikely to be able to see what
> context the character appears in if our client doesn't show it. (I don't
> think the default youtube player does). And you haven't even included a
> time code.

Unfortunately I can't really offer anything better, as the text I saw
was after a lot of processing (youtube-dl, then some other
post-processing), and I don't actually remember which file it was that
bugged me about this, now. But the subs/annotations (visible in the
default player if you turn on "Subtitles" down the bottom) do include
U+FEFF; in each case, it's on the very last line of the song, although
that's not where I remember it occurring.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Steven D'Aprano

On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote:
[...]
>> Why do the subtitles contain ZWNBSP in the first place? Surely they're
>> not English subtitles?
> 
> No, they're not :) The character comes up in the Cantonese and
> Japanese subs for Once Upon A December.
> 
> http://youtu.be/CEpcUeWP0bg
> http://youtu.be/WFZAaHrHens
> 
> Possibly some others in the series as well. It may well be a fault in
> the subtitles, but most programs I've seen don't show U+FEFF as a big
> fat box.

I think that for backwards compatibility, applications (or fonts) are
permitted to treat U+FEFF as a zero-width invisible character, so perhaps
you can raise a feature request with VLC.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Chris Angelico

On Thu, Jun 11, 2015 at 1:19 PM, Michael Torrie  wrote:
> I think Joel had the right idea.  First calculate the rough number of
> weeks by taking the number of days between the date and divide by seven.
> Then check to see what the start date's day of week is, and adjust the
> rough week count down by one if it's not the first day of the week.  I'm
> not sure if you have to check the end date's day of week or not.  I kind
> of think checking the first one only is sufficient, but I could be
> wrong.  You'll have to code it up and test it, which I assume you've
> been doing up to this point, even though you haven't shared any code.

Alternatively, you could start by rounding the start date up to the
next week boundary, then round the end date down to the previous week
boundary, and then calculate from there. Something like this:

>>> start = datetime.date(2015, 1, 4)
>>> end = datetime.date(2015, 4, 2)
>>> start += datetime.timedelta(7-start.isoweekday())
>>> end -= datetime.timedelta(end.isoweekday() % 7)

Now both dates represent Sundays. If either already did, it hasn't been changed.

>>> (end - start).days//7
12

There are twelve complete Sunday-to-Sunday weeks (plus any loose days
either end) between the original dates.

Depending on your definition of "complete week", you may need to
adjust this code some.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Ian Kelly

On Wed, Jun 10, 2015 at 8:01 PM, Sebastian M Cheung via Python-list
 wrote:
> yes just whole weeks given any two months, I did looked into calendar module 
> but couldn't find specifically what i need.

>>> cal.monthdays2calendar(2014, 4) + cal.monthdays2calendar(2014, 5)
[[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)], [(7, 0),
(8, 1), (9, 2), (10, 3), (11, 4), (12, 5), (13, 6)], [(14, 0), (15,
1), (16, 2), (17, 3), (18, 4), (19, 5), (20, 6)], [(21, 0), (22, 1),
(23, 2), (24, 3), (25, 4), (26, 5), (27, 6)], [(28, 0), (29, 1), (30,
2), (0, 3), (0, 4), (0, 5), (0, 6)], [(0, 0), (0, 1), (0, 2), (1, 3),
(2, 4), (3, 5), (4, 6)], [(5, 0), (6, 1), (7, 2), (8, 3), (9, 4), (10,
5), (11, 6)], [(12, 0), (13, 1), (14, 2), (15, 3), (16, 4), (17, 5),
(18, 6)], [(19, 0), (20, 1), (21, 2), (22, 3), (23, 4), (24, 5), (25,
6)], [(26, 0), (27, 1), (28, 2), (29, 3), (30, 4), (31, 5), (0, 6)]]

You just need to:

1) Trim the first and last weeks off since they contain invalid dates.
2) Merge the overlapping last week of April and first week of May.
3) Count the resulting number of weeks in the list.

Alternatively, the dateutil.rrule module could probably be used to do
this fairly easily, but it's a third-party module and not part of the
standard library.

https://labix.org/python-dateutil
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python NBSP DWIM

2015-06-10 Thread Chris Angelico

On Thu, Jun 11, 2015 at 1:27 PM, Steven D'Aprano  wrote:
> On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote:
> [...]
>>> Why do the subtitles contain ZWNBSP in the first place? Surely they're
>>> not English subtitles?
>>
>> No, they're not :) The character comes up in the Cantonese and
>> Japanese subs for Once Upon A December.
>>
>> http://youtu.be/CEpcUeWP0bg
>> http://youtu.be/WFZAaHrHens
>>
>> Possibly some others in the series as well. It may well be a fault in
>> the subtitles, but most programs I've seen don't show U+FEFF as a big
>> fat box.
>
> I think that for backwards compatibility, applications (or fonts) are
> permitted to treat U+FEFF as a zero-width invisible character, so perhaps
> you can raise a feature request with VLC.

Yeah. Well, like I said - learn something new every day. I didn't know
it wasn't a bug. (Though it'd still be a font issue, not a VLC one.
With other fonts, it comes up looking different, in some cases
invisible. Unfortunately, the fonts that look good aren't the fonts
that have glyphs for all characters, so I need to figure out why font
substitution isn't working right. But that's a separate issue.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to find number of whole weeks between dates?

2015-06-10 Thread Ian Kelly

On Wed, Jun 10, 2015 at 9:19 PM, Michael Torrie  wrote:
> On 06/10/2015 02:11 PM, Sebastian M Cheung via Python-list wrote:
>> On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote:
>>> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and  May 
>>> would be 5th, 12th and 19th. So expecting 7 whole weeks in total
>>
>> What I mean is given two dates I want to find WHOLE weeks, so if given the 
>> 2014 calendar and function has two inputs (4th and 5th month) then 7th, 
>> 14th, 21st and 28th from April with 28th April week carrying into May, and 
>> then 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May 
>> is not a whole week and will not be counted.
>>
>> Hope thats clear.
>
> I think Joel had the right idea.  First calculate the rough number of
> weeks by taking the number of days between the date and divide by seven.
> Then check to see what the start date's day of week is, and adjust the
> rough week count down by one if it's not the first day of the week.  I'm
> not sure if you have to check the end date's day of week or not.  I kind
> of think checking the first one only is sufficient, but I could be
> wrong.  You'll have to code it up and test it, which I assume you've
> been doing up to this point, even though you haven't shared any code.

I don't think the logic is quite right. Consider:

>>> cal = calendar.TextCalendar()
>>> print(cal.formatmonth(2014, 6))
 June 2014
Mo Tu We Th Fr Sa Su
   1
 2  3  4  5  6  7  8
 9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30

>>> date(2014, 7, 1) - date(2014, 6, 1)
datetime.timedelta(30)
>>> _.days // 7 - 1
3
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Marko Rauhamaa

Devin Jeanpierre :

> For example, one can also imagine testing that a serialized structure
> is identical across version changes, so that it's guaranteed to be
> forwards/backwards compatible. It is not enough to test that the
> deserialized form is, because it might differ substantially, as long
> as the communicated serialized structure is the same.

There are merits to canonical serialization formats, but that approach
to testing is far too simplistic. A test case should accept all observed
behavior that is correct.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

New Python student needs help with execution

2015-06-10 Thread c me

I installed 2.7.9 on a Win8.1 machine. The Coursera instructor did a simple 
install then executed Python from a file in which he'd put a simple hello world 
script.  My similar documents folder cannot see the python executable.  How do 
I make this work?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: enhancement request: make py3 read/write py2 pickle format

2015-06-10 Thread Devin Jeanpierre

Snipped aplenty.

On Wed, Jun 10, 2015 at 8:21 PM, Steven D'Aprano  wrote:
> On Thu, 11 Jun 2015 08:10 am, Devin Jeanpierre wrote:
> [...]
>> I could spend a bunch of time writing yet another config file format,
>> or I could use text format protocol buffers, YAML, or TOML and call it
>> a day.
>
> Writing a rc parser is so trivial that it's almost easier to just write it
> than it is to look up the APIs for YAML or JSON, to say nothing of the
> rigmarole of defining a protocol buffer config file, compiling it,
> importing the module, and using that.
>
-snip
>
> That's a basic, *but acceptable*, rc parser written in literally under a
> minute. At the risk of ending up with egg on my face, I reckon that it's so
> simple and so obviously correct that I can tell it works correctly without
> even testing it. (Famous last words, huh?)

I won't try to egg you. That said, you have to write tests. Also,
everyone who uses it has to learn the format and API, and it may have
corner cases you aren't aware of, it has to get ported to python 3 if
you wrote it for python 2, the parsing errors are obscure and might
need improvement, and so on. There's a place for this, but I suspect
it is small compared to the place where it seemed like a good idea at
the time.

>>> Beyond simple needs, like rc files, literal_eval is not sufficient. You
>>> can't use it to deserialise arbitrary objects. That might be a feature,
>>> but if you need something more powerful than basic ints, floats, strings
>>> and a few others, literal_eval will not be powerful enough.
>>
>> No, it is powerful enough. After all, JSON has the same limitations.
>
> In the sense that you can build arbitrary objects from a combination of a
> few basic types, yes, literal_eval is "powerful enough" if you are prepared
> to re-invent JSON, YAML, or protocol buffer.
>
> But I'm not talking about re-inventing what already exists. If I want JSON,
> I'll use JSON, not spend weeks or months re-writing it from scratch. I
> can't do this:
>
> class MyClass:
> pass
>
> a = MyClass()
> serialised = repr(a)
> b = ast.literal_eval(serialised)
> assert a == b

I don't understand. You can't do that in JSON, YAML, XML, or protocol
buffers, either. They only provide a small set of types, comparable to
(but smaller) than the set of types you get from literal_eval/repr.

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list

66 matches

Mail list logo