Permission to Showcase the Python Program
I’m having trouble reaching an actual human at your organization as my emails get bounced back. If someone could please see my email below and respond with an answer as soon as possible, I’d appreciate it. To Whom It May Concern: My name is Leslie Bush and I am the Intellectual Property Coordinator for The Great Courses. We produce non-credit, college-level educational programs on DVD and electronic formats in a lecture series. The lectures are recorded and then sold to the general public for-profit. As a didactic tool for enhancing the programs, we include in the lectures visual elements illustrating works of art, people, events, locations, etc We are currently producing a course with Dr. John Keyser, Professor of Computer Science at the University of North Carolina Chapel Hill, entitled “Computer Science for Everyone: Programming Concepts and Exercises” The professor would like to use your Python program to showcase in the course. I am writing to see if we may have permission to do so. If permission is granted, we will have a copy of our license agreement sent over to your company’s authorizer for review and a signature. Here are the details of our program: Title: Computer Science for Everyone: Programming Concepts and Exercises Author and Publisher: The Teaching Company Language: All Format: all electronic formats Distribution: Worldwide Print Run: Life of the product Lecturer: Dr. John Keyser Release date: Summer 2016 Price: Unknown If you need further details or have questions or concerns, please do not hesitate to contact me. Our address is 4840 Westfields Blvd. Chantilly, VA 20151. For more information about The Teaching Company you can look at our site at www.thegreatcourses.com If you are interested in seeing any of the courses on our site please let me know and I’ll be happy to send you a copy. I look forward to working with you on this exciting course. Thank you for your assistance. Sincerely, Leslie Bush Product Development Intellectual Property Coordinator The Teaching Company/The Great Courses 4840 Westfields Blvd., Suite 500, Chantilly, VA 20151-2299 (703)774-1687 Direct (703)502-4270 Fax www.thegreatcourses.com -- https://mail.python.org/mailman/listinfo/python-list
ANN: eGenix mxODBC Plone/Zope Database Adapter 2.2.2
ANNOUNCING mxODBC Plone/Zope Database Adapter Version 2.2.2 for the Plone CMS and Zope server platform Available for Plone 4.0-4.3 and Plone 5.0, Zope 2.12 and 2.13, on Windows, Linux, Mac OS X, FreeBSD and other platforms This announcement is also available on our web-site for online reading: http://www.egenix.com/company/news/eGenix-mxODBC-Zope-DA-2.2.2-GA.html INTRODUCTION The eGenix mxODBC Zope DA allows you to easily connect your Zope or Plone CMS installation to just about any database backend on the market today, giving you the reliability of the commercially supported eGenix product mxODBC and the flexibility of the ODBC standard as middle-tier architecture. The mxODBC Zope Database Adapter is highly portable, just like Zope itself and provides a high performance interface to all your ODBC data sources, using a single well-supported interface on Windows, Linux, Mac OS X, FreeBSD and other platforms. This makes it ideal for deployment in ZEO Clusters and Zope hosting environments where stability and high performance are a top priority, establishing an excellent basis and scalable solution for your Plone CMS. Product page: http://www.egenix.com/products/zope/mxODBCZopeDA/ NEWS The 2.2.2 release of our mxODBC Zope/Plone Database Adapter product is a patch level release of the popular ODBC database interface for Plone and Zope. It includes these enhancements and fixes: Driver Compatibility Enhancements - * Reenabled returning cursor.rowcount for FreeTDS >= 0.91. In previous versions, FreeTDS could return wrong data for .rowcount when using SELECTs. Fixes - * Removed exists() built-in from mxODBC Zope DA's implicit addition of new built-ins via mxTools. This resolves a hard to track bug where the new built-in could potentially override the TAL python:exists function (in e.g. tal:condition="exists:something"). See this Products.CMFEditions fix for an example where the problem surfaced. This is a bug in TAL (it shouldn't give preference to built-ins over its own helpers), but we're providing the fix as easy work-around. The complete list of changes is available on the mxODBC Zope DA changelog page. http://www.egenix.com/products/zope/mxODBCZopeDA/changelog.html mxODBC Zope DA 2.2.0 was released on 2014-12-11. Please see the mxODBC Zope DA 2.2.0 release announcement for all the new features we have added. http://www.egenix.com/company/news/eGenix-mxODBC-Zope-DA-2.2.0-GA.html For the full list of features, please see the mxODBC Zope DA feature list: http://www.egenix.com/products/zope/mxODBCZopeDA/#Features The complete list of changes is available on the mxODBC Zope DA changelog page. UPGRADING Users are encouraged to upgrade to this latest mxODBC Plone/Zope Database Adapter release to benefit from the new features and updated ODBC driver support. We have taken special care not to introduce backwards incompatible changes, making the upgrade experience as smooth as possible. For major and minor upgrade purchases, we will give out 20% discount coupons going from mxODBC Zope DA 1.x to 2.2 and 50% coupons for upgrades from mxODBC 2.x to 2.2. After upgrade, use of the original license from which you upgraded is no longer permitted. Patch level upgrades (e.g. 2.2.0 to 2.2.2) are always free of charge. Please contact the eGenix.com Sales Team with your existing license serials for details for an upgrade discount coupon. If you want to try the new release before purchase, you can request 30-day evaluation licenses by visiting our web-site or writing to sa...@egenix.com, stating your name (or the name of the company) and the number of eval licenses that you need. http://www.egenix.com/products/python/mxODBCZopeDA/#Evaluation DOWNLOADS Please visit the eGenix mxODBC Zope DA product page for downloads, instructions on installation and documentation of the packages: http://www.egenix.com/company/products/zope/mxODBCZopeDA/ If you want to try the package, please jump straight to the download instructions: http://www.egenix.com/products/zope/mxODBCZopeDA/#Download Fully functional evaluation licenses for the mxODBC Zope DA are available free of charge: http://www.egenix.com/products/zope/mxODBCZopeDA/#Evaluation SUPPORT Commercial support for this product is available directly from eGenix.com. Please see the support section of our website for details:
Re: enhancement request: make py3 read/write py2 pickle format
On Wednesday 10 June 2015 14:48, Devin Jeanpierre wrote: [...] > and literal_eval is not a great idea. > > * the common serializer (repr) does not output a canonical form, and > can serialize things in a way that they can't be deserialized For literals, the canonical form is that understood by Python. I'm pretty sure that these have been stable since the days of Python 1.0, and will remain so pretty much forever: ints: 12345 floats: 1.2345 strings: "spam" None True False lists, tuples, dicts and sets containing the above There may be a few differences between Python 2 and 3, e.g. no set literal in Python 2, but in general the Python syntax is well-known and understood by anyone programming in Python. > * there is no schema > * there is no well understood migration story for when the data you > load and store changes literal_eval is not a serialisation format itself. It is a primitive operation usable when serialising. E.g. you might write out a simple Unix- style rc file of key:value pairs: length=23.45 width=10.95 landscape=False split on "=" and call literal_eval on the value. This is a perfectly reasonable light-weight solution for simple serialisation needs. > * it is not usable from other programming languages That's okay, we're not writing in other programming languages :-) > * it encourages the use of eval when literal_eval becomes inconvenient > or insufficient I don't think so. I think that people who make the effort to import ast and call ast.literal_eval are fully aware of the dangers of eval and aren't silly enough to start using eval. > * It is not particularly well specified or documented compared to the > alternatives. > * The types you get back differ in python 2 vs 3 Doesn't matter. The type you *write* are different in Python 2 vs 3, so of course you do. > For most apps, the alternatives are better. Irmen's serpent library is > strictly better on every front, for example. (Except potentially > security, who knows.) Beyond simple needs, like rc files, literal_eval is not sufficient. You can't use it to deserialise arbitrary objects. That might be a feature, but if you need something more powerful than basic ints, floats, strings and a few others, literal_eval will not be powerful enough. I think we are in violent agreement :-) -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
Chris Warrick wrote: > On Tue, Jun 9, 2015 at 8:08 PM, Neal Becker wrote: >> One of the most annoying problems with py2/3 interoperability is that the >> pickle formats are not compatible. There must be many who, like myself, >> often use pickle format for data storage. >> >> It certainly would be a big help if py3 could read/write py2 pickle >> format. You know, backward compatibility? > > Don’t use pickle. It’s unsafe — it executes arbitrary code, which > means someone can give you a pickle file that will delete all your > files or eat your cat. > > Instead, use a safe format that has no ability to execute code, like > JSON. It will also work with other programming languages and > environments if you ever need to talk to anyone else. > > But, FYI: there is backwards compatibility if you ask for it, in the > form of protocol versions. That’s all you should know — again, don’t > use pickle. > I believe a good native serialization system is essential for any modern programming language. If pickle isn't it, we need something else that can serialize all language objects. Or, are you saying, it's impossible to do this safely? -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On 2015-06-10 12:04, Neal Becker wrote: Chris Warrick wrote: On Tue, Jun 9, 2015 at 8:08 PM, Neal Becker wrote: One of the most annoying problems with py2/3 interoperability is that the pickle formats are not compatible. There must be many who, like myself, often use pickle format for data storage. It certainly would be a big help if py3 could read/write py2 pickle format. You know, backward compatibility? Don’t use pickle. It’s unsafe — it executes arbitrary code, which means someone can give you a pickle file that will delete all your files or eat your cat. Instead, use a safe format that has no ability to execute code, like JSON. It will also work with other programming languages and environments if you ever need to talk to anyone else. But, FYI: there is backwards compatibility if you ask for it, in the form of protocol versions. That’s all you should know — again, don’t use pickle. I believe a good native serialization system is essential for any modern programming language. If pickle isn't it, we need something else that can serialize all language objects. Or, are you saying, it's impossible to do this safely? By the very nature of the stated problem: serializing all language objects. Being able to construct any object, including instances of arbitrary classes, means that arbitrary code can be executed. All I have to do is make a pickle file for an object that claims that its constructor is shutil.rmtree(). This is fine in some use cases (e.g. wire format for otherwise-secured communication between two endpoints under your complete control), but it is worrying in others, like your use case of data storage (and presumably sharing). Python 2/3 is also the least of your compatibility worries there. Refactor a class to a different module, or did one of your third-party dependencies do this? Poof! Your pickle files no longer work. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Wed, Jun 10, 2015 at 9:04 PM, Neal Becker wrote: > I believe a good native serialization system is essential for any modern > programming language. If pickle isn't it, we need something else that can > serialize all language objects. Or, are you saying, it's impossible to do > this safely? It is indeed impossible to serialize _all_ objects safely. How do you, for instance, serialize an open socket? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
Robert Kern : > By the very nature of the stated problem: serializing all language > objects. Being able to construct any object, including instances of > arbitrary classes, means that arbitrary code can be executed. All I > have to do is make a pickle file for an object that claims that its > constructor is shutil.rmtree(). You can't serialize/migrate arbitrary objects. Consider open TCP connections, open files and other objects that extend outside the Python VM. Also objects hold references to each other, leading to a huge reference mesh. For example: a.buddy = b b.buddy = a with open("a", "wb") as f: f.write(serialize(a)) with open("b", "wb") as f: f.write(serialize(b)) with open("a", "rb") as f: aa = deserialize(f.read()) with open("b", "rb") as f: bb = deserialize(f.read()) assert aa.buddy is bb Marko -- https://mail.python.org/mailman/listinfo/python-list
Python NBSP DWIM
str.split() doesn't seem to respect non-breaking space: Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>> print(repr("hello\N{NO-BREAK SPACE}world".split())) ['hello', 'world'] What's the purpose of a non-breaking space if it's treated like a space for breaking/splitting purposes? :-) Is this a bug? -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Wed, Jun 10, 2015, at 08:08, Marko Rauhamaa wrote: > You can't serialize/migrate arbitrary objects. Consider open TCP > connections, open files and other objects that extend outside the Python > VM. Also objects hold references to each other, leading to a huge > reference mesh. > > For example: > >a.buddy = b >b.buddy = a >with open("a", "wb") as f: f.write(serialize(a)) >with open("b", "wb") as f: f.write(serialize(b)) > >with open("a", "rb") as f: aa = deserialize(f.read()) >with open("b", "rb") as f: bb = deserialize(f.read()) >assert aa.buddy is bb Of course, if you serialize a single dict with e.g. {'a': a, 'b': b}, you can expect (with advanced serialization tools, anyway - I suspect JSON will just make a mess or exceed maximum recursion depth) result['a'].buddy is result['b'] -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On 2015-06-10 13:08, Marko Rauhamaa wrote: Robert Kern : By the very nature of the stated problem: serializing all language objects. Being able to construct any object, including instances of arbitrary classes, means that arbitrary code can be executed. All I have to do is make a pickle file for an object that claims that its constructor is shutil.rmtree(). You can't serialize/migrate arbitrary objects. Consider open TCP connections, open files and other objects that extend outside the Python VM. Yes, yes, but that's really beside the point. Yes, there are some objects for which it doesn't even make sense to serialize. But my point is that even in this slightly smaller set of objects that *can* be serialized (and pickle currently does serialize), being able to serialize all of them entails arbitrary code execution to deserialize them. To allow people to write their own types that can be serialized, you have to let them specify arbitrary callables that will do the reconstruction. If you whitelist the possible reconstruction callables, you have greatly restricted the types that can participate in the serialization system. Also objects hold references to each other, leading to a huge reference mesh. For example: a.buddy = b b.buddy = a with open("a", "wb") as f: f.write(serialize(a)) with open("b", "wb") as f: f.write(serialize(b)) with open("a", "rb") as f: aa = deserialize(f.read()) with open("b", "rb") as f: bb = deserialize(f.read()) assert aa.buddy is bb Yeah, no one expects that to work. For example, if I deserialize the same string twice, you can't expect to get identical returned objects (as in, "deserialize(pickle) is deserialize(pickle)"). However, pickle does correctly handle fairly arbitrary reference graphs within the context of a single serialization, which is the most that can be asked of a serialization system. That isn't really a concern here. >>> class A(object): ... pass ... >>> a = A() >>> b = A() >>> a.buddy = b >>> b.buddy = a >>> data = [a, b] >>> data[0].buddy is data[1] True >>> data[1].buddy is data[0] True >>> import cPickle >>> unpickled = cPickle.loads(cPickle.dumps(data)) >>> unpickled[0].buddy is unpickled[1] True >>> unpickled[1].buddy is unpickled[0] True -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On 10/06/2015 14:28, Tim Chase wrote: str.split() doesn't seem to respect non-breaking space: Python 3.4.2 (default, Oct 8 2014, 10:45:20) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>> print(repr("hello\N{NO-BREAK SPACE}world".split())) ['hello', 'world'] What's the purpose of a non-breaking space if it's treated like a space for breaking/splitting purposes? :-) Is this a bug? -tkc IMNSHO yes. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Did the 3.4.4 docs get published early?
For example, here is a "New in version 3.4.4" method: https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future However, the latest release appears to be 3.4.3: https://www.python.org/downloads/ Is this normal, or did the 3.4.4 docs somehow get published early by mistake? Nick -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase wrote: > Is this a bug? Looks like it's been reported a few times with slightly different context: https://bugs.python.org/issue6537 https://bugs.python.org/issue16623 https://bugs.python.org/issue20491 https://bugs.python.org/issue1390608 The couple times it's come up in the context of str.split, it's been rejected, since the purpose of that method is to split words. Skip -- https://mail.python.org/mailman/listinfo/python-list
Re: Did the 3.4.4 docs get published early?
On 10/06/2015 15:11, Nicholas Chammas wrote: For example, here is a "New in version 3.4.4" method: https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future However, the latest release appears to be 3.4.3: https://www.python.org/downloads/ Is this normal, or did the 3.4.4 docs somehow get published early by mistake? Nick I suspect that this is due to a trainee pilot being let loose too early with the time machine. Failing that finger trouble when doing a commit. Thinking about it more likely the former rather than the latter :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
In a message of Wed, 10 Jun 2015 09:28:24 -0500, Skip Montanaro writes: >On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase > wrote: >> Is this a bug? > >Looks like it's been reported a few times with slightly different context: > >https://bugs.python.org/issue6537 >https://bugs.python.org/issue16623 >https://bugs.python.org/issue20491 >https://bugs.python.org/issue1390608 > >The couple times it's come up in the context of str.split, it's been >rejected, since the purpose of that method is to split words. > >Skip In these unicode days, this thinking may need to be revisited. There are many languages where whitespace does not separate words -- either words aren't separated, or in Vietnamese, spaces separate syllables, so entire words have spaces in them. Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Did the 3.4.4 docs get published early?
On Jun 10, 2015 9:41 AM, "Mark Lawrence" wrote: > > On 10/06/2015 15:11, Nicholas Chammas wrote: >> >> For example, here is a "New in version 3.4.4" method: >> >> https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future >> >> However, the latest release appears to be 3.4.3: >> >> https://www.python.org/downloads/ >> >> Is this normal, or did the 3.4.4 docs somehow get published early by >> mistake? >> >> Nick >> > > I suspect that this is due to a trainee pilot being let loose too early with the time machine. Failing that finger trouble when doing a commit. Thinking about it more likely the former rather than the latter :) > Actually, it's just that the online docs reflect the latest documentation from a particular branch of the source repository, since the docs are continually improving and have no backwards compatibility constraints. This does mean we sometimes have anomalies like this, though. If you truly need the docs as they were at the time of release, they are available, though I don't have a link handy on my phone. -- Zach (On a phone) -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Wed, Jun 10, 2015, at 11:03, Laura Creighton wrote: > In these unicode days, this thinking may need to be revisited. There > are many languages where whitespace does not separate words -- either > words aren't separated, or in Vietnamese, spaces separate syllables, > so entire words have spaces in them. Text wrapping for CJK scripts is another topic that might be worth addressing in textwrap - words aren't space-separated, but there are still rules about where you can place a line break. Generally these are centered around preventing punctuation marks from being orphaned rather than any attempt to algorithmically find word boundaries. For the process called "Oikomi", while messing with kerning is not strictly possible for monospaced text, it might be worthwhile in general to have "preferred" and "maximum" line widths as parameters for textwrap. http://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages -- https://mail.python.org/mailman/listinfo/python-list
Re: Memory error while using pandas dataframe
On Mon, Jun 8, 2015 at 3:32 AM, naren wrote: > Memory Error while working with pandas dataframe. > > Description of Environment Windows 7 python 3.4.2 32-bit version pandas > 0.16.0 > > We are running into the error described below. Any help provided will be > sincerely appreciated. > > We are able to read a 300MB Csv file into a dataframe using the read_csv > function. While working with the dataframe we ran into memory error. We > used the pd.Concat function to concatenate two dataframes. So we decided to > use chunksize for lazy reading. Chunking returns an object of type > TextFileReader. > > > http://pandas.pydata.org/pandas-docs/stable/io.html#iterating-through-files-chunk-by-chunk > > We are able to iterate over this object once as a debugging measure. The > iterator gets exhausted after iterating once. So we are not able to convert > the TextFileReader object back into a dataframe, using the pd.concat > function. > It looks like you already figured out what your problem is. The TextFileReader is exhausted (i.e., at EOF), so you end up getting None from it. What is your question? You want to be able to iterate through TextFileReader again? If so, try rewinding the file object that you passed to pd.concat. If you saved a reference to the file object, just call "seek(0)" on that object. If you didn't, access it as the "f" attribute on the TextFileReader object and call "seek(0)" on that instead. That might work. Otherwise, you should be more specific with your question and provide a full segment of code that is as small as possible to reproduce the error you're seeing. HTH, Jason -- https://mail.python.org/mailman/listinfo/python-list
Re: Did the 3.4.4 docs get published early?
On 10.06.2015 17:05, Zachary Ware wrote: > On Jun 10, 2015 9:41 AM, "Mark Lawrence" wrote: >> >> On 10/06/2015 15:11, Nicholas Chammas wrote: >>> >>> For example, here is a "New in version 3.4.4" method: >>> >>> https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future >>> >>> However, the latest release appears to be 3.4.3: >>> >>> https://www.python.org/downloads/ >>> >>> Is this normal, or did the 3.4.4 docs somehow get published early by >>> mistake? >>> >>> Nick >>> >> >> I suspect that this is due to a trainee pilot being let loose too early > with the time machine. Failing that finger trouble when doing a commit. > Thinking about it more likely the former rather than the latter :) >> > > Actually, it's just that the online docs reflect the latest documentation > from a particular branch of the source repository, since the docs are > continually improving and have no backwards compatibility constraints. This > does mean we sometimes have anomalies like this, though. > > If you truly need the docs as they were at the time of release, they are > available, though I don't have a link handy on my phone. You can (with javascript enabled) select the version for the docs at the top right of the page. Also, just replacing the version number in the URL works for the python 3 series (use 3.X even for python 3.0), even farther back than the drop down menu allows. regards, jwi signature.asc Description: OpenPGP digital signature -- https://mail.python.org/mailman/listinfo/python-list
How to find number of whole weeks between dates?
Say in 2014 April to May whole weeks would be 7th, 14th 28th April and May would be 5th, 12th and 19th. So expecting 7 whole weeks in total -- https://mail.python.org/mailman/listinfo/python-list
Re: Testing random
Jussi Piitulainen wrote: > Thomas 'PointedEars' Lahn writes: >> Jussi Piitulainen wrote: >>> Thomas 'PointedEars' Lahn writes: 8 3 6 3 1 2 6 8 2 1 6. >>> >>> There are more than four hundred thousand ways to get those numbers >>> in some order. >>> >>> (11! / 2! / 2! / 2! / 3! / 2! = 415800) >> >> Fallacy. Order is irrelevant here. > > You need to consider every sequence that leads to the observed counts. No, you need _not_, because – I repeat – the probability of getting a sequence of length n from a set of 9 numbers whereas the probability of picking a number is evenly distributed, is (1∕9)ⁿ [(1/9)^n, or 1/9 to the nth, for those who do to see it because of lack of Unicode support at their system]. *Always.* *No matter* which numbers are in it. *No matter* in which order they are. AISB, order is *irrelevant* here. *Completely.* This is _not_ a lottery box; you put the ball with the number on it *back into the box* after you have drawn it and before you draw a new one. > One of those sequences occurred. You don't know which. You do not have to. > When tossing herrings […] Herrings are the key word here, indeed, and they are deep dark red. > Code follows. Incidentally, I'm not feeling smart here. Good. Because you should not feel smart in any way after ignoring all my explanations. > [nonsense] -- PointedEars Twitter: @PointedEars2 Please do not cc me. / Bitte keine Kopien per E-Mail. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Thu, 11 Jun 2015 12:28 am, Skip Montanaro wrote: > On Wed, Jun 10, 2015 at 8:28 AM, Tim Chase > wrote: >> Is this a bug? > > Looks like it's been reported a few times with slightly different context: > > https://bugs.python.org/issue6537 > https://bugs.python.org/issue16623 > https://bugs.python.org/issue20491 > https://bugs.python.org/issue1390608 > > The couple times it's come up in the context of str.split, it's been > rejected, since the purpose of that method is to split words. That reasoning is ... strange. The whole point of the NBSP is specifically *not* to split on it. If you wanted it to split, you would use a regular space. (Oh, and for the record, there are at least two non-breaking spaces in Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".) http://www.unicode.org/charts/PDF/U0080.pdf http://www.unicode.org/charts/PDF/U2000.pdf Non-breaking spaces should be used for when you want to prevent word-wrapping, and also for "open form" compound words: http://grammar.ccc.commnet.edu/grammar/compounds.htm textwrap should also treat NBSPs as non-spaces for the purposes of wrapping. As a work-around, I think this should work: - split the string on NBSPs; - for substring returned, split normally; - merge sub-substrings. def split(s): """Split on whitespace, except NBSP. >>> split(u'hello world spam\\u00A0eggs cheese') [u'hello', u'world', u'spam\\xa0eggs', 'cheese'] """ words = [] NBSP = u'\u00A0' substrings = s.split(NBSP) for i, sub in enumerate(substrings): parts = sub.split() if i == 0: words.extend(parts) else: words[-1] += NBSP + parts[0] words.extend(parts[1:]) return words -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On 10-6-2015 11:36, Steven D'Aprano wrote: >> For most apps, the alternatives are better. Irmen's serpent library is >> strictly better on every front, for example. (Except potentially >> security, who knows.) > > Beyond simple needs, like rc files, literal_eval is not sufficient. You > can't use it to deserialise arbitrary objects. That might be a feature, but > if you need something more powerful than basic ints, floats, strings and a > few others, literal_eval will not be powerful enough. Just to have this off my chest: I guess that "serialization format" is not the most correct term for what serpent does (or in general, for the literal expressions that literal_eval accepts). Serpent doesn't strive to (de)serialize everything perfectly. It is meant as a pythonic data transfer format. You can do this by explicitly mapping your application's object model to and from the wire data format, or do it in a more pythonic way (IMO) and let python take care of most of it automatically. Serpent is smart (I hope) about a number of non-primitive types. If needed, use its hooks to teach it about types it doesn't readily recognize. Yes, it does force you to reduce the arbitrary types you want to process to the set of types that are accepted in a python literal expression. Thankfully lists, sets, tuples and dicts are also among them. Raison d'être for serpent is that I was looking for a safe pythonic alternative for pickle, and with fewer limitations than Json. I chose to use ast.literal_eval from the standard library to do the "deserialization" for me, and so only had to build some code to "serialize" object trees into python literal expressions :) Regarding security: I simply trust the docstring of ast.literal_eval here; "Safely evaluate an expression node or a string containing a Python expression. [...]" Irmen -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
Sebastian M Cheung : > Say in 2014 April to May whole weeks would be 7th, 14th 28th April and > May would be 5th, 12th and 19th. So expecting 7 whole weeks in total This program gives you the number of days between two dates given in the -MM-DD format: #!/usr/bin/env python3 import sys def gregorian_day_count(isodate): year, month, day = map(int, isodate.split('-')) a, b = divmod(12 * year + month - 3, 12) return (a * 365 + (a >> 2) - (a * 1311 >> 17) + (a * 1311 >> 19) + + (31306 * b + 722 >> 10)) def main(): print(gregorian_day_count(sys.argv[2]) - gregorian_day_count(sys.argv[1])) if __name__ == '__main__': main() Divide the number by 7 and you have your answer. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
Marko Rauhamaa : > This program gives you the number of days between two dates given in the > -MM-DD format: Sorry, couldn't resist. It still does work, though. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On Wed, Jun 10, 2015 at 11:05 AM, Sebastian M Cheung via Python-list wrote: > Say in 2014 April to May whole weeks would be 7th, 14th 28th April and May > would be 5th, 12th and 19th. So expecting 7 whole weeks in total >>> from datetime import date >>> d1 = date(2014, 4, 7) >>> d2 = date(2014, 5, 19) >>> d2 - d1 datetime.timedelta(42) >>> (d2 - d1).days 42 >>> (d2 - d1).days // 7 6 -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
In a message of Wed, 10 Jun 2015 20:38:59 +0300, Marko Rauhamaa writes: >Divide the number by 7 and you have your answer. > I am not sure that is what he wants -- If he gives us a start of Tuesday the 9th of June 2015 (yesterday) and an end of Thursday the 25th of June, that's 16 days. But there is only one Monday-Friday week in there, the 14th-19th. So if the OP wants an answer of 1 for such data, he may be interested in the python calendar module https://docs.python.org/2/library/calendar.html Laura -- https://mail.python.org/mailman/listinfo/python-list
Re: Testing random
On Wednesday, June 10, 2015 at 10:06:49 AM UTC-7, Thomas 'PointedEars' Lahn wrote: > Jussi Piitulainen wrote: > > > Thomas 'PointedEars' Lahn writes: > >> Jussi Piitulainen wrote: > >>> Thomas 'PointedEars' Lahn writes: > 8 3 6 3 1 2 6 8 2 1 6. > >>> > >>> There are more than four hundred thousand ways to get those numbers > >>> in some order. > >>> > >>> (11! / 2! / 2! / 2! / 3! / 2! = 415800) > >> > >> Fallacy. Order is irrelevant here. > > > > You need to consider every sequence that leads to the observed counts. > > No, you need _not_, because – I repeat – the probability of getting a > sequence of length n from a set of 9 numbers whereas the probability of > picking a number is evenly distributed, is (1∕9)ⁿ [(1/9)^n, or 1/9 to the > nth, for those who do to see it because of lack of Unicode support at their > system]. *Always.* *No matter* which numbers are in it. *No matter* in > which order they are. AISB, order is *irrelevant* here. *Completely.* > > This is _not_ a lottery box; you put the ball with the number on it *back > into the box* after you have drawn it and before you draw a new one. > > > One of those sequences occurred. You don't know which. > > You do not have to. > > > When tossing herrings […] > > Herrings are the key word here, indeed, and they are deep dark red. > > > Code follows. Incidentally, I'm not feeling smart here. > > Good. Because you should not feel smart in any way after ignoring all my > explanations. > > > [nonsense] > > -- > PointedEars > > Twitter: @PointedEars2 > Please do not cc me. / Bitte keine Kopien per E-Mail. To put it another way, let's simplify the problem. You're rolling a pair of dice. What are the chances that you'll see a pair of 3s? Look at the list of possible roll combinations: 1 1 1 2 1 3 1 4 1 5 1 6 2 1 2 2 2 3 2 4 2 5 2 6 3 1 3 2 3 3 3 4 3 5 3 6 4 1 4 2 4 3 4 4 4 5 4 6 5 1 5 2 5 3 5 4 5 5 5 6 6 1 6 2 6 3 6 4 6 5 6 6 36 possible combinations. Only one of them has a pair of 3s. The answer is 1/36. What about the chances of seeing 2 1? Here's where I think you two are having such a huge disagreement. Does order matter? It depends what you're pulling random numbers out for. The odds of seeing 2 1 are also only 1/36. But if order doesn't matter in your application, then 1 2 is equivalent. The odds of getting 2 1 OR 1 2 is 2/36, or 1/18. But whether order matters or not, the chances of getting a pair of threes in two rolls is ALWAYS 1/36. If this gets expanded to grabbing 10 random numbers between 1 and 9, then the chances of getting a sequence of 10 ones is still only (1/9)^10, *regardless of whether or not order matters*. There are 9^10 possible sequences, but only *one* of these is all ones. If order matters, then 7385941745 also has a (1/9)^10 chance of occurring. Just because it isn't a memorable sequence doesn't give it a higher chance of happening. If order DOESN'T matter, then 1344557789 would be equivalent, and the odds are higher. -- https://mail.python.org/mailman/listinfo/python-list
Re: Testing random
On Wed, Jun 10, 2015 at 11:03 AM, Thomas 'PointedEars' Lahn wrote: > Jussi Piitulainen wrote: > >> Thomas 'PointedEars' Lahn writes: >>> Jussi Piitulainen wrote: Thomas 'PointedEars' Lahn writes: > 8 3 6 3 1 2 6 8 2 1 6. There are more than four hundred thousand ways to get those numbers in some order. (11! / 2! / 2! / 2! / 3! / 2! = 415800) >>> >>> Fallacy. Order is irrelevant here. >> >> You need to consider every sequence that leads to the observed counts. > > No, you need _not_, because – I repeat – the probability of getting a > sequence of length n from a set of 9 numbers whereas the probability of > picking a number is evenly distributed, is (1∕9)ⁿ [(1/9)^n, or 1/9 to the > nth, for those who do to see it because of lack of Unicode support at their > system]. *Always.* *No matter* which numbers are in it. *No matter* in > which order they are. AISB, order is *irrelevant* here. *Completely.* Order is relevant because, for instance, there are n differently ordered sequences that contain n-1 1s and one 2, while there is only one sequence that contains n 1s. While each of those individual sequences are indeed equiprobable, the overall probability of getting a sequence that contains n-1 1s and one 2 is n times the probability of getting a sequence that contains n 1s. The context of this whole thread is about the probability of getting a sequence where every number occurs at least once. The order that they occur in doesn't matter, but the number of possible permutations does, because every one of those permutations is a distinct sequence contributing an equal amount to the total overall probability. The probability of 123456789 and 1 are equal. The probability of a sequence containing all nine numbers and a sequence containing only 1s are *not* equal. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On Wed, Jun 10, 2015 at 1:50 PM, Laura Creighton wrote: > In a message of Wed, 10 Jun 2015 20:38:59 +0300, Marko Rauhamaa writes: >>Divide the number by 7 and you have your answer. >> > > I am not sure that is what he wants -- If he gives us a start of Tuesday the > 9th of June 2015 (yesterday) and an end of Thursday the 25th of June, that's > 16 days. But there is only one Monday-Friday week in there, the 14th-19th. > > So if the OP wants an answer of 1 for such data, he may be interested in > the python calendar module https://docs.python.org/2/library/calendar.html > > Laura > > > -- > https://mail.python.org/mailman/listinfo/python-list Find the number of weeks with the above method, then >>> import datetime end_date = datetime.datetime(2012, 3, 23) // whatever your end date is if end_date.weekday() != 5: number_of_complete _weeks -= 1 weekday returns 0 for monday, so 5 for Saturday -- Joel Goldstick http://joelgoldstick.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Testing random
On Wed, Jun 10, 2015, at 13:03, Thomas 'PointedEars' Lahn wrote: > This is _not_ a lottery box; you put the ball with the number on it *back > into the box* after you have drawn it and before you draw a new one. Yes, but getting a 2, putting it back, and getting a 1 is just as good as getting a 1, putting it back, and getting a 2, so you have to add the probability of those cases together to get the probability of getting at least one 1 and at least one 2. -- https://mail.python.org/mailman/listinfo/python-list
Re: Did the 3.4.4 docs get published early?
Also, just replacing the version number in the URL works for the python 3 series (use 3.X even for python 3.0), even farther back than the drop down menu allows. This does not help in this case: https://docs.python.org/3.4/library/asyncio-task.html#asyncio.ensure_future Also, you cannot select the docs for a maintenance release, like 3.4.3. Anyway, it’s not a big deal as long as significant changes are tagged appropriately with notes like “New in version NNN”, which they are. Ideally, the docs would only show the latest changes for released versions of Python, but since some changes (like the one I linked to) are introduced in maintenance versions, it’s probably hard to separate them out into separate branches. Nick On Wed, Jun 10, 2015 at 10:11 AM Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > For example, here is a "New in version 3.4.4" method: > > https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future > > However, the latest release appears to be 3.4.3: > > https://www.python.org/downloads/ > > Is this normal, or did the 3.4.4 docs somehow get published early by > mistake? > > Nick > > -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On 10/06/2015 18:50, Laura Creighton wrote: In a message of Wed, 10 Jun 2015 20:38:59 +0300, Marko Rauhamaa writes: Divide the number by 7 and you have your answer. I am not sure that is what he wants -- If he gives us a start of Tuesday the 9th of June 2015 (yesterday) and an end of Thursday the 25th of June, that's 16 days. But there is only one Monday-Friday week in there, the 14th-19th. So if the OP wants an answer of 1 for such data, he may be interested in the python calendar module https://docs.python.org/2/library/calendar.html Laura For those who wish to move into the 21st century the link is https://docs.python.org/3/library/calendar.html -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Testing random
sohcahto...@gmail.com writes: [...] > Here's where I think you two are having such a huge disagreement. > Does order matter? It depends what you're pulling random numbers out > for. > > The odds of seeing 2 1 are also only 1/36. But if order doesn't > matter in your application, then 1 2 is equivalent. The odds of > getting 2 1 OR 1 2 is 2/36, or 1/18. [...] I'm not sure what Thomas 'PointedEars' Lahn is talking about. It seems to be something else than what others have been discussing. Others have been discussing a record of the number of times that each possible outcome came up in a sequence of random numbers. There is no other record of the sequence. The number of drawings is much larger than the number of possible outcomes. The subject line refers to testing whether the record of counts is compatible with the drawings being random in the usual sense: independent, with uniform distribution. Someone pointed out that some numbers may not have occurred at all - I think a piece of code needed modification - and so people have commented on the probability of this happening ... and whether it depends on the number of drawings. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote: > Say in 2014 April to May whole weeks would be 7th, 14th 28th April and May > would be 5th, 12th and 19th. So expecting 7 whole weeks in total What I mean is given two dates I want to find WHOLE weeks, so if given the 2014 calendar and function has two inputs (4th and 5th month) then 7th, 14th, 21st and 28th from April with 28th April week carrying into May, and then 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May is not a whole week and will not be counted. Hope thats clear. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On 10/06/2015 21:11, Sebastian M Cheung via Python-list wrote: On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote: Say in 2014 April to May whole weeks would be 7th, 14th 28th April and May would be 5th, 12th and 19th. So expecting 7 whole weeks in total What I mean is given two dates I want to find WHOLE weeks, so if given the 2014 calendar and function has two inputs (4th and 5th month) then 7th, 14th, 21st and 28th from April with 28th April week carrying into May, and then 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May is not a whole week and will not be counted. Hope thats clear. If you'd be kind enough to show the code that you've written and the precise reasons(s) that it doesn't work then we'll be delighted to point you in the right direction. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On Wed, Jun 10, 2015 at 2:11 PM, Sebastian M Cheung via Python-list wrote: > On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote: >> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and May >> would be 5th, 12th and 19th. So expecting 7 whole weeks in total > > What I mean is given two dates I want to find WHOLE weeks, so if given the > 2014 calendar and function has two inputs (4th and 5th month) then 7th, 14th, > 21st and 28th from April with 28th April week carrying into May, and then > 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May is > not a whole week and will not be counted. So the two "dates" being passed are actually months? The calendar module already suggested should be useful for this. -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
FWIW most of the objections below also apply to JSON, so this doesn't just have to be about repr/literal_eval. I'm definitely a huge proponent of widespread use of something like protocol buffers, both for production code and personal hacky projects. On Wed, Jun 10, 2015 at 2:36 AM, Steven D'Aprano wrote: > On Wednesday 10 June 2015 14:48, Devin Jeanpierre wrote: > > [...] >> and literal_eval is not a great idea. >> >> * the common serializer (repr) does not output a canonical form, and >> can serialize things in a way that they can't be deserialized > > For literals, the canonical form is that understood by Python. I'm pretty > sure that these have been stable since the days of Python 1.0, and will > remain so pretty much forever: The problem is that there are two different ways repr might write out a dict equal to {'a': 1, 'b': 2}. This can make tests brittle -- e.g. it's why doctest fails badly at examples involving dictionaries. Text format protocol buffers output everything sorted, so that you can do textual diffs for compatibility tests and such. At work, one thing we do in places is mock out services using "golden" expected protobuf responses, so that you can test that the server returns exactly that, and test what the client does with that, separately. These are checked into perforce in text format. >> * there is no schema >> * there is no well understood migration story for when the data you >> load and store changes > > literal_eval is not a serialisation format itself. It is a primitive > operation usable when serialising. E.g. you might write out a simple Unix- > style rc file of key:value pairs: > -snip- > > split on "=" and call literal_eval on the value. > > This is a perfectly reasonable light-weight solution for simple > serialisation needs. I could spend a bunch of time writing yet another config file format, or I could use text format protocol buffers, YAML, or TOML and call it a day. >> * it encourages the use of eval when literal_eval becomes inconvenient >> or insufficient > > I don't think so. I think that people who make the effort to import ast and > call ast.literal_eval are fully aware of the dangers of eval and aren't > silly enough to start using eval. The problem is when you have your config file format using python literals, and another programmer wants to deal with it and doesn't look at your codebase, and things like that. When transferring data, this can happen a lot, since you are often not the user of the data you wrote, and you can't control how others consume it. They might use eval even if you didn't mean for them to. For example, in JavaScript, this was once a common problem for services exposing JSON, and it still happens even now. >> * It is not particularly well specified or documented compared to the >> alternatives. >> * The types you get back differ in python 2 vs 3 > > Doesn't matter. The type you *write* are different in Python 2 vs 3, so of > course you do. In a shared 2/3 codebase, if I write bytes I expect to get bytes, and if I write unicode I expect to get unicode. (There is a third category of thing, which should be bytes on 2.x and string on 3.x, but it's probably best to handle that outside of the deserializer). If you thread it through repr and literal_eval using different versions for each, unicode in python 3 becomes bytes in python 2, and vice versa. So it makes migrating to Python 3 even harder. >> For most apps, the alternatives are better. Irmen's serpent library is >> strictly better on every front, for example. (Except potentially >> security, who knows.) > > Beyond simple needs, like rc files, literal_eval is not sufficient. You > can't use it to deserialise arbitrary objects. That might be a feature, but > if you need something more powerful than basic ints, floats, strings and a > few others, literal_eval will not be powerful enough. No, it is powerful enough. After all, JSON has the same limitations. Protobuf only adds enums and structs to JSON's types, and it's potentially the most-used serialization format in the world by operations per second. Serialization libraries/formats usually need handholding to serialize complex Python objects into simple serializable types. [Except pickle, and that's the very reason it's insecure (per previous discussion in thread.)] -- Devin -- https://mail.python.org/mailman/listinfo/python-list
Re: Permission to Showcase the Python Program
On 6/9/2015 10:19 AM, Leslie Bush wrote: I’m having trouble reaching an actual human at your organization > as my emails get bounced back. If you sent email to p...@python.org it should not have bounced, as according to https://mail.python.org/mailman/listinfo/python-legal-sig that is the 'legal email address'. But I do not think it really matters for your purpose. My name is Leslie Bush and I am the Intellectual Property Coordinator > for The Great Courses. We produce non-credit, college-level educational > programs on DVD and electronic formats in a lecture series. > The lectures are recorded and then sold to the general public for-profit. > As a didactic tool for enhancing the programs, we include in the lectures > visual elements illustrating works of art, people, events, locations, etc I am a python core developer, a member of PSF, but otherwise have no official position. As a parent, I am familiar with The Great Courses, having used 2 for home schooling. They were very helpful. We are currently producing a course with Dr. John Keyser, Professor of > Computer Science at the University of North Carolina Chapel Hill, entitled > “Computer Science for Everyone: Programming Concepts and Exercises” > The professor would like to use your Python program to showcase in the course. Great. Most of us would consider it silly for him not to. Most of us would also urge that he use Python 3 rather than Python 2, if he is not already. > I am writing to see if we may have permission to do so. If permission > is granted, we will have a copy of our license agreement sent over to > your company’s authorizer for review and a signature. I am not a lawyer, but am 99.990% (and I am not exaggerating) sure that the Python license already gives you the permission you need. Our documentation page is at https://docs.python.org/3/ Clicking the History and License of Python link takes you to https://docs.python.org/3/license.html Skip to Terms and conditions for accessing or otherwise using Python and read PSF LICENSE AGREEMENT FOR PYTHON 3.4.3. It is about as liberal as can be. It was written by the PSF lawyer and is intended to be a blanket grant of permissiond, so that the PSF will not need to employ an 'authorizer' to review and sign permissions. This is standard for open-source software. Companies routinely use Python and write their own public and proprietary code. People write and sell books about Python or that reference Python. Universities teach courses that include Python. People post videos about Python. We WANT people to do all of these things. They all do it without further agreements and signatures beyond what is openly published. I suspect that some companies paranoid about the possibility of lawsuit may have their own lawyer review the license agreement to make sure it means what it seems to say and that their use falls within the license. You are free to do the same if worried. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Did the 3.4.4 docs get published early?
On 6/10/2015 10:11 AM, Nicholas Chammas wrote: For example, here is a "New in version 3.4.4" method: https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future However, the latest release appears to be 3.4.3: https://www.python.org/downloads/ Is this normal, or did the 3.4.4 docs somehow get published early by mistake? The online x.y docs reflect the x.y branch in the repository. New features are not normally added in an x.y.z maintenance release, which is normally bugfixes only. However, asyncio is a new module in 3.4 and marked as 'provisional', which means subject to change during the 3.4 series of releases. Idle is also exceptional in getting uncategorized changes in maintenance releases, so each x.y.z release needs a copy of the Idle doc chapter that is both up-to-date and frozen as of x.y.z. Making this happen is still a work-in-progress. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On 6/10/2015 6:10 PM, Devin Jeanpierre wrote: The problem is that there are two different ways repr might write out a dict equal to {'a': 1, 'b': 2}. This can make tests brittle Not if one compares objects rather than string representations of objects. I am strongly of the view that code and tests should be written to directly compare objects as much as possible. it's why doctest fails badly at examples involving dictionaries. or sets or addresses or object ids or locale-dependent strings or random numbers or values dependent on random numbers. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
Robert Kern wrote: To allow people to write their own types that can be serialized, you have to let them specify arbitrary callables that will do the reconstruction. If you whitelist the possible reconstruction callables, you have greatly restricted the types that can participate in the serialization system. If whitelisting a type is the *only* thing you need to do to make it serialisable, I think that comes close enough to the stated goal of being able to "serialise all [potentially serialisable] language objects". Having to be explicit about which types are deserialisable is probably a good thing anyway. It gives you an opportunity to specify the mapping between the external format and class names, so that your serialised data doesn't contain assumptions about implementation details of your program. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy wrote: > On 6/10/2015 6:10 PM, Devin Jeanpierre wrote: > >> The problem is that there are two different ways repr might write out >> a dict equal to {'a': 1, 'b': 2}. This can make tests brittle > > > Not if one compares objects rather than string representations of objects. > I am strongly of the view that code and tests should be written to directly > compare objects as much as possible. For serialization formats that always output the same string for the same data (like text format protos), there is no practical difference between the two, except that if you're comparing text, you can easily supply a diff to update one to match the other. -- Devin -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Wed, Jun 10, 2015 at 4:39 PM, Devin Jeanpierre wrote: > On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy wrote: >> On 6/10/2015 6:10 PM, Devin Jeanpierre wrote: >> >>> The problem is that there are two different ways repr might write out >>> a dict equal to {'a': 1, 'b': 2}. This can make tests brittle >> >> >> Not if one compares objects rather than string representations of objects. >> I am strongly of the view that code and tests should be written to directly >> compare objects as much as possible. > > For serialization formats that always output the same string for the > same data (like text format protos), there is no practical difference > between the two, except that if you're comparing text, you can easily > supply a diff to update one to match the other. Ugh, there's also the fiddly difference between what goes in and what you read. A serialized data structure might contain lots of data that is ignored by the deserializer (in protobuf), or it might contain data which can't be loaded by the deserializer or produces weird / incorrect results. Being able to inspect and test the serialized data separately from the deserialized data is useful in that regard, so that you know where the failure lies, but it's sort of fuzzy. Some examples of where this crops up: pickles after you've moved a class, JSON encoders that try to be clever and output invalid JSON, protocol buffers with unexpected fields. Overall, though, the diff thing is probably the bigger reason everyone wants to do this sort of thing with serialized data. If you do it right and are principled about it, I don't see a problem with it. -- Devin -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On 6/10/2015 7:39 PM, Devin Jeanpierre wrote: On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy wrote: On 6/10/2015 6:10 PM, Devin Jeanpierre wrote: The problem is that there are two different ways repr might write out a dict equal to {'a': 1, 'b': 2}. This can make tests brittle You commented about *tests* Not if one compares objects rather than string representations of objects. I am strongly of the view that code and tests should be written to directly compare objects as much as possible. I responded about *tests* For serialization formats that always output the same string for the same data (like text format protos), there is no practical difference between the two, except that if you're comparing text, you can easily supply a diff to update one to match the other. Serialization is a different issue. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Thu, Jun 11, 2015 at 8:10 AM, Devin Jeanpierre wrote: > The problem is that there are two different ways repr might write out > a dict equal to {'a': 1, 'b': 2}. This can make tests brittle -- e.g. > it's why doctest fails badly at examples involving dictionaries. Text > format protocol buffers output everything sorted, so that you can do > textual diffs for compatibility tests and such. With Python's JSON module [1], you can pass sort_keys=True to stipulate that the keys be lexically ordered, which should make the output "canonical". Pike's Standards.JSON.encode() [2] can take a flag value to canonicalize the output, which currently has the same effect (sort mappings by their indices). I did a quick check for Ruby and didn't find anything in its standard library JSON module, but knowing Ruby, it'll be available somewhere in a gem. A web search for 'perl json' brought up a CPAN link [4] that has a canonicalize option for sorting by keys. So that's three out of four definite, one uncertain, where it's pretty easy to ensure that you get byte-for-byte identical output from a JSON encoder. Even though failing doctests are a separate problem, it's useful to have canonical output. Your diffs get less noisy, for instance. Coupled with a human-readability flag (eg "indent=4" in Python, "Standards.JSON.HUMAN_READABLE" in Pike) that splits the result over multiple lines, it can make a pretty easy to diff file. Definitely worth doing... and definitely worth using a JSON encoder rather than repr(). ChrisA [1] https://docs.python.org/3/library/json.html#json.dump [2] http://pike.lysator.liu.se/generated/manual/modref/ex/predef_3A_3A/Standards/JSON.html [3] http://ruby-doc.org/stdlib-2.0.0/libdoc/json/rdoc/JSON.html [4] http://search.cpan.org/~makamaka/JSON-2.90/lib/JSON.pm#PERL_-%3E_JSON -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Wed, Jun 10, 2015 at 4:46 PM, Terry Reedy wrote: > On 6/10/2015 7:39 PM, Devin Jeanpierre wrote: >> >> On Wed, Jun 10, 2015 at 4:25 PM, Terry Reedy wrote: >>> >>> On 6/10/2015 6:10 PM, Devin Jeanpierre wrote: >>> The problem is that there are two different ways repr might write out a dict equal to {'a': 1, 'b': 2}. This can make tests brittle > > > You commented about *tests* > >>> Not if one compares objects rather than string representations of >>> objects. >>> I am strongly of the view that code and tests should be written to >>> directly >>> compare objects as much as possible. > > > I responded about *tests* > >> For serialization formats that always output the same string for the >> same data (like text format protos), there is no practical difference >> between the two, except that if you're comparing text, you can easily >> supply a diff to update one to match the other. > > > Serialization is a different issue. Yes, tests of code that uses serialization (caching, RPCs, etc.). I mentioned above a sort of test that divides tests of a client and server along RPC boundaries by providing fake queries and responses, and testing that those are the queries and responses given by the client and server. This way you don't need to actually start the client and server to test them both and their interactions. This is one example, there are other uses, but they go along the same lines. For example, one can also imagine testing that a serialized structure is identical across version changes, so that it's guaranteed to be forwards/backwards compatible. It is not enough to test that the deserialized form is, because it might differ substantially, as long as the communicated serialized structure is the same. -- Devin -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Thu, Jun 11, 2015 at 3:11 AM, Steven D'Aprano wrote: > (Oh, and for the record, there are at least two non-breaking spaces in > Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".) > > http://www.unicode.org/charts/PDF/U0080.pdf > http://www.unicode.org/charts/PDF/U2000.pdf And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've been fighting with VLC Media Player over the font it uses for subtitles; for some bizarre reason, that font represents U+FEFF not with zero pixels of emptiness, but with a box containing the letters "ZWN" "BSP" on two lines. Yeah, because that totally takes up zero width and looks like blank space. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Wed, Jun 10, 2015, at 19:30, Gregory Ewing wrote: > If whitelisting a type is the *only* thing you need to > do to make it serialisable, I think that comes close > enough to the stated goal of being able to "serialise > all [potentially serialisable] language objects". IMO the serialization framework should handle this by providing your own way to look them up (almost but not entirely unlike providing your own globals table to eval) rather than by having a whitelist. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Wed, Jun 10, 2015, at 20:09, Chris Angelico wrote: > And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as > the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've > been > fighting with VLC Media Player over the font it uses for subtitles; for > some bizarre reason, that font represents U+FEFF not with zero pixels of > emptiness, but with a box containing the letters "ZWN" "BSP" on two > lines. > Yeah, because that totally takes up zero width and looks like blank > space. As I understand it, the proper behavior is that the ZWNBSP that is the byte order mark shall never appear in an in-memory representation of the first line of a BOM-encoded file, or any other line of the concatenation of two BOM-encoded files, but should "vanish" when the file is opened and first read from. So it shouldn't be showing up in your subtitles regardless of its rendering behavior. The real world, needless to say, isn't so nice. IIRC there's also a font in MS windows that uses various glyphs which are zero-width, but are not blank, to represent ZWJ, ZWNJ, RLM, and LRM. Good for seeing what is happening, bad for actually rendering text that's intended to contain these characters. Though there's another argument that ideally a rendering engine should not render any such glyph unless something like "visible controls" has been selected (the real world, again, isn't so nice, which is why most symbols intended for visible control style rendering have their own distinct code points rather than using those of the control characters they represent). -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Thu, Jun 11, 2015 at 11:02 AM, wrote: > > On Wed, Jun 10, 2015, at 20:09, Chris Angelico wrote: > > And U+FEFF "ZERO WIDTH NO-BREAK SPACE", notable because it's also used as > > the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've > > been > > fighting with VLC Media Player over the font it uses for subtitles; for > > some bizarre reason, that font represents U+FEFF not with zero pixels of > > emptiness, but with a box containing the letters "ZWN" "BSP" on two > > lines. > > Yeah, because that totally takes up zero width and looks like blank > > space. > > As I understand it, the proper behavior is that the ZWNBSP that is the > byte order mark shall never appear in an in-memory representation of the > first line of a BOM-encoded file, or any other line of the concatenation > of two BOM-encoded files, but should "vanish" when the file is opened > and first read from. So it shouldn't be showing up in your subtitles > regardless of its rendering behavior. It's a perfectly valid character for other purposes; it's coming up in the middle of pieces of text, which should be 100% legal. No, it's a font problem. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
yes just whole weeks given any two months, I did looked into calendar module but couldn't find specifically what i need. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Thu, 11 Jun 2015 10:09 am, Chris Angelico wrote: > On Thu, Jun 11, 2015 at 3:11 AM, Steven D'Aprano > wrote: >> (Oh, and for the record, there are at least two non-breaking spaces in >> Unicode, U+00A0 "NO-BREAK SPACE" and U+202F "NARROW NO-BREAK SPACE".) >> >> http://www.unicode.org/charts/PDF/U0080.pdf >> http://www.unicode.org/charts/PDF/U2000.pdf > > And U+FEFF "ZERO WIDTH NO-BREAK SPACE", No, despite the name, that is not a space character, it is a formatting character. Due to Unicode's stability policy, the name is stuck forever, but it should not be treated as a space character: py> unicodedata.category(' ') 'Zs' py> unicodedata.category('\u00A0') # NBSP 'Zs' py> unicodedata.category('\uFEFF') # ZWNBSP 'Cf' Ideally, outside of the BOM, you should never come across a ZWNBSP. You should use U+2060 WORD JOINER instead. But if you do come across one outside of the BOM, it should be treated as a legitimate non-space character: http://www.unicode.org/faq/utf_bom.html#bom6 Although ZWNBSP is a "default ignorable" code point, I believe that the font is well within its rights to show it with a visible glyph: "Fonts can contain glyphs intended for visible display of default ignorable code points that would otherwise be rendered invisibly when not supported." http://www.unicode.org/faq/unsup_char.html > notable because it's also used as > the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've > been fighting with VLC Media Player over the font it uses for subtitles; > for some bizarre reason, that font represents U+FEFF not with zero pixels > of emptiness, but with a box containing the letters "ZWN" "BSP" on two > lines. Yeah, because that totally takes up zero width and looks like blank > space. Why do the subtitles contain ZWNBSP in the first place? Surely they're not English subtitles? -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Thu, Jun 11, 2015 at 12:26 PM, Steven D'Aprano wrote: > No, despite the name, that is not a space character, it is a formatting > character. Due to Unicode's stability policy, the name is stuck forever, > but it should not be treated as a space character: > > py> unicodedata.category(' ') > 'Zs' > py> unicodedata.category('\u00A0') # NBSP > 'Zs' > py> unicodedata.category('\uFEFF') # ZWNBSP > 'Cf' > > > Ideally, outside of the BOM, you should never come across a ZWNBSP. You > should use U+2060 WORD JOINER instead. But if you do come across one > outside of the BOM, it should be treated as a legitimate non-space > character: > > http://www.unicode.org/faq/utf_bom.html#bom6 > > Although ZWNBSP is a "default ignorable" code point, I believe that the font > is well within its rights to show it with a visible glyph: > > "Fonts can contain glyphs intended for visible display of > default ignorable code points that would otherwise be > rendered invisibly when not supported." > > http://www.unicode.org/faq/unsup_char.html Huh. Okay, my bad. I was under the impression that it was supposed to take up no width, as the name implies, but stability trumps logic sometimes. Learn something new every day. >> notable because it's also used as >> the byte-order mark (as its counterpart, U+FFFE, is unallocated). I've >> been fighting with VLC Media Player over the font it uses for subtitles; >> for some bizarre reason, that font represents U+FEFF not with zero pixels >> of emptiness, but with a box containing the letters "ZWN" "BSP" on two >> lines. Yeah, because that totally takes up zero width and looks like blank >> space. > > Why do the subtitles contain ZWNBSP in the first place? Surely they're not > English subtitles? No, they're not :) The character comes up in the Cantonese and Japanese subs for Once Upon A December. http://youtu.be/CEpcUeWP0bg http://youtu.be/WFZAaHrHens Possibly some others in the series as well. It may well be a fault in the subtitles, but most programs I've seen don't show U+FEFF as a big fat box. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Wed, Jun 10, 2015, at 23:05, Chris Angelico wrote: > http://youtu.be/CEpcUeWP0bg > http://youtu.be/WFZAaHrHens An example of the actual subtitle text would be more useful than a youtube link to the video, since we're unlikely to be able to see what context the character appears in if our client doesn't show it. (I don't think the default youtube player does). And you haven't even included a time code. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On 06/10/2015 02:11 PM, Sebastian M Cheung via Python-list wrote: > On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote: >> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and May >> would be 5th, 12th and 19th. So expecting 7 whole weeks in total > > What I mean is given two dates I want to find WHOLE weeks, so if given the > 2014 calendar and function has two inputs (4th and 5th month) then 7th, 14th, > 21st and 28th from April with 28th April week carrying into May, and then > 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May is > not a whole week and will not be counted. > > Hope thats clear. I think Joel had the right idea. First calculate the rough number of weeks by taking the number of days between the date and divide by seven. Then check to see what the start date's day of week is, and adjust the rough week count down by one if it's not the first day of the week. I'm not sure if you have to check the end date's day of week or not. I kind of think checking the first one only is sufficient, but I could be wrong. You'll have to code it up and test it, which I assume you've been doing up to this point, even though you haven't shared any code. -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
On Thu, 11 Jun 2015 08:10 am, Devin Jeanpierre wrote: [...] >> For literals, the canonical form is that understood by Python. I'm pretty >> sure that these have been stable since the days of Python 1.0, and will >> remain so pretty much forever: > > The problem is that there are two different ways repr might write out > a dict equal to {'a': 1, 'b': 2}. This can make tests brittle -- e.g. > it's why doctest fails badly at examples involving dictionaries. Only if they are badly written. Yes, dicts are *less convenient* for doctests, but if they fail, the blame is on the author of the tests themselves, not doctest. Unordered output is not a problem for dicts, because dicts also have unordered *input*. It doesn't matter whether you input {'a':1,'b':2} or {'b':2,'a':1}, you will get the same dict either way. [...] > I could spend a bunch of time writing yet another config file format, > or I could use text format protocol buffers, YAML, or TOML and call it > a day. Writing a rc parser is so trivial that it's almost easier to just write it than it is to look up the APIs for YAML or JSON, to say nothing of the rigmarole of defining a protocol buffer config file, compiling it, importing the module, and using that. def read(configfile): config = collections.OrderedDict() with open(configfile) as f: for line in f: line = line.strip() if line.startswith('#"): continue key, value = line.split("=", 1) key = key.rstrip() value = value.lstrip() config[key] = ast.literal_eval(value) return config That's a basic, *but acceptable*, rc parser written in literally under a minute. At the risk of ending up with egg on my face, I reckon that it's so simple and so obviously correct that I can tell it works correctly without even testing it. (Famous last words, huh?) Unlike any of the richer, more powerful serialisation formats like YAML, JSON, or protocol buffer, its not only human readable but human writable too. By which I mean, while it is *possible* for a sufficiently motivated person to write correctly formatted JSON, YAML or even XML, it's not really something you would choose to do willingly. But Unix sys admins hand-edit rc files every day. But of course this also means it's less powerful and can deal with few types of data. Power comes at a cost of complexity, and simplicity itself can be a virtue. I wouldn't use JSON etc. for config files until I was sure that a simpler INI or RC file wasn't sufficient for my needs. Some how I have drifted away from serialisation in general to specifically config files... never mind. [...] > The problem is when you have your config file format using python > literals, and another programmer wants to deal with it and doesn't > look at your codebase, and things like that. When transferring data, > this can happen a lot, since you are often not the user of the data > you wrote, and you can't control how others consume it. Not only can I not control how they consume it, but I don't care how they consume it :-) I hear what you are saying, and I don't disagree with it. I'm just standing up for simplicity as a virtue when appropriate. If I'm writing a script to save a bunch of values to pass to another script after some human editing, it's faster for me to just write out the key:value pairs than it is to learn how to use protocol buffer, deal with a separate compilation step, etc. It's actually easier to write out, and read in, the key:values than to use the configfile module. If you don't need multiple sections, default values, or variable interpolation, even configparser is overkill. But if I'm swapping data with others, or if I have to use a richer set of types or functionality, then naturally I'm going to need something more powerful, preferably something standard so I don't have to document the internal format, just say "use XML with this schema" or whatever. > They might use > eval even if you didn't mean for them to. For example, in JavaScript, > this was once a common problem for services exposing JSON, and it > still happens even now. If they choose to use eval, *that's not my fault*. You can't stop them from deserialising your data and then passing any and all strings to eval, so why should I be expected to stop them from something similar? [...] >> Beyond simple needs, like rc files, literal_eval is not sufficient. You >> can't use it to deserialise arbitrary objects. That might be a feature, >> but if you need something more powerful than basic ints, floats, strings >> and a few others, literal_eval will not be powerful enough. > > No, it is powerful enough. After all, JSON has the same limitations. In the sense that you can build arbitrary objects from a combination of a few basic types, yes, literal_eval is "powerful enough" if you are prepared to re-invent JSON, YAML, or protocol buffer. But I'm not talking about re-inventing what already exists. If I want
Re: Python NBSP DWIM
On Thu, Jun 11, 2015 at 1:18 PM, wrote: > On Wed, Jun 10, 2015, at 23:05, Chris Angelico wrote: >> http://youtu.be/CEpcUeWP0bg >> http://youtu.be/WFZAaHrHens > > An example of the actual subtitle text would be more useful than a > youtube link to the video, since we're unlikely to be able to see what > context the character appears in if our client doesn't show it. (I don't > think the default youtube player does). And you haven't even included a > time code. Unfortunately I can't really offer anything better, as the text I saw was after a lot of processing (youtube-dl, then some other post-processing), and I don't actually remember which file it was that bugged me about this, now. But the subs/annotations (visible in the default player if you turn on "Subtitles" down the bottom) do include U+FEFF; in each case, it's on the very last line of the song, although that's not where I remember it occurring. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote: [...] >> Why do the subtitles contain ZWNBSP in the first place? Surely they're >> not English subtitles? > > No, they're not :) The character comes up in the Cantonese and > Japanese subs for Once Upon A December. > > http://youtu.be/CEpcUeWP0bg > http://youtu.be/WFZAaHrHens > > Possibly some others in the series as well. It may well be a fault in > the subtitles, but most programs I've seen don't show U+FEFF as a big > fat box. I think that for backwards compatibility, applications (or fonts) are permitted to treat U+FEFF as a zero-width invisible character, so perhaps you can raise a feature request with VLC. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On Thu, Jun 11, 2015 at 1:19 PM, Michael Torrie wrote: > I think Joel had the right idea. First calculate the rough number of > weeks by taking the number of days between the date and divide by seven. > Then check to see what the start date's day of week is, and adjust the > rough week count down by one if it's not the first day of the week. I'm > not sure if you have to check the end date's day of week or not. I kind > of think checking the first one only is sufficient, but I could be > wrong. You'll have to code it up and test it, which I assume you've > been doing up to this point, even though you haven't shared any code. Alternatively, you could start by rounding the start date up to the next week boundary, then round the end date down to the previous week boundary, and then calculate from there. Something like this: >>> start = datetime.date(2015, 1, 4) >>> end = datetime.date(2015, 4, 2) >>> start += datetime.timedelta(7-start.isoweekday()) >>> end -= datetime.timedelta(end.isoweekday() % 7) Now both dates represent Sundays. If either already did, it hasn't been changed. >>> (end - start).days//7 12 There are twelve complete Sunday-to-Sunday weeks (plus any loose days either end) between the original dates. Depending on your definition of "complete week", you may need to adjust this code some. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On Wed, Jun 10, 2015 at 8:01 PM, Sebastian M Cheung via Python-list wrote: > yes just whole weeks given any two months, I did looked into calendar module > but couldn't find specifically what i need. >>> cal.monthdays2calendar(2014, 4) + cal.monthdays2calendar(2014, 5) [[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)], [(7, 0), (8, 1), (9, 2), (10, 3), (11, 4), (12, 5), (13, 6)], [(14, 0), (15, 1), (16, 2), (17, 3), (18, 4), (19, 5), (20, 6)], [(21, 0), (22, 1), (23, 2), (24, 3), (25, 4), (26, 5), (27, 6)], [(28, 0), (29, 1), (30, 2), (0, 3), (0, 4), (0, 5), (0, 6)], [(0, 0), (0, 1), (0, 2), (1, 3), (2, 4), (3, 5), (4, 6)], [(5, 0), (6, 1), (7, 2), (8, 3), (9, 4), (10, 5), (11, 6)], [(12, 0), (13, 1), (14, 2), (15, 3), (16, 4), (17, 5), (18, 6)], [(19, 0), (20, 1), (21, 2), (22, 3), (23, 4), (24, 5), (25, 6)], [(26, 0), (27, 1), (28, 2), (29, 3), (30, 4), (31, 5), (0, 6)]] You just need to: 1) Trim the first and last weeks off since they contain invalid dates. 2) Merge the overlapping last week of April and first week of May. 3) Count the resulting number of weeks in the list. Alternatively, the dateutil.rrule module could probably be used to do this fairly easily, but it's a third-party module and not part of the standard library. https://labix.org/python-dateutil -- https://mail.python.org/mailman/listinfo/python-list
Re: Python NBSP DWIM
On Thu, Jun 11, 2015 at 1:27 PM, Steven D'Aprano wrote: > On Thu, 11 Jun 2015 01:05 pm, Chris Angelico wrote: > [...] >>> Why do the subtitles contain ZWNBSP in the first place? Surely they're >>> not English subtitles? >> >> No, they're not :) The character comes up in the Cantonese and >> Japanese subs for Once Upon A December. >> >> http://youtu.be/CEpcUeWP0bg >> http://youtu.be/WFZAaHrHens >> >> Possibly some others in the series as well. It may well be a fault in >> the subtitles, but most programs I've seen don't show U+FEFF as a big >> fat box. > > I think that for backwards compatibility, applications (or fonts) are > permitted to treat U+FEFF as a zero-width invisible character, so perhaps > you can raise a feature request with VLC. Yeah. Well, like I said - learn something new every day. I didn't know it wasn't a bug. (Though it'd still be a font issue, not a VLC one. With other fonts, it comes up looking different, in some cases invisible. Unfortunately, the fonts that look good aren't the fonts that have glyphs for all characters, so I need to figure out why font substitution isn't working right. But that's a separate issue.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to find number of whole weeks between dates?
On Wed, Jun 10, 2015 at 9:19 PM, Michael Torrie wrote: > On 06/10/2015 02:11 PM, Sebastian M Cheung via Python-list wrote: >> On Wednesday, June 10, 2015 at 6:06:09 PM UTC+1, Sebastian M Cheung wrote: >>> Say in 2014 April to May whole weeks would be 7th, 14th 28th April and May >>> would be 5th, 12th and 19th. So expecting 7 whole weeks in total >> >> What I mean is given two dates I want to find WHOLE weeks, so if given the >> 2014 calendar and function has two inputs (4th and 5th month) then 7th, >> 14th, 21st and 28th from April with 28th April week carrying into May, and >> then 5th, 12th and 19th May to give total of 7 whole weeks, because 26th May >> is not a whole week and will not be counted. >> >> Hope thats clear. > > I think Joel had the right idea. First calculate the rough number of > weeks by taking the number of days between the date and divide by seven. > Then check to see what the start date's day of week is, and adjust the > rough week count down by one if it's not the first day of the week. I'm > not sure if you have to check the end date's day of week or not. I kind > of think checking the first one only is sufficient, but I could be > wrong. You'll have to code it up and test it, which I assume you've > been doing up to this point, even though you haven't shared any code. I don't think the logic is quite right. Consider: >>> cal = calendar.TextCalendar() >>> print(cal.formatmonth(2014, 6)) June 2014 Mo Tu We Th Fr Sa Su 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 >>> date(2014, 7, 1) - date(2014, 6, 1) datetime.timedelta(30) >>> _.days // 7 - 1 3 -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
Devin Jeanpierre : > For example, one can also imagine testing that a serialized structure > is identical across version changes, so that it's guaranteed to be > forwards/backwards compatible. It is not enough to test that the > deserialized form is, because it might differ substantially, as long > as the communicated serialized structure is the same. There are merits to canonical serialization formats, but that approach to testing is far too simplistic. A test case should accept all observed behavior that is correct. Marko -- https://mail.python.org/mailman/listinfo/python-list
New Python student needs help with execution
I installed 2.7.9 on a Win8.1 machine. The Coursera instructor did a simple install then executed Python from a file in which he'd put a simple hello world script. My similar documents folder cannot see the python executable. How do I make this work? -- https://mail.python.org/mailman/listinfo/python-list
Re: enhancement request: make py3 read/write py2 pickle format
Snipped aplenty. On Wed, Jun 10, 2015 at 8:21 PM, Steven D'Aprano wrote: > On Thu, 11 Jun 2015 08:10 am, Devin Jeanpierre wrote: > [...] >> I could spend a bunch of time writing yet another config file format, >> or I could use text format protocol buffers, YAML, or TOML and call it >> a day. > > Writing a rc parser is so trivial that it's almost easier to just write it > than it is to look up the APIs for YAML or JSON, to say nothing of the > rigmarole of defining a protocol buffer config file, compiling it, > importing the module, and using that. > -snip > > That's a basic, *but acceptable*, rc parser written in literally under a > minute. At the risk of ending up with egg on my face, I reckon that it's so > simple and so obviously correct that I can tell it works correctly without > even testing it. (Famous last words, huh?) I won't try to egg you. That said, you have to write tests. Also, everyone who uses it has to learn the format and API, and it may have corner cases you aren't aware of, it has to get ported to python 3 if you wrote it for python 2, the parsing errors are obscure and might need improvement, and so on. There's a place for this, but I suspect it is small compared to the place where it seemed like a good idea at the time. >>> Beyond simple needs, like rc files, literal_eval is not sufficient. You >>> can't use it to deserialise arbitrary objects. That might be a feature, >>> but if you need something more powerful than basic ints, floats, strings >>> and a few others, literal_eval will not be powerful enough. >> >> No, it is powerful enough. After all, JSON has the same limitations. > > In the sense that you can build arbitrary objects from a combination of a > few basic types, yes, literal_eval is "powerful enough" if you are prepared > to re-invent JSON, YAML, or protocol buffer. > > But I'm not talking about re-inventing what already exists. If I want JSON, > I'll use JSON, not spend weeks or months re-writing it from scratch. I > can't do this: > > class MyClass: > pass > > a = MyClass() > serialised = repr(a) > b = ast.literal_eval(serialised) > assert a == b I don't understand. You can't do that in JSON, YAML, XML, or protocol buffers, either. They only provide a small set of types, comparable to (but smaller) than the set of types you get from literal_eval/repr. -- Devin -- https://mail.python.org/mailman/listinfo/python-list