Re: Looking up a dictionary _key_ by key?

2015-06-24 Thread Marko Rauhamaa
Ian Kelly :

> I don't think that it's fundamentally broken. A simple example would
> be the int 3, vs. the float 3, vs. the Decimal 3. All of them compare
> equal to one another, but they are distinct values, and sometimes it
> might be useful to be able to determine which one is actually a key in
> the dict.

One possibility is to enter the key on the value side as well:

d[key] = (key, value)

...

canonical_key, value = d[key]


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


rhythmbox plugin problem

2015-06-24 Thread L E
Hello,

I am trying to get some old plugins I wrote to wrote on anewer version of
rhythmbox.

When I try to load the plugin I see:

(rhythmbox:3092): libpeas-WARNING **: nowplaying-lcd:
/usr/lib/rhythmbox/plugins/nowplaying-lcd/libnowplaying-lcd.so: cannot open
shared object file: No such file or directory

(rhythmbox:3092): libpeas-WARNING **: Could not load plugin module:
'nowplaying-lcd'


any ideas about what is going on here?

thanks.
-- 
https://mail.python.org/mailman/listinfo/python-list


Pure Python Data Mangling or Encrypting

2015-06-24 Thread Randall Smith
Chunks of data (about 2MB) are to be stored on machines using a 
peer-to-peer protocol.  The recipient of these chunks can't assume that 
the payload is benign.  While the data senders are supposed to encrypt 
data, that's not guaranteed, and I'd like to protect the recipient 
against exposure to nefarious data by mangling or encrypting the data 
before it is written to disk.


My original idea was for the recipient to encrypt using AES.  But I want 
to keep this software pure Python "batteries included" and not require 
installation of other platform-dependent software.  Pure Python AES and 
even DES are just way too slow.  I don't know that I really need 
encryption here, but some type of fast mangling algorithm where a bad 
actor sending a payload can't guess the output ahead of time.


Any ideas are appreciated.  Thanks.

-Randall

--
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Devin Jeanpierre
How about a random substitution cipher? This will be ultra-weak, but
fast (using bytes.translate/bytes.maketrans) and seems to be the kind
of thing you're asking for.

-- Devin

On Tue, Jun 23, 2015 at 12:02 PM, Randall Smith  wrote:
> Chunks of data (about 2MB) are to be stored on machines using a peer-to-peer
> protocol.  The recipient of these chunks can't assume that the payload is
> benign.  While the data senders are supposed to encrypt data, that's not
> guaranteed, and I'd like to protect the recipient against exposure to
> nefarious data by mangling or encrypting the data before it is written to
> disk.
>
> My original idea was for the recipient to encrypt using AES.  But I want to
> keep this software pure Python "batteries included" and not require
> installation of other platform-dependent software.  Pure Python AES and even
> DES are just way too slow.  I don't know that I really need encryption here,
> but some type of fast mangling algorithm where a bad actor sending a payload
> can't guess the output ahead of time.
>
> Any ideas are appreciated.  Thanks.
>
> -Randall
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Looking up a dictionary _key_ by key?

2015-06-24 Thread Mark Lawrence

On 24/06/2015 01:47, Dan Stromberg wrote:


Would I have to do an O(n) search to find my key?



can you use something from here 
https://pypi.python.org/pypi/sortedcontainers/0.9.6 with the bisect module?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Next CodinGame online programming contest on June, 27 - Python available

2015-06-24 Thread Maria Martin
Hi everyone! 

On June 27th, "Code of the Rings", an online coding battle will launch. It's 
Free & open to all. You will have 24 hours to code and optimize your solution 
to a puzzle. 

What will be exciting and fun is that it will be VERY EASY to start and to get 
something that works, but complex to produce the most efficient code... Over 
the 24 hours, you will be able to submit your code as much as you like, 
whenever you like. No restrictions, no obligation :) 

Registration is open: http://www.codingame.com/challenge/code-of-the-rings

- Duration: 24 hours ; 1 game to solve 
- Participation is 100% online and free 
- Over 23 coding languages to choose from including Python
- Prizes to win, and 25+ t-shirts 
- Apply to sponsoring companies offering jobs and internships International 
Leaderboard + Leaderboard by University 

Please, feel free to share the information and your feedback are more than 
welcome! :) 

Hope to see you there and keep coding!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To write headers once with different values in separate row in CSV

2015-06-24 Thread kbtyo
On Tuesday, June 23, 2015 at 3:12:40 PM UTC-4, John Gordon wrote:
> In  Sahlusar 
>  writes:
> 
> > However, when I extrapolate this same logic with a list like:
> 
> > ('Response.MemberO.PMembers.PMembers.Member.CurrentEmployer.EmployerAddress
> > .TimeAtPreviousAddress.', None), where the headers/columns are the first
> > item (only to be written out once) with different values. I receive an
> > output CSV with repeating headers and values all printed in one long string
> 
> First, I would try to determine if the problem is in the makerows()
> function, or if the problem is elsewhere.
> 
> Have you tried creating some dummy data by hand and seeing how makerows()
> handles it?
> 
> (By the way, if your post had included some sample data that illustrates
> the problem, it would have been much easier to figure out a solution.
> Instead, we are left guessing at your XML format, and at the particular
> implementation of flatten_dict().)
> 
> -- 
> John Gordon   A is for Amy, who fell down the stairs
> gor...@panix.com  B is for Basil, assaulted by bears
> -- Edward Gorey, "The Gashlycrumb Tinies"



On Tuesday, June 23, 2015 at 3:12:40 PM UTC-4, John Gordon wrote:
> In  Sahlusar 
>  writes:
> 
> > However, when I extrapolate this same logic with a list like:
> 
> > ('Response.MemberO.PMembers.PMembers.Member.CurrentEmployer.EmployerAddress
> > .TimeAtPreviousAddress.', None), where the headers/columns are the first
> > item (only to be written out once) with different values. I receive an
> > output CSV with repeating headers and values all printed in one long string
> 
> First, I would try to determine if the problem is in the makerows()
> function, or if the problem is elsewhere.
> 
> Have you tried creating some dummy data by hand and seeing how makerows()
> handles it?
>


Yes I did do this.  


> (By the way, if your post had included some sample data that illustrates
> the problem, it would have been much easier to figure out a solution.
> Instead, we are left guessing at your XML format, and at the particular
> implementation of flatten_dict().)

Yes, unfortunately, due to NDA protocols I cannot share this. 
> 
> -- 
> John Gordon   A is for Amy, who fell down the stairs
> gor...@panix.com  B is for Basil, assaulted by bears
> -- Edward Gorey, "The Gashlycrumb Tinies"

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Steven D'Aprano
On Wed, 24 Jun 2015 05:02 am, Randall Smith wrote:

> Chunks of data (about 2MB) are to be stored on machines using a
> peer-to-peer protocol.  The recipient of these chunks can't assume that
> the payload is benign.  While the data senders are supposed to encrypt
> data, that's not guaranteed, and I'd like to protect the recipient
> against exposure to nefarious data by mangling or encrypting the data
> before it is written to disk.

I don't understand how mangling the data is supposed to protect the
recipient. Don't they have the ability unmangle the data, and thus expose
themselves to whatever nasties are in the files?

If not, you can save all that time and effort implementing the peer-to-peer
business and just dump 2MB chunks of random data on their disks.


> My original idea was for the recipient to encrypt using AES.  But I want
> to keep this software pure Python "batteries included" and not require
> installation of other platform-dependent software.  Pure Python AES and
> even DES are just way too slow.  I don't know that I really need
> encryption here, but some type of fast mangling algorithm where a bad
> actor sending a payload can't guess the output ahead of time.

Again, I don't understand your threat model here. Why does the bad actor
need to guess the mangling? Putting on my Black Hat and twirling my
moustache wickedly, I decide to send you a JPG of Goatse. (Don't google
it.) Or, a more serious threat, a zip bomb:

http://www.ghacks.net/2008/07/27/42-kilobytes-unzipped-make-45-petabytes/

or malware of some description. So I P2P you the file. How it gets encrypted
on your disk is irrelevant to me, eventually you're going to unencrypted it
and try to access it.

We need to understand what threat you are defending against before we can
advise you.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To write headers once with different values in separate row in CSV

2015-06-24 Thread kbtyo
On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote:
> On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote:
> 
> > That is not the underlying issue. Any thoughts or suggestions would be
> > very helpful.
> 
> 
> Thank you for spending over 100 lines to tell us what is NOT the underlying
> issue. I will therefore tell you what is NOT the solution to your problem
> (whatever it is, since I can't tell). The solution is NOT to squeeze lemon
> juice into your keyboard.
> 
> If someday you feel like telling us what the issue actually IS, instead of
> what it IS NOT, then perhaps we will have a chance to help you find a
> solution.
> 
> 
> 
> -- 
> Steven

Curious - what should I have provided? Detailed and constructive feedback (like 
your reply to my post regarding importing functions) is more useful than to 
"squeeze lemon juice" into one's keyboard. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Organizing function calls once files have been moved to a directory

2015-06-24 Thread kbtyo
On Tuesday, June 23, 2015 at 10:18:43 PM UTC-4, Steven D'Aprano wrote:
> On Wed, 24 Jun 2015 06:16 am, kbtyo wrote:
> 
> > I am working on a workflow module that will allow one to recursively check
> > for file extensions and if there is a match move them to a folder for
> > processing (parsing, data wrangling etc).
> > 
> > I have a simple search process, and log for the files that are present
> > (see below). However, I am puzzled by what the most efficient
> > method/syntax is to call functions once the selected files have been
> > moved? 
> 
> The most efficient syntax is the regular syntax that you always use when
> calling a file:
> 
> function(arg, another_arg)
> 
> 
> What else would you use?
> 
> 
> > I have the functions and classes written in another file. Should I 
> > import them or should I include them in the same file as the following
> > mini-script?
> 
> That's entirely up to you. Some factors you might consider:
> 
> - Are these functions and classes reusable by other code? then you might
> want to keep them separate in another file, treated as a library, and
> import the library into your application.
> 
> - If you merge the two files together, will it be so big that it is
> difficult to work with? Then don't merge them together. My opinion is that
> the decimal module from the standard library is about as big as a single
> module should every be, and it is almost 6,500 lines. So if your
> application is bigger than that, you might want to split it.
> 
> 
> 
> > Moreover, should I create another log file for processing? If so, what is
> > an idiomatically correct method to do so?
> 
> I don't know. Do you want a second log file? How will it be different from
> the first?
> 
> As for creating another log file, I guess the most correct way to do so
> would be the same way you created the first log file.
> 
> I'm not sure I actually understand your questions so far.
> 
> Some further comments on your code:
> 
> > if __name__ == '__main__':
> > 
> > # The top argument for name in files
> > topdir = '.'
> > dest = 'C:\\Users\\wynsa2\\Desktop\\'
> 
> Rather than escaping backslashes, you can use regular forward slashes:
> 
> dest = 'C:/Users/wynsa2/Desktop/'
> 
> 
> Windows will accept either.
> 
> 
> > extens = ['docs', 'docx', 'pdf'] # the extensions to search for
> > found = {x: [] for x in extens} # lists of found files
> >  
> > # Directories to ignore
> > ignore = ['docs', 'doc', 'py', 'pdf']
> > logname = "file_search.log"
> > print('Beginning search for files in %s' % os.path.realpath(topdir))
> >   
> > # Walk the tree
> > for dirpath, dirnames, files in os.walk(topdir):
> > # Remove directories in ignore
> > # directory names must match exactly!
> > for idir in ignore:
> > if idir in dirnames:
> > dirnames.remove(idir)
> >  
> > # Loop through the file names for the current step
> > for name in files:
> >  #Calling str.rsplit on name then
> > #splits the string into a list (from the right)
> > #with the first argument "."" delimiting it,
> > #and only making as many splits as the second argument (1).
> > #The third part ([-1]) retrieves the last element of the list--we
> > #use this instead of an index of 1 because if no splits are made
> > #(if there is no "."" in name), no IndexError will be raised
> > 
> > ext = name.lower().rsplit('.', 1)[-1]
> 
> The better way to split the extension from the file name is to use
> os.path.splitext(name):
> 
> 
> py> import os
> py> os.path.splitext("this/file.txt")
> ('this/file', '.txt')
> py> os.path.splitext("this/file")  # no extension
> ('this/file', '')
> py> os.path.splitext("this/file.tar.gz")
> ('this/file.tar', '.gz')
> 
> 
> -- 
> Steven



On Tuesday, June 23, 2015 at 10:18:43 PM UTC-4, Steven D'Aprano wrote:
> On Wed, 24 Jun 2015 06:16 am, kbtyo wrote:
> 
> > I am working on a workflow module that will allow one to recursively check
> > for file extensions and if there is a match move them to a folder for
> > processing (parsing, data wrangling etc).
> > 
> > I have a simple search process, and log for the files that are present
> > (see below). However, I am puzzled by what the most efficient
> > method/syntax is to call functions once the selected files have been
> > moved? 
> 
> The most efficient syntax is the regular syntax that you always use when
> calling a file:
> 
> function(arg, another_arg)
> 
> 
> What else would you use?
> 
> 
> > I have the functions and classes written in another file. Should I 
> > import them or should I include them in the same file as the following
> > mini-script?
> 
> That's entirely up to you. Some factors you might consider:
> 
> - Are these functions and classes reusable by other code? then you might
> want to keep them separate in another file, treated as a library, and
> import the library into your application.

I think I will do 

Re: To write headers once with different values in separate row in CSV

2015-06-24 Thread Steven D'Aprano
On Wed, 24 Jun 2015 09:37 pm, kbtyo wrote:

> On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote:
>> On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote:
>> 
>> > That is not the underlying issue. Any thoughts or suggestions would be
>> > very helpful.
>> 
>> 
>> Thank you for spending over 100 lines to tell us what is NOT the
>> underlying issue. I will therefore tell you what is NOT the solution to
>> your problem (whatever it is, since I can't tell). The solution is NOT to
>> squeeze lemon juice into your keyboard.
>> 
>> If someday you feel like telling us what the issue actually IS, instead
>> of what it IS NOT, then perhaps we will have a chance to help you find a
>> solution.
>> 
>> 
>> 
>> --
>> Steven
> 
> Curious - what should I have provided? 

To start with, you should tell us what is the problem you are having. You
gave us some code, and then said "That is not the underlying issue". Okay,
so what is the underlying issue? What is the problem you want help solving?

In another post, you responded to John Gordon's question:

# John
Have you tried creating some dummy data by hand and seeing 
how makerows() handles it?


by answering:

Yes I did do this.


Okay. What was the result? Do you want us to guess what result you got?


John also suggested that you provide sample data, and an implementation of
flatten_dict, and your answer is:

Yes, unfortunately, due to NDA protocols I cannot share this.


You don't have to provide your *actual* data. You can provide *sample* data,
that does not contain any of your actual confidential values. If your XML
file looks like this:



   
  Gambardella, Matthew
  XML Developer's Guide
  Computer
  44.95
  2000-10-01
  An in-depth look at creating applications 
  with XML.
   



you can replace the data:



   
  Smith, John
  ABCDEF
  Widgets
  .99
  1900-01-01
  blah blah blah blah
   



You can even change the tags:




   
  Smith, John
  ABCDEF
  Widgets
  .99
  1900-01-01
  blah blah blah blah
   



If you're still worried that the sample XML has the same structure as your
real data, you can remove some fields and add new ones:



   
  ABCDEF
  .99
  1900-01-01
  fe fi fo fum
  blah blah blah blah
   



If you can't share the flatten_dict() function, either: 

(1) get permission to share it from your manager or project leader.
flatten_dict is not a trade secret or valuable in any way, and
half-competent Python programmer can probably come up with two or three
different ways to flatten a dict in five minutes. They're all going to look
more or less the same, because there's only so many ways to flatten a dict.

(2) Or accept that we can't help you, and deal with it on your own.



> Detailed and constructive feedback 
> (like your reply to my post regarding importing functions) is more useful
> than to "squeeze lemon juice" into one's keyboard.

Of course. That is why I said it was NOT the solution. Don't waste your time
squeezing lemon juice over your keyboard, it won't solve your problem.

But you can't expect us to guess what your problem is, or debug code we
can't see, or read your mind and understand your data.

Before you ask any more questions, please read this:

http://sscce.org/



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To write headers once with different values in separate row in CSV

2015-06-24 Thread kbtyo
On Wednesday, June 24, 2015 at 8:38:24 AM UTC-4, Steven D'Aprano wrote:
> On Wed, 24 Jun 2015 09:37 pm, kbtyo wrote:
> 
> > On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote:
> >> On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote:
> >> 
> >> > That is not the underlying issue. Any thoughts or suggestions would be
> >> > very helpful.
> >> 
> >> 
> >> Thank you for spending over 100 lines to tell us what is NOT the
> >> underlying issue. I will therefore tell you what is NOT the solution to
> >> your problem (whatever it is, since I can't tell). The solution is NOT to
> >> squeeze lemon juice into your keyboard.
> >> 
> >> If someday you feel like telling us what the issue actually IS, instead
> >> of what it IS NOT, then perhaps we will have a chance to help you find a
> >> solution.
> >> 
> >> 
> >> 
> >> --
> >> Steven
> > 
> > Curious - what should I have provided? 
> 
> To start with, you should tell us what is the problem you are having. You
> gave us some code, and then said "That is not the underlying issue". Okay,
> so what is the underlying issue? What is the problem you want help solving?
> 
> In another post, you responded to John Gordon's question:
> 
> # John
> Have you tried creating some dummy data by hand and seeing 
> how makerows() handles it?
> 
> 
> by answering:
> 
> Yes I did do this.
> 
> 
> Okay. What was the result? Do you want us to guess what result you got?
> 
> 
> John also suggested that you provide sample data, and an implementation of
> flatten_dict, and your answer is:
> 
> Yes, unfortunately, due to NDA protocols I cannot share this.
> 
> 
> You don't have to provide your *actual* data. You can provide *sample* data,
> that does not contain any of your actual confidential values. If your XML
> file looks like this:
> 
> 
> 
>
>   Gambardella, Matthew
>   XML Developer's Guide
>   Computer
>   44.95
>   2000-10-01
>   An in-depth look at creating applications 
>   with XML.
>
> 
> 
> 
> you can replace the data:
> 
> 
> 
>
>   Smith, John
>   ABCDEF
>   Widgets
>   .99
>   1900-01-01
>   blah blah blah blah
>
> 
> 
> 
> You can even change the tags:
> 
> 
> 
> 
>
>   Smith, John
>   ABCDEF
>   Widgets
>   .99
>   1900-01-01
>   blah blah blah blah
>
> 
> 
> 
> If you're still worried that the sample XML has the same structure as your
> real data, you can remove some fields and add new ones:
> 
> 
> 
>
>   ABCDEF
>   .99
>   1900-01-01
>   fe fi fo fum
>   blah blah blah blah
>
> 
> 
> 
> If you can't share the flatten_dict() function, either: 
> 
> (1) get permission to share it from your manager or project leader.
> flatten_dict is not a trade secret or valuable in any way, and
> half-competent Python programmer can probably come up with two or three
> different ways to flatten a dict in five minutes. They're all going to look
> more or less the same, because there's only so many ways to flatten a dict.
> 
> (2) Or accept that we can't help you, and deal with it on your own.
> 
> 
> 
> > Detailed and constructive feedback 
> > (like your reply to my post regarding importing functions) is more useful
> > than to "squeeze lemon juice" into one's keyboard.
> 
> Of course. That is why I said it was NOT the solution. Don't waste your time
> squeezing lemon juice over your keyboard, it won't solve your problem.
> 
> But you can't expect us to guess what your problem is, or debug code we
> can't see, or read your mind and understand your data.
> 
> Before you ask any more questions, please read this:
> 
> http://sscce.org/
> 
> 
> 
> -- 
> Steven



On Wednesday, June 24, 2015 at 8:38:24 AM UTC-4, Steven D'Aprano wrote:
> On Wed, 24 Jun 2015 09:37 pm, kbtyo wrote:
> 
> > On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote:
> >> On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote:
> >> 
> >> > That is not the underlying issue. Any thoughts or suggestions would be
> >> > very helpful.
> >> 
> >> 
> >> Thank you for spending over 100 lines to tell us what is NOT the
> >> underlying issue. I will therefore tell you what is NOT the solution to
> >> your problem (whatever it is, since I can't tell). The solution is NOT to
> >> squeeze lemon juice into your keyboard.
> >> 
> >> If someday you feel like telling us what the issue actually IS, instead
> >> of what it IS NOT, then perhaps we will have a chance to help you find a
> >> solution.
> >> 
> >> 
> >> 
> >> --
> >> Steven
> > 
> > Curious - what should I have provided? 
> 
> To start with, you should tell us what is the problem you are having. You
> gave us some code, and then said "That is not the underlying issue". Okay,
> so what is the underlying issue? What is the problem you want help solving?
> 
> In another post, you responded to John Gordon's question:
> 
> # John
> Have you tried creating some dummy data by hand and seeing 
> how

Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Grant Edwards
On 2015-06-24, Steven D'Aprano  wrote:
> On Wed, 24 Jun 2015 05:02 am, Randall Smith wrote:
>
>> Chunks of data (about 2MB) are to be stored on machines using a
>> peer-to-peer protocol.  The recipient of these chunks can't assume that
>> the payload is benign.  While the data senders are supposed to encrypt
>> data, that's not guaranteed, and I'd like to protect the recipient
>> against exposure to nefarious data by mangling or encrypting the data
>> before it is written to disk.
>
> I don't understand how mangling the data is supposed to protect the
> recipient. Don't they have the ability unmangle the data, and thus expose
> themselves to whatever nasties are in the files?

And how does writing unmangled data to disk expose anybody to
anything?  I've never heard of an exploit where writing an evilly
crafted bit-pattern to disk causes a any sort of problem.

-- 
Grant Edwards   grant.b.edwardsYow! My mind is making
  at   ashtrays in Dayton ...
  gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


EuroPython 2015: Beginner’s Day

2015-06-24 Thread M.-A. Lemburg
We’re pleased to announce a new venture at this year’s EuroPython...

*** The EuroPython Beginner’s Day ***

 https://ep2015.europython.eu/en/events/beginners-day/

If you’re thinking of coming to the conference but you’re new to
Python, this could be the session for you. Whether you’re totally new
to programming or you already know another language, this day is to
give you a crash-course in Python, and the ecosystem around it, to
give you the context you need to get the most out of EuroPython.

Bring your laptop, as a large part of the day will be devoted to
learning Python on your own PC. This session will take place on the
first day of the conference, the Monday. It will be presented in
English (although a few of the coaches do speak basic Spanish, French
and Italian).

Sessions will include:

 * A high-level introduction to Python and programming in general.
   Where did Python come from, what is programming all about, and what
   do I need to know to understand all these in-jokes about cheese
   shops?

 * A self-directed learning session, with specific tutorials for total
   beginners and more experienced programmers, accompanied by coaches
   who will be there to answer your questions and help you when you
   get stuck.  Learn at your own pace!

 * A session on the Python “ecosystem”  An introduction to the Python
   ecosystem: some topics and bits of jargon that are bound to come up
   this week: open source, free software, github, packages, pip, pypi,
   scientific computing, scipy, numpy, pandas, ipython notebook, web
   frameworks, django, flask, asyncio, the BDFL, the Zen of Python,
   etc etc.  What are the tools, areas of interest, in-jokes, people
   of note.

 * “How to get the best out of the conference” - recommended talks,
   what to do at lunchtimes or in the evenings, tips on when and how
   to ask questions (hint: as often as possible!), what an “open
   space” is, and more.

We really need to get an idea of numbers for this session, so if you
are interested in attending, please drop a quick email to Harry
Percival  from the program work group.

Also, be sure to get your tickets in time, since ticket sales have
picked up a lot since we announced the schedule.

PS: We’re also looking for volunteers to help with coaching students
during this session. If you enjoy teaching Python to beginners, and
you don’t mind sacrificing your EuroPython Monday to it, please do get
in touch with Harry Percival  !

Enjoy,
--
EuroPython 2015 Team
http://ep2015.europython.eu/
http://www.europython-society.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: EuroPython 2015: Beginner’s Day

2015-06-24 Thread Chris Angelico
On Thu, Jun 25, 2015 at 12:06 AM, M.-A. Lemburg  wrote:
> * A high-level introduction to Python and programming in general.
>Where did Python come from, what is programming all about, and what
>do I need to know to understand all these in-jokes about cheese
>shops?
>
>  * A session on the Python “ecosystem”  An introduction to the Python
>ecosystem: some topics and bits of jargon that are bound to come up
>this week: open source, free software, github, packages, pip, pypi,
>scientific computing, scipy, numpy, pandas, ipython notebook, web
>frameworks, django, flask, asyncio, the BDFL, the Zen of Python,
>etc etc.  What are the tools, areas of interest, in-jokes, people
>of note.

Heh, I wish this had been available when I started digging into
Python. Took me ages to figure out some of the in-jokes

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Emile van Sebille

On 6/24/2015 7:02 AM, Grant Edwards wrote:

And how does writing unmangled data to disk expose anybody to
anything?  I've never heard of an exploit where writing an evilly
crafted bit-pattern to disk causes a any sort of problem.


Unless that code is executed at boot.  Mangling would at least prevent 
it from executing.


Emile




--
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Chris Angelico
On Thu, Jun 25, 2015 at 1:52 AM, Emile van Sebille  wrote:
> On 6/24/2015 7:02 AM, Grant Edwards wrote:
>>
>> And how does writing unmangled data to disk expose anybody to
>> anything?  I've never heard of an exploit where writing an evilly
>> crafted bit-pattern to disk causes a any sort of problem.
>
>
> Unless that code is executed at boot.  Mangling would at least prevent it
> from executing.

Or it's on Windows. It's pretty easy to trick Windows into running
some code somewhere. But you can often disrupt that by simply renaming
the file to have no extension.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Emile van Sebille

On 6/24/2015 8:55 AM, Chris Angelico wrote:

On Thu, Jun 25, 2015 at 1:52 AM, Emile van Sebille  wrote:

On 6/24/2015 7:02 AM, Grant Edwards wrote:


And how does writing unmangled data to disk expose anybody to
anything?  I've never heard of an exploit where writing an evilly
crafted bit-pattern to disk causes a any sort of problem.



Unless that code is executed at boot.  Mangling would at least prevent it
from executing.


Or it's on Windows. It's pretty easy to trick Windows into running
some code somewhere. But you can often disrupt that by simply renaming
the file to have no extension.


ISTR that windows may look into the file to see if it can 'guess' the 
appropriate application, so dropping the extension may not be 
sufficient.  But maybe they've changed that as my windows experience 
doesn't run much past XP.


Emile



--
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Grant Edwards
On 2015-06-24, Emile van Sebille  wrote:
> On 6/24/2015 7:02 AM, Grant Edwards wrote:
>> And how does writing unmangled data to disk expose anybody to
>> anything?  I've never heard of an exploit where writing an evilly
>> crafted bit-pattern to disk causes a any sort of problem.
>
> Unless that code is executed at boot.

Don't write it somewhere where that might happen.  [Of course you
don't let a remote user determine where the untrusted data gets
written -- that would be completely beyond the pale.] Or does Windows
pick files at random from the disk and execute them?

> Mangling would at least prevent it from executing.

If you don't want a file to be executed, then don't make it
executable.  Or doesn't Windows have any way to control whether a file
is executable or not?

-- 
Grant Edwards   grant.b.edwardsYow! You were s'posed
  at   to laugh!
  gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Chris Angelico
On Thu, Jun 25, 2015 at 2:16 AM, Grant Edwards  wrote:
> On 2015-06-24, Emile van Sebille  wrote:
>> Mangling would at least prevent it from executing.
>
> If you don't want a file to be executed, then don't make it
> executable.  Or doesn't Windows have any way to control whether a file
> is executable or not?

Windows doesn't have the Unix file system concept of execute
permission, no. If a file has the .exe extension and the first 512
bytes look like an appropriate header (MZ etc), Windows will happily
run it. With other extensions, similarly - just create a .bat file and
double-click it, it'll run the commands.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread fl
Hi,
I want to learn some coding on PDF. After I download and install pyPDF2,
it cannot pass unit test, which is coming from the package.

I put a screen shot link here to show the console message:

http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc

[IMG]http://i57.tinypic.com/fbdpg0.png[/IMG]


This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed.

I don't know whether it has conflicts or not.


Thanks, 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread Steven D'Aprano
On Thu, 25 Jun 2015 02:53 am, fl wrote:

> Hi,
> I want to learn some coding on PDF. After I download and install pyPDF2,
> it cannot pass unit test, which is coming from the package.
> 
> I put a screen shot link here to show the console message:

Please don't use screen shots:

(1) We cannot copy and paste text from a screen shot.

(2) Blind people and those with poor vision may not be able to see the
screen shot, and their screen readers do not work on images.

(3) People may be reading your post via email or news, and not be able to
access the website containing the screen shot.

Instead, copy and paste the text from the console and include it in the body
of your message. Do you know how to copy text from the Windows console?


> http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc
> 
> [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG]


Have you read the test failure? It is obvious why it failed. Your test
expects to get:

"TheCrazyOnesOctober14,1998Herestothecrazyones..."

but instead gets:

"TheCrazyOnes\nOctober14,1998\nHerestothecrazyones..."

The difference is obvious. Your expected output doesn't include newlines.




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Looking for people who are using Hypothesis and are willing to say so

2015-06-24 Thread David MacIver
Hi there,

Author of Hypothesis here. (If you don't know what Hypothesis is, you're
probably not the target audience for this email but you should totally
check it out: https://hypothesis.readthedocs.org/ Unless you like spending
ages writing tests and still shipping buggy code).

I keep finding out about new people using Hypothesis who I've never heard
of. e.g. turns out that depending on how you count there are between two
and four talks about Hypothesis happening at Europython this year, and many
of them are from people I don't know.

On the one hand, it's great that people are using and excited about it! No
complaints from me there. I was bowled over when I realised about the
EuroPython talks.

On the other hand, it's really quite useful to have more visibility of
usage - both for me to have it and also for other people to see - it's a
much easier sell that people should start using it if they can see that
lots of other people are too.

SO, the point. If you are one of those people using Hypothesis, I'd really
like it if you would say so publicly.

The #1 best way for you to do this for me is to add your name and usage to
https://github.com/DRMacIver/hypothesis/blob/master/docs/endorsements.rst
so it will show up on the endorsements page at
http://hypothesis.readthedocs.org/en/latest/endorsements.html

I'm also thrilled that people are speaking at it, and would love more talks
and blog posts about it. Even tweeting enthusiastically about it is good
too.

Finally, if you are using Hypothesis but can't/don't want to speak about
doing so publicly because your company is doing super top secret stuff (or
any other reason), I'd really appreciate just a short email saying roughly
what sort of things you're using it for, maybe give me an idea of your
workflow. The other reason that I want to know who is using it is so I can
learn where it needs to improve and also help other people use it better
(I'm pretty sure some of the users of Hypothesis at this point have a
better idea how to deploy it than I do).

Regards and thanks,
David R. MacIver
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread fl
On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote:
> Hi,
> I want to learn some coding on PDF. After I download and install pyPDF2,
> it cannot pass unit test, which is coming from the package.
> 
> I put a screen shot link here to show the console message:
> 
> http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc
> 
> [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG]
> 
> 
> This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed.
> 
> I don't know whether it has conflicts or not.
> 
> 
> Thanks,

Thanks, Steven. I don't know how to copy command console window contents 
to the forum post. I even try redirection hoping to screen contents to a 
text file, but it fails.


Yes, there are extra '\n' in the extracted, but I don't know how to suppress
it. Does anyone know how to make it the same of the expected?

Thanks,
-- 
https://mail.python.org/mailman/listinfo/python-list


windows and file names > 256 bytes

2015-06-24 Thread Albert-Jan Roskam via Python-list
Hi,

Consider the following calls, where very_long_path is more than 256 bytes:
[1] os.mkdir(very_long_path)
[2] os.getsize(very_long_path)
[3] shutil.rmtree(very_long_path)

I am using Python 2.7 and [1] and [2] fail under Windows XP [3] fails 
under Win7 (not sure about XP). This is even when I use the "special" 
notations \\?\c:\dir\file or \\?\UNC\server\share\file, e.g.
os.path.getsize("?\\" + "c:\\dir\\file")
(Oddly, os.path.getsize(os.path.join("?", "c:\\dir\\file")) will 
truncate the prefix)

My questions:
1. How can I get the file size of very long paths under XP?
2. Is this a bug in Python? I would prefer if Python dealt with the gory 
details of Windows' silly behavior.

Regards,
Albert-Jan

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Randall Smith

On 06/24/2015 06:36 AM, Steven D'Aprano wrote:

I don't understand how mangling the data is supposed to protect the
recipient. Don't they have the ability unmangle the data, and thus expose
themselves to whatever nasties are in the files?


They never look at the data and wouldn't care to unmangle it.  The 
purpose is primarily to prevent automated software (file indexers, virus 
scanners) from doing bad things to the data.


-Randall


--
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Randall Smith

On 06/24/2015 02:44 AM, Devin Jeanpierre wrote:

How about a random substitution cipher? This will be ultra-weak, but
fast (using bytes.translate/bytes.maketrans) and seems to be the kind
of thing you're asking for.

-- Devin



I tried this out and it seems to be just what I need.  Thanks Devin!

It's pure Python, fast, and mangles the data sufficiently.

--Randall
--
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Grant Edwards
On 2015-06-24, Chris Angelico  wrote:
> On Thu, Jun 25, 2015 at 2:16 AM, Grant Edwards  
> wrote:
>> On 2015-06-24, Emile van Sebille  wrote:
>>
>>> Mangling would at least prevent it from executing.
>>
>> If you don't want a file to be executed, then don't make it
>> executable.  Or doesn't Windows have any way to control whether a
>> file is executable or not?
>
> Windows doesn't have the Unix file system concept of execute
> permission, no. If a file has the .exe extension and the first 512
> bytes look like an appropriate header (MZ etc), Windows will happily
> run it. With other extensions, similarly - just create a .bat file
> and double-click it, it'll run the commands.

So can prevent execution, just by changing the filename?  Maybe 30
years using Unix has biased me, but that just seems so wrong...

-- 
Grant Edwards   grant.b.edwardsYow! Here I am in the
  at   POSTERIOR OLFACTORY LOBULE
  gmail.combut I don't see CARL SAGAN
   anywhere!!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Grant Edwards
On 2015-06-24, Randall Smith  wrote:
> On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
>
>> I don't understand how mangling the data is supposed to protect the
>> recipient. Don't they have the ability unmangle the data, and thus
>> expose themselves to whatever nasties are in the files?
>
> They never look at the data and wouldn't care to unmangle it. 

I obviously don't "get it". If the recipient is never going look at
the data or unmangle it, why not convert every received file to a
single null byte?  That way you save on disk space as well --
especially if you just create links for all files after the initial
one.  ;)

[I supposed next you're going to tell me that Windows filesystems
don't support links.]

> The purpose is primarily to prevent automated software (file
> indexers, virus scanners) from doing bad things to the data.

Life under windows must be more tiresome than I imagined (or could
imagine) if you have to jump through such hoops to keep "automated
software" from doing bad things to your data files.

-- 
Grant Edwards   grant.b.edwardsYow! My mind is making
  at   ashtrays in Dayton ...
  gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Randall Smith

On 06/24/2015 07:19 AM, Dennis Lee Bieber wrote:



Pardon, but that description has me confused. Perhaps I just don't
understand the full use-case.

Who exactly is supposed to be protected from what? You state "data
senders are supposed to encrypt" which, if the recipient doesn't have the
decryption key, implies the recipient -- isn't the real recipient but just
a transport/storage place until the data is retrieved by the end-user.


You got it.  I didn't want to explain any more than necessary.  But yes, 
the recipient just stores the data for the end-user.




If "you" do the encryption on the storage machine, then you need to
also do the decryption when returning the data to the end-user -- which
means the key is available somewhere on the storage machine, and the local
user might obtain access to it and the stored data.


Right again.  A legitimate data owner would encrypt the data.  The 
storage machine is encrypting to protect itself against unwanted 
exposure to unencrypted malware.  Not that they would go looking at the 
files, but their virus scanner or file indexer might.




Given the assumptions I'm making, my recommendation is likely to be
something on the nature of: use an OS designed with security at the core of
the file system; each sender has their own login UID, and the file system
is configured to grant r/w access only to the login -- no execute
permissions, no access by someone not logged in as that user, etc.


Yes.  This is done for "imaged" systems, but I don't have control over 
the storage machines.


I'm leaning towards using a random substitution cipher suggested by 
Devin Jeanpierre.  If you see any weaknesses in that solution, I'd like 
to hear them.


Thanks for your response.


--Randall
--
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread MRAB

On 2015-06-24 18:52, fl wrote:

On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote:

Hi,
I want to learn some coding on PDF. After I download and install pyPDF2,
it cannot pass unit test, which is coming from the package.

I put a screen shot link here to show the console message:

http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc

[IMG]http://i57.tinypic.com/fbdpg0.png[/IMG]


This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed.

I don't know whether it has conflicts or not.


Thanks,


Thanks, Steven. I don't know how to copy command console window contents
to the forum post. I even try redirection hoping to screen contents to a
text file, but it fails.


You can make a rectangular selection by dragging over the console
window the mouse pointer.



Yes, there are extra '\n' in the extracted, but I don't know how to suppress
it. Does anyone know how to make it the same of the expected?

Thanks,



--
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Randall Smith

On 06/24/2015 01:29 PM, Grant Edwards wrote:

On 2015-06-24, Randall Smith  wrote:

On 06/24/2015 06:36 AM, Steven D'Aprano wrote:


I don't understand how mangling the data is supposed to protect the
recipient. Don't they have the ability unmangle the data, and thus
expose themselves to whatever nasties are in the files?


They never look at the data and wouldn't care to unmangle it.


I obviously don't "get it". If the recipient is never going look at
the data or unmangle it, why not convert every received file to a
single null byte?  That way you save on disk space as well --
especially if you just create links for all files after the initial
one.  ;)

[I supposed next you're going to tell me that Windows filesystems
don't support links.]


The purpose is primarily to prevent automated software (file
indexers, virus scanners) from doing bad things to the data.


Life under windows must be more tiresome than I imagined (or could
imagine) if you have to jump through such hoops to keep "automated
software" from doing bad things to your data files.



These are machines storing chunks of other people's data.  The data 
owner chunks a file, compresses and encrypts it, then sends it to 
several storage servers.  The storage server might be a Raspberry PI 
with a USB disk or a Windows XP machine - I can't know which.



I don't use Windows and don't recommend it for this software. 
Nevertheless, many people do use it.


-Randall
--
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread fl
On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote:
> Hi,
> I want to learn some coding on PDF. After I download and install pyPDF2,
> it cannot pass unit test, which is coming from the package.
> 
> I put a screen shot link here to show the console message:
> 
> http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc
> 
> [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG]
> 
> 
> This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed.
> 
> I don't know whether it has conflicts or not.
> 
> 
> Thanks,

You can make a rectangular selection by dragging over the console 
 window the mouse pointer. 

Excuse me. I don't understand your idea. On the command window, there is
no content copied through a mouse click/drag (even no screen difference).
Do you mean using Snipping Tool? That will be an image, which is not
advised as a previous poster.
Thanks,
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread John Gordon
In <16dfcef6-4740-45b9-b04f-0f5bc0899...@googlegroups.com> fl 
 writes:

> Excuse me. I don't understand your idea. On the command window, there is
> no content copied through a mouse click/drag (even no screen difference).

Right-click the command window title bar and select Edit -> Mark.
Then use the mouse to select a rectangular area of text.
Then right-click the title bar again and select Edit -> Copy.

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread Jon Ribbens
On 2015-06-24, fl  wrote:
> You can make a rectangular selection by dragging over the console 
>  window the mouse pointer. 
>
> Excuse me. I don't understand your idea. On the command window, there is
> no content copied through a mouse click/drag (even no screen difference).
> Do you mean using Snipping Tool? That will be an image, which is not
> advised as a previous poster.

Click on the icon at the top-left of the command window.
Choose "Edit->Mark". Drag to select the text you want.
Click the menu again, and choose "Edit->Copy".
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread fl
On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote:
> Hi,
> I want to learn some coding on PDF. After I download and install pyPDF2,
> it cannot pass unit test, which is coming from the package.
> 
> I put a screen shot link here to show the console message:
> 
> http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc
> 
> [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG]
> 
> 
> This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed.
> 
> I don't know whether it has conflicts or not.
> 
> 
> Thanks,

Thanks for the trick! I know now how new I am to the Windows.

Below is the installation message, and the unittest message.

Suspecting there are differences between Linux and Windows on '\n', I
 install pyPDF2 on Ubuntu. It has the same error. What the hell of pyPDF2
 is? I don't know what use/purpose of its test script for. This process is
 also for my learning on Python. Does anyone have the same or different 
 experiences on pyPDF2?

Thanks again.



/






ImportError: No module named Tests

C:\Python27\Tools\PyPDF2-master\Tests>cd ..

C:\Python27\Tools\PyPDF2-master>C:\python27\python.exe setup.py install
running install
running build
running build_py
creating build
creating build\lib
creating build\lib\PyPDF2
copying PyPDF2\filters.py -> build\lib\PyPDF2
copying PyPDF2\generic.py -> build\lib\PyPDF2
copying PyPDF2\merger.py -> build\lib\PyPDF2
copying PyPDF2\pagerange.py -> build\lib\PyPDF2
copying PyPDF2\pdf.py -> build\lib\PyPDF2
copying PyPDF2\utils.py -> build\lib\PyPDF2
copying PyPDF2\xmp.py -> build\lib\PyPDF2
copying PyPDF2\_version.py -> build\lib\PyPDF2
copying PyPDF2\__init__.py -> build\lib\PyPDF2
running install_lib
creating C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\filters.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\generic.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\merger.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\pagerange.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\pdf.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\utils.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\xmp.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\_version.py -> C:\python27\Lib\site-packages\PyPDF2
copying build\lib\PyPDF2\__init__.py -> C:\python27\Lib\site-packages\PyPDF2
byte-compiling C:\python27\Lib\site-packages\PyPDF2\filters.py to filters.pyc
byte-compiling C:\python27\Lib\site-packages\PyPDF2\generic.py to generic.pyc
byte-compiling C:\python27\Lib\site-packages\PyPDF2\merger.py to merger.pyc
byte-compiling C:\python27\Lib\site-packages\PyPDF2\pagerange.py to pagerange.py
c
byte-compiling C:\python27\Lib\site-packages\PyPDF2\pdf.py to pdf.pyc
byte-compiling C:\python27\Lib\site-packages\PyPDF2\utils.py to utils.pyc
byte-compiling C:\python27\Lib\site-packages\PyPDF2\xmp.py to xmp.pyc
byte-compiling C:\python27\Lib\site-packages\PyPDF2\_version.py to _version.pyc
byte-compiling C:\python27\Lib\site-packages\PyPDF2\__init__.py to __init__.pyc
running install_egg_info
Writing C:\python27\Lib\site-packages\PyPDF2-1.24-py2.7.egg-info











C:\Python27\Tools\PyPDF2-master>C:\python27\python.exe -m unittest Tests.tests >
> logt
F
==
FAIL: test_PdfReaderFileLoad (Tests.tests.PdfReaderTestCases)
Test loading and parsing of a file. Extract text of the file and compare to expe
cted
--
Traceback (most recent call last):
  File "Tests\tests.py", line 35, in test_PdfReaderFileLoad
% (pdftext, ipdf_p1_text.encode('utf-8', errors='ignore')))
AssertionError: PDF extracted text differs from expected value.

Expected:

'TheCrazyOnesOctober14,1998Herestothecrazyones.Themis\xcb\x9dts.Therebels.Thetro
ublemakers.Theroundpegsinthesquareholes.Theoneswhoseethingsdi\xcb\x99erently.The
yrenotfondofrules.Andtheyhavenorespectforthestatusquo.Youcanquotethem,disagreewi
ththem,glorifyorvilifythem.Abouttheonlythingyoucantdoisignorethem.Becausetheycha
ngethings.Theyinvent.Theyimagine.Theyheal.Theyexplore.Theycreate.Theyinspire.The
ypushthehumanraceforward.Maybetheyhavetobecrazy.Howelsecanyoustareatanemptycanva
sandseeaworkofart?Orsitinsilenceandhearasongthatsneverbeenwritten?Orgazeataredpl
anetandseealaboratoryonwheels?Wemaketoolsforthesekindsofpeople.Whilesomeseethema
sthecrazyones,weseegenius.Becausethepeoplewhoarecrazyenoughtothinktheycanchanget
heworld,aretheoneswhodo.'

Extracted:

'TheCrazyOnes\nOctober14,1998\nHerestothecrazyones.Themis\xcb\x9dts.Therebels.Th
etroublemakers.\nTheroundpegsinthesquareholes.\nTheoneswhoseethingsdi\xcb\x99ere
ntly.Theyrenotfondofrules.And\ntheyhavenorespectforthestatusquo.Youcanquotethem,
\ndisagreewiththem,glorifyorvilifythem.\nAbouttheonlythingyoucantdoisignorethem.
Becausetheychange\nthings.Theyinvent.Theyimagine.Theyheal.Theyexplore.They\

Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread Mark Lawrence

On 24/06/2015 19:48, MRAB wrote:

On 2015-06-24 18:52, fl wrote:

On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote:

Hi,
I want to learn some coding on PDF. After I download and install pyPDF2,
it cannot pass unit test, which is coming from the package.

I put a screen shot link here to show the console message:

http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc

[IMG]http://i57.tinypic.com/fbdpg0.png[/IMG]


This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?)
installed.

I don't know whether it has conflicts or not.


Thanks,


Thanks, Steven. I don't know how to copy command console window contents
to the forum post. I even try redirection hoping to screen contents to a
text file, but it fails.


You can make a rectangular selection by dragging over the console
window the mouse pointer.



An alternative is to install ConEmu and set up a startup.txt file.  My 
heavily encrypted :) version follows


cmd /F:ON /T:02 /K cd C:\Users\Mark\Documents\MyPython 
"-new_console:t:MyPython"
cmd /F:ON /T:02 /K cd c:\Users\Mark\pythonissues "-new_console:t:Python 
Issues"

cmd /F:ON /T:02 /K cd c:\cPython\PCBuild "-new_console:t:cPython"
cmd /F:ON /T:02 /K cd C:\Users\Mark\Documents\Cash\Python 
"-new_console:t:Cash Python"

C:\Python34\Scripts\ipython.exe --matplotlib "-new_console:t:iPython"

Once you've overcome the encryption life is far, far easier on Windows. 
 You can do really advanced concepts like cut and paste relatively 
easily, but please don't take my word for it, try it yourself.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Grant Edwards
On 2015-06-24, Randall Smith  wrote:
> On 06/24/2015 01:29 PM, Grant Edwards wrote:
>> On 2015-06-24, Randall Smith  wrote:
>>> On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
>>>
 I don't understand how mangling the data is supposed to protect the
 recipient. Don't they have the ability unmangle the data, and thus
 expose themselves to whatever nasties are in the files?
>>>
>>> They never look at the data and wouldn't care to unmangle it.
>>
>> I obviously don't "get it". If the recipient is never going look at
>> the data or unmangle it, why not convert every received file to a
>> single null byte?  That way you save on disk space as well --
>> especially if you just create links for all files after the initial
>> one.  ;)
>
> These are machines storing chunks of other people's data.  The data 
> owner chunks a file, compresses and encrypts it, then sends it to 
> several storage servers.  The storage server might be a Raspberry PI 
> with a USB disk or a Windows XP machine - I can't know which.

OK.  But if the recipient (the server) mangles the data and then never
unmangles or reads the data, there doesn't seem to be any point in
storing it.  I must be misunderstanding your statement that the data
is never read/unmangled.

-- 
Grant Edwards   grant.b.edwardsYow! A can of ASPARAGUS,
  at   73 pigeons, some LIVE ammo,
  gmail.comand a FROZEN DAQUIRI!!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for people who are using Hypothesis and are willing to say so

2015-06-24 Thread Paul Rubin
David MacIver  writes:
> Author of Hypothesis here. (If you don't know what Hypothesis is, you're
> probably not the target audience for this email but you should totally
> check it out: https://hypothesis.readthedocs.org/

Oh very cool: a QuickCheck-like unit test library.  I heard of something
like that for Python recently, that might or might not have been
Hypothesis.  I certainly plan to try it out.  The original QuickCheck
(for Haskell) used the static type signatures on the functions under
test to know what test cases to generate, but Erlang QuickCheck has had
some good successes, including finding some subtle bugs during
development in the HAMT (Clojure-like hash array mapped trie)
implementation just released with Erlang/OTP 18.0 this week.

I see Hypothesis use decorators that look sort of like Erlang Dialyzer
so that can help with test cases.  Maybe later, it use Python 3 type
annotations, though I think those are still much less precise than
Dialyzer or Haskell types.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Billy Earney
Freenet seems to come to mind.. :)

On Wed, Jun 24, 2015 at 4:24 PM, Grant Edwards 
wrote:

> On 2015-06-24, Randall Smith  wrote:
> > On 06/24/2015 01:29 PM, Grant Edwards wrote:
> >> On 2015-06-24, Randall Smith  wrote:
> >>> On 06/24/2015 06:36 AM, Steven D'Aprano wrote:
> >>>
>  I don't understand how mangling the data is supposed to protect the
>  recipient. Don't they have the ability unmangle the data, and thus
>  expose themselves to whatever nasties are in the files?
> >>>
> >>> They never look at the data and wouldn't care to unmangle it.
> >>
> >> I obviously don't "get it". If the recipient is never going look at
> >> the data or unmangle it, why not convert every received file to a
> >> single null byte?  That way you save on disk space as well --
> >> especially if you just create links for all files after the initial
> >> one.  ;)
> >
> > These are machines storing chunks of other people's data.  The data
> > owner chunks a file, compresses and encrypts it, then sends it to
> > several storage servers.  The storage server might be a Raspberry PI
> > with a USB disk or a Windows XP machine - I can't know which.
>
> OK.  But if the recipient (the server) mangles the data and then never
> unmangles or reads the data, there doesn't seem to be any point in
> storing it.  I must be misunderstanding your statement that the data
> is never read/unmangled.
>
> --
> Grant Edwards   grant.b.edwardsYow! A can of ASPARAGUS,
>   at   73 pigeons, some LIVE
> ammo,
>   gmail.comand a FROZEN DAQUIRI!!
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Randall Smith

On 06/24/2015 04:24 PM, Grant Edwards wrote:



OK.  But if the recipient (the server) mangles the data and then never
unmangles or reads the data, there doesn't seem to be any point in
storing it.  I must be misunderstanding your statement that the data
is never read/unmangled.



When the storage server sends the data (on request), it decodes the data 
before sending.  I'm currently testing this on a Raspberry PI using a 
random substitution with bytearray.maketrans and bytearray.translate on 
Raspberry PI and it is working quite well.


Thanks.

-Randall

--
https://mail.python.org/mailman/listinfo/python-list


Could you explain this rebinding (or some other action) on "nums = nums"?

2015-06-24 Thread fl
Hi,

I read a blog written by Ned and find it is very interesting, but I am still
unclear it in some parts. In the following example, I am almost lost at the
last line: 

nums = num


Could anyone explain it in a more detail to me?

Thanks,





...
The reason is that list implements __iadd__ like this (except in C, not Python):

class List:
def __iadd__(self, other):
self.extend(other)
return self
When you execute "nums += more", you're getting the same effect as:

nums = nums.__iadd__(more)
which, because of the implementation of __iadd__, acts like this:

nums.extend(more)
nums = nums
So there is a rebinding operation here, but first, there's a mutating 
operation, and the rebinding operation is a no-op.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Could you explain this rebinding (or some other action) on "nums = nums"?

2015-06-24 Thread Chris Angelico
On Thu, Jun 25, 2015 at 9:52 AM, fl  wrote:
> The reason is that list implements __iadd__ like this (except in C, not 
> Python):
>
> class List:
> def __iadd__(self, other):
> self.extend(other)
> return self
> When you execute "nums += more", you're getting the same effect as:
>
> nums = nums.__iadd__(more)
> which, because of the implementation of __iadd__, acts like this:
>
> nums.extend(more)
> nums = nums
> So there is a rebinding operation here, but first, there's a mutating 
> operation, and the rebinding operation is a no-op.

It's not a complete no-op, as can be demonstrated if you use something
other than a simple name:

>>> tup = ("spam", [1, 2, 3], "ham")
>>> tup[1]
[1, 2, 3]
>>> tup[1].extend([4,5])
>>> tup[1] = tup[1]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'tuple' object does not support item assignment
>>> tup
('spam', [1, 2, 3, 4, 5], 'ham')
>>> tup[1] += [6,7]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'tuple' object does not support item assignment
>>> tup
('spam', [1, 2, 3, 4, 5, 6, 7], 'ham')

The reason for the rebinding is that += can do two completely
different things: with mutable objects, like lists, it changes them in
place, but with immutables, it returns a new one:

>>> msg = "Hello"
>>> msg += ", world!"
>>> msg
'Hello, world!'

This didn't change the string "Hello", because you can't do that.
Instead, it rebound msg to "Hello, world!". For consistency, the +=
operator will *always* rebind, but in situations where that's not
necessary, it rebinds to the exact same object.

Does that answer the question?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why does the unit test fail of the pyPDF2 package?

2015-06-24 Thread Steven D'Aprano
On Thu, 25 Jun 2015 03:52 am, fl wrote:

> Thanks, Steven. I don't know how to copy command console window contents
> to the forum post.

I don't know either, because I don't use Windows, but you can google for
instructions:

https://duckduckgo.com/html/?q=copy+text+windows+console

https://startpage.com/do/search?q=copy+text+windows+console

http://www.bing.com/search?q=copy+text+windows+console

http://au.search.yahoo.com/search?p=copy+text+windows+console

Even Google works:

https://www.google.com.au/search?q=copy+text+windows+console


> Yes, there are extra '\n' in the extracted, but I don't know how to
> suppress it. Does anyone know how to make it the same of the expected?

Where you enter the expected output, instead of entering:

"TheCrazyOnesOctober14,1998Herestothecrazyones..."

instead enter:

"TheCrazyOnes\nOctober14,1998\nHerestothecrazyones..."


The point is that your expected output should contain the text actually in
the PDF file. If your expected output is different from the actual
contents, then the expectations are wrong, not your code. The test itself
is buggy.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Steven D'Aprano
On Thu, 25 Jun 2015 04:36 am, Randall Smith wrote:

> On 06/24/2015 07:19 AM, Dennis Lee Bieber wrote:
> 
> 
>> Pardon, but that description has me confused. Perhaps I just don't
>> understand the full use-case.
>>
>> Who exactly is supposed to be protected from what? You state "data
>> senders are supposed to encrypt" which, if the recipient doesn't have the
>> decryption key, implies the recipient -- isn't the real recipient but
>> just a transport/storage place until the data is retrieved by the
>> end-user.
> 
> You got it.  I didn't want to explain any more than necessary.  But yes,
> the recipient just stores the data for the end-user.

Trust me. That's not all they are doing.


>> If "you" do the encryption on the storage machine, then you need to
>> also do the decryption when returning the data to the end-user -- which
>> means the key is available somewhere on the storage machine, and the
>> local user might obtain access to it and the stored data.
> 
> Right again.  A legitimate data owner would encrypt the data.  The
> storage machine is encrypting to protect itself against unwanted
> exposure to unencrypted malware.  Not that they would go looking at the
> files, but their virus scanner or file indexer might.

Okay, you're worrying me now. If this is legitimate business, then you
shouldn't be worried about the virus scanner or file indexer *scanning* the
content of the file.

But giving you the benefit of the doubt, that there's nothing underhanded
happening, I don't think you have a good model for the potential threats in
your software. I think there are at least three different threats:

Sender of the data versus the storage machine:

- the sender of the data may deliberately send malware, intending to attack
the people storing the file;

Storage machine versus the end recipient:

- the storage machine may be infected by malware which corrupts the file;

- the owner of the storage machine may deliberately corrupt the data (this
is a special case of the previous);

- the owner of the storage machine may want to spy on the files, that is,
read the contents without changing the files (attack on privacy).


There may be others threats as well, e.g. man-in-the-middle attacks. If this
is anything like Bittorrent, you have a whole range of threats.

But just sticking to the three above, the first one is partially mitigated
by allowing virus scanners to scan the data, but that implies that the
owner of the storage machine can spy on the files. So you have a conflict
here.

Honestly, the *only* real defence against the spying issue is to encrypt the
files. Not obfuscate them with a lousy random substitution cipher. The
storage machine can keep the files as long as they like, just by making a
copy, and spend hours bruteforcing them. They *will* crack the substitution
cipher. In pure Python, that may take a few days or weeks; in C, hours or
days. If they have the resources to throw at it, minutes. Substitution
ciphers have not been effective encryption since, oh, the 1950s, unless you
use a one-time pad. Which you won't be.

That's assuming they don't just look at the Python source code, grab the
cipher key, and decrypt in seconds.

If you're serious about protecting your users privacy and their data
integrity, you need to use modern strong encryption, and you need to solve
the issue of how to get the key from the trusted source to the untrusted
storage machine. I have no idea how to do that -- you need to talk to
actual security experts, not random Python programmers.

A pure Python solution for the encryption is likely to be too slow for more
than toy files. Bite the bullet and use a library written in C. Python uses
C code for all sorts of modules: math, decimal, bisect, pickle, io, etc.
all delegate to C code when available. There's no shame in it.

Not to put too fine a point on it, using a substitution cipher because it's
easy and fast in pure Python code is like making a boat out of styrofoam
because it's light and floats and using aluminium or fibreglass is too
expensive. Sure that will work for toy applications, like paddling around
the swimming pool in your back yard, but nobody in their right mind would
trust it on the deep ocean or a white-water river.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pure Python Data Mangling or Encrypting

2015-06-24 Thread Devin Jeanpierre
On Wed, Jun 24, 2015 at 9:07 PM, Steven D'Aprano  wrote:
> But just sticking to the three above, the first one is partially mitigated
> by allowing virus scanners to scan the data, but that implies that the
> owner of the storage machine can spy on the files. So you have a conflict
> here.

If it's encrypted malware, and you can't decrypt it, there's no threat.

> Honestly, the *only* real defence against the spying issue is to encrypt the
> files. Not obfuscate them with a lousy random substitution cipher. The
> storage machine can keep the files as long as they like, just by making a
> copy, and spend hours bruteforcing them. They *will* crack the substitution
> cipher. In pure Python, that may take a few days or weeks; in C, hours or
> days. If they have the resources to throw at it, minutes. Substitution
> ciphers have not been effective encryption since, oh, the 1950s, unless you
> use a one-time pad. Which you won't be.

The original post said that the sender will usually send files they
encrypted, unless they are malicious. So if the sender wants them to
be encrypted, they already are.

"While the data senders are supposed to encrypt data, that's not
guaranteed, and I'd like to protect the recipient against exposure to
nefarious data by mangling or encrypting the data before it is written
to disk."

The cipher is just to keep the sender from being able to control what
is on disk.

I am usually very oppositional when it comes to rolling your own
crypto, but am I alone here in thinking the OP very clearly laid out
their case?

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list