Re: Looking up a dictionary _key_ by key?
Ian Kelly : > I don't think that it's fundamentally broken. A simple example would > be the int 3, vs. the float 3, vs. the Decimal 3. All of them compare > equal to one another, but they are distinct values, and sometimes it > might be useful to be able to determine which one is actually a key in > the dict. One possibility is to enter the key on the value side as well: d[key] = (key, value) ... canonical_key, value = d[key] Marko -- https://mail.python.org/mailman/listinfo/python-list
rhythmbox plugin problem
Hello, I am trying to get some old plugins I wrote to wrote on anewer version of rhythmbox. When I try to load the plugin I see: (rhythmbox:3092): libpeas-WARNING **: nowplaying-lcd: /usr/lib/rhythmbox/plugins/nowplaying-lcd/libnowplaying-lcd.so: cannot open shared object file: No such file or directory (rhythmbox:3092): libpeas-WARNING **: Could not load plugin module: 'nowplaying-lcd' any ideas about what is going on here? thanks. -- https://mail.python.org/mailman/listinfo/python-list
Pure Python Data Mangling or Encrypting
Chunks of data (about 2MB) are to be stored on machines using a peer-to-peer protocol. The recipient of these chunks can't assume that the payload is benign. While the data senders are supposed to encrypt data, that's not guaranteed, and I'd like to protect the recipient against exposure to nefarious data by mangling or encrypting the data before it is written to disk. My original idea was for the recipient to encrypt using AES. But I want to keep this software pure Python "batteries included" and not require installation of other platform-dependent software. Pure Python AES and even DES are just way too slow. I don't know that I really need encryption here, but some type of fast mangling algorithm where a bad actor sending a payload can't guess the output ahead of time. Any ideas are appreciated. Thanks. -Randall -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
How about a random substitution cipher? This will be ultra-weak, but fast (using bytes.translate/bytes.maketrans) and seems to be the kind of thing you're asking for. -- Devin On Tue, Jun 23, 2015 at 12:02 PM, Randall Smith wrote: > Chunks of data (about 2MB) are to be stored on machines using a peer-to-peer > protocol. The recipient of these chunks can't assume that the payload is > benign. While the data senders are supposed to encrypt data, that's not > guaranteed, and I'd like to protect the recipient against exposure to > nefarious data by mangling or encrypting the data before it is written to > disk. > > My original idea was for the recipient to encrypt using AES. But I want to > keep this software pure Python "batteries included" and not require > installation of other platform-dependent software. Pure Python AES and even > DES are just way too slow. I don't know that I really need encryption here, > but some type of fast mangling algorithm where a bad actor sending a payload > can't guess the output ahead of time. > > Any ideas are appreciated. Thanks. > > -Randall > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Looking up a dictionary _key_ by key?
On 24/06/2015 01:47, Dan Stromberg wrote: Would I have to do an O(n) search to find my key? can you use something from here https://pypi.python.org/pypi/sortedcontainers/0.9.6 with the bisect module? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Next CodinGame online programming contest on June, 27 - Python available
Hi everyone! On June 27th, "Code of the Rings", an online coding battle will launch. It's Free & open to all. You will have 24 hours to code and optimize your solution to a puzzle. What will be exciting and fun is that it will be VERY EASY to start and to get something that works, but complex to produce the most efficient code... Over the 24 hours, you will be able to submit your code as much as you like, whenever you like. No restrictions, no obligation :) Registration is open: http://www.codingame.com/challenge/code-of-the-rings - Duration: 24 hours ; 1 game to solve - Participation is 100% online and free - Over 23 coding languages to choose from including Python - Prizes to win, and 25+ t-shirts - Apply to sponsoring companies offering jobs and internships International Leaderboard + Leaderboard by University Please, feel free to share the information and your feedback are more than welcome! :) Hope to see you there and keep coding! -- https://mail.python.org/mailman/listinfo/python-list
Re: To write headers once with different values in separate row in CSV
On Tuesday, June 23, 2015 at 3:12:40 PM UTC-4, John Gordon wrote: > In Sahlusar > writes: > > > However, when I extrapolate this same logic with a list like: > > > ('Response.MemberO.PMembers.PMembers.Member.CurrentEmployer.EmployerAddress > > .TimeAtPreviousAddress.', None), where the headers/columns are the first > > item (only to be written out once) with different values. I receive an > > output CSV with repeating headers and values all printed in one long string > > First, I would try to determine if the problem is in the makerows() > function, or if the problem is elsewhere. > > Have you tried creating some dummy data by hand and seeing how makerows() > handles it? > > (By the way, if your post had included some sample data that illustrates > the problem, it would have been much easier to figure out a solution. > Instead, we are left guessing at your XML format, and at the particular > implementation of flatten_dict().) > > -- > John Gordon A is for Amy, who fell down the stairs > gor...@panix.com B is for Basil, assaulted by bears > -- Edward Gorey, "The Gashlycrumb Tinies" On Tuesday, June 23, 2015 at 3:12:40 PM UTC-4, John Gordon wrote: > In Sahlusar > writes: > > > However, when I extrapolate this same logic with a list like: > > > ('Response.MemberO.PMembers.PMembers.Member.CurrentEmployer.EmployerAddress > > .TimeAtPreviousAddress.', None), where the headers/columns are the first > > item (only to be written out once) with different values. I receive an > > output CSV with repeating headers and values all printed in one long string > > First, I would try to determine if the problem is in the makerows() > function, or if the problem is elsewhere. > > Have you tried creating some dummy data by hand and seeing how makerows() > handles it? > Yes I did do this. > (By the way, if your post had included some sample data that illustrates > the problem, it would have been much easier to figure out a solution. > Instead, we are left guessing at your XML format, and at the particular > implementation of flatten_dict().) Yes, unfortunately, due to NDA protocols I cannot share this. > > -- > John Gordon A is for Amy, who fell down the stairs > gor...@panix.com B is for Basil, assaulted by bears > -- Edward Gorey, "The Gashlycrumb Tinies" -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On Wed, 24 Jun 2015 05:02 am, Randall Smith wrote: > Chunks of data (about 2MB) are to be stored on machines using a > peer-to-peer protocol. The recipient of these chunks can't assume that > the payload is benign. While the data senders are supposed to encrypt > data, that's not guaranteed, and I'd like to protect the recipient > against exposure to nefarious data by mangling or encrypting the data > before it is written to disk. I don't understand how mangling the data is supposed to protect the recipient. Don't they have the ability unmangle the data, and thus expose themselves to whatever nasties are in the files? If not, you can save all that time and effort implementing the peer-to-peer business and just dump 2MB chunks of random data on their disks. > My original idea was for the recipient to encrypt using AES. But I want > to keep this software pure Python "batteries included" and not require > installation of other platform-dependent software. Pure Python AES and > even DES are just way too slow. I don't know that I really need > encryption here, but some type of fast mangling algorithm where a bad > actor sending a payload can't guess the output ahead of time. Again, I don't understand your threat model here. Why does the bad actor need to guess the mangling? Putting on my Black Hat and twirling my moustache wickedly, I decide to send you a JPG of Goatse. (Don't google it.) Or, a more serious threat, a zip bomb: http://www.ghacks.net/2008/07/27/42-kilobytes-unzipped-make-45-petabytes/ or malware of some description. So I P2P you the file. How it gets encrypted on your disk is irrelevant to me, eventually you're going to unencrypted it and try to access it. We need to understand what threat you are defending against before we can advise you. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: To write headers once with different values in separate row in CSV
On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote: > On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote: > > > That is not the underlying issue. Any thoughts or suggestions would be > > very helpful. > > > Thank you for spending over 100 lines to tell us what is NOT the underlying > issue. I will therefore tell you what is NOT the solution to your problem > (whatever it is, since I can't tell). The solution is NOT to squeeze lemon > juice into your keyboard. > > If someday you feel like telling us what the issue actually IS, instead of > what it IS NOT, then perhaps we will have a chance to help you find a > solution. > > > > -- > Steven Curious - what should I have provided? Detailed and constructive feedback (like your reply to my post regarding importing functions) is more useful than to "squeeze lemon juice" into one's keyboard. -- https://mail.python.org/mailman/listinfo/python-list
Re: Organizing function calls once files have been moved to a directory
On Tuesday, June 23, 2015 at 10:18:43 PM UTC-4, Steven D'Aprano wrote: > On Wed, 24 Jun 2015 06:16 am, kbtyo wrote: > > > I am working on a workflow module that will allow one to recursively check > > for file extensions and if there is a match move them to a folder for > > processing (parsing, data wrangling etc). > > > > I have a simple search process, and log for the files that are present > > (see below). However, I am puzzled by what the most efficient > > method/syntax is to call functions once the selected files have been > > moved? > > The most efficient syntax is the regular syntax that you always use when > calling a file: > > function(arg, another_arg) > > > What else would you use? > > > > I have the functions and classes written in another file. Should I > > import them or should I include them in the same file as the following > > mini-script? > > That's entirely up to you. Some factors you might consider: > > - Are these functions and classes reusable by other code? then you might > want to keep them separate in another file, treated as a library, and > import the library into your application. > > - If you merge the two files together, will it be so big that it is > difficult to work with? Then don't merge them together. My opinion is that > the decimal module from the standard library is about as big as a single > module should every be, and it is almost 6,500 lines. So if your > application is bigger than that, you might want to split it. > > > > > Moreover, should I create another log file for processing? If so, what is > > an idiomatically correct method to do so? > > I don't know. Do you want a second log file? How will it be different from > the first? > > As for creating another log file, I guess the most correct way to do so > would be the same way you created the first log file. > > I'm not sure I actually understand your questions so far. > > Some further comments on your code: > > > if __name__ == '__main__': > > > > # The top argument for name in files > > topdir = '.' > > dest = 'C:\\Users\\wynsa2\\Desktop\\' > > Rather than escaping backslashes, you can use regular forward slashes: > > dest = 'C:/Users/wynsa2/Desktop/' > > > Windows will accept either. > > > > extens = ['docs', 'docx', 'pdf'] # the extensions to search for > > found = {x: [] for x in extens} # lists of found files > > > > # Directories to ignore > > ignore = ['docs', 'doc', 'py', 'pdf'] > > logname = "file_search.log" > > print('Beginning search for files in %s' % os.path.realpath(topdir)) > > > > # Walk the tree > > for dirpath, dirnames, files in os.walk(topdir): > > # Remove directories in ignore > > # directory names must match exactly! > > for idir in ignore: > > if idir in dirnames: > > dirnames.remove(idir) > > > > # Loop through the file names for the current step > > for name in files: > > #Calling str.rsplit on name then > > #splits the string into a list (from the right) > > #with the first argument "."" delimiting it, > > #and only making as many splits as the second argument (1). > > #The third part ([-1]) retrieves the last element of the list--we > > #use this instead of an index of 1 because if no splits are made > > #(if there is no "."" in name), no IndexError will be raised > > > > ext = name.lower().rsplit('.', 1)[-1] > > The better way to split the extension from the file name is to use > os.path.splitext(name): > > > py> import os > py> os.path.splitext("this/file.txt") > ('this/file', '.txt') > py> os.path.splitext("this/file") # no extension > ('this/file', '') > py> os.path.splitext("this/file.tar.gz") > ('this/file.tar', '.gz') > > > -- > Steven On Tuesday, June 23, 2015 at 10:18:43 PM UTC-4, Steven D'Aprano wrote: > On Wed, 24 Jun 2015 06:16 am, kbtyo wrote: > > > I am working on a workflow module that will allow one to recursively check > > for file extensions and if there is a match move them to a folder for > > processing (parsing, data wrangling etc). > > > > I have a simple search process, and log for the files that are present > > (see below). However, I am puzzled by what the most efficient > > method/syntax is to call functions once the selected files have been > > moved? > > The most efficient syntax is the regular syntax that you always use when > calling a file: > > function(arg, another_arg) > > > What else would you use? > > > > I have the functions and classes written in another file. Should I > > import them or should I include them in the same file as the following > > mini-script? > > That's entirely up to you. Some factors you might consider: > > - Are these functions and classes reusable by other code? then you might > want to keep them separate in another file, treated as a library, and > import the library into your application. I think I will do
Re: To write headers once with different values in separate row in CSV
On Wed, 24 Jun 2015 09:37 pm, kbtyo wrote: > On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote: >> On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote: >> >> > That is not the underlying issue. Any thoughts or suggestions would be >> > very helpful. >> >> >> Thank you for spending over 100 lines to tell us what is NOT the >> underlying issue. I will therefore tell you what is NOT the solution to >> your problem (whatever it is, since I can't tell). The solution is NOT to >> squeeze lemon juice into your keyboard. >> >> If someday you feel like telling us what the issue actually IS, instead >> of what it IS NOT, then perhaps we will have a chance to help you find a >> solution. >> >> >> >> -- >> Steven > > Curious - what should I have provided? To start with, you should tell us what is the problem you are having. You gave us some code, and then said "That is not the underlying issue". Okay, so what is the underlying issue? What is the problem you want help solving? In another post, you responded to John Gordon's question: # John Have you tried creating some dummy data by hand and seeing how makerows() handles it? by answering: Yes I did do this. Okay. What was the result? Do you want us to guess what result you got? John also suggested that you provide sample data, and an implementation of flatten_dict, and your answer is: Yes, unfortunately, due to NDA protocols I cannot share this. You don't have to provide your *actual* data. You can provide *sample* data, that does not contain any of your actual confidential values. If your XML file looks like this: Gambardella, Matthew XML Developer's Guide Computer 44.95 2000-10-01 An in-depth look at creating applications with XML. you can replace the data: Smith, John ABCDEF Widgets .99 1900-01-01 blah blah blah blah You can even change the tags: Smith, John ABCDEF Widgets .99 1900-01-01 blah blah blah blah If you're still worried that the sample XML has the same structure as your real data, you can remove some fields and add new ones: ABCDEF .99 1900-01-01 fe fi fo fum blah blah blah blah If you can't share the flatten_dict() function, either: (1) get permission to share it from your manager or project leader. flatten_dict is not a trade secret or valuable in any way, and half-competent Python programmer can probably come up with two or three different ways to flatten a dict in five minutes. They're all going to look more or less the same, because there's only so many ways to flatten a dict. (2) Or accept that we can't help you, and deal with it on your own. > Detailed and constructive feedback > (like your reply to my post regarding importing functions) is more useful > than to "squeeze lemon juice" into one's keyboard. Of course. That is why I said it was NOT the solution. Don't waste your time squeezing lemon juice over your keyboard, it won't solve your problem. But you can't expect us to guess what your problem is, or debug code we can't see, or read your mind and understand your data. Before you ask any more questions, please read this: http://sscce.org/ -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: To write headers once with different values in separate row in CSV
On Wednesday, June 24, 2015 at 8:38:24 AM UTC-4, Steven D'Aprano wrote: > On Wed, 24 Jun 2015 09:37 pm, kbtyo wrote: > > > On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote: > >> On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote: > >> > >> > That is not the underlying issue. Any thoughts or suggestions would be > >> > very helpful. > >> > >> > >> Thank you for spending over 100 lines to tell us what is NOT the > >> underlying issue. I will therefore tell you what is NOT the solution to > >> your problem (whatever it is, since I can't tell). The solution is NOT to > >> squeeze lemon juice into your keyboard. > >> > >> If someday you feel like telling us what the issue actually IS, instead > >> of what it IS NOT, then perhaps we will have a chance to help you find a > >> solution. > >> > >> > >> > >> -- > >> Steven > > > > Curious - what should I have provided? > > To start with, you should tell us what is the problem you are having. You > gave us some code, and then said "That is not the underlying issue". Okay, > so what is the underlying issue? What is the problem you want help solving? > > In another post, you responded to John Gordon's question: > > # John > Have you tried creating some dummy data by hand and seeing > how makerows() handles it? > > > by answering: > > Yes I did do this. > > > Okay. What was the result? Do you want us to guess what result you got? > > > John also suggested that you provide sample data, and an implementation of > flatten_dict, and your answer is: > > Yes, unfortunately, due to NDA protocols I cannot share this. > > > You don't have to provide your *actual* data. You can provide *sample* data, > that does not contain any of your actual confidential values. If your XML > file looks like this: > > > > > Gambardella, Matthew > XML Developer's Guide > Computer > 44.95 > 2000-10-01 > An in-depth look at creating applications > with XML. > > > > > you can replace the data: > > > > > Smith, John > ABCDEF > Widgets > .99 > 1900-01-01 > blah blah blah blah > > > > > You can even change the tags: > > > > > > Smith, John > ABCDEF > Widgets > .99 > 1900-01-01 > blah blah blah blah > > > > > If you're still worried that the sample XML has the same structure as your > real data, you can remove some fields and add new ones: > > > > > ABCDEF > .99 > 1900-01-01 > fe fi fo fum > blah blah blah blah > > > > > If you can't share the flatten_dict() function, either: > > (1) get permission to share it from your manager or project leader. > flatten_dict is not a trade secret or valuable in any way, and > half-competent Python programmer can probably come up with two or three > different ways to flatten a dict in five minutes. They're all going to look > more or less the same, because there's only so many ways to flatten a dict. > > (2) Or accept that we can't help you, and deal with it on your own. > > > > > Detailed and constructive feedback > > (like your reply to my post regarding importing functions) is more useful > > than to "squeeze lemon juice" into one's keyboard. > > Of course. That is why I said it was NOT the solution. Don't waste your time > squeezing lemon juice over your keyboard, it won't solve your problem. > > But you can't expect us to guess what your problem is, or debug code we > can't see, or read your mind and understand your data. > > Before you ask any more questions, please read this: > > http://sscce.org/ > > > > -- > Steven On Wednesday, June 24, 2015 at 8:38:24 AM UTC-4, Steven D'Aprano wrote: > On Wed, 24 Jun 2015 09:37 pm, kbtyo wrote: > > > On Tuesday, June 23, 2015 at 9:50:50 PM UTC-4, Steven D'Aprano wrote: > >> On Wed, 24 Jun 2015 03:15 am, Sahlusar wrote: > >> > >> > That is not the underlying issue. Any thoughts or suggestions would be > >> > very helpful. > >> > >> > >> Thank you for spending over 100 lines to tell us what is NOT the > >> underlying issue. I will therefore tell you what is NOT the solution to > >> your problem (whatever it is, since I can't tell). The solution is NOT to > >> squeeze lemon juice into your keyboard. > >> > >> If someday you feel like telling us what the issue actually IS, instead > >> of what it IS NOT, then perhaps we will have a chance to help you find a > >> solution. > >> > >> > >> > >> -- > >> Steven > > > > Curious - what should I have provided? > > To start with, you should tell us what is the problem you are having. You > gave us some code, and then said "That is not the underlying issue". Okay, > so what is the underlying issue? What is the problem you want help solving? > > In another post, you responded to John Gordon's question: > > # John > Have you tried creating some dummy data by hand and seeing > how
Re: Pure Python Data Mangling or Encrypting
On 2015-06-24, Steven D'Aprano wrote: > On Wed, 24 Jun 2015 05:02 am, Randall Smith wrote: > >> Chunks of data (about 2MB) are to be stored on machines using a >> peer-to-peer protocol. The recipient of these chunks can't assume that >> the payload is benign. While the data senders are supposed to encrypt >> data, that's not guaranteed, and I'd like to protect the recipient >> against exposure to nefarious data by mangling or encrypting the data >> before it is written to disk. > > I don't understand how mangling the data is supposed to protect the > recipient. Don't they have the ability unmangle the data, and thus expose > themselves to whatever nasties are in the files? And how does writing unmangled data to disk expose anybody to anything? I've never heard of an exploit where writing an evilly crafted bit-pattern to disk causes a any sort of problem. -- Grant Edwards grant.b.edwardsYow! My mind is making at ashtrays in Dayton ... gmail.com -- https://mail.python.org/mailman/listinfo/python-list
EuroPython 2015: Beginner’s Day
We’re pleased to announce a new venture at this year’s EuroPython... *** The EuroPython Beginner’s Day *** https://ep2015.europython.eu/en/events/beginners-day/ If you’re thinking of coming to the conference but you’re new to Python, this could be the session for you. Whether you’re totally new to programming or you already know another language, this day is to give you a crash-course in Python, and the ecosystem around it, to give you the context you need to get the most out of EuroPython. Bring your laptop, as a large part of the day will be devoted to learning Python on your own PC. This session will take place on the first day of the conference, the Monday. It will be presented in English (although a few of the coaches do speak basic Spanish, French and Italian). Sessions will include: * A high-level introduction to Python and programming in general. Where did Python come from, what is programming all about, and what do I need to know to understand all these in-jokes about cheese shops? * A self-directed learning session, with specific tutorials for total beginners and more experienced programmers, accompanied by coaches who will be there to answer your questions and help you when you get stuck. Learn at your own pace! * A session on the Python “ecosystem” An introduction to the Python ecosystem: some topics and bits of jargon that are bound to come up this week: open source, free software, github, packages, pip, pypi, scientific computing, scipy, numpy, pandas, ipython notebook, web frameworks, django, flask, asyncio, the BDFL, the Zen of Python, etc etc. What are the tools, areas of interest, in-jokes, people of note. * “How to get the best out of the conference” - recommended talks, what to do at lunchtimes or in the evenings, tips on when and how to ask questions (hint: as often as possible!), what an “open space” is, and more. We really need to get an idea of numbers for this session, so if you are interested in attending, please drop a quick email to Harry Percival from the program work group. Also, be sure to get your tickets in time, since ticket sales have picked up a lot since we announced the schedule. PS: We’re also looking for volunteers to help with coaching students during this session. If you enjoy teaching Python to beginners, and you don’t mind sacrificing your EuroPython Monday to it, please do get in touch with Harry Percival ! Enjoy, -- EuroPython 2015 Team http://ep2015.europython.eu/ http://www.europython-society.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: EuroPython 2015: Beginner’s Day
On Thu, Jun 25, 2015 at 12:06 AM, M.-A. Lemburg wrote: > * A high-level introduction to Python and programming in general. >Where did Python come from, what is programming all about, and what >do I need to know to understand all these in-jokes about cheese >shops? > > * A session on the Python “ecosystem” An introduction to the Python >ecosystem: some topics and bits of jargon that are bound to come up >this week: open source, free software, github, packages, pip, pypi, >scientific computing, scipy, numpy, pandas, ipython notebook, web >frameworks, django, flask, asyncio, the BDFL, the Zen of Python, >etc etc. What are the tools, areas of interest, in-jokes, people >of note. Heh, I wish this had been available when I started digging into Python. Took me ages to figure out some of the in-jokes ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 6/24/2015 7:02 AM, Grant Edwards wrote: And how does writing unmangled data to disk expose anybody to anything? I've never heard of an exploit where writing an evilly crafted bit-pattern to disk causes a any sort of problem. Unless that code is executed at boot. Mangling would at least prevent it from executing. Emile -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On Thu, Jun 25, 2015 at 1:52 AM, Emile van Sebille wrote: > On 6/24/2015 7:02 AM, Grant Edwards wrote: >> >> And how does writing unmangled data to disk expose anybody to >> anything? I've never heard of an exploit where writing an evilly >> crafted bit-pattern to disk causes a any sort of problem. > > > Unless that code is executed at boot. Mangling would at least prevent it > from executing. Or it's on Windows. It's pretty easy to trick Windows into running some code somewhere. But you can often disrupt that by simply renaming the file to have no extension. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 6/24/2015 8:55 AM, Chris Angelico wrote: On Thu, Jun 25, 2015 at 1:52 AM, Emile van Sebille wrote: On 6/24/2015 7:02 AM, Grant Edwards wrote: And how does writing unmangled data to disk expose anybody to anything? I've never heard of an exploit where writing an evilly crafted bit-pattern to disk causes a any sort of problem. Unless that code is executed at boot. Mangling would at least prevent it from executing. Or it's on Windows. It's pretty easy to trick Windows into running some code somewhere. But you can often disrupt that by simply renaming the file to have no extension. ISTR that windows may look into the file to see if it can 'guess' the appropriate application, so dropping the extension may not be sufficient. But maybe they've changed that as my windows experience doesn't run much past XP. Emile -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 2015-06-24, Emile van Sebille wrote: > On 6/24/2015 7:02 AM, Grant Edwards wrote: >> And how does writing unmangled data to disk expose anybody to >> anything? I've never heard of an exploit where writing an evilly >> crafted bit-pattern to disk causes a any sort of problem. > > Unless that code is executed at boot. Don't write it somewhere where that might happen. [Of course you don't let a remote user determine where the untrusted data gets written -- that would be completely beyond the pale.] Or does Windows pick files at random from the disk and execute them? > Mangling would at least prevent it from executing. If you don't want a file to be executed, then don't make it executable. Or doesn't Windows have any way to control whether a file is executable or not? -- Grant Edwards grant.b.edwardsYow! You were s'posed at to laugh! gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On Thu, Jun 25, 2015 at 2:16 AM, Grant Edwards wrote: > On 2015-06-24, Emile van Sebille wrote: >> Mangling would at least prevent it from executing. > > If you don't want a file to be executed, then don't make it > executable. Or doesn't Windows have any way to control whether a file > is executable or not? Windows doesn't have the Unix file system concept of execute permission, no. If a file has the .exe extension and the first 512 bytes look like an appropriate header (MZ etc), Windows will happily run it. With other extensions, similarly - just create a .bat file and double-click it, it'll run the commands. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Why does the unit test fail of the pyPDF2 package?
Hi, I want to learn some coding on PDF. After I download and install pyPDF2, it cannot pass unit test, which is coming from the package. I put a screen shot link here to show the console message: http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG] This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed. I don't know whether it has conflicts or not. Thanks, -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
On Thu, 25 Jun 2015 02:53 am, fl wrote: > Hi, > I want to learn some coding on PDF. After I download and install pyPDF2, > it cannot pass unit test, which is coming from the package. > > I put a screen shot link here to show the console message: Please don't use screen shots: (1) We cannot copy and paste text from a screen shot. (2) Blind people and those with poor vision may not be able to see the screen shot, and their screen readers do not work on images. (3) People may be reading your post via email or news, and not be able to access the website containing the screen shot. Instead, copy and paste the text from the console and include it in the body of your message. Do you know how to copy text from the Windows console? > http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc > > [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG] Have you read the test failure? It is obvious why it failed. Your test expects to get: "TheCrazyOnesOctober14,1998Herestothecrazyones..." but instead gets: "TheCrazyOnes\nOctober14,1998\nHerestothecrazyones..." The difference is obvious. Your expected output doesn't include newlines. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Looking for people who are using Hypothesis and are willing to say so
Hi there, Author of Hypothesis here. (If you don't know what Hypothesis is, you're probably not the target audience for this email but you should totally check it out: https://hypothesis.readthedocs.org/ Unless you like spending ages writing tests and still shipping buggy code). I keep finding out about new people using Hypothesis who I've never heard of. e.g. turns out that depending on how you count there are between two and four talks about Hypothesis happening at Europython this year, and many of them are from people I don't know. On the one hand, it's great that people are using and excited about it! No complaints from me there. I was bowled over when I realised about the EuroPython talks. On the other hand, it's really quite useful to have more visibility of usage - both for me to have it and also for other people to see - it's a much easier sell that people should start using it if they can see that lots of other people are too. SO, the point. If you are one of those people using Hypothesis, I'd really like it if you would say so publicly. The #1 best way for you to do this for me is to add your name and usage to https://github.com/DRMacIver/hypothesis/blob/master/docs/endorsements.rst so it will show up on the endorsements page at http://hypothesis.readthedocs.org/en/latest/endorsements.html I'm also thrilled that people are speaking at it, and would love more talks and blog posts about it. Even tweeting enthusiastically about it is good too. Finally, if you are using Hypothesis but can't/don't want to speak about doing so publicly because your company is doing super top secret stuff (or any other reason), I'd really appreciate just a short email saying roughly what sort of things you're using it for, maybe give me an idea of your workflow. The other reason that I want to know who is using it is so I can learn where it needs to improve and also help other people use it better (I'm pretty sure some of the users of Hypothesis at this point have a better idea how to deploy it than I do). Regards and thanks, David R. MacIver -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote: > Hi, > I want to learn some coding on PDF. After I download and install pyPDF2, > it cannot pass unit test, which is coming from the package. > > I put a screen shot link here to show the console message: > > http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc > > [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG] > > > This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed. > > I don't know whether it has conflicts or not. > > > Thanks, Thanks, Steven. I don't know how to copy command console window contents to the forum post. I even try redirection hoping to screen contents to a text file, but it fails. Yes, there are extra '\n' in the extracted, but I don't know how to suppress it. Does anyone know how to make it the same of the expected? Thanks, -- https://mail.python.org/mailman/listinfo/python-list
windows and file names > 256 bytes
Hi, Consider the following calls, where very_long_path is more than 256 bytes: [1] os.mkdir(very_long_path) [2] os.getsize(very_long_path) [3] shutil.rmtree(very_long_path) I am using Python 2.7 and [1] and [2] fail under Windows XP [3] fails under Win7 (not sure about XP). This is even when I use the "special" notations \\?\c:\dir\file or \\?\UNC\server\share\file, e.g. os.path.getsize("?\\" + "c:\\dir\\file") (Oddly, os.path.getsize(os.path.join("?", "c:\\dir\\file")) will truncate the prefix) My questions: 1. How can I get the file size of very long paths under XP? 2. Is this a bug in Python? I would prefer if Python dealt with the gory details of Windows' silly behavior. Regards, Albert-Jan --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 06/24/2015 06:36 AM, Steven D'Aprano wrote: I don't understand how mangling the data is supposed to protect the recipient. Don't they have the ability unmangle the data, and thus expose themselves to whatever nasties are in the files? They never look at the data and wouldn't care to unmangle it. The purpose is primarily to prevent automated software (file indexers, virus scanners) from doing bad things to the data. -Randall -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 06/24/2015 02:44 AM, Devin Jeanpierre wrote: How about a random substitution cipher? This will be ultra-weak, but fast (using bytes.translate/bytes.maketrans) and seems to be the kind of thing you're asking for. -- Devin I tried this out and it seems to be just what I need. Thanks Devin! It's pure Python, fast, and mangles the data sufficiently. --Randall -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 2015-06-24, Chris Angelico wrote: > On Thu, Jun 25, 2015 at 2:16 AM, Grant Edwards > wrote: >> On 2015-06-24, Emile van Sebille wrote: >> >>> Mangling would at least prevent it from executing. >> >> If you don't want a file to be executed, then don't make it >> executable. Or doesn't Windows have any way to control whether a >> file is executable or not? > > Windows doesn't have the Unix file system concept of execute > permission, no. If a file has the .exe extension and the first 512 > bytes look like an appropriate header (MZ etc), Windows will happily > run it. With other extensions, similarly - just create a .bat file > and double-click it, it'll run the commands. So can prevent execution, just by changing the filename? Maybe 30 years using Unix has biased me, but that just seems so wrong... -- Grant Edwards grant.b.edwardsYow! Here I am in the at POSTERIOR OLFACTORY LOBULE gmail.combut I don't see CARL SAGAN anywhere!! -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 2015-06-24, Randall Smith wrote: > On 06/24/2015 06:36 AM, Steven D'Aprano wrote: > >> I don't understand how mangling the data is supposed to protect the >> recipient. Don't they have the ability unmangle the data, and thus >> expose themselves to whatever nasties are in the files? > > They never look at the data and wouldn't care to unmangle it. I obviously don't "get it". If the recipient is never going look at the data or unmangle it, why not convert every received file to a single null byte? That way you save on disk space as well -- especially if you just create links for all files after the initial one. ;) [I supposed next you're going to tell me that Windows filesystems don't support links.] > The purpose is primarily to prevent automated software (file > indexers, virus scanners) from doing bad things to the data. Life under windows must be more tiresome than I imagined (or could imagine) if you have to jump through such hoops to keep "automated software" from doing bad things to your data files. -- Grant Edwards grant.b.edwardsYow! My mind is making at ashtrays in Dayton ... gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 06/24/2015 07:19 AM, Dennis Lee Bieber wrote: Pardon, but that description has me confused. Perhaps I just don't understand the full use-case. Who exactly is supposed to be protected from what? You state "data senders are supposed to encrypt" which, if the recipient doesn't have the decryption key, implies the recipient -- isn't the real recipient but just a transport/storage place until the data is retrieved by the end-user. You got it. I didn't want to explain any more than necessary. But yes, the recipient just stores the data for the end-user. If "you" do the encryption on the storage machine, then you need to also do the decryption when returning the data to the end-user -- which means the key is available somewhere on the storage machine, and the local user might obtain access to it and the stored data. Right again. A legitimate data owner would encrypt the data. The storage machine is encrypting to protect itself against unwanted exposure to unencrypted malware. Not that they would go looking at the files, but their virus scanner or file indexer might. Given the assumptions I'm making, my recommendation is likely to be something on the nature of: use an OS designed with security at the core of the file system; each sender has their own login UID, and the file system is configured to grant r/w access only to the login -- no execute permissions, no access by someone not logged in as that user, etc. Yes. This is done for "imaged" systems, but I don't have control over the storage machines. I'm leaning towards using a random substitution cipher suggested by Devin Jeanpierre. If you see any weaknesses in that solution, I'd like to hear them. Thanks for your response. --Randall -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
On 2015-06-24 18:52, fl wrote: On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote: Hi, I want to learn some coding on PDF. After I download and install pyPDF2, it cannot pass unit test, which is coming from the package. I put a screen shot link here to show the console message: http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG] This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed. I don't know whether it has conflicts or not. Thanks, Thanks, Steven. I don't know how to copy command console window contents to the forum post. I even try redirection hoping to screen contents to a text file, but it fails. You can make a rectangular selection by dragging over the console window the mouse pointer. Yes, there are extra '\n' in the extracted, but I don't know how to suppress it. Does anyone know how to make it the same of the expected? Thanks, -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 06/24/2015 01:29 PM, Grant Edwards wrote: On 2015-06-24, Randall Smith wrote: On 06/24/2015 06:36 AM, Steven D'Aprano wrote: I don't understand how mangling the data is supposed to protect the recipient. Don't they have the ability unmangle the data, and thus expose themselves to whatever nasties are in the files? They never look at the data and wouldn't care to unmangle it. I obviously don't "get it". If the recipient is never going look at the data or unmangle it, why not convert every received file to a single null byte? That way you save on disk space as well -- especially if you just create links for all files after the initial one. ;) [I supposed next you're going to tell me that Windows filesystems don't support links.] The purpose is primarily to prevent automated software (file indexers, virus scanners) from doing bad things to the data. Life under windows must be more tiresome than I imagined (or could imagine) if you have to jump through such hoops to keep "automated software" from doing bad things to your data files. These are machines storing chunks of other people's data. The data owner chunks a file, compresses and encrypts it, then sends it to several storage servers. The storage server might be a Raspberry PI with a USB disk or a Windows XP machine - I can't know which. I don't use Windows and don't recommend it for this software. Nevertheless, many people do use it. -Randall -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote: > Hi, > I want to learn some coding on PDF. After I download and install pyPDF2, > it cannot pass unit test, which is coming from the package. > > I put a screen shot link here to show the console message: > > http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc > > [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG] > > > This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed. > > I don't know whether it has conflicts or not. > > > Thanks, You can make a rectangular selection by dragging over the console window the mouse pointer. Excuse me. I don't understand your idea. On the command window, there is no content copied through a mouse click/drag (even no screen difference). Do you mean using Snipping Tool? That will be an image, which is not advised as a previous poster. Thanks, -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
In <16dfcef6-4740-45b9-b04f-0f5bc0899...@googlegroups.com> fl writes: > Excuse me. I don't understand your idea. On the command window, there is > no content copied through a mouse click/drag (even no screen difference). Right-click the command window title bar and select Edit -> Mark. Then use the mouse to select a rectangular area of text. Then right-click the title bar again and select Edit -> Copy. -- John Gordon A is for Amy, who fell down the stairs gor...@panix.com B is for Basil, assaulted by bears -- Edward Gorey, "The Gashlycrumb Tinies" -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
On 2015-06-24, fl wrote: > You can make a rectangular selection by dragging over the console > window the mouse pointer. > > Excuse me. I don't understand your idea. On the command window, there is > no content copied through a mouse click/drag (even no screen difference). > Do you mean using Snipping Tool? That will be an image, which is not > advised as a previous poster. Click on the icon at the top-left of the command window. Choose "Edit->Mark". Drag to select the text you want. Click the menu again, and choose "Edit->Copy". -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote: > Hi, > I want to learn some coding on PDF. After I download and install pyPDF2, > it cannot pass unit test, which is coming from the package. > > I put a screen shot link here to show the console message: > > http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc > > [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG] > > > This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed. > > I don't know whether it has conflicts or not. > > > Thanks, Thanks for the trick! I know now how new I am to the Windows. Below is the installation message, and the unittest message. Suspecting there are differences between Linux and Windows on '\n', I install pyPDF2 on Ubuntu. It has the same error. What the hell of pyPDF2 is? I don't know what use/purpose of its test script for. This process is also for my learning on Python. Does anyone have the same or different experiences on pyPDF2? Thanks again. / ImportError: No module named Tests C:\Python27\Tools\PyPDF2-master\Tests>cd .. C:\Python27\Tools\PyPDF2-master>C:\python27\python.exe setup.py install running install running build running build_py creating build creating build\lib creating build\lib\PyPDF2 copying PyPDF2\filters.py -> build\lib\PyPDF2 copying PyPDF2\generic.py -> build\lib\PyPDF2 copying PyPDF2\merger.py -> build\lib\PyPDF2 copying PyPDF2\pagerange.py -> build\lib\PyPDF2 copying PyPDF2\pdf.py -> build\lib\PyPDF2 copying PyPDF2\utils.py -> build\lib\PyPDF2 copying PyPDF2\xmp.py -> build\lib\PyPDF2 copying PyPDF2\_version.py -> build\lib\PyPDF2 copying PyPDF2\__init__.py -> build\lib\PyPDF2 running install_lib creating C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\filters.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\generic.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\merger.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\pagerange.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\pdf.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\utils.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\xmp.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\_version.py -> C:\python27\Lib\site-packages\PyPDF2 copying build\lib\PyPDF2\__init__.py -> C:\python27\Lib\site-packages\PyPDF2 byte-compiling C:\python27\Lib\site-packages\PyPDF2\filters.py to filters.pyc byte-compiling C:\python27\Lib\site-packages\PyPDF2\generic.py to generic.pyc byte-compiling C:\python27\Lib\site-packages\PyPDF2\merger.py to merger.pyc byte-compiling C:\python27\Lib\site-packages\PyPDF2\pagerange.py to pagerange.py c byte-compiling C:\python27\Lib\site-packages\PyPDF2\pdf.py to pdf.pyc byte-compiling C:\python27\Lib\site-packages\PyPDF2\utils.py to utils.pyc byte-compiling C:\python27\Lib\site-packages\PyPDF2\xmp.py to xmp.pyc byte-compiling C:\python27\Lib\site-packages\PyPDF2\_version.py to _version.pyc byte-compiling C:\python27\Lib\site-packages\PyPDF2\__init__.py to __init__.pyc running install_egg_info Writing C:\python27\Lib\site-packages\PyPDF2-1.24-py2.7.egg-info C:\Python27\Tools\PyPDF2-master>C:\python27\python.exe -m unittest Tests.tests > > logt F == FAIL: test_PdfReaderFileLoad (Tests.tests.PdfReaderTestCases) Test loading and parsing of a file. Extract text of the file and compare to expe cted -- Traceback (most recent call last): File "Tests\tests.py", line 35, in test_PdfReaderFileLoad % (pdftext, ipdf_p1_text.encode('utf-8', errors='ignore'))) AssertionError: PDF extracted text differs from expected value. Expected: 'TheCrazyOnesOctober14,1998Herestothecrazyones.Themis\xcb\x9dts.Therebels.Thetro ublemakers.Theroundpegsinthesquareholes.Theoneswhoseethingsdi\xcb\x99erently.The yrenotfondofrules.Andtheyhavenorespectforthestatusquo.Youcanquotethem,disagreewi ththem,glorifyorvilifythem.Abouttheonlythingyoucantdoisignorethem.Becausetheycha ngethings.Theyinvent.Theyimagine.Theyheal.Theyexplore.Theycreate.Theyinspire.The ypushthehumanraceforward.Maybetheyhavetobecrazy.Howelsecanyoustareatanemptycanva sandseeaworkofart?Orsitinsilenceandhearasongthatsneverbeenwritten?Orgazeataredpl anetandseealaboratoryonwheels?Wemaketoolsforthesekindsofpeople.Whilesomeseethema sthecrazyones,weseegenius.Becausethepeoplewhoarecrazyenoughtothinktheycanchanget heworld,aretheoneswhodo.' Extracted: 'TheCrazyOnes\nOctober14,1998\nHerestothecrazyones.Themis\xcb\x9dts.Therebels.Th etroublemakers.\nTheroundpegsinthesquareholes.\nTheoneswhoseethingsdi\xcb\x99ere ntly.Theyrenotfondofrules.And\ntheyhavenorespectforthestatusquo.Youcanquotethem, \ndisagreewiththem,glorifyorvilifythem.\nAbouttheonlythingyoucantdoisignorethem. Becausetheychange\nthings.Theyinvent.Theyimagine.Theyheal.Theyexplore.They\
Re: Why does the unit test fail of the pyPDF2 package?
On 24/06/2015 19:48, MRAB wrote: On 2015-06-24 18:52, fl wrote: On Wednesday, June 24, 2015 at 9:54:12 AM UTC-7, fl wrote: Hi, I want to learn some coding on PDF. After I download and install pyPDF2, it cannot pass unit test, which is coming from the package. I put a screen shot link here to show the console message: http://tinypic.com/view.php?pic=fbdpg0&s=8#.VYre8_lVhBc [IMG]http://i57.tinypic.com/fbdpg0.png[/IMG] This Windows 7 PC has both Python 2.7 and Enthought Canopy (3.4?) installed. I don't know whether it has conflicts or not. Thanks, Thanks, Steven. I don't know how to copy command console window contents to the forum post. I even try redirection hoping to screen contents to a text file, but it fails. You can make a rectangular selection by dragging over the console window the mouse pointer. An alternative is to install ConEmu and set up a startup.txt file. My heavily encrypted :) version follows cmd /F:ON /T:02 /K cd C:\Users\Mark\Documents\MyPython "-new_console:t:MyPython" cmd /F:ON /T:02 /K cd c:\Users\Mark\pythonissues "-new_console:t:Python Issues" cmd /F:ON /T:02 /K cd c:\cPython\PCBuild "-new_console:t:cPython" cmd /F:ON /T:02 /K cd C:\Users\Mark\Documents\Cash\Python "-new_console:t:Cash Python" C:\Python34\Scripts\ipython.exe --matplotlib "-new_console:t:iPython" Once you've overcome the encryption life is far, far easier on Windows. You can do really advanced concepts like cut and paste relatively easily, but please don't take my word for it, try it yourself. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 2015-06-24, Randall Smith wrote: > On 06/24/2015 01:29 PM, Grant Edwards wrote: >> On 2015-06-24, Randall Smith wrote: >>> On 06/24/2015 06:36 AM, Steven D'Aprano wrote: >>> I don't understand how mangling the data is supposed to protect the recipient. Don't they have the ability unmangle the data, and thus expose themselves to whatever nasties are in the files? >>> >>> They never look at the data and wouldn't care to unmangle it. >> >> I obviously don't "get it". If the recipient is never going look at >> the data or unmangle it, why not convert every received file to a >> single null byte? That way you save on disk space as well -- >> especially if you just create links for all files after the initial >> one. ;) > > These are machines storing chunks of other people's data. The data > owner chunks a file, compresses and encrypts it, then sends it to > several storage servers. The storage server might be a Raspberry PI > with a USB disk or a Windows XP machine - I can't know which. OK. But if the recipient (the server) mangles the data and then never unmangles or reads the data, there doesn't seem to be any point in storing it. I must be misunderstanding your statement that the data is never read/unmangled. -- Grant Edwards grant.b.edwardsYow! A can of ASPARAGUS, at 73 pigeons, some LIVE ammo, gmail.comand a FROZEN DAQUIRI!! -- https://mail.python.org/mailman/listinfo/python-list
Re: Looking for people who are using Hypothesis and are willing to say so
David MacIver writes: > Author of Hypothesis here. (If you don't know what Hypothesis is, you're > probably not the target audience for this email but you should totally > check it out: https://hypothesis.readthedocs.org/ Oh very cool: a QuickCheck-like unit test library. I heard of something like that for Python recently, that might or might not have been Hypothesis. I certainly plan to try it out. The original QuickCheck (for Haskell) used the static type signatures on the functions under test to know what test cases to generate, but Erlang QuickCheck has had some good successes, including finding some subtle bugs during development in the HAMT (Clojure-like hash array mapped trie) implementation just released with Erlang/OTP 18.0 this week. I see Hypothesis use decorators that look sort of like Erlang Dialyzer so that can help with test cases. Maybe later, it use Python 3 type annotations, though I think those are still much less precise than Dialyzer or Haskell types. -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
Freenet seems to come to mind.. :) On Wed, Jun 24, 2015 at 4:24 PM, Grant Edwards wrote: > On 2015-06-24, Randall Smith wrote: > > On 06/24/2015 01:29 PM, Grant Edwards wrote: > >> On 2015-06-24, Randall Smith wrote: > >>> On 06/24/2015 06:36 AM, Steven D'Aprano wrote: > >>> > I don't understand how mangling the data is supposed to protect the > recipient. Don't they have the ability unmangle the data, and thus > expose themselves to whatever nasties are in the files? > >>> > >>> They never look at the data and wouldn't care to unmangle it. > >> > >> I obviously don't "get it". If the recipient is never going look at > >> the data or unmangle it, why not convert every received file to a > >> single null byte? That way you save on disk space as well -- > >> especially if you just create links for all files after the initial > >> one. ;) > > > > These are machines storing chunks of other people's data. The data > > owner chunks a file, compresses and encrypts it, then sends it to > > several storage servers. The storage server might be a Raspberry PI > > with a USB disk or a Windows XP machine - I can't know which. > > OK. But if the recipient (the server) mangles the data and then never > unmangles or reads the data, there doesn't seem to be any point in > storing it. I must be misunderstanding your statement that the data > is never read/unmangled. > > -- > Grant Edwards grant.b.edwardsYow! A can of ASPARAGUS, > at 73 pigeons, some LIVE > ammo, > gmail.comand a FROZEN DAQUIRI!! > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On 06/24/2015 04:24 PM, Grant Edwards wrote: OK. But if the recipient (the server) mangles the data and then never unmangles or reads the data, there doesn't seem to be any point in storing it. I must be misunderstanding your statement that the data is never read/unmangled. When the storage server sends the data (on request), it decodes the data before sending. I'm currently testing this on a Raspberry PI using a random substitution with bytearray.maketrans and bytearray.translate on Raspberry PI and it is working quite well. Thanks. -Randall -- https://mail.python.org/mailman/listinfo/python-list
Could you explain this rebinding (or some other action) on "nums = nums"?
Hi, I read a blog written by Ned and find it is very interesting, but I am still unclear it in some parts. In the following example, I am almost lost at the last line: nums = num Could anyone explain it in a more detail to me? Thanks, ... The reason is that list implements __iadd__ like this (except in C, not Python): class List: def __iadd__(self, other): self.extend(other) return self When you execute "nums += more", you're getting the same effect as: nums = nums.__iadd__(more) which, because of the implementation of __iadd__, acts like this: nums.extend(more) nums = nums So there is a rebinding operation here, but first, there's a mutating operation, and the rebinding operation is a no-op. -- https://mail.python.org/mailman/listinfo/python-list
Re: Could you explain this rebinding (or some other action) on "nums = nums"?
On Thu, Jun 25, 2015 at 9:52 AM, fl wrote: > The reason is that list implements __iadd__ like this (except in C, not > Python): > > class List: > def __iadd__(self, other): > self.extend(other) > return self > When you execute "nums += more", you're getting the same effect as: > > nums = nums.__iadd__(more) > which, because of the implementation of __iadd__, acts like this: > > nums.extend(more) > nums = nums > So there is a rebinding operation here, but first, there's a mutating > operation, and the rebinding operation is a no-op. It's not a complete no-op, as can be demonstrated if you use something other than a simple name: >>> tup = ("spam", [1, 2, 3], "ham") >>> tup[1] [1, 2, 3] >>> tup[1].extend([4,5]) >>> tup[1] = tup[1] Traceback (most recent call last): File "", line 1, in TypeError: 'tuple' object does not support item assignment >>> tup ('spam', [1, 2, 3, 4, 5], 'ham') >>> tup[1] += [6,7] Traceback (most recent call last): File "", line 1, in TypeError: 'tuple' object does not support item assignment >>> tup ('spam', [1, 2, 3, 4, 5, 6, 7], 'ham') The reason for the rebinding is that += can do two completely different things: with mutable objects, like lists, it changes them in place, but with immutables, it returns a new one: >>> msg = "Hello" >>> msg += ", world!" >>> msg 'Hello, world!' This didn't change the string "Hello", because you can't do that. Instead, it rebound msg to "Hello, world!". For consistency, the += operator will *always* rebind, but in situations where that's not necessary, it rebinds to the exact same object. Does that answer the question? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Why does the unit test fail of the pyPDF2 package?
On Thu, 25 Jun 2015 03:52 am, fl wrote: > Thanks, Steven. I don't know how to copy command console window contents > to the forum post. I don't know either, because I don't use Windows, but you can google for instructions: https://duckduckgo.com/html/?q=copy+text+windows+console https://startpage.com/do/search?q=copy+text+windows+console http://www.bing.com/search?q=copy+text+windows+console http://au.search.yahoo.com/search?p=copy+text+windows+console Even Google works: https://www.google.com.au/search?q=copy+text+windows+console > Yes, there are extra '\n' in the extracted, but I don't know how to > suppress it. Does anyone know how to make it the same of the expected? Where you enter the expected output, instead of entering: "TheCrazyOnesOctober14,1998Herestothecrazyones..." instead enter: "TheCrazyOnes\nOctober14,1998\nHerestothecrazyones..." The point is that your expected output should contain the text actually in the PDF file. If your expected output is different from the actual contents, then the expectations are wrong, not your code. The test itself is buggy. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On Thu, 25 Jun 2015 04:36 am, Randall Smith wrote: > On 06/24/2015 07:19 AM, Dennis Lee Bieber wrote: > > >> Pardon, but that description has me confused. Perhaps I just don't >> understand the full use-case. >> >> Who exactly is supposed to be protected from what? You state "data >> senders are supposed to encrypt" which, if the recipient doesn't have the >> decryption key, implies the recipient -- isn't the real recipient but >> just a transport/storage place until the data is retrieved by the >> end-user. > > You got it. I didn't want to explain any more than necessary. But yes, > the recipient just stores the data for the end-user. Trust me. That's not all they are doing. >> If "you" do the encryption on the storage machine, then you need to >> also do the decryption when returning the data to the end-user -- which >> means the key is available somewhere on the storage machine, and the >> local user might obtain access to it and the stored data. > > Right again. A legitimate data owner would encrypt the data. The > storage machine is encrypting to protect itself against unwanted > exposure to unencrypted malware. Not that they would go looking at the > files, but their virus scanner or file indexer might. Okay, you're worrying me now. If this is legitimate business, then you shouldn't be worried about the virus scanner or file indexer *scanning* the content of the file. But giving you the benefit of the doubt, that there's nothing underhanded happening, I don't think you have a good model for the potential threats in your software. I think there are at least three different threats: Sender of the data versus the storage machine: - the sender of the data may deliberately send malware, intending to attack the people storing the file; Storage machine versus the end recipient: - the storage machine may be infected by malware which corrupts the file; - the owner of the storage machine may deliberately corrupt the data (this is a special case of the previous); - the owner of the storage machine may want to spy on the files, that is, read the contents without changing the files (attack on privacy). There may be others threats as well, e.g. man-in-the-middle attacks. If this is anything like Bittorrent, you have a whole range of threats. But just sticking to the three above, the first one is partially mitigated by allowing virus scanners to scan the data, but that implies that the owner of the storage machine can spy on the files. So you have a conflict here. Honestly, the *only* real defence against the spying issue is to encrypt the files. Not obfuscate them with a lousy random substitution cipher. The storage machine can keep the files as long as they like, just by making a copy, and spend hours bruteforcing them. They *will* crack the substitution cipher. In pure Python, that may take a few days or weeks; in C, hours or days. If they have the resources to throw at it, minutes. Substitution ciphers have not been effective encryption since, oh, the 1950s, unless you use a one-time pad. Which you won't be. That's assuming they don't just look at the Python source code, grab the cipher key, and decrypt in seconds. If you're serious about protecting your users privacy and their data integrity, you need to use modern strong encryption, and you need to solve the issue of how to get the key from the trusted source to the untrusted storage machine. I have no idea how to do that -- you need to talk to actual security experts, not random Python programmers. A pure Python solution for the encryption is likely to be too slow for more than toy files. Bite the bullet and use a library written in C. Python uses C code for all sorts of modules: math, decimal, bisect, pickle, io, etc. all delegate to C code when available. There's no shame in it. Not to put too fine a point on it, using a substitution cipher because it's easy and fast in pure Python code is like making a boat out of styrofoam because it's light and floats and using aluminium or fibreglass is too expensive. Sure that will work for toy applications, like paddling around the swimming pool in your back yard, but nobody in their right mind would trust it on the deep ocean or a white-water river. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Pure Python Data Mangling or Encrypting
On Wed, Jun 24, 2015 at 9:07 PM, Steven D'Aprano wrote: > But just sticking to the three above, the first one is partially mitigated > by allowing virus scanners to scan the data, but that implies that the > owner of the storage machine can spy on the files. So you have a conflict > here. If it's encrypted malware, and you can't decrypt it, there's no threat. > Honestly, the *only* real defence against the spying issue is to encrypt the > files. Not obfuscate them with a lousy random substitution cipher. The > storage machine can keep the files as long as they like, just by making a > copy, and spend hours bruteforcing them. They *will* crack the substitution > cipher. In pure Python, that may take a few days or weeks; in C, hours or > days. If they have the resources to throw at it, minutes. Substitution > ciphers have not been effective encryption since, oh, the 1950s, unless you > use a one-time pad. Which you won't be. The original post said that the sender will usually send files they encrypted, unless they are malicious. So if the sender wants them to be encrypted, they already are. "While the data senders are supposed to encrypt data, that's not guaranteed, and I'd like to protect the recipient against exposure to nefarious data by mangling or encrypting the data before it is written to disk." The cipher is just to keep the sender from being able to control what is on disk. I am usually very oppositional when it comes to rolling your own crypto, but am I alone here in thinking the OP very clearly laid out their case? -- Devin -- https://mail.python.org/mailman/listinfo/python-list