Re: Python and Math
wrote in message news:281f5806-8793-4fd2-877c-214927dda...@googlegroups.com... > > pip looked and saw that you already had it, so did nothing -- what did it > report? In this caes: > > 'pip install -U ipython[notebook]' > > might have worked: -U means upgrade even if I already have it. > Indeed it did - thanks for the tip. I used pip to uninstall jinja2. Afterwards, running 'ipython notebook' predictably failed. Then I ran the above command to upgrade ipython notebook. It figured out that jinja2 was missing and re-installed it. Now it works again. Very smooth. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: tkinter errors out without clear message
21.05.14 20:19, Terry Reedy написав(ла): There is also the issue that TkVersion == 8.5 is underspecied -- there are multiple bugfix releases. root.call('info', 'patchlevel') returns more detailed info. -- https://mail.python.org/mailman/listinfo/python-list
转发: hi,How much time can transition to python3
hi, i learn python is 0.5 year, i'm so much love python, i come from non English speaking countries, Python2 coding problem has been troubling me, I started to learn the python3 now, But many libraries do not support python3, I know python3 publishing for many years. Why do so many libraries or does not support python3, Perhaps it is because of your home page, still in the striking position put python2 download link, You can speed up the elimination of python2 ? please thank you who2are2...@gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: 转发: hi,How much time can transition to python3
"who2are2...@gmail.com" writes: > i learn python is 0.5 year, > i'm so much love python, Welcome, you have found a very good programming language. I'm glad you like it. > i come from non English speaking countries, > Python2 coding problem has been troubling me, > I started to learn the python3 now, This is good. Python 3 makes it much easier to do the right thing with writing systems worldwide. > But many libraries do not support python3, > I know python3 publishing for many years. > Why do so many libraries or does not support python3, Because Python 2 has a lot of inertia. There is a great amount of existing Python 2 code, and many other systems built on that code. Change takes time. Be glad that you are learning Python 3 now! There has been great improvement in the Python 3 landscape in recent years. > Perhaps it is because of your home page, still in the striking > position put python2 download link, You have that backward; the website reflects the current needs of the community. While the PYthon 3 transition is still going through rapid change, the safest choice is still Python 2 for *existing * uses. But for newcomers like yourself, Python 3 is now the right choice and has been for some years. Congratulations! > You can speed up the elimination of python2 ? Yes, much has already been done, and much is still being done now. But there is still more work to do, as you observed. You can help by contacting the specific projects you rely on which still do not have Python 3 support, and ask those people kindly how you can help. We all get there faster by helping each other! -- \ “Not to be absolutely certain is, I think, one of the essential | `\ things in rationality.” —Bertrand Russell | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Perhaps it is because of your home page, still in the striking position put python2 download link,
hi, i learn python is 0.5 year, i'm so much love python, i come from non English speaking countries, Python2 coding problem has been troubling me, I started to learn the python3 now, But many libraries do not support python3, I know python3 publishing for many years. Why do so many libraries or does not support python3, Perhaps it is because of your home page, still in the striking position put python2 download link, You can speed up the elimination of python2 ? please thank you -- https://mail.python.org/mailman/listinfo/python-list
Can Python do this? First steps, links to resources or complete software referals appreciated.
Hi, I'm an academic and I want to find/adapt/create a script that will grab abstracts (150-250 words of text) from Google Scholar search results and sort them by relevance (e.g. keywords, keyword combinations, anything other way you can think of). Any of you guys know of a script that does this already? Preferably open source? If not, any resources you could bring to my attention? I' a complete Newb! Thanks for your help. Ed -- https://mail.python.org/mailman/listinfo/python-list
Python is horribly slow compared to bash!!
Figure some of you folks might enjoy this. Look how horrible Python performance is! http://thedailywtf.com/Articles/Best-of-Email-Brains,-Security,-Robots,-and-a-Risky-Click.aspx Actually, probably a lot of you folks already read TDWTF, but maybe some don't (yet). ChrisA -- https://mail.python.org/mailman/listinfo/python-list
hashing strings to integers for sqlite3 keys
I'm using Python 3.3 and the sqlite3 module in the standard library. I'm processing a lot of strings from input files (among other things, values of headers in e-mail & news messages) and suppressing duplicates using a table of seen strings in the database. It seems to me --- from past experience with other things, where testing integers for equality is faster than testing strings, as well as from reading the SQLite3 documentation about INTEGER PRIMARY KEY --- that the SELECT tests should be faster if I am looking up an INTEGER PRIMARY KEY value rather than TEXT PRIMARY KEY. Is that right? If so, what sort of hashing function should I use? The "maxint" for SQLite3 is a lot smaller than the size of even MD5 hashes. The only thing I've thought of so far is to use MD5 or SHA-something modulo the maxint value. (Security isn't an issue --- i.e., I'm not worried about someone trying to create a hash collision.) Thanks, Adam -- "It is the role of librarians to keep government running in difficult times," replied Dramoren. "Librarians are the last line of defence against chaos." (McMullen 2001) -- https://mail.python.org/mailman/listinfo/python-list
Re: 转发: hi,How much time can transition to python3
在 2014年5月22日星期四UTC+8下午5时38分57秒,Ben Finney写道: > " > > > > > i learn python is 0.5 year, > > > i'm so much love python, > > > > Welcome, you have found a very good programming language. I'm glad you > > like it. > > > > > i come from non English speaking countries, > > > Python2 coding problem has been troubling me, > > > I started to learn the python3 now, > > > > This is good. Python 3 makes it much easier to do the right thing with > > writing systems worldwide. > > > > > But many libraries do not support python3, > > > I know python3 publishing for many years. > > > Why do so many libraries or does not support python3, > > > > Because Python 2 has a lot of inertia. There is a great amount of > > existing Python 2 code, and many other systems built on that code. > > Change takes time. > > > > Be glad that you are learning Python 3 now! There has been great > > improvement in the Python 3 landscape in recent years. > > > > > Perhaps it is because of your home page, still in the striking > > > position put python2 download link, > > > > You have that backward; the website reflects the current needs of the > > community. While the PYthon 3 transition is still going through rapid > > change, the safest choice is still Python 2 for *existing * uses. > > > > But for newcomers like yourself, Python 3 is now the right choice and > > has been for some years. Congratulations! > > > > > You can speed up the elimination of python2 ? > > > > Yes, much has already been done, and much is still being done now. But > > there is still more work to do, as you observed. > > > > You can help by contacting the specific projects you rely on which still > > do not have Python 3 support, and ask those people kindly how you can > > help. We all get there faster by helping each other! > > > > -- > > \ “Not to be absolutely certain is, I think, one of the essential | > > `\ things in rationality.” —Bertrand Russell | > > _o__) | > > Ben Finney thank you so much -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
Adam Funk wrote: > I'm using Python 3.3 and the sqlite3 module in the standard library. > I'm processing a lot of strings from input files (among other things, > values of headers in e-mail & news messages) and suppressing > duplicates using a table of seen strings in the database. > > It seems to me --- from past experience with other things, where > testing integers for equality is faster than testing strings, as well > as from reading the SQLite3 documentation about INTEGER PRIMARY KEY > --- that the SELECT tests should be faster if I am looking up an > INTEGER PRIMARY KEY value rather than TEXT PRIMARY KEY. Is that > right? My gut feeling tells me that this would matter more for join operations than lookup of a value. If you plan to do joins you could use an autoinc integer as the primary key and an additional string key for lookup. > If so, what sort of hashing function should I use? The "maxint" for > SQLite3 is a lot smaller than the size of even MD5 hashes. The only > thing I've thought of so far is to use MD5 or SHA-something modulo the > maxint value. (Security isn't an issue --- i.e., I'm not worried > about someone trying to create a hash collision.) Start with the cheapest operation you can think of, md5(s) % MAXINT or even hash(s) % MAXINT # don't forget to set PYTHONHASHSEED then compare performance with just s and only if you can demonstrate a significant speedup keep the complication in your code. If you find such a speedup I'd like to see the numbers because this cries PREMATURE OPTIMIZATION... -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On Thu, May 22, 2014 at 9:47 PM, Adam Funk wrote: > I'm using Python 3.3 and the sqlite3 module in the standard library. > I'm processing a lot of strings from input files (among other things, > values of headers in e-mail & news messages) and suppressing > duplicates using a table of seen strings in the database. > > It seems to me --- from past experience with other things, where > testing integers for equality is faster than testing strings, as well > as from reading the SQLite3 documentation about INTEGER PRIMARY KEY > --- that the SELECT tests should be faster if I am looking up an > INTEGER PRIMARY KEY value rather than TEXT PRIMARY KEY. Is that > right? It might be faster to use an integer primary key, but the possibility of even a single collision means you can't guarantee uniqueness without a separate check. I don't know sqlite3 well enough to say, but based on what I know of PostgreSQL, it's usually best to make your schema mimic your logical structure, rather than warping it for the sake of performance. With a good indexing function, the performance of a textual PK won't be all that much worse than an integral one, and everything you do will read correctly in the code - no fiddling around with hashes and collision checks. Stick with the TEXT PRIMARY KEY and let the database do the database's job. If you're processing a really large number of strings, you might want to consider moving from sqlite3 to PostgreSQL anyway (I've used psycopg2 quite happily), as you'll get better concurrency; and that might solve your performance problem as well, as Pg plays very nicely with caches. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On 2014-05-22 12:47, Adam Funk wrote: > I'm using Python 3.3 and the sqlite3 module in the standard library. > I'm processing a lot of strings from input files (among other > things, values of headers in e-mail & news messages) and suppressing > duplicates using a table of seen strings in the database. > > It seems to me --- from past experience with other things, where > testing integers for equality is faster than testing strings, as > well as from reading the SQLite3 documentation about INTEGER > PRIMARY KEY --- that the SELECT tests should be faster if I am > looking up an INTEGER PRIMARY KEY value rather than TEXT PRIMARY > KEY. Is that right? If sqlite can handle the absurd length of a Python long, you *can* do it as ints: >>> from hashlib import sha1 >>> s = "Hello world" >>> h = sha1(s) >>> h.hexdigest() '7b502c3a1f48c8609ae212cdfb639dee39673f5e' >>> int(h.hexdigest(), 16) 703993777145756967576188115661016000849227759454L That's a pretty honkin' huge int for a DB key, but you can use it. And it's pretty capped on length regardless of the underlying string's length. > If so, what sort of hashing function should I use? The "maxint" for > SQLite3 is a lot smaller than the size of even MD5 hashes. The only > thing I've thought of so far is to use MD5 or SHA-something modulo > the maxint value. (Security isn't an issue --- i.e., I'm not > worried about someone trying to create a hash collision.) You could truncate that to something like >>> int(h.hexdigest()[-8:], 16) which should give you something that would result in a 32-bit number that should fit in sqlite's int. -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: Python is horribly slow compared to bash!!
Le jeudi 22 mai 2014 12:54:22 UTC+2, Chris Angelico a écrit : > Figure some of you folks might enjoy this. Look how horrible Python > > performance is! > > > > http://thedailywtf.com/Articles/Best-of-Email-Brains,-Security,-Robots,-and-a-Risky-Click.aspx > > > > Actually, probably a lot of you folks already read TDWTF, but maybe > > some don't (yet). > > > > ChrisA = = >>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = 'z'") [1.4027834829454946, 1.38714224331963, 1.3822586635296261] >>> timeit.repeat("(x*1000 + y)[:-1]", setup="x = 'abc'; y = '\u0fce'") [5.462776291480395, 5.4479432055423445, 5.447874284053398] Na, na, na, I win. But that's peanuts. I can make an application running 100 times slower just by replacing 'z' with Dutch characters. [*]. I win again. I can take the same application and replace 'z' by ..., and ... No, I do not win :-( . Python fails. [*] Unicode is fascinating, working with it is a little bit travelling. jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On 2014-05-22, Peter Otten wrote: > Adam Funk wrote: > >> I'm using Python 3.3 and the sqlite3 module in the standard library. >> I'm processing a lot of strings from input files (among other things, >> values of headers in e-mail & news messages) and suppressing >> duplicates using a table of seen strings in the database. >> >> It seems to me --- from past experience with other things, where >> testing integers for equality is faster than testing strings, as well >> as from reading the SQLite3 documentation about INTEGER PRIMARY KEY >> --- that the SELECT tests should be faster if I am looking up an >> INTEGER PRIMARY KEY value rather than TEXT PRIMARY KEY. Is that >> right? > > My gut feeling tells me that this would matter more for join operations than > lookup of a value. If you plan to do joins you could use an autoinc integer > as the primary key and an additional string key for lookup. I'm not doing any join operations. I'm using sqlite3 for storing big piles of data & persistence between runs --- not really "proper relational database use". In this particular case, I'm getting header values out of messages & doing this: for this_string in these_strings: if not already_seen(this_string): process(this_string) # ignore if already seen ... > and only if you can demonstrate a significant speedup keep the complication > in your code. > > If you find such a speedup I'd like to see the numbers because this cries > PREMATURE OPTIMIZATION... On further reflection, I think I asked for that. In fact, the table I'm using only has one column for the hashes --- I wasn't going to store the strings at all in order to save disk space (maybe my mind is stuck in the 1980s). -- But the government always tries to coax well-known writers into the Establishment; it makes them feel educated. [Robert Graves] -- https://mail.python.org/mailman/listinfo/python-list
Re: Can Python do this? First steps, links to resources or complete software referals appreciated.
On May 22, 2014, at 6:03 AM, ed.cot...@gmail.com wrote: > Hi, I'm an academic and I want to find/adapt/create a script that will grab > abstracts (150-250 words of text) from Google Scholar search results and sort > them by relevance (e.g. keywords, keyword combinations, anything other way > you can think of). > > Any of you guys know of a script that does this already? Preferably open > source? If not, any resources you could bring to my attention? I' a complete > Newb! > > Thanks for your help. > > Ed > -- > https://mail.python.org/mailman/listinfo/python-list Well, you might take a look at scholar.py, located here: http://www.icir.org/christian/scholar.html Also, there is this at stackoverflow: http://stackoverflow.com/questions/13200709/extract-google-scholar-results-using-python-or-r One of these may provide what you want, or serve as a jumping off point. -Bill -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On 2014-05-22, Chris Angelico wrote: > On Thu, May 22, 2014 at 9:47 PM, Adam Funk wrote: >> I'm using Python 3.3 and the sqlite3 module in the standard library. >> I'm processing a lot of strings from input files (among other things, >> values of headers in e-mail & news messages) and suppressing >> duplicates using a table of seen strings in the database. >> >> It seems to me --- from past experience with other things, where >> testing integers for equality is faster than testing strings, as well >> as from reading the SQLite3 documentation about INTEGER PRIMARY KEY >> --- that the SELECT tests should be faster if I am looking up an >> INTEGER PRIMARY KEY value rather than TEXT PRIMARY KEY. Is that >> right? > > It might be faster to use an integer primary key, but the possibility > of even a single collision means you can't guarantee uniqueness > without a separate check. I don't know sqlite3 well enough to say, but > based on what I know of PostgreSQL, it's usually best to make your > schema mimic your logical structure, rather than warping it for the > sake of performance. With a good indexing function, the performance of > a textual PK won't be all that much worse than an integral one, and > everything you do will read correctly in the code - no fiddling around > with hashes and collision checks. > > Stick with the TEXT PRIMARY KEY and let the database do the database's > job. If you're processing a really large number of strings, you might > want to consider moving from sqlite3 to PostgreSQL anyway (I've used > psycopg2 quite happily), as you'll get better concurrency; and that > might solve your performance problem as well, as Pg plays very nicely > with caches. Well, actually I'm thinking about doing away with checking for duplicates at this stage, since the substrings that I pick out of the deduplicated header values go into another table as the TEXT PRIMARY KEY anyway, with deduplication there. So I think this stage reeks of premature optimization. -- The history of the world is the history of a privileged few. --- Henry Miller -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On 2014-05-22, Tim Chase wrote: > On 2014-05-22 12:47, Adam Funk wrote: >> I'm using Python 3.3 and the sqlite3 module in the standard library. >> I'm processing a lot of strings from input files (among other >> things, values of headers in e-mail & news messages) and suppressing >> duplicates using a table of seen strings in the database. >> >> It seems to me --- from past experience with other things, where >> testing integers for equality is faster than testing strings, as >> well as from reading the SQLite3 documentation about INTEGER >> PRIMARY KEY --- that the SELECT tests should be faster if I am >> looking up an INTEGER PRIMARY KEY value rather than TEXT PRIMARY >> KEY. Is that right? > > If sqlite can handle the absurd length of a Python long, you *can* do > it as ints: It can't. SQLite3 INTEGER is an 8-byte signed one. https://www.sqlite.org/datatype3.html But after reading the other replies to my question, I've concluded that what I was trying to do is pointless. > >>> from hashlib import sha1 > >>> s = "Hello world" > >>> h = sha1(s) > >>> h.hexdigest() > '7b502c3a1f48c8609ae212cdfb639dee39673f5e' > >>> int(h.hexdigest(), 16) > 703993777145756967576188115661016000849227759454L That ties in with a related question I've been wondering about lately (using MD5s & SHAs for other things) --- getting a hash value (which is internally numeric, rather than string, right?) out as a hex string & then converting that to an int looks inefficient to me --- is there any better way to get an int? (I haven't seen any other way in the API.) -- A firm rule must be imposed upon our nation before it destroys itself. The United States needs some theology and geometry, some taste and decency. I suspect that we are teetering on the edge of the abyss. --- Ignatius J Reilly -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On Thu, May 22, 2014 at 11:41 PM, Adam Funk wrote: > On further reflection, I think I asked for that. In fact, the table > I'm using only has one column for the hashes --- I wasn't going to > store the strings at all in order to save disk space (maybe my mind is > stuck in the 1980s). That's a problem, then, because you will see hash collisions. Maybe not often, but they definitely will occur if you have enough strings (look up the birthday paradox - with a 32-bit arbitrarily selected integer (such as a good crypto hash that you then truncate to 32 bits), you have a 50% chance of a collision at just 77,000 strings). Do you have enough RAM to hold all the strings directly? Just load 'em all up into a Python set. Set operations are fast, clean, and easy. Your already_seen function becomes a simple 'in' check. These days you can get 16GB or 32GB of RAM in a PC inexpensively enough; with an average string size of 80 characters, and assuming Python 3.3+, that's about 128 bytes each - close enough, and a nice figure. 16GB divided by 128 gives 128M strings - obviously you won't get all of that, but that's your ball-park. Anything less than, say, a hundred million strings, and you can dump the lot into memory. Easy! ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On Thu, May 22, 2014 at 11:54 PM, Adam Funk wrote: >> >>> from hashlib import sha1 >> >>> s = "Hello world" >> >>> h = sha1(s) >> >>> h.hexdigest() >> '7b502c3a1f48c8609ae212cdfb639dee39673f5e' >> >>> int(h.hexdigest(), 16) >> 703993777145756967576188115661016000849227759454L > > That ties in with a related question I've been wondering about lately > (using MD5s & SHAs for other things) --- getting a hash value (which > is internally numeric, rather than string, right?) out as a hex string > & then converting that to an int looks inefficient to me --- is there > any better way to get an int? (I haven't seen any other way in the > API.) I don't know that there is, at least not with hashlib. You might be able to use digest() followed by the struct module, but it's no less convoluted. It's the same in several other languages' hashing functions; the result is a string, not an integer. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: daemon.DaemonContext
I know it's 4 years later, but I'm currently battling this myself. I do exactly this and yet it doesn't appear to be keeping the filehandler open. Nothing ever gets written to logs after I daemonize! -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On 2014-05-22, Chris Angelico wrote: > On Thu, May 22, 2014 at 11:41 PM, Adam Funk wrote: >> On further reflection, I think I asked for that. In fact, the table >> I'm using only has one column for the hashes --- I wasn't going to >> store the strings at all in order to save disk space (maybe my mind is >> stuck in the 1980s). > > That's a problem, then, because you will see hash collisions. Maybe > not often, but they definitely will occur if you have enough strings > (look up the birthday paradox - with a 32-bit arbitrarily selected > integer (such as a good crypto hash that you then truncate to 32 > bits), you have a 50% chance of a collision at just 77,000 strings). Ah yes, there's a handy table for that: https://en.wikipedia.org/wiki/Birthday_attack#Mathematics > Do you have enough RAM to hold all the strings directly? Just load 'em > all up into a Python set. Set operations are fast, clean, and easy. > Your already_seen function becomes a simple 'in' check. These days you > can get 16GB or 32GB of RAM in a PC inexpensively enough; with an > average string size of 80 characters, and assuming Python 3.3+, that's > about 128 bytes each - close enough, and a nice figure. 16GB divided > by 128 gives 128M strings - obviously you won't get all of that, but > that's your ball-park. Anything less than, say, a hundred million > strings, and you can dump the lot into memory. Easy! Good point, & since (as I explained in my other post) the substrings are being deduplicated in their own table anyway it's probably not worth bothering with persistence between runs for this bit. -- Some say the world will end in fire; some say in segfaults. [XKCD 312] -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On Thu, 22 May 2014 12:47:31 +0100, Adam Funk wrote: > I'm using Python 3.3 and the sqlite3 module in the standard library. I'm > processing a lot of strings from input files (among other things, values > of headers in e-mail & news messages) and suppressing duplicates using a > table of seen strings in the database. > > It seems to me --- from past experience with other things, where testing > integers for equality is faster than testing strings, as well as from > reading the SQLite3 documentation about INTEGER PRIMARY KEY --- that the > SELECT tests should be faster if I am looking up an INTEGER PRIMARY KEY > value rather than TEXT PRIMARY KEY. Is that right? > > If so, what sort of hashing function should I use? The "maxint" for > SQLite3 is a lot smaller than the size of even MD5 hashes. The only > thing I've thought of so far is to use MD5 or SHA-something modulo the > maxint value. (Security isn't an issue --- i.e., I'm not worried about > someone trying to create a hash collision.) > > Thanks, > Adam why not just set the filed in the DB to be unique & then catch the error when you try to Wright a duplicate? let the DB engine handle the task -- Your step will soil many countries. -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On 2014-05-22, Chris Angelico wrote: > On Thu, May 22, 2014 at 11:54 PM, Adam Funk wrote: >> That ties in with a related question I've been wondering about lately >> (using MD5s & SHAs for other things) --- getting a hash value (which >> is internally numeric, rather than string, right?) out as a hex string >> & then converting that to an int looks inefficient to me --- is there >> any better way to get an int? (I haven't seen any other way in the >> API.) > > I don't know that there is, at least not with hashlib. You might be > able to use digest() followed by the struct module, but it's no less > convoluted. It's the same in several other languages' hashing > functions; the result is a string, not an integer. Well, J*v* returns a byte array, so I used to do this: digester = MessageDigest.getInstance("MD5"); ... digester.reset(); byte[] digest = digester.digest(bytes); return new BigInteger(+1, digest); I dunno why language designers don't make it easy to get a single big number directly out of these things. I just had a look at the struct module's fearsome documentation & think it would present a good shoot(self, foot) opportunity. -- A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
On Fri, May 23, 2014 at 12:47 AM, Adam Funk wrote: >> I don't know that there is, at least not with hashlib. You might be >> able to use digest() followed by the struct module, but it's no less >> convoluted. It's the same in several other languages' hashing >> functions; the result is a string, not an integer. > > Well, J*v* returns a byte array... I counted byte arrays along with strings. Whether it's notionally a string of bytes or characters makes no difference - it's not an integer. > I dunno why language designers don't make it easy to get a single big > number directly out of these things. It's probably because these sorts of hashes are usually done on large puddles of memory, to create a smaller puddle of memory. How you interpret the resulting puddle is up to you; maybe you want to think of it as a number, maybe as a string, but really it's just a sequence of bytes. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: daemon.DaemonContext and logging
On Saturday, April 10, 2010 11:52:41 PM UTC-4, Ben Finney wrote: > pid = daemon.pidlockfile.TimeoutPIDLockFile( > "/tmp/dizazzo-daemontest.pid", 10) Has pidlockfile been removed? (1.6) -brian -- https://mail.python.org/mailman/listinfo/python-list
Re: daemon.DaemonContext
On Thursday, May 22, 2014 10:31:11 AM UTC-4, wo...@4amlunch.net wrote: > I know it's 4 years later, but I'm currently battling this myself. I do > exactly this and yet it doesn't appear to be keeping the filehandler open. > Nothing ever gets written to logs after I daemonize! Ok, made it work, although I think this goes against the documentation as well as what's here. I changed: context = daemon.DaemonContext( # Stuff here ) context.files_preserve[fh.stream] to: context = daemon.DaemonContext( # Stuff here files_preserve[fh.stream] ) And now it works. -- https://mail.python.org/mailman/listinfo/python-list
Re: hashing strings to integers for sqlite3 keys
Adam Funk wrote: > On 2014-05-22, Chris Angelico wrote: > >> On Thu, May 22, 2014 at 11:54 PM, Adam Funk wrote: > >>> That ties in with a related question I've been wondering about lately >>> (using MD5s & SHAs for other things) --- getting a hash value (which >>> is internally numeric, rather than string, right?) out as a hex string >>> & then converting that to an int looks inefficient to me --- is there >>> any better way to get an int? (I haven't seen any other way in the >>> API.) >> >> I don't know that there is, at least not with hashlib. You might be >> able to use digest() followed by the struct module, but it's no less >> convoluted. It's the same in several other languages' hashing >> functions; the result is a string, not an integer. > > Well, J*v* returns a byte array, so I used to do this: > > digester = MessageDigest.getInstance("MD5"); > ... > digester.reset(); > byte[] digest = digester.digest(bytes); > return new BigInteger(+1, digest); In Python 3 there's int.from_bytes() >>> h = hashlib.sha1(b"Hello world") >>> int.from_bytes(h.digest(), "little") 538059071683667711846616050503420899184350089339 > I dunno why language designers don't make it easy to get a single big > number directly out of these things. You hardly ever need to manipulate the numerical value of the digest. And on its way into the database it will be re-serialized anyway. > I just had a look at the struct module's fearsome documentation & > think it would present a good shoot(self, foot) opportunity. -- https://mail.python.org/mailman/listinfo/python-list
Re: daemon.DaemonContext
On 05/22/2014 07:31 AM, wo...@4amlunch.net wrote: I know it's 4 years later, but I'm currently battling this myself. I do exactly this and yet it doesn't appear to be keeping the filehandler open. Nothing ever gets written to logs after I daemonize! You didn't include any context (important after four years!) so what are you talking about? And did you target the correct list? -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
shebang & windows: call an extensionless git hook
Hi, I wrote the git pre-commit hook below. It is supposed to reject commits that contain large files (e.g. accidental commits by inexperienced users, think of "git add .") Anyway, I tried this under Linux, but the target platform is Windows. As per Git design the hook name *must* be "pre-commit" (no .py extension). How will Windows know that Python should be run? And (should it be relevant): how does Windows know which Python version to invoke? I read about custom shebangs with Pylauncher. Is that my only option? (see: https://bitbucket.org/vinay.sajip/pylauncher, http://legacy.python.org/dev/peps/pep-0397/) In addition, I would really appreciate general feedback on the hook script below. Thanks! Albert-Jan albertjan@debian ~/Desktop/test_repo $ git config --global init.templatedir ~/Desktop/git_template_dir albertjan@debian ~/Desktop/test_repo $ cd ~/Desktop/git_template_dir albertjan@debian ~/Desktop/git_template_dir $ cat hooks/pre-commit #!/usr/bin/python #-*- mode: python -*- """Git pre-commit hook: reject large files""" import sys import os import re from subprocess import Popen, PIPE def git_filesize_hook(megabytes_cutoff=5, verbose=False): """Git pre-commit hook: Return error if the maximum file size in the HEAD revision exceeds , succes (0) otherwise. You can bypass this hook by specifying '--no-verify' as an option in 'git commit'.""" if verbose: print os.getcwd() cmd = "git ls-tree --full-tree -r -l HEAD" git = Popen(cmd, shell=True, stdout=PIPE, cwd=os.getcwd()) get_size = lambda item: int(re.split(" +", item)[3].split("\t")[0]) sizes = map(get_size, git.stdout.readlines()) cut_off_bytes = megabytes_cutoff * 2 ** 20 if max(sizes) > cut_off_bytes: return ("ERROR: your commit contains at least one file " "that is larger than %d bytes" % cut_off_bytes) return 0 if __name__ == "__main__": sys.exit(git_filesize_hook(0.01, True)) albertjan@debian ~/Desktop/git_template_dir $ cd - /home/antonia/Desktop/test_repo albertjan@debian ~/Desktop/test_repo $ git init ## this also fetches my own pre-commit hook from template_dir Initialized empty Git repository in /home/antonia/Desktop/test_repo/.git/ albertjan@debian ~/Desktop/test_repo $ touch foo.txt albertjan@debian ~/Desktop/test_repo $ git add foo.txt albertjan@debian ~/Desktop/test_repo $ ls -l .git/hooks total 4 -rw-r--r-- 1 albertjan albertjan 1468 May 22 14:49 pre-commit albertjan@debian ~/Desktop/test_repo $ git commit -a -m "commit" # hook does not yet work [master (root-commit) dc82f3d] commit 0 files changed create mode 100644 foo.txt albertjan@debian ~/Desktop/test_repo $ chmod +x .git/hooks/pre-commit ## can I avoid this in Linux? What should I do in Windows? albertjan@debian ~/Desktop/test_repo $ echo "blaah\n" >> foo.txt albertjan@debian ~/Desktop/test_repo $ git commit -a -m "commit" # now the hook does its job /home/antonia/Desktop/test_repo ERROR: your commit contains at least one file that is larger than 1 bytes Regards, Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ -- https://mail.python.org/mailman/listinfo/python-list
Re: All-numeric script names and import
On 05/21/2014 03:46 PM, Chris Angelico wrote: > If I have a file called 1.py, is there a way to import it? Obviously I > can't import it as itself, but in theory, it should be possible to > import something from it. I can manage it with __import__ (this is > Python 2.7 I'm working on, at least for the moment), but not with the > statement form. > > # from 1 import app as application # Doesn't work with a numeric name > application = __import__("1").app > > Is there a way to tell Python that, syntactically, this thing that > looks like a number is really a name? Or am I just being dumb? > > (Don't hold back on that last question. "Yes" is a perfectly > acceptable answer. But please explain which of the several > possibilities is the way I'm being dumb. Thanks!) > > ChrisA > import 1.py as module_1 on Python 2.7 (module_1 is not inserted in sys.modules): >>> import imp >>> module_1 = imp.new_module('module_1') >>> execfile('1.py', module_1.__dict__) >>> del module_1.__dict__['__builtins__'] Xavier -- https://mail.python.org/mailman/listinfo/python-list
Re: All-numeric script names and import
On 05/22/2014 12:32 PM, Xavier de Gaye wrote: > import 1.py as module_1 on Python 2.7 (module_1 is not inserted in sys.modules): > > >>> import imp > >>> module_1 = imp.new_module('module_1') > >>> execfile('1.py', module_1.__dict__) > >>> del module_1.__dict__['__builtins__'] Oups.. should not remove the builtins and should add __file__. With corrections: >>> import imp >>> module_1 = imp.new_module('module_1') >>> execfile('1.py', module_1.__dict__) >>> module_1.__file__ = '1.py' Xavier -- https://mail.python.org/mailman/listinfo/python-list
Re: All-numeric script names and import
On Thu, May 22, 2014 at 8:32 PM, Xavier de Gaye wrote: > import 1.py as module_1 on Python 2.7 (module_1 is not inserted in > sys.modules): > import imp module_1 = imp.new_module('module_1') execfile('1.py', module_1.__dict__) del module_1.__dict__['__builtins__'] Heh, I think __import__() is simpler than that :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Advice for choosing correct architecture/tech for a hobby project
I am working on a hobby project - a Bookmarker https://github.com/anshbansal/Bookmarker. Basically bookmarks like in webbrowser stored in a app. The twist is storage by categories. I have spent some time on choosing the correct tech for making this project but it seems it would be better to take some advice on this after I went through this discussion on django forums https://groups.google.com/forum/#!topic/django-users/rSqSftkl5mg. I want to be able to add bookmarks to the app through browser. I want a front-end from which I am able to browse the bookmarks. The browsing front-end should have a search option(search for category) for filtering the bookmarks. As per these requirements that I have framed so far I thought that a web framework would be a good choice and so I chose Django. The reason being the capability to add bookmarks through browser can be done easily through JavaScript. But I hit a snag today that webbrowser's won't allow client to open hyperlinks with file protocol. I have both offline and online bookmarks so that was a problem for me. Now I am at my experience's ends. I have spent 15-20 days' spare time trying to decide the technology and now this snag. Can someone advice on this? Am I using correct technology? -- https://mail.python.org/mailman/listinfo/python-list
Re: daemon.DaemonContext and logging
On 5/22/14 10:28 AM, wo...@4amlunch.net wrote: On Saturday, April 10, 2010 11:52:41 PM UTC-4, Ben Finney wrote: pid = daemon.pidlockfile.TimeoutPIDLockFile( "/tmp/dizazzo-daemontest.pid", 10) Has pidlockfile been removed? (1.6) -brian "Have you released the inertial dampener?" :) -- https://mail.python.org/mailman/listinfo/python-list
Re: Python is horribly slow compared to bash!!
On 5/22/14 5:54 AM, Chris Angelico wrote: Figure some of you folks might enjoy this. Look how horrible Python performance is! http://thedailywtf.com/Articles/Best-of-Email-Brains,-Security,-Robots,-and-a-Risky-Click.aspx > From TDWTF: Most of the interesting physics analysis code here is based on a framework using Python scripts for setup and configuration which then calls native analysis code, that usually is implemented in C++. This goes back to a previous discussion about about Julia (couple weeks back) and IPython. What these guys at CERN need is the dynamic duo of IPython and Julia. (its gonna be fabulous, seriously) Or, Julia by itself. The whole point of the Julia project was to bring the whole dynamic scripting, glue, lightning fast FORTRAN or C++ specialty code, into one screaming fast package that "does it all". Of course that's a pipe dream, but they are getting very close. And, if they pull off the IPython | Julia match-up thing, man, its going to change the way technical computation is handled for decades to come. Back to the TDWTF post, what a hoot. Ok, you heard it there first people, Python is dead everyone learn BASH.:-pheh marcus -- https://mail.python.org/mailman/listinfo/python-list
Re: Advice for choosing correct architecture/tech for a hobby project
In <6a3c5b20-bce5-4c95-b27f-3840e9cc7...@googlegroups.com> Aseem Bansal writes: > But I hit a snag today that webbrowser's won't allow client to open > hyperlinks with file protocol. I have both offline and online bookmarks > so that was a problem for me. What do you mean by saying "webbrowser's won't allow client to open hyperlinks with file protocol"? Of course they do. My web browser works just fine with links such as this: foo.html -- John Gordon Imagine what it must be like for a real medical doctor to gor...@panix.comwatch 'House', or a real serial killer to watch 'Dexter'. -- https://mail.python.org/mailman/listinfo/python-list
Re: Advice for choosing correct architecture/tech for a hobby project
On 5/22/14 1:54 PM, Aseem Bansal wrote: I am working on a hobby project - a Bookmarker{snip} hi, no django is not really the correct tool-set. Django is for server-side content management, but who knows, you might come up with a great hack (I don't want to discourage you). But, a straight python trimmed down app would probably be better... what led you to django? It seems from your descriptions, which don't make sense by the way, that you are attempting to create your own 'browser' within your app (web api) and you want to use a standard browser (like firefox or chrome) to 'front-end' the apps bookmarks. So, your app needs to be able to read your browser's bookmarks file. Browsers most certainly can read http:// https:// file:// etc. (and many more). Your api may not be able to read local file:// urls, but I'm skeptical about that (most web api(s) have no trouble with file:// either). Provide some more info, somebody will help. marcus -- https://mail.python.org/mailman/listinfo/python-list
Re: Advice for choosing correct architecture/tech for a hobby project
On Thu, May 22, 2014 at 1:28 PM, John Gordon wrote: > In <6a3c5b20-bce5-4c95-b27f-3840e9cc7...@googlegroups.com> Aseem Bansal > writes: > >> But I hit a snag today that webbrowser's won't allow client to open >> hyperlinks with file protocol. I have both offline and online bookmarks >> so that was a problem for me. > > What do you mean by saying "webbrowser's won't allow client to open > hyperlinks with file protocol"? Of course they do. > > My web browser works just fine with links such as this: > > foo.html It works if the document that contains the link is also opened from the local filesystem, but browsers will refuse to follow the link if it was served over http. -- https://mail.python.org/mailman/listinfo/python-list
Re: Advice for choosing correct architecture/tech for a hobby project
In Ian Kelly writes: > > My web browser works just fine with links such as this: > > > > foo.html > It works if the document that contains the link is also opened from > the local filesystem, but browsers will refuse to follow the link if > it was served over http. Aha! I didn't know that. Now that I think about it, I suppose it makes sense. Perhaps the OP could write a separate application for handling local files, something like: -- John Gordon Imagine what it must be like for a real medical doctor to gor...@panix.comwatch 'House', or a real serial killer to watch 'Dexter'. -- https://mail.python.org/mailman/listinfo/python-list
Re: Advice for choosing correct architecture/tech for a hobby project
On 05/22/2014 11:54 AM, Aseem Bansal wrote: I am working on a hobby project - a Bookmarker https://github.com/anshbansal/Bookmarker. Take a look at delicio.us -- it seems to be a similar type of experience. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: All-numeric script names and import
On Wednesday, May 21, 2014 7:16:46 PM UTC+5:30, Chris Angelico wrote: > If I have a file called 1.py, is there a way to import it? Obviously I > can't import it as itself, but in theory, it should be possible to > import something from it. I can manage it with __import__ (this is > Python 2.7 I'm working on, at least for the moment), but not with the > statement form. $ cat ا.py x = 1 def foo(x): print("Hi %s!!" % x) $ python3 Python 3.3.5 (default, Mar 22 2014, 13:24:53) [GCC 4.8.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import ا >>> ا.foo('Chris') Hi Chris!! >>> ا.x 1 -- https://mail.python.org/mailman/listinfo/python-list
Re: All-numeric script names and import
On Fri, May 23, 2014 at 12:08 PM, Rustom Mody wrote: > $ cat ا.py > x = 1 > def foo(x): print("Hi %s!!" % x) Yeah, no thanks. I am not naming my scripts in Arabic. :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list