date:20160501

Re: What should Python apps do when asked to show help?

2016-05-01 Thread cs


On 01May2016 16:44, Chris Angelico  wrote:

On Sun, May 1, 2016 at 3:24 PM,   wrote:

Yes, PAGER=cat would make "man" also not page, and likely almost everything.
And yet I am unwilling to do so. Why?

On reflection, my personal problems with this approach are twofold:

- I want $PAGER to specify my preferred pager when I do want a pager, so
setting it to "cat" does not inform apps about my wishes


So you expect the environment variable to say which of multiple pagers
you might want, but only when you already want a pager. Okay. How is
an app supposed to know whether or not to use a pager? How do you
expect them to mindread?


I think for several of us, we do not expect the app to mindread. Don't page for 
short output!


As the rest of my article remarks, I at least think "man" should page on the 
premise than manual pages will be long enough to benefit, as they should be.


Aside: especially if one uses "less" and includes the -d and -F options in the 
$LESS envvar, which suppresses the warning about "dumb" terminals and autoquits 
if the file fits on the screen - these two provide most of the painfree 
behaviour for short outputs and embedded ttys at least.


We could fork a separate discussion on making pagers more seamless, and 
terminal emulators with nice modes to reduce the need for pagers.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread alister

On Sun, 01 May 2016 17:28:53 +1000, cs wrote:

> On 01May2016 16:44, Chris Angelico  wrote:
>>On Sun, May 1, 2016 at 3:24 PM,   wrote:
>>> Yes, PAGER=cat would make "man" also not page, and likely almost
>>> everything.
>>> And yet I am unwilling to do so. Why?
>>>
>>> On reflection, my personal problems with this approach are twofold:
>>>
>>> - I want $PAGER to specify my preferred pager when I do want a pager,
>>> so setting it to "cat" does not inform apps about my wishes
>>
>>So you expect the environment variable to say which of multiple pagers
>>you might want, but only when you already want a pager. Okay. How is an
>>app supposed to know whether or not to use a pager? How do you expect
>>them to mindread?
> 
> I think for several of us, we do not expect the app to mindread. Don't
> page for short output!
> 
> As the rest of my article remarks, I at least think "man" should page on
> the premise than manual pages will be long enough to benefit, as they
> should be.
> 
> Aside: especially if one uses "less" and includes the -d and -F options
> in the $LESS envvar, which suppresses the warning about "dumb" terminals
> and autoquits if the file fits on the screen - these two provide most of
> the painfree behaviour for short outputs and embedded ttys at least.
> 
> We could fork a separate discussion on making pagers more seamless, and
> terminal emulators with nice modes to reduce the need for pagers.
> 
> Cheers,
> Cameron Simpson 

all the discussion on the pager variable is interesting but it overlooks 
what I consider to be a very important lesson on program output.

You have no way of knowing what the users output device is & have no 
right to dictate what that should be.

alternative outputs for the command line could be.

a teletype printer 
a text to speech reader
a Braille terminal
or a computer to (dead) parrot) interface

which is why mot of us here all agree, just output the data, let the end 
users environment decide how to present it, that is the users choice not 
the programmers.




-- 
Get in touch with your feelings of hostility against the dying light.
-- Dylan Thomas [paraphrased periphrastically]
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Steven D'Aprano

On Sun, 1 May 2016 04:44 pm, Chris Angelico wrote:

> On Sun, May 1, 2016 at 3:24 PM,   wrote:
>> Yes, PAGER=cat would make "man" also not page, and likely almost
>> everything. And yet I am unwilling to do so. Why?
>>
>> On reflection, my personal problems with this approach are twofold:
>>
>> - I want $PAGER to specify my preferred pager when I do want a pager, so
>> setting it to "cat" does not inform apps about my wishes
> 
> So you expect the environment variable to say which of multiple pagers
> you might want, but only when you already want a pager. Okay. How is
> an app supposed to know whether or not to use a pager? How do you
> expect them to mindread?

Easy: if the READPAGERENVIRONVAR is set, then the application should read
the PAGER environment variable, otherwise it should ignore it.




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Steven D'Aprano

On Sun, 1 May 2016 05:28 pm, c...@zip.com.au wrote:

> On 01May2016 16:44, Chris Angelico  wrote:

>>So you expect the environment variable to say which of multiple pagers
>>you might want, but only when you already want a pager. Okay. How is
>>an app supposed to know whether or not to use a pager? How do you
>>expect them to mindread?
> 
> I think for several of us, we do not expect the app to mindread. Don't
> page for short output!

Is there an environment variable to tell the application what you
consider "short", or should it read your mind?

Personally, I'd rather use a pager for 3 lines than print 30 lines of help
text directly to the console, but others may feel differently.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Chris Angelico

On Sun, May 1, 2016 at 8:55 PM, Steven D'Aprano  wrote:
> On Sun, 1 May 2016 05:28 pm, c...@zip.com.au wrote:
>
>> On 01May2016 16:44, Chris Angelico  wrote:
>
>>>So you expect the environment variable to say which of multiple pagers
>>>you might want, but only when you already want a pager. Okay. How is
>>>an app supposed to know whether or not to use a pager? How do you
>>>expect them to mindread?
>>
>> I think for several of us, we do not expect the app to mindread. Don't
>> page for short output!
>
> Is there an environment variable to tell the application what you
> consider "short", or should it read your mind?

How about $LINES? If it's less than that, it'll fit on one screen. Of
course, that still won't be perfect, but it's a definite improvement
over guessing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Grant Edwards

On 2016-05-01, Chris Angelico  wrote:
> On Sun, May 1, 2016 at 3:24 PM,   wrote:
>> Yes, PAGER=cat would make "man" also not page, and likely almost everything.
>> And yet I am unwilling to do so. Why?
>>
>> On reflection, my personal problems with this approach are twofold:
>>
>> - I want $PAGER to specify my preferred pager when I do want a pager, so
>> setting it to "cat" does not inform apps about my wishes
>
> So you expect the environment variable to say which of multiple pagers
> you might want, but only when you already want a pager.

Yes!

Just like EDITOR specifies which editor to use _when_ _you_ _want_
_to_ _use_ _an_ _editor_.  It doesn't tell programs to invoke an
editor all the time.

> Okay. How is an app supposed to know whether or not to use a pager?

Command line option.

> How do you expect them to mindread?

Nope, just recognize '-p' or somesuch.

-- 
Grant


-- 
https://mail.python.org/mailman/listinfo/python-list

Python3 html scraper that supports javascript

2016-05-01 Thread zljubisic

Hi,

can you please recommend to me a python3 library that I can use for scrapping 
JS that works on windows as well as linux?

Regards.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Marko Rauhamaa

Grant Edwards :

> On 2016-05-01, Chris Angelico  wrote:
>> Okay. How is an app supposed to know whether or not to use a pager?
> Command line option.
>
>> How do you expect them to mindread?
> Nope, just recognize '-p' or somesuch.

In discussions like these, it would be important to draw from
precedents. Are there commands that have such an option?

I could only find:

   mysql --pager CMD

which seems sensible but nothing like an industry standard.

Personally, I wouldn't bother with builtin paging.

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Grant Edwards

On 2016-05-01, Marko Rauhamaa  wrote:
> Grant Edwards :
>
>> On 2016-05-01, Chris Angelico  wrote:
>>> Okay. How is an app supposed to know whether or not to use a pager?
>> Command line option.
>>
>>> How do you expect them to mindread?
>> Nope, just recognize '-p' or somesuch.
>
> In discussions like these, it would be important to draw from
> precedents. Are there commands that have such an option?

It's pretty rare.  It is assumed that Unix uses can type " | less" if
they want to view the output of a program with a pager.  That's
simpler and faster than spending time to try to figure out if and how
you tell some particular application to invoke a pager for you.

> I could only find:
>
>mysql --pager CMD
>
> which seems sensible but nothing like an industry standard.
>
> Personally, I wouldn't bother with builtin paging.

I agree completely.  Builtin paging is pretty much pointless -- but if
you _are_ going to do, make it something that you invoke with a
command line option.

-- 
Grant

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Steven D'Aprano

On Mon, 2 May 2016 02:30 am, Grant Edwards wrote:

>> In discussions like these, it would be important to draw from
>> precedents. Are there commands that have such an option?
> 
> It's pretty rare.  It is assumed that Unix uses can type " | less" 

Is nobody except me questioning the assumption that we're only talking about
Unix users?

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python3 html scraper that supports javascript

2016-05-01 Thread Bob Gailer

On May 1, 2016 10:20 AM,  wrote:
>
> Hi,
>
> can you please recommend to me a python3 library that I can use for
scrapping JS
I'm not sure what you mean by that. The tool I use is Splinter. Install it
using pip.
that works on windows as well as linux?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Grant Edwards

On 2016-05-01, Steven D'Aprano  wrote:
> On Mon, 2 May 2016 02:30 am, Grant Edwards wrote:
>
>>> In discussions like these, it would be important to draw from
>>> precedents. Are there commands that have such an option?
>> 
>> It's pretty rare.  It is assumed that Unix uses can type " | less" 
>
> Is nobody except me questioning the assumption that we're only
> talking about Unix users?

Didn't the OP specify that he was writing a command-line utility for
Linux/Unix?

Discussing command line operation for Windows or OS-X seems rather
pointless.

-- 
Grant




-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Gene Heskett

On Sunday 01 May 2016 12:36:48 Steven D'Aprano wrote:

> On Mon, 2 May 2016 02:30 am, Grant Edwards wrote:
> >> In discussions like these, it would be important to draw from
> >> precedents. Are there commands that have such an option?
> >
> > It's pretty rare.  It is assumed that Unix uses can type " | less"
>
> Is nobody except me questioning the assumption that we're only talking
> about Unix users?
>
linux, unix, mauche nichs.  Are there others?>
>
> --
> Steven


Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Random832

On Sun, May 1, 2016, at 13:04, Grant Edwards wrote:
> On 2016-05-01, Steven D'Aprano  wrote:
> > Is nobody except me questioning the assumption that we're only
> > talking about Unix users?
> 
> Didn't the OP specify that he was writing a command-line utility for
> Linux/Unix?

We've been talking about pydoc instead of the OP's program for a while
now.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Ethan Furman


On 05/01/2016 09:36 AM, Steven D'Aprano wrote:

On Mon, 2 May 2016 02:30 am, Grant Edwards wrote:



It's pretty rare.  It is assumed that Unix uses can type " | less"


Is nobody except me questioning the assumption that we're only talking about
Unix users?


Even Windows has "more".

--
~Ethan~

--
https://mail.python.org/mailman/listinfo/python-list

How to fill in abbreviation in one column based on state name in another column?

2016-05-01 Thread David Shi via Python-list

Hello, I am back.  Thank you very much for your positive response.
I am trying to use Pandas apply to execute a lookup function, so that we can 
put abbreviation in a new column, in accordance to a state name in another 
column.
Does anyone knows how to make this to work?
Regards.DavidLook up functionstate_to_code = {"VERMONT": "VT", "GEORGIA": "GA", 
"IOWA": "IA"}#table['moa_state_name'] = map(lambda x: x.upper(), 
table['moa_state_name'])def convert_state(row):    abbrev1 =  
state_to_code(table['moa_state_name']) #'aatest'    if abbrev1:         return 
abbrev1 ##state_to_code[abbrev[0]]    return np.nan#print 
convert_state(table['moa_state_name'])
table.insert(0, "abbrev", np.nan)
table['abbrev'] = table.apply(convert_state, axis=1)print 
state_to_code['ARKANSAS']
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: do_POST not working on http.server with python

2016-05-01 Thread Pierre Quentel

Le jeudi 28 avril 2016 10:36:27 UTC+2, Rahul Raghunath a écrit :
> 0
> down vote
> favorite
>   
> 
> I'm trying to create a simple http server with basic GET and POST 
> functionality. The program is supposed to GET requests by printing out a 
> simple webpage that greets a user and askes how he would rather be greeted. 
> When the user enters a greeting of his choice, the webpage should now greet 
> him as he had chosen.
> 
> While GET seems to be working fine, POST is not. I tried debugging by 
> printing at every code execution and it seems to be getting stuck here:
> 
> ctype, pdict = cgi.parse_header(self.headers.getheader('content-type'))
> 
> I'll paste the code full code below, along with my terminal output.
> 
> Code:
> 
> from http.server import BaseHTTPRequestHandler, HTTPServer
> import cgi
> 
> 
> class webServerHandler(BaseHTTPRequestHandler):
> 
> def do_GET(self):
> try:
> if self.path.endswith("/hello"):
> self.send_response(200)
> self.send_header('Content-type', 'text/html')
> self.end_headers()
> output = ""
> output += ""
> output += "Hello!"
> output += ''' enctype='multipart/form-data' action='/hello'>What would you like me to 
> say? value="Submit"> '''
> output += ""
> self.wfile.write(output.encode(encoding = 'utf_8'))
> print (output)
> return
> 
> if self.path.endswith("/hola"):
> self.send_response(200)
> self.send_header('Content-type', 'text/html')
> self.end_headers()
> output = ""
> output += ""
> output += "¡ Hola !"
> output += ''' enctype='multipart/form-data' action='/hello'>What would you like me to 
> say? value="Submit"> '''
> output += ""
> self.wfile.write(output.encode(encoding = 'utf_8'))
> print (output)
> return
> 
> except IOError:
> self.send_error(404, 'File Not Found: %s' % self.path)
> 
> def do_POST(self):
> try:
> self.send_response(201)
> print("Sent response")
> self.send_header('Content-type', 'text/html')
> print("Sent headers")
> self.end_headers()
> print("Ended header")
> ctype, pdict = 
> cgi.parse_header(self.headers.getheader('content-type'))
> print("Parsed headers")
> if ctype == 'multipart/form-data':
> fields = cgi.parse_multipart(self.rfile, pdict)
> messagecontent = fields.get('message')
> print("Receiver message content")
> output = ""
> output += ""
> output += "  Okay, how about this: "
> output += " %s " % messagecontent[0]
> output += ''' enctype='multipart/form-data' action='/hello'>What would you like me to 
> say? value="Submit"> '''
> output += ""
> print(output)
> self.wfile.write(output.encode(encoding = 'utf_8'))
> print ("Wrote through CGI")
> except:
> pass
> 
> 
> def main():
> try:
> port = 8080
> server = HTTPServer(('', port), webServerHandler)
> print ("Web Server running on port", port)
> server.serve_forever()
> except KeyboardInterrupt:
> print (" ^C entered, stopping web server")
> server.socket.close()
> 
> if __name__ == '__main__':
> main()
> 
> Terminal Output:
> 
> Web Server running on port 8080
> 127.0.0.1 - - [28/Apr/2016 13:28:59] "GET /hello HTTP/1.1" 200 -
> Hello! enctype='multipart/form-data' action='/hello'>What would you like me to 
> say? value="Submit"> 
> 127.0.0.1 - - [28/Apr/2016 13:29:09] "POST /hello HTTP/1.1" 201 -
> Sent response
> Sent headers
> Ended header
> 
> As you can see, the POST function does not seem to go beyong the parse_header 
> command. I cannot figure this out, and any help would be usefu!

Hi,

It's generally not considered good practise to silently ignore exceptions in a 
try/except where the except clause is just :

except:
   pass

If you remove the try/except in do_POST you will see this interesting error 
message :

AttributeError: 'HTTPMessage' object has no attribute 'getheader'
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread cs


On 01May2016 20:55, Steven D'Aprano  wrote:

On Sun, 1 May 2016 05:28 pm, c...@zip.com.au wrote:

On 01May2016 16:44, Chris Angelico  wrote:

So you expect the environment variable to say which of multiple pagers
you might want, but only when you already want a pager. Okay. How is
an app supposed to know whether or not to use a pager? How do you
expect them to mindread?


I think for several of us, we do not expect the app to mindread. Don't
page for short output!


Is there an environment variable to tell the application what you
consider "short", or should it read your mind?


We're getting into matters of taste here. It shouldn't read my mind, but of 
course when it differs it shows bad taste!


I am taking the line that usage and help messages should fall into the "short" 
category, both simply by their nature and also as a design/style criterion for 
program authors. Manuals, be they man pages or info or whatever, should be 
"long", with specification and ideally explainations for rationale and some 
examples.



Personally, I'd rather use a pager for 3 lines than print 30 lines of help
text directly to the console, but others may feel differently.


And I am very much the opposite. ["foo --help"; "types next command; huh?  I'm 
in a pager, not back at my prompt?"]


However, with "less" configured to quit if the text fits on the screen (which 
is can usually determine by querying the terminal directly, no magic required), 
I get the best of both wolds, possibly to the point that I have rarely noticed 
that Python's help() pages.


And I've got mixed feelings about git. It seems that "git help" and "git 
--help" produces sensible unpaged short help (42 lines of it, but that is ok to 
me). It is "git help " which runs "man git-subcommand", and that is 
somewhat defensible because most of the git subcommands have an outrageous 
number of options.  (And it really does just invoke "man" (by default - that is 
also tunable I see); I can tell because it invokes my personal "man" command 
instead of the system one or some internally sourced text.)


My constrast, Mercurial (hg) always produces unpaged output for "help" and 
"help ", and the "help " is succinct and fitting for a 
help text. There is a single large "man hg" for when you want the detailed 
manual. This approach is more to my liking.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread cs


On 01May2016 21:23, Chris Angelico  wrote:

On Sun, May 1, 2016 at 8:55 PM, Steven D'Aprano  wrote:

Is there an environment variable to tell the application what you
consider "short", or should it read your mind?


How about $LINES? If it's less than that, it'll fit on one screen. Of
course, that still won't be perfect, but it's a definite improvement
over guessing.


On terminal emulators you can normally query the terminal directly for its 
current size (look at the output of "stty -a" for example), and pagers do. This 
is better than $LINES, which is really a convenience thing presented by some 
shells like bash and which won't magicly change if you resize your terminal 
until bash gets another look.


If your pager can be told to autoquit if the output fits then this pain point 
(whatever your preference) can be largely obviated.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread cs


On 01May2016 17:04, Grant Edwards  wrote:

On 2016-05-01, Steven D'Aprano  wrote:

On Mon, 2 May 2016 02:30 am, Grant Edwards wrote:


In discussions like these, it would be important to draw from
precedents. Are there commands that have such an option?


It's pretty rare.  It is assumed that Unix uses can type " | less"


Is nobody except me questioning the assumption that we're only
talking about Unix users?


Didn't the OP specify that he was writing a command-line utility for
Linux/Unix?

Discussing command line operation for Windows or OS-X seems rather
pointless.


OS-X _is_ UNIX. I spent almost all my time on this Mac in terminals. It is a 
very nice to use UNIX in many regards.


Cheers,
Cameron Simpson 

Mac OS X. Because making Unix user-friendly is easier than debugging Windows.
- Mike Dawson, Macintosh Systems Administrator and Consultation.
 mdaw...@mac.com http://herowars.onestop.net
--
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Steven D'Aprano

On Mon, 2 May 2016 03:04 am, Grant Edwards wrote:

> On 2016-05-01, Steven D'Aprano  wrote:
>> On Mon, 2 May 2016 02:30 am, Grant Edwards wrote:
>>
 In discussions like these, it would be important to draw from
 precedents. Are there commands that have such an option?
>>> 
>>> It's pretty rare.  It is assumed that Unix uses can type " | less"
>>
>> Is nobody except me questioning the assumption that we're only
>> talking about Unix users?
> 
> Didn't the OP specify that he was writing a command-line utility for
> Linux/Unix?

*cough* I'm the OP, and no I didn't.

Obviously I'm a Linux user myself, but I'm presumptuous enough to hope that
when I release the utility publicly[1], others may find it of some small
use. Including Windows users.

[1] Real Soon Now.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Code Opinion - Enumerate

2016-05-01 Thread Sayth Renshaw

Looking at various Python implementations of Conway's game of life.

I came across one on rosetta using defaultdict.

http://rosettacode.org/wiki/Conway%27s_Game_of_Life#Python

Just looking for your opinion on style would you write it like this continually 
calling range or would you use enumerate instead, or neither (something far 
better) ?

import random
from collections import defaultdict
 
printdead, printlive = '-#'
maxgenerations = 3
cellcount = 3,3
celltable = defaultdict(int, {
 (1, 2): 1,
 (1, 3): 1,
 (0, 3): 1,
 } ) # Only need to populate with the keys leading to life
 
##
## Start States
##
# blinker
u = universe = defaultdict(int)
u[(1,0)], u[(1,1)], u[(1,2)] = 1,1,1
 
for i in range(maxgenerations):
print "\nGeneration %3i:" % ( i, )
for row in range(cellcount[1]):
print "  ", ''.join(str(universe[(row,col)])
for col in range(cellcount[0])).replace(
'0', printdead).replace('1', printlive)
nextgeneration = defaultdict(int)
for row in range(cellcount[1]):
for col in range(cellcount[0]):
nextgeneration[(row,col)] = celltable[
( universe[(row,col)],
  -universe[(row,col)] + sum(universe[(r,c)]
 for r in range(row-1,row+2)
 for c in range(col-1, col+2) )
) ]
universe = nextgeneration

Just finished watching ned batchelders talk and wondering how far I should take 
his advice.

http://nedbatchelder.com/text/iter.html

Thanks

Sayth
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: web facing static text db

2016-05-01 Thread sum abiut

Django is an excellent framework. you can use it with sqlite.

cheers

On Sat, Apr 30, 2016 at 7:17 PM, Gordon Levi  wrote:

> "Fetchinson ."  wrote:
>
> >Hi folks,go >
> >I have a vo ery specific set of requirements for a task and was
> >wondering if anyone had good suggestions for the best set of tools:
> >
> >* store text documents (about 10 pages)
> >* the data set is static (i.e. only lookups are performed, no delete,
> >no edit, no addition)
> >* only one operation required: lookup of pages by matching words in them
> >* very simple web frontend for querying the words to be matched
> >* no authentication or authorization, frontend completely public
> >* deployment at webfaction
> >* deadline: yesterday :)
> >
> >Which web framework and db engine would you recommend?
> >
> >So far I'm familiar with turbogears but would be willing to learn
> >anything if sufficiently basic since my needs are pretty basic (I
> >think).
> >
>
> What do need that storing the documents in HTML format and Google
> Custom Search does not provide
> ?
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How to fill in abbreviation in one column based on state name in another column?

2016-05-01 Thread Rustom Mody

Your code (below) is too garbled to be able to read

On Monday, May 2, 2016 at 12:00:59 AM UTC+5:30, David Shi wrote:
> Hello, I am back.  Thank you very much for your positive response.
> I am trying to use Pandas apply to execute a lookup function, so that we can 
> put abbreviation in a new column, in accordance to a state name in another 
> column.
> Does anyone knows how to make this to work?
> Regards.DavidLook up functionstate_to_code = {"VERMONT": "VT", "GEORGIA": 
> "GA", "IOWA": "IA"}#table['moa_state_name'] = map(lambda x: x.upper(), 
> table['moa_state_name'])def convert_state(row):    abbrev1 =  
> state_to_code(table['moa_state_name']) #'aatest'    if abbrev1:         
> return abbrev1 ##state_to_code[abbrev[0]]    return np.nan#print 
> convert_state(table['moa_state_name'])
> table.insert(0, "abbrev", np.nan)
> table['abbrev'] = table.apply(convert_state, axis=1)print 
> state_to_code['ARKANSAS']

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Code Opinion - Enumerate

2016-05-01 Thread Sayth Renshaw

Also not using enumerate but no ugly for i range implementation

this one from code review uses a generator on live cells only.

http://codereview.stackexchange.com/a/108121/104381


def neighbors(cell):
x, y = cell
yield x - 1, y - 1
yield x, y - 1
yield x + 1, y - 1
yield x - 1, y
yield x + 1, y
yield x - 1, y + 1
yield x, y + 1
yield x + 1, y + 1

def apply_iteration(board):
new_board = set([])
candidates = board.union(set(n for cell in board for n in neighbors(cell)))
for cell in candidates:
count = sum((n in board) for n in neighbors(cell))
if count == 3 or (count == 2 and cell in board):
new_board.add(cell)
return new_board

if __name__ == "__main__":
board = {(0,1), (1,2), (2,0), (2,1), (2,2)}
number_of_iterations = 10
for _ in xrange(number_of_iterations):
board = apply_iteration(board)
print board


Sayth
-- 
https://mail.python.org/mailman/listinfo/python-list

You gotta love a 2-line python solution

2016-05-01 Thread DFS


To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
/ex/001.html","D:\file.html")
-

That's it!

Coming from VB/A background, some of the stuff you can do with python - 
with ease - is amazing.



VBScript version
--
1. Option Explicit
2. Dim xmlHTTP, fso, fOut
3. Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
4. xmlHTTP.Open "GET", "http://econpy.pythonanywhere.com/ex/001.html";
5. xmlHTTP.Send
6. Set fso = CreateObject("Scripting.FileSystemObject")
7. Set fOut = fso.CreateTextFile("D:\file.html", True)
8.  fOut.WriteLine xmlHTTP.ResponseText
9. fOut.Close
10. Set fOut = Nothing
11. Set fso  = Nothing
12. Set xmlHTTP = Nothing
--

Technically, that VBS will run with just lines 3-9, but that's still 6 
lines of code vs 2 for python.




--
https://mail.python.org/mailman/listinfo/python-list

Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS


I posted a little while ago about how short the python code was:

-
1. import urllib
2. urllib.urlretrieve(webpage, filename)
-

Which is very sweet compared to the VBScript version:

--
1. Option Explicit
2. Dim xmlHTTP, fso, fOut
3. Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
4. xmlHTTP.Open "GET", webpage
5. xmlHTTP.Send
6. Set fso = CreateObject("Scripting.FileSystemObject")
7. Set fOut = fso.CreateTextFile(filename, True)
8.  fOut.WriteLine xmlHTTP.ResponseText
9. fOut.Close
10. Set fOut = Nothing
11. Set fso  = Nothing
12. Set xmlHTTP = Nothing
--

Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 
iterations, vs 0.88 for python.


webpage = 'http://econpy.pythonanywhere.com/ex/001.html'


So I tried:
---
import urllib2
r = urllib2.urlopen(webpage)
f = open(filename,"w")
f.write(r.read())
f.close
---
and
---
import requests
r = requests.get(webpage)
f = open(filename,"w")
f.write(r.text)
f.close
---
and
-
import pycurl
with open(filename, 'wb') as f:
c = pycurl.Curl()
c.setopt(c.URL, webpage)
c.setopt(c.WRITEDATA, f)
c.perform()
c.close()
-

urllib2 and requests were about the same speed as urllib.urlretrieve, 
while pycurl was significantly slower (1.2 seconds).


I'm running Win 8.1.  python 2.7.11 32-bit.

I know it's asking a lot, but is there a really fast AND really short 
python solution for this simple thing?



Thanks!


--
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:
> To save a webpage to a file:
> -
> 1. import urllib
> 2. urllib.urlretrieve("http://econpy.pythonanywhere.com
>  /ex/001.html","D:\file.html")
> -

Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 09:06 PM, DFS wrote:
> Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10 
> iterations, vs 0.88 for python.
...
> I know it's asking a lot, but is there a really fast AND really short 
> python solution for this simple thing?

0.88 is not fast enough for you? That's less then a second.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Chris Angelico

On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen  wrote:
> On Sun, May 1, 2016, at 09:06 PM, DFS wrote:
>> Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
>> iterations, vs 0.88 for python.
> ...
>> I know it's asking a lot, but is there a really fast AND really short
>> python solution for this simple thing?
>
> 0.88 is not fast enough for you? That's less then a second.

Also, this is timings of network and disk operations. Unless something
pathological is happening, the language used won't make any
difference.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Ben Finney

DFS  writes:

> Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
> iterations, vs 0.88 for python.
>
> […]
>
> urllib2 and requests were about the same speed as urllib.urlretrieve,
> while pycurl was significantly slower (1.2 seconds).

Network access is notoriously erratic in its timing. The program, and
the machine on which it runs, is subject to a great many external
effects once the request is sent — effects which will significantly
alter the delay before a response is completed.

How have you controlled for the wide variability in the duration, for
even a given request by the *same code on the same machine*, at
different points in time?

One simple way to do that: Run the exact same test many times (say,
10 000 or so) on the same machine, and then compute the average of all
the durations.

Do the same for each different program, and then you may have more
meaningfully comparable measurements.

-- 
 \ “We are no more free to believe whatever we want about God than |
  `\ we are free to adopt unjustified beliefs about science or |
_o__)  history […].” —Sam Harris, _The End of Faith_, 2004 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Chris Angelico

On Mon, May 2, 2016 at 2:49 PM, Ben Finney  wrote:
> One simple way to do that: Run the exact same test many times (say,
> 10 000 or so) on the same machine, and then compute the average of all
> the durations.
>
> Do the same for each different program, and then you may have more
> meaningfully comparable measurements.

And also find the minimum and maximum durations, too. Averages don't
always tell the whole story.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS


On 5/2/2016 12:40 AM, Chris Angelico wrote:

On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen  wrote:

On Sun, May 1, 2016, at 09:06 PM, DFS wrote:

Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
iterations, vs 0.88 for python.

...

I know it's asking a lot, but is there a really fast AND really short
python solution for this simple thing?


0.88 is not fast enough for you? That's less then a second.


Also, this is timings of network and disk operations. Unless something
pathological is happening, the language used won't make any
difference.

ChrisA



Unfortunately, the VBScript is twice as fast as any python method.




--
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS


On 5/2/2016 12:31 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-


Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".



Thanks.

I actually use "D:\\file.html" in my code.


--
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 09:51 PM, DFS wrote:
> On 5/2/2016 12:31 AM, Stephen Hansen wrote:
> > On Sun, May 1, 2016, at 08:39 PM, DFS wrote:
> >> To save a webpage to a file:
> >> -
> >> 1. import urllib
> >> 2. urllib.urlretrieve("http://econpy.pythonanywhere.com
> >>  /ex/001.html","D:\file.html")
> >> -
> >
> > Note, for paths on windows you really want to use a rawstring. Ie,
> > r"D:\file.html".
> > 
> Thanks.
> 
> I actually use "D:\\file.html" in my code.

Or you can do that. But the whole point of raw strings is not having to
escape slashes :) 

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 09:50 PM, DFS wrote:
> On 5/2/2016 12:40 AM, Chris Angelico wrote:
> > On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen  wrote:
> >> On Sun, May 1, 2016, at 09:06 PM, DFS wrote:
> >>> Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
> >>> iterations, vs 0.88 for python.
> >> ...
> >>> I know it's asking a lot, but is there a really fast AND really short
> >>> python solution for this simple thing?
> >>
> >> 0.88 is not fast enough for you? That's less then a second.
> >
> > Also, this is timings of network and disk operations. Unless something
> > pathological is happening, the language used won't make any
> > difference.
> >
> > ChrisA
> 
> 
> Unfortunately, the VBScript is twice as fast as any python method.

And 0.2 is twice as fast as 0.1. When you have two small numbers, 'twice
as fast' isn't particularly meaningful as a metric. 

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS


On 5/2/2016 12:49 AM, Ben Finney wrote:

DFS  writes:


Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
iterations, vs 0.88 for python.

[…]

urllib2 and requests were about the same speed as urllib.urlretrieve,
while pycurl was significantly slower (1.2 seconds).


Network access is notoriously erratic in its timing. The program, and
the machine on which it runs, is subject to a great many external
effects once the request is sent — effects which will significantly
alter the delay before a response is completed.

How have you controlled for the wide variability in the duration, for
even a given request by the *same code on the same machine*, at
different points in time?

One simple way to do that: Run the exact same test many times (say,
10 000 or so) on the same machine, and then compute the average of all
the durations.

Do the same for each different program, and then you may have more
meaningfully comparable measurements.



I tried the 10-loop test several times with all versions.

The results were 100% consistent: VBSCript xmlHTTP was always 2x faster 
than any python method.




--
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS


On 5/2/2016 1:00 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 09:50 PM, DFS wrote:

On 5/2/2016 12:40 AM, Chris Angelico wrote:

On Mon, May 2, 2016 at 2:34 PM, Stephen Hansen  wrote:

On Sun, May 1, 2016, at 09:06 PM, DFS wrote:

Then I tested them in loops - the VBScript is MUCH faster: 0.44 for 10
iterations, vs 0.88 for python.

...

I know it's asking a lot, but is there a really fast AND really short
python solution for this simple thing?


0.88 is not fast enough for you? That's less then a second.


Also, this is timings of network and disk operations. Unless something
pathological is happening, the language used won't make any
difference.

ChrisA



Unfortunately, the VBScript is twice as fast as any python method.


And 0.2 is twice as fast as 0.1. When you have two small numbers, 'twice
as fast' isn't particularly meaningful as a metric.


0.2 is half as fast as 0.1, here.

And two small numbers turn into bigger numbers when the webpage is big, 
and soon the download time differences are measured in minutes, not half 
a second.


So, any ideas?
--
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS


On 5/2/2016 1:02 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 09:51 PM, DFS wrote:

On 5/2/2016 12:31 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-


Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".


Thanks.

I actually use "D:\\file.html" in my code.


Or you can do that. But the whole point of raw strings is not having to
escape slashes :)



Nice.  Where/how else is 'r' used?


I'm new to python, but I learned that one the hard way.

I was using "D\testfile.txt" for something, and my code kept failing. 
Took me a while to figure it out.  I tried various letters after the 
slash.  I finally stumbled across the escape slashes in the docs somewhere.



--
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Chris Angelico

On Mon, May 2, 2016 at 3:04 PM, DFS  wrote:
> And two small numbers turn into bigger numbers when the webpage is big, and
> soon the download time differences are measured in minutes, not half a
> second.
>
> So, any ideas?

So, measure with bigger web pages, and find out whether it's really a
2:1 ratio or a half-second difference. When download times are
measured in minutes, a half second difference is insignificant.

Extrapolating is dangerous.
https://xkcd.com/605/

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 10:00 PM, DFS wrote:
> I tried the 10-loop test several times with all versions.

Also how, _exactly_, are you testing this?

C:\Python27>python -m timeit "filename='C:\\test.txt';
webpage='http://econpy.pythonanywhere.com/ex/001.html'; import urllib2;
r = urllib2.urlopen(webpage); f = open(filename, 'w');
f.write(r.read()); f.close();"
10 loops, best of 3: 175 msec per loop

That's a whole lot less the 0.88secs.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 10:04 PM, DFS wrote:
> And two small numbers turn into bigger numbers when the webpage is big, 
> and soon the download time differences are measured in minutes, not half 
> a second.

Are you sure of that? Have you determined that the time is not a
constant overhead verses that the time is directly relational to the
size of the page? If so, how have you determined that?

You aren't showing how you're testing. 0.4s difference is meaningless to
me, if its a constant overhead. If its twice as slow for a 1 meg file,
then you might have an issue. Maybe. You haven't shown that.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 10:08 PM, DFS wrote:
> On 5/2/2016 1:02 AM, Stephen Hansen wrote:
> >> I actually use "D:\\file.html" in my code.
> >
> > Or you can do that. But the whole point of raw strings is not having to
> > escape slashes :)
> 
> 
> Nice.  Where/how else is 'r' used?

Raw strings are primarily used A) for windows paths, and more
universally, B) for regular expressions. 

But in theory they're useful anywhere you have static/literal data that
might include backslashes where you don't actually intend to use any
escape characters.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Code Opinion - Enumerate

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 08:17 PM, Sayth Renshaw wrote:
> Just looking for your opinion on style would you write it like this
> continually calling range or would you use enumerate instead, or neither
> (something far better) ?

I can't comment on your specific code because there's too much noise to
it, but in general:

Using enumerate increases readability, and I use it whenever the idiom:

for index, item in enumerate(thing):
...

is used.

Enumerate is your friend. Hug it.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS


On 5/2/2016 1:02 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 09:51 PM, DFS wrote:

On 5/2/2016 12:31 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-


Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".


Thanks.

I actually use "D:\\file.html" in my code.


Or you can do that. But the whole point of raw strings is not having to
escape slashes :)



Trying the rawstring thing (say it fast 3x):

webpage = "http://econpy.pythonanywhere.com/ex/001.html";


webfile = "D:\\econpy001.html"
urllib.urlretrieve(webpage,webfile) WORKS

webfile = "rD:\econpy001.html"
urllib.urlretrieve(webpage,webfile) FAILS

webfile = "D:\econpy001.html"
urllib.urlretrieve(webpage,"r" + webfile) FAILS

webfile  = "D:\econpy001.html"
urllib.urlretrieve(webpage,"r" + "" + webfile + "")  FAILS


The FAILs throw:

Traceback (most recent call last):
  File "webscraper.py", line 54, in 
urllib.urlretrieve(webpage,webfile)
  File "D:\development\python\python_2.7.11\lib\urllib.py", line 98, in 
urlretrieve

return opener.retrieve(url, filename, reporthook, data)
  File "D:\development\python\python_2.7.11\lib\urllib.py", line 249, 
in retrieve

tfp = open(filename, 'wb')
IOError: [Errno 22] invalid mode ('wb') or filename: 'rD:\\econpy001.html'


What am I doing wrong?
--
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 10:23 PM, DFS wrote:
> Trying the rawstring thing (say it fast 3x):
> 
> webpage = "http://econpy.pythonanywhere.com/ex/001.html";
> 
> 
> webfile = "D:\\econpy001.html"
> urllib.urlretrieve(webpage,webfile) WORKS
> 
> webfile = "rD:\econpy001.html"

The r is *outside* the string.

Its: r"D:\econpy001.html"

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread Steven D'Aprano

On Monday 02 May 2016 15:21, Stephen Hansen wrote:

> On Sun, May 1, 2016, at 10:08 PM, DFS wrote:
>> On 5/2/2016 1:02 AM, Stephen Hansen wrote:
>> >> I actually use "D:\\file.html" in my code.
>> >
>> > Or you can do that. But the whole point of raw strings is not having to
>> > escape slashes :)
>> 
>> 
>> Nice.  Where/how else is 'r' used?
> 
> Raw strings are primarily used A) for windows paths, and more
> universally, B) for regular expressions.

Raw strings are designed for regular expressions. They can be used for 
Windows paths, except for one minor gotcha: you can't end a raw string with 
an odd number of backspaces. So this doesn't work:

directory = r'D:\some\path\dir\'

So it's more of a half-cooked string than a raw string.

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS


On 5/2/2016 1:15 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 10:00 PM, DFS wrote:

I tried the 10-loop test several times with all versions.


Also how, _exactly_, are you testing this?

C:\Python27>python -m timeit "filename='C:\\test.txt';
webpage='http://econpy.pythonanywhere.com/ex/001.html'; import urllib2;
r = urllib2.urlopen(webpage); f = open(filename, 'w');
f.write(r.read()); f.close();"
10 loops, best of 3: 175 msec per loop

That's a whole lot less the 0.88secs.


Indeed.


-
import requests, urllib, urllib2, pycurl
import time

webpage = "http://econpy.pythonanywhere.com/ex/001.html";
webfile = "D:\\econpy001.html"
loops   = 10

startTime = time.clock()
for i in range(loops):
urllib.urlretrieve(webpage,webfile)
endTime = time.clock()  
print "Finished urllib in %.2g seconds" %(endTime-startTime)

startTime = time.clock()
for i in range(loops):
r = urllib2.urlopen(webpage)
f = open(webfile,"w")
f.write(r.read())
f.close
endTime = time.clock()  
print "Finished urllib2 in %.2g seconds" %(endTime-startTime)

startTime = time.clock()
for i in range(loops):
r = requests.get(webpage)
f = open(webfile,"w")
f.write(r.text)
f.close
endTime = time.clock()  
print "Finished requests in %.2g seconds" %(endTime-startTime)

startTime = time.clock()
for i in range(loops):
with open(webfile + str(i) + ".txt", 'wb') as f:
c = pycurl.Curl()
c.setopt(c.URL, webpage)
c.setopt(c.WRITEDATA, f)
c.perform()
c.close()
endTime = time.clock()  
print "Finished pycurl in %.2g seconds" %(endTime-startTime)
-

$ python getHTML.py
Finished urllib in 0.88 seconds
Finished urllib2 in 0.83 seconds
Finished requests in 0.89 seconds
Finished pycurl in 1.1 seconds

Those results are consistent.  They go up or down a little, but never 
below 0.82 seconds (for urllib2), or above 1.2 seconds (for pycurl)


VBScript is consistently 0.44 to 0.48

--
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Steven D'Aprano

On Monday 02 May 2016 15:04, DFS wrote:

> 0.2 is half as fast as 0.1, here.
> 
> And two small numbers turn into bigger numbers when the webpage is big,
> and soon the download time differences are measured in minutes, not half
> a second.

It takes twice as long to screw a screw into timber than to hammer a nail 
into the same timber.

Therefore if builders change from nails to screws, they can finish building 
the house in half the time.

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Steven D'Aprano

On Monday 02 May 2016 15:00, DFS wrote:

> I tried the 10-loop test several times with all versions.
> 
> The results were 100% consistent: VBSCript xmlHTTP was always 2x faster
> than any python method.

Are you absolutely sure you're comparing the same job in two languages? Is 
VB using a local web cache, and Python not? Are you saving files with both 
tests? To the same local drive? (To ensure you aren't measuring the 
difference between "write this file to a slow IDE hard disk, write that file 
to a fast SSD".)

Once you are sure that you are comparing the same task in two languages, 
then make sure the measurement is meaningful. If you change from a (let's 
say) 1 KB file to a 100 KB file, do you see the same 2 x difference? What if 
you increase it to a 1 KB file?

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread DFS


On 5/2/2016 1:37 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 10:23 PM, DFS wrote:

Trying the rawstring thing (say it fast 3x):

webpage = "http://econpy.pythonanywhere.com/ex/001.html";


webfile = "D:\\econpy001.html"
urllib.urlretrieve(webpage,webfile) WORKS

webfile = "rD:\econpy001.html"


The r is *outside* the string.

Its: r"D:\econpy001.html"



Got it.  Thanks.




--
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread Stephen Hansen

On Sun, May 1, 2016, at 10:59 PM, DFS wrote:
> startTime = time.clock()
> for i in range(loops):
>   r = urllib2.urlopen(webpage)
>   f = open(webfile,"w")
>   f.write(r.read())
>   f.close
> endTime = time.clock()  
> print "Finished urllib2 in %.2g seconds" %(endTime-startTime)

Yeah on my system I get 1.8 out of this, amounting to 0.18s. 

I'm again going back to the point of: its fast enough. When comparing
two small numbers, "twice as slow" is meaningless.

You have an assumption you haven't answered, that downloading a 10 meg
file will be twice as slow as downloading this tiny file. You haven't
proven that at all. 

I suspect you have a constant overhead of X, and in this toy example,
that makes it seem twice as slow. But when downloading a file of size,
you'll have the same constant factor, at which point the difference is
irrelevant. 

If you believe otherwise, demonstrate it.

-- 
Stephen Hansen
  m e @ i x o k a i . i o
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Code Opinion - Enumerate

2016-05-01 Thread Sayth Renshaw

Thanks for the opinion. I should  add that is not my code in first post it's 
the code from Rosetta on how to do Conway's GOL. 

I thought it looked ugly. 

Sayth 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What should Python apps do when asked to show help?

2016-05-01 Thread Terry Reedy


On 5/1/2016 9:48 PM, Steven D'Aprano wrote:

On Mon, 2 May 2016 03:04 am, Grant Edwards wrote:


On 2016-05-01, Steven D'Aprano  wrote:

On Mon, 2 May 2016 02:30 am, Grant Edwards wrote:


In discussions like these, it would be important to draw from
precedents. Are there commands that have such an option?


It's pretty rare.  It is assumed that Unix uses can type " | less"


Is nobody except me questioning the assumption that we're only
talking about Unix users?


Didn't the OP specify that he was writing a command-line utility for
Linux/Unix?


*cough* I'm the OP, and no I didn't.

Obviously I'm a Linux user myself, but I'm presumptuous enough to hope that
when I release the utility publicly[1], others may find it of some small
use. Including Windows users.

[1] Real Soon Now.


As a Windows user in recent years, I expect -h to give me a list of 
options, hopefully with some annotation beyond the bare bones, that give 
the signature of the command (regarding it as a function call).  'python 
-h' is pretty bare bones.  'python -m test -h' is much better.  I expect 
both to tell me how to properly pass a file argument.  I don't expect 
either to tell me how write a python or unittest file.  I use the manual 
for this.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: You gotta love a 2-line python solution

2016-05-01 Thread Terry Reedy


On 5/2/2016 12:31 AM, Stephen Hansen wrote:

On Sun, May 1, 2016, at 08:39 PM, DFS wrote:

To save a webpage to a file:
-
1. import urllib
2. urllib.urlretrieve("http://econpy.pythonanywhere.com
 /ex/001.html","D:\file.html")
-


Note, for paths on windows you really want to use a rawstring. Ie,
r"D:\file.html".


Or use forward slashes "D:/file.html" and avoid the issue.  I don't know 
of anywhere this does not work for file names sent from python directly 
to Windows.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: Fastest way to retrieve and write html contents to file

2016-05-01 Thread DFS


On 5/2/2016 2:05 AM, Steven D'Aprano wrote:

On Monday 02 May 2016 15:00, DFS wrote:


I tried the 10-loop test several times with all versions.

The results were 100% consistent: VBSCript xmlHTTP was always 2x faster
than any python method.



Are you absolutely sure you're comparing the same job in two languages?


As near as I can tell.  In VBScript I'm actually dereferencing various 
objects (that adds to the time), but I don't do that in python.  I don't 
know enough to even know if it's necessary, or good practice, or what.






Is VB using a local web cache, and Python not?


I'm not specifying a local web cache with either (wouldn't know how or 
where to look).  If you have Windows, you can try it.

---
Option Explicit
Dim xmlHTTP, fso, fOut, startTime, endTime, webpage, webfile,i
webpage = "http://econpy.pythonanywhere.com/ex/001.html";
webfile  = "D:\econpy001.html"
startTime = Timer
For i = 1 to 10
 Set xmlHTTP = CreateObject("MSXML2.serverXMLHTTP")
 xmlHTTP.Open "GET", webpage
 xmlHTTP.Send
 Set fso = CreateObject("Scripting.FileSystemObject")
 Set fOut = fso.CreateTextFile(webfile, True)
  fOut.WriteLine xmlHTTP.ResponseText
 fOut.Close
 Set fOut= Nothing
 Set fso = Nothing
 Set xmlHTTP = Nothing
Next
endTime = Timer
wscript.echo "Finished VBScript in " & FormatNumber(endTime - 
startTime,3) & " seconds"

---
save it to a .vbs file and run it like this:
$cscript /nologo filename.vbs



Are you saving files with both
tests? To the same local drive? (To ensure you aren't measuring the
difference between "write this file to a slow IDE hard disk, write that file
to a fast SSD".)


Identical functionality (retrieve webpage, write html to file).  Same 
webpage, written to the same folder on the same hard drive (not SSD).


The 10 file writes (open/write/close) don't make a meaningful difference 
at all:

VBScript 0.0156 seconds
urllib2  0.0034 seconds

This file is 3.55K.



Once you are sure that you are comparing the same task in two languages,
then make sure the measurement is meaningful. If you change from a (let's
say) 1 KB file to a 100 KB file, do you see the same 2 x difference? What if
you increase it to a 1 KB file?


Do you know a webpage I can hit 10x repeatedly to download a good size 
file?  I'm always paranoid they'll block me thinking I'm a 
"professional" web scraper or something.


Thanks


--
https://mail.python.org/mailman/listinfo/python-list

56 matches

Mail list logo