Re: Raw Data from Website

2016-08-23 Thread Steven D'Aprano
On Tuesday 23 August 2016 10:28, adam.j.k...@gmail.com wrote:

> Hi,
> 
> I am hoping someone is able to help me.
> 
> Is there a way to pull as much raw data from a website as possible. The
> webpage that I am looking for is as follows:
> 
http://www.homepriceguide.com.au/Research/ResearchSeeFullList.aspx?LocationType=LGA&State=QLD&LgaID=632
> 
> The main variable that is important is the "632" at the end, by adjusting
> this it changes the postcodes. Each postcode contains a large amount of data.
> Is there a way this all able to be exported into an excel document?

Ideally, the web site itself will offer an Excel download option. If it 
doesn't, you may be able to screen-scrape the data yourself, but:

(1) it may be against the terms of service of the website;
(2) it may be considered unethical or possibly even copyright 
infringement or (worst case) even illegal;
(3) especially if you're thinking of selling the data;
(4) at the very least, unless you take care not to abuse the service, 
it may be rude and the website may even block your access.

There are many tutorials and examples of "screen scraping" or "web scraping" on 
the internet -- try reading them. It's not something I personally have any 
experience with, but I expect that the process goes something like this:

- connect to the website;
- download the particular page you want;
- grab the data that you care about;
- remove HTML tags and extract just the bits needed;
- write them to a CSV file.


You may find the Beautiful Soup third-party library helpful for this.



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Raw Data from Website

2016-08-23 Thread Chris Angelico
On Tue, Aug 23, 2016 at 5:14 PM, Steven D'Aprano
 wrote:
> There are many tutorials and examples of "screen scraping" or "web scraping" 
> on
> the internet -- try reading them. It's not something I personally have any
> experience with, but I expect that the process goes something like this:
>
> - connect to the website;
> - download the particular page you want;
> - grab the data that you care about;
> - remove HTML tags and extract just the bits needed;
> - write them to a CSV file.

More or less. It's usually more like this:

- import requests and grab the data, nice and easy
- extract some of the info you need
- run into difficulties
- scream in frustration at the stupid inconsistencies in the original site
- mess around with it until your code is nested as deeply as the
site's HTML (minimum 30 levels)
- decide that 90% of the info is good enough
- run the program in production for a month or two, and then discover
that something's been changed and now it doesn't work
- return to step 1, repeat until you run out of hair to pull out

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Dynamically import specific names from a module vs importing full module

2016-08-23 Thread Malcolm Greene
Ned and Random832,

Ned: Thank you - your example answered my question.

Random832: Thank you for the reminder about "from  import
" still importing the module. Yes, I had forgotten that behavior.

Best,
Malcolm
-- 
https://mail.python.org/mailman/listinfo/python-list


Python non blocking multi-client service

2016-08-23 Thread dimao

I am trying to implement multi-client service. The service must be able to:

connect to the server;
send/receive messages;
wait until a new message will be received
*** Per each client and non blocking

My current code takes almost 99% CPU usage. Here is the main part of my code :


PORT= 33123
HOST= '127.0.0.1'


import asyncio
import os

@asyncio.coroutine
def tcp_echo_client(offset):

  def send(offset):
  MSG = """{"ClientId":"%s", % (str(offset))"}"""

  print("> " + MSG)
  writer.write((MSG).encode("utf-8"))


  def recv():
   msgback = (yield from reader.readline()).decode("utf-8").rstrip()
   print("< " + msgback)
   return msgback

   reader, writer = yield from asyncio.open_connection(HOST, port=PORT)


   print(reader)
   print('Waiting 3 sec for response...')

   while True:
  response = yield from asyncio.wait_for(reader.readline(), timeout=5.0)
  print(response)
  send(offset)

  yield from asyncio.sleep(0.5)

@asyncio.coroutine
def do_work(task_name, work_queue):
while not work_queue.empty():
queue_item = yield from work_queue.get()
print('{0} grabbed item: {1}'.format(task_name, queue_item))
asyncio.Task(tcp_echo_client(offset=queue_item))
yield from asyncio.sleep(0.1)


if __name__ == "__main__":
q = asyncio.Queue()

for x in range(100):
q.put_nowait(x)

print(q)

loop = asyncio.get_event_loop()

tasks = [
asyncio.async(do_work('task1', q)),
asyncio.async(do_work('task2', q)),
asyncio.async(do_work('task3', q)),
asyncio.async(do_work('task4', q)),
asyncio.async(do_work('task5', q)),
asyncio.async(do_work('task6', q))
]

loop.run_until_complete(asyncio.wait(tasks))
loop.run_forever()
loop.close()
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python non blocking multi-client service

2016-08-23 Thread Chris Angelico
On Tue, Aug 23, 2016 at 11:08 PM, dimao  wrote:
> My current code takes almost 99% CPU usage. Here is the main part of my code :
>
>
> PORT= 33123
> HOST= '127.0.0.1'
>
>
> import asyncio
> import os
>
> @asyncio.coroutine
> def tcp_echo_client(offset):
>
>   def send(offset):
>   MSG = """{"ClientId":"%s", % (str(offset))"}"""
>
>   print("> " + MSG)
>   writer.write((MSG).encode("utf-8"))
>
>
>   def recv():
>msgback = (yield from reader.readline()).decode("utf-8").rstrip()
>print("< " + msgback)
>return msgback
>
>reader, writer = yield from asyncio.open_connection(HOST, port=PORT)
>
>
>print(reader)
>print('Waiting 3 sec for response...')
>
>while True:
>   response = yield from asyncio.wait_for(reader.readline(), 
> timeout=5.0)
>   print(response)
>   send(offset)
>
>   yield from asyncio.sleep(0.5)
>

Can you post your actual code, please? I'm not sure that this code is
runnable - at least, not with this indentation. And if I have to fix
your indentation to try to figure out what's nested inside what, I
can't help with your CPU usage problem.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


libdivecomputer

2016-08-23 Thread alister

Does anyone know if there is a python module avaialbe to interact with 
this library
(it is for downloading data down various Decompression computers so a bit 
of a specialist market)

I already have an application that works fine (sub-surface) so this is 
more of a curiosity, delving into ctypes is not something I want to try 
just yet.


-- 
According to Arkansas law, Section 4761, Pope's Digest:  "No person
shall be permitted under any pretext whatever, to come nearer than
fifty feet of any door or window of any polling room, from the opening
of the polls until the completion of the count and the certification of
the returns."
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python non blocking multi-client service

2016-08-23 Thread dima . olkhov
On Tuesday, August 23, 2016 at 4:09:07 PM UTC+3, dimao wrote:
> I am trying to implement multi-client service. The service must be able to:
> 
> connect to the server;
> send/receive messages;
> wait until a new message will be received
> *** Per each client and non blocking
> 
> My current code takes almost 99% CPU usage. Here is the main part of my code :
> 
> 
> PORT= 33123
> HOST= '127.0.0.1'
> 
> 
> import asyncio
> import os
> 
> @asyncio.coroutine
> def tcp_echo_client(offset):
> 
>   def send(offset):
>   MSG = """{"ClientId":"%s", % (str(offset))"}"""
> 
>   print("> " + MSG)
>   writer.write((MSG).encode("utf-8"))
> 
> 
>   def recv():
>msgback = (yield from reader.readline()).decode("utf-8").rstrip()
>print("< " + msgback)
>return msgback
> 
>reader, writer = yield from asyncio.open_connection(HOST, port=PORT)
> 
> 
>print(reader)
>print('Waiting 3 sec for response...')
> 
>while True:
>   response = yield from asyncio.wait_for(reader.readline(), 
> timeout=5.0)
>   print(response)
>   send(offset)
> 
>   yield from asyncio.sleep(0.5)
> 
> @asyncio.coroutine
> def do_work(task_name, work_queue):
> while not work_queue.empty():
> queue_item = yield from work_queue.get()
> print('{0} grabbed item: {1}'.format(task_name, queue_item))
> asyncio.Task(tcp_echo_client(offset=queue_item))
> yield from asyncio.sleep(0.1)
> 
> 
> if __name__ == "__main__":
> q = asyncio.Queue()
> 
> for x in range(100):
> q.put_nowait(x)
> 
> print(q)
> 
> loop = asyncio.get_event_loop()
> 
> tasks = [
> asyncio.async(do_work('task1', q)),
> asyncio.async(do_work('task2', q)),
> asyncio.async(do_work('task3', q)),
> asyncio.async(do_work('task4', q)),
> asyncio.async(do_work('task5', q)),
> asyncio.async(do_work('task6', q))
> ]
> 
> loop.run_until_complete(asyncio.wait(tasks))
> loop.run_forever()
> loop.close()

UPDATED :
PORT= 33123
HOST= '127.0.0.1'
LASTLINE = '#'

import asyncio
import os


@asyncio.coroutine
def tcp_echo_client(offset):

def send(offset):

MSG = '{'+"ClientId"+':'+"%s" % (str(offset))+'}'

print("> " + MSG)
writer.write((MSG ).encode("utf-8"))
yield from writer.drain()

def recv():
msgback = (yield from reader.readline()).decode("utf-8").rstrip()
print("< " + msgback)
return msgback

reader, writer = yield from asyncio.open_connection(HOST, port=PORT)


print(reader)
print('Waiting 3 sec for response...')

while True:
try:
response = yield from asyncio.wait_for(reader.readline(), 
timeout=10.0)
print(response)

if str(response).find('GetClientInfo') > 0:
send(int(offset))

yield from asyncio.sleep(0.5)
except:
print ('Error')

@asyncio.coroutine
def do_work(task_name, work_queue):
 while not work_queue.empty():
queue_item = yield from work_queue.get()
print('{0} grabbed item: {1}'.format(task_name, queue_item))
asyncio.Task(tcp_echo_client(offset=queue_item))
yield from asyncio.sleep(0.1)



if __name__ == "__main__":
q = asyncio.Queue()

for x in range(1):
q.put_nowait(x)

print(q)

loop = asyncio.get_event_loop()

tasks = [
asyncio.async(do_work('task1', q)),
asyncio.async(do_work('task2', q)),
asyncio.async(do_work('task3', q)),
asyncio.async(do_work('task4', q)),
asyncio.async(do_work('task5', q)),
asyncio.async(do_work('task6', q))
]

loop.run_until_complete(asyncio.wait(tasks))
loop.run_forever()
loop.close()

Thanks
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python non blocking multi-client service

2016-08-23 Thread Chris Angelico
On Tue, Aug 23, 2016 at 11:39 PM,   wrote:
> except:
> print ('Error')
>

Don't do this.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python non blocking multi-client service

2016-08-23 Thread dima . olkhov
On Tuesday, August 23, 2016 at 4:09:07 PM UTC+3, dimao wrote:
> I am trying to implement multi-client service. The service must be able to:
> 
> connect to the server;
> send/receive messages;
> wait until a new message will be received
> *** Per each client and non blocking
> 
> My current code takes almost 99% CPU usage. Here is the main part of my code :
> 
> 
> PORT= 33123
> HOST= '127.0.0.1'
> 
> 
> import asyncio
> import os
> 
> @asyncio.coroutine
> def tcp_echo_client(offset):
> 
>   def send(offset):
>   MSG = """{"ClientId":"%s", % (str(offset))"}"""
> 
>   print("> " + MSG)
>   writer.write((MSG).encode("utf-8"))
> 
> 
>   def recv():
>msgback = (yield from reader.readline()).decode("utf-8").rstrip()
>print("< " + msgback)
>return msgback
> 
>reader, writer = yield from asyncio.open_connection(HOST, port=PORT)
> 
> 
>print(reader)
>print('Waiting 3 sec for response...')
> 
>while True:
>   response = yield from asyncio.wait_for(reader.readline(), 
> timeout=5.0)
>   print(response)
>   send(offset)
> 
>   yield from asyncio.sleep(0.5)
> 
> @asyncio.coroutine
> def do_work(task_name, work_queue):
> while not work_queue.empty():
> queue_item = yield from work_queue.get()
> print('{0} grabbed item: {1}'.format(task_name, queue_item))
> asyncio.Task(tcp_echo_client(offset=queue_item))
> yield from asyncio.sleep(0.1)
> 
> 
> if __name__ == "__main__":
> q = asyncio.Queue()
> 
> for x in range(100):
> q.put_nowait(x)
> 
> print(q)
> 
> loop = asyncio.get_event_loop()
> 
> tasks = [
> asyncio.async(do_work('task1', q)),
> asyncio.async(do_work('task2', q)),
> asyncio.async(do_work('task3', q)),
> asyncio.async(do_work('task4', q)),
> asyncio.async(do_work('task5', q)),
> asyncio.async(do_work('task6', q))
> ]
> 
> loop.run_until_complete(asyncio.wait(tasks))
> loop.run_forever()
> loop.close()

I did that only for the debug reasons :-)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python non blocking multi-client service

2016-08-23 Thread dima . olkhov
On Tuesday, August 23, 2016 at 4:09:07 PM UTC+3, dimao wrote:
> I am trying to implement multi-client service. The service must be able to:
> 
> connect to the server;
> send/receive messages;
> wait until a new message will be received
> *** Per each client and non blocking
> 
> My current code takes almost 99% CPU usage. Here is the main part of my code :
> 
> 
> PORT= 33123
> HOST= '127.0.0.1'
> 
> 
> import asyncio
> import os
> 
> @asyncio.coroutine
> def tcp_echo_client(offset):
> 
>   def send(offset):
>   MSG = """{"ClientId":"%s", % (str(offset))"}"""
> 
>   print("> " + MSG)
>   writer.write((MSG).encode("utf-8"))
> 
> 
>   def recv():
>msgback = (yield from reader.readline()).decode("utf-8").rstrip()
>print("< " + msgback)
>return msgback
> 
>reader, writer = yield from asyncio.open_connection(HOST, port=PORT)
> 
> 
>print(reader)
>print('Waiting 3 sec for response...')
> 
>while True:
>   response = yield from asyncio.wait_for(reader.readline(), 
> timeout=5.0)
>   print(response)
>   send(offset)
> 
>   yield from asyncio.sleep(0.5)
> 
> @asyncio.coroutine
> def do_work(task_name, work_queue):
> while not work_queue.empty():
> queue_item = yield from work_queue.get()
> print('{0} grabbed item: {1}'.format(task_name, queue_item))
> asyncio.Task(tcp_echo_client(offset=queue_item))
> yield from asyncio.sleep(0.1)
> 
> 
> if __name__ == "__main__":
> q = asyncio.Queue()
> 
> for x in range(100):
> q.put_nowait(x)
> 
> print(q)
> 
> loop = asyncio.get_event_loop()
> 
> tasks = [
> asyncio.async(do_work('task1', q)),
> asyncio.async(do_work('task2', q)),
> asyncio.async(do_work('task3', q)),
> asyncio.async(do_work('task4', q)),
> asyncio.async(do_work('task5', q)),
> asyncio.async(do_work('task6', q))
> ]
> 
> loop.run_until_complete(asyncio.wait(tasks))
> loop.run_forever()
> loop.close()

Can any one help me ?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python non blocking multi-client service

2016-08-23 Thread Zentrader
I don't know anything about asyncio, but multiprocessing would be my tool of 
choice.  The first example should be enough for what you want to do at 
https://pymotw.com/2/multiprocessing/basics.html
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python non blocking multi-client service

2016-08-23 Thread INADA Naoki
> 
> I did that only for the debug reasons :-)

That's bad for debugging too.
Real Exception and Stacktrace are far better than print('Error')
on all time.  Never do it even for debugging.
-- 
https://mail.python.org/mailman/listinfo/python-list


Python 3: Launch multiple commands(subprocesses) in parallel (but upto 4 any time at same time) AND store each of their outputs into a variable

2016-08-23 Thread lax . clarke
Hi,

I've been reading various forums and python documentation on subprocess, 
multithreading, PIPEs, etc.  But I cannot seem to mash together several of my 
requirements into working code.

I am trying to:

1) Use Python 3+ (specifically 3.4 if it matters)
2) Launch N commands in background (e.g., like subprocess.call would for 
individual commands)
3) But only limit P commands to run at same time
4) Wait until all N commands are done
5) Have an array of N strings with the stdout+stderr of each command in it.

What is the best way to do this?
There are literally many variations of things in the Python documentation and 
Stackoverflow that I am unable to see the forest from trees (for my problem).

Thank you very much!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: degrees and radians.

2016-08-23 Thread murdocksgranpa
On Saturday, May 4, 2002 at 3:37:07 AM UTC-4, Jim Richardson wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> 
> I am trying to get the math module to deal with degrees rather than
> radians. (that it deals with radians for the angular functions like
> sin() isn't mentioned in the docs, which was sort of an eyeopener :)  I
> can't find any info on doing this. I can convert from-to degrees in the
> code calling the function, but that's a bit clunky. Any pointers to an
> FM to R? :)
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.0.6 (GNU/Linux)
> Comment: For info see http://www.gnupg.org
> 
> iD8DBQE804+jd90bcYOAWPYRAt9KAKCuqeC4ozuXSaKZ5xY27Wv+k04QuQCcCrCZ
> WyichPnKgXo+GaDdAebsaeU=
> =h+vc
> -END PGP SIGNATURE-
> 
> -- 
> Jim Richardson
>   Anarchist, pagan and proud of it
> http://www.eskimo.com/~warlock
> Linux, from watches to supercomputers, for grandmas and geeks.

For what is is worth.. Electrical Engineers for the most part work in degrees 
NOT Radians for example try doing polar to rectangular or vice versa in polar. 
I have never seen it done. 

Also Borland C and C++ used Degrees and NOT Radians.. go look at the libraries

Just for what its worth.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: degrees and radians.

2016-08-23 Thread Gary Herron

On 08/23/2016 09:08 PM, murdocksgra...@gmail.com wrote:

On Saturday, May 4, 2002 at 3:37:07 AM UTC-4, Jim Richardson wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I am trying to get the math module to deal with degrees rather than
radians. (that it deals with radians for the angular functions like
sin() isn't mentioned in the docs, which was sort of an eyeopener :)  I
can't find any info on doing this. I can convert from-to degrees in the
code calling the function, but that's a bit clunky. Any pointers to an
FM to R? :)

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE804+jd90bcYOAWPYRAt9KAKCuqeC4ozuXSaKZ5xY27Wv+k04QuQCcCrCZ
WyichPnKgXo+GaDdAebsaeU=
=h+vc
-END PGP SIGNATURE-

--
Jim Richardson
Anarchist, pagan and proud of it
http://www.eskimo.com/~warlock
Linux, from watches to supercomputers, for grandmas and geeks.

For what is is worth.. Electrical Engineers for the most part work in degrees 
NOT Radians for example try doing polar to rectangular or vice versa in polar.
I have never seen it done.

Also Borland C and C++ used Degrees and NOT Radians.. go look at the libraries

Just for what its worth.



Do you really need anything more complex than this?

>>> toRadians = math.pi/180.0

>>> math.sin(90*toRadians)
1.0

Perhaps I'm not understanding what you mean by "clunky",  but this seems 
pretty clean and simple to me.



Gary Herron


--
Dr. Gary Herron
Professor of Computer Science
DigiPen Institute of Technology
(425) 895-4418

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3: Launch multiple commands(subprocesses) in parallel (but upto 4 any time at same time) AND store each of their outputs into a variable

2016-08-23 Thread Dale Marvin via Python-list

On 8/23/16 8:15 PM, lax.cla...@gmail.com wrote:

> I am trying to:
>
> 1) Use Python 3+ (specifically 3.4 if it matters)
> 2) Launch N commands in background (e.g., like subprocess.call would 
for individual commands)

> 3) But only limit P commands to run at same time
> 4) Wait until all N commands are done
> 5) Have an array of N strings with the stdout+stderr of each command 
in it.

>
> What is the best way to do this?

The best way is a matter of opinion, I have had success using Celery 
with Redis. 


DAle

--
https://mail.python.org/mailman/listinfo/python-list


Re: degrees and radians.

2016-08-23 Thread Random832
On Wed, Aug 24, 2016, at 00:26, Gary Herron wrote:
> Perhaps I'm not understanding what you mean by "clunky",  but this seems 
> pretty clean and simple to me.

The original post is from 2002, I don't know why it got a reply just
now.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3: Launch multiple commands(subprocesses) in parallel (but upto 4 any time at same time) AND store each of their outputs into a variable

2016-08-23 Thread Paul Rubin
Dale Marvin  writes:
> The best way is a matter of opinion, I have had success using Celery
> with Redis. 

I generally use GNU Parallel for stuff like that.  Celery looks
interesting though much fancier.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: degrees and radians.

2016-08-23 Thread Steven D'Aprano
On Wednesday 24 August 2016 14:26, Gary Herron wrote:

> Do you really need anything more complex than this?
> 
>  >>> toRadians = math.pi/180.0
> 
>  >>> math.sin(90*toRadians)
> 1.0
> 
> Perhaps I'm not understanding what you mean by "clunky",  but this seems
> pretty clean and simple to me.

The math module has two conversion functions, math.radians() and 
math.degrees().


Some other languages (Julia, by memory, and perhaps others) have dedicated 
sind(), cosd(), tand() or possibly dsin(), dcos(), dtan() functions which take 
their argument in degrees and are more accurate than doing a conversion to 
radians first. I'd like to see that.

I've also seen languages with sinp() etc to calculate the sine of x*pi without 
the intermediate calculation.

But if I were designing Python from scratch, I'd make sin(), cos() and tan() 
call dunder methods __sin__ etc:


def sin(obj):
if hasattr(type(obj), '__sin__'):
y = type(obj).__sin__()
if y is not NotImplemented:
return y
elif isinstance(obj, numbers.Number):
return float.__sin__(float(obj))
raise TypeError

Likewise for asin() etc.

Then you could define your own numeric types, such as a Degrees type, a 
PiRadians type, etc, with their own dedicated trig function implementations, 
without the caller needing to care about which sin* function they call.




-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list