Re: Adding a cookie

2010-01-17 Thread Jens Müller
Hi, I am creating a webscraper for a specific web site for an application. Now, that website has a specific cookie which needs to be set in the request. Otherwise, the website is redirected. I have been trying for the last 6 hours to add a cookie to the HTTP request, but to no avail.

Re: decode(..., errors='ignore') has no effect

2010-01-12 Thread Jens Müller
To convert unicode into str you have to *encode()* it. u"...".decode(...) will implicitly convert to ASCII first, i. e. is equivalent to u"...".encode("ascii").decode(...) Hence the error message Ah - yes of course. And how can you use the system's default encoding with errors=ignore? The de

decode(..., errors='ignore') has no effect

2010-01-12 Thread Jens Müller
Hi, I try to decode a string,e.g. u'M\xfcnchen, pronounced [\u02c8m\u028fn\xe7\u0259n]'.decode('cp1252', 'ignore') but even thoug I use errors='ignore' I get UnicodeEncodeError: 'charmap' codec can't encode character u'\u02c8' in position 21: character maps to How come? Thanks, Jens --

Re: Speeding up network access: threading?

2010-01-05 Thread Jens Müller
Hi and sorry for double posting - had mailer problems, Terry said "queue". not "list". Use the Queue class (it's thread-safe) in the "Queue" module (assuming you're using Python 2.x; in Python 3.x it's called the "queue" module). Yes yes, I know. I use a queue to realize the thread pool queue,

Re: Speeding up network access: threading?

2010-01-05 Thread Jens Müller
Hello, The fairly obvious thing to do is use a queue.queue for tasks and another for results and a pool of threads that read, fetch, and write. Thanks, indeed. Is a list thrad-safe or do I need to lock when adding the results of my worker threads to a list? The order of the elements in the l

Re: Speeding up network access: threading?

2010-01-05 Thread Jens Müller
Hello, The fairly obvious thing to do is use a queue.queue for tasks and another for results and a pool of threads that read, fetch, and write. Thanks, indeed. Is a list thrad-safe or do I need to lock when adding the results of my worker threads to a list? The order of the elements in the li

Speeding up network access: threading?

2010-01-04 Thread Jens Müller
Hello, what would be best practise for speeding up a larger number of http-get requests done via urllib? Until now they are made in sequence, each request taking up to one second. The results must be merged into a list, while the original sequence needs not to be kept. I think speed could be