Re: Hash stability

2012-01-15 Thread Stefan Behnel
Heiko Wundram, 14.01.2012 23:45:
> Am 14.01.2012 10:46, schrieb Peter Otten:
>> Steven D'Aprano wrote:
>>> How many people rely on hash(some_string) being stable across Python
>>> versions? Does anyone have code that will be broken if the string hashing
>>> algorithm changes?
>>
>> Nobody who understands the question ;)
> 
> Erm, not exactly true. There are actually some packages out there (take
> suds [https://fedorahosted.org/suds/], for example) that rely on the
> hashing algorithm to be stable to function "properly" (suds uses hash() of
> strings to create caches of objects/XML Schemas on the filesystem).

That's a stupid design. Using a hash function that the application does not
control to index into persistent storage just screams for getting the code
broken at some point.

Stefan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: defining class and subclass in C

2012-01-15 Thread Stefan Behnel
Daniel Franke, 14.01.2012 22:15:
> I spent some days and nights on this already and my google-fu is running out.
> I'd like to implement the equivalent of this Python code in a C-extension:
> 
> >>> class A(object):
>   pass
> >>> class B(A):
>   pass
> >>> A
> 
> >>> B
> 
> >>> B.__bases__
> (,)
> 
> However, loading my C-code (quoted below) I get:
> 
> >>> import ca
> >>> ca
> 
> >>> ca.ca
> 
> 
> Here I'd expect "" instead?!

You already got the response (and found for yourself) that this is normal.
CPython makes a distinction between classes defined the Python way and
extension types, the latter of which you define in your code.

As a general advice: if your primary interest is in implementing some kind
of functionality, instead of just learning about the bits and pieces of
CPython's C-API, you may want to take a look at Cython. It makes writing
efficient C extension modules fast and easy, especially when it comes to
class hierarchies and similarly things.

Stefan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extension module question

2012-01-15 Thread Stefan Behnel
Evan Driscoll, 15.01.2012 08:37:
> As I hinted at in an earlier email, I'm working on a module which will
> allow calling readdir() (and FindFirstFile on Windows, hopefully pretty
> uniformly) from Python. The responses I got convinced me that it was a
> good idea to write a C-to-Python bridge as an extension module.

An even better idea is to write an extension module in Cython. Much faster
and simpler to learn and do.


> What I'm not sure about is how to store pointers to *C* stuff between
> calls. In particular, opendir() returns a DIR* which you then need to
> pass to calls to readdir() in the future (and closedir()).
> 
> So I've got this:
> 
> static PyObject*
> py_opendir(PyObject* self, PyObject* args)
> {
> const char* dirname = 0;
> if (!PyArg_ParseTuple(args, "s", &dirname)) {
> return NULL;
> }
> // Eventually want to Py_BEGIN_ALLOW_THREADS here

Cython allows you to do that by simply putting it into a "with nogil" block.


> DIR* directory = opendir(dirname);
> 
> PyObject out = PyBuildValue( ???, directory );
> return out;
> }
> 
> but I don't know what to build. (I might want to wrap it in a custom
> handle class or something, but I still need to know how to build the
> value I eventually store in an attribute of that class.)

I suggest you write an extension type and store the pointer in it directly.

Untested Cython code example:

filesystem_encoding = sys.getfilesystemencoding()

cdef class Directory:
cdef DIR* c_directory
def __cinit__(self, directory):
if isinstance(directory, unicode):
 directory = directory.encode(filesystem_encoding)
cdef char* c_dirname = directory # raises TypeError on failure
with nogil:
self.c_directory = opendir(c_dirname)

def __iter__(self):
cdef char* name
cdef size_t name_length
for name in however_you_list_the_content_of(self.c_directory):
name_length = length_which_you_may_know_of(name)
yield name[:name_length].decode(filesystem_encoding)

and so on. Note how Cython does all sorts of things automatically for you
here, e.g. type conversions and the corresponding error handling as well as
all those nasty details of the C-level extension type implementation. Also
note that I'm using __cinit__() instead of __init__() for safety. See here:

http://docs.cython.org/src/userguide/special_methods.html#initialisation-methods-cinit-and-init

To implement the same interface for Unices and Windows, I suggest you write
two separate extension modules and hide them in a Python package that does
the appropriate platform specific imports at runtime.

Stefan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Heiko Wundram

Am 15.01.2012 11:13, schrieb Stefan Behnel:

That's a stupid design. Using a hash function that the application does not
control to index into persistent storage just screams for getting the code
broken at some point.


I agree completely with that (I hit the corresponding problem with suds 
while transitioning from 32-bit Python to 64-bit Python, where hashes 
aren't stable either), but as stated in my mail: that wasn't the 
original question. ;-)


--
--- Heiko.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Bryan
Chris Angelico wrote:
> Suggestion: Create a subclass of dict, the SecureDict or something,
> which could either perturb the hashes or even use a proper
> cryptographic hash function; normal dictionaries can continue to use
> the current algorithm. The description in Objects/dictnotes.txt
> suggests that it's still well worth keeping the current system for
> programmer-controlled dictionaries, and only change user-controlled
> ones (such as POST data etc).

I have to disagree; that's not how the world works, at least not
anymore. Competent, skilled, dedicated programmers have over and over
again failed to appreciate the importance and the difficulty of
maintaining proper function in an adversarial environment. The tactic
of ignoring security issues unless and until they are proven
problematic stands utterly discredited.

> It would then be up to the individual framework and module authors to
> make use of this, but it would not impose any cost on the myriad other
> uses of dictionaries - there's no point adding extra load to every
> name lookup just because of a security issue in an extremely narrow
> situation. It would also mean that code relying on hash(str) stability
> wouldn't be broken.

That seemingly "extremely narrow situation" turns out to be wide as
Montana. Maybe Siberia. Does your program take input? Does it accept a
format that could possibly be downloaded from a malicious site on the
Internet? Does your market include users who occasionally make
mistakes? If not, enjoy your utter irrelevance. If so,
congratulations: you write Internet software.

Varying the hash function is just the first step. Plausible attacks
dynamically infer how to induce degenerate behavior. Replacing the
dictionary hash function with a "proper cryptographic hash function"
is a naive non-solution; all things considered it's somewhat worse
than useless. An old and interesting and relevant exercise is to
implement a dictionary with O(1) insert, look-up, and delete in the
average non-adversarial case; and O(lg n) insert, look-up, and delete
in the worse case.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Chris Angelico
On Sun, Jan 15, 2012 at 11:03 PM, Bryan
 wrote:
> Chris Angelico wrote:
>> Suggestion: Create a subclass of dict, the SecureDict or something,
>> ... there's no point adding extra load to every
>> name lookup just because of a security issue in an extremely narrow
>> situation.
>
> That seemingly "extremely narrow situation" turns out to be wide as
> Montana. Maybe Siberia. Does your program take input? Does it accept a
> format that could possibly be downloaded from a malicious site on the
> Internet? Does your market include users who occasionally make
> mistakes? If not, enjoy your utter irrelevance. If so,
> congratulations: you write Internet software.

Yes, but in that "Internet software", there will only be a small
number of dictionaries that an attacker can stuff with keys (GET/POST
data, headers, cookies, etc, and anything derived therefrom); compare
the huge number of dictionaries that exist elsewhere in your Python
program. Adding load to dictionaries will add load to a huge number of
lookups that can never come under attack.

However, since posting that I've read the entire thread on the
python-dev archive. (It is, I might mention, a LOT of text.) A number
of suggestions and arguments are put forth, including a subclassing
notion similar to my postulation, and the same point is raised: that
app/framework developers won't secure their apps. Other options are
also offered (personally, I'm liking the one where an exception is
raised if something collides with too many keys - current suggestion
1000, although it could possibly work well with something that scales
with the dictionary size), and I'm sure that something will be done
that's a lot smarter than one quick idea spun off in response to a
separate query. So, I retract this idea :)

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Peter Otten
Heiko Wundram wrote:

> Am 15.01.2012 11:13, schrieb Stefan Behnel:
>> That's a stupid design. Using a hash function that the application does
>> not control to index into persistent storage just screams for getting the
>> code broken at some point.
> 
> I agree completely with that (I hit the corresponding problem with suds
> while transitioning from 32-bit Python to 64-bit Python, where hashes
> aren't stable either), but as stated in my mail: that wasn't the
> original question. ;-)

I'm curious: did you actually get false cache hits or just slower responses?


-- 
http://mail.python.org/mailman/listinfo/python-list


problem:emulate it in python with mechanize

2012-01-15 Thread contro opinion
you can do it by hand ,
1.open
http://www.flvcd.com/'
2.input
http://v.163.com/movie/2008/10/O/Q/M7F57SUCS_M7F5R3DOQ.html
3.click  submit
you can get
http://mov.bn.netease.com/movie/2012/1/V/7/S7MKQOBV7.flv

i want to  emulate it  in python with  mechanize,here is my code ,why i
can't get  the  right result:
 http://mov.bn.netease.com/movie/2012/1/V/7/S7MKQOBV7.flv



import mechanize
import cookielib
import lxml.html
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
br = mechanize.Browser()
br.set_handle_robots(False)

r = br.open('http://www.flvcd.com/')
for f in br.forms():
print f
br.select_form(nr=0)
br.form['kw']='http://v.163.com/movie/2008/10/O/Q/M7F57SUCS_M7F5R3DOQ.html'
print  br.submit().read()

why??
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problem:emulate it in python with mechanize

2012-01-15 Thread Kev Dwyer
contro opinion wrote:

> you can do it by hand ,
> 1.open
> http://www.flvcd.com/'
> 2.input
> http://v.163.com/movie/2008/10/O/Q/M7F57SUCS_M7F5R3DOQ.html
> 3.click  submit
> you can get
> http://mov.bn.netease.com/movie/2012/1/V/7/S7MKQOBV7.flv
> 
> i want to  emulate it  in python with  mechanize,here is my code ,why i
> can't get  the  right result:
>  http://mov.bn.netease.com/movie/2012/1/V/7/S7MKQOBV7.flv
> 
> 
> 
> import mechanize
> import cookielib
> import lxml.html
> br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US;
> rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
> br = mechanize.Browser()
> br.set_handle_robots(False)
> 
> r = br.open('http://www.flvcd.com/')
> for f in br.forms():
> print f
> br.select_form(nr=0)
> 
br.form['kw']='http://v.163.com/movie/2008/10/O/Q/M7F57SUCS_M7F5R3DOQ.html'
> print  br.submit().read()
> 
> why??

Hello,

I think the page uses javascript to submit the form, so mechanize may not 
work with it directly.

See 
http://stackoverflow.com/questions/3798550/python-mechanize-javascript-
submit-button-problem

for a similar problem and suggested workaround.

Cheers,

Kev

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWarts: time, datetime, and calendar modules

2012-01-15 Thread Lie Ryan

On 01/15/2012 06:23 AM, Rick Johnson wrote:

So how do we solve this dilemma you ask??? Well, we need to "mark"
method OR variable names (OR both!) with syntactic markers so there
will be NO confusion.

Observe:
   def $method(self):pass
   self.@instanceveriable
   self.@@classvariable


There is no need for a language-level support for Hungarian notation.

--
http://mail.python.org/mailman/listinfo/python-list


Re: why i can get nothing?

2012-01-15 Thread Jason Friedman
> here is my code :
> import urllib
> import lxml.html
> down='http://download.v.163.com/dl/open/00DL0QDR0QDS0QHH.html'
> file=urllib.urlopen(down).
> read()
> root=lxml.html.document_fromstring(file)
> tnodes = root.xpath("//a/@href[contains(string(),'mp4')]")
> for i,add in enumerate(tnodes):
>     print  i,add
>
> why i can get nothing?

What version of python is this?  Based on the naked "print" I guess
2.x, and I got:

$ /opt/python2/bin/python2.7
Python 2.7.2 (default, Oct 10 2011, 03:43:34)
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import lxml.html
Traceback (most recent call last):
  File "", line 1, in 
ImportError: No module named lxml.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Heiko Wundram

Am 15.01.2012 13:22, schrieb Peter Otten:

Heiko Wundram wrote:

I agree completely with that (I hit the corresponding problem with suds
while transitioning from 32-bit Python to 64-bit Python, where hashes
aren't stable either), but as stated in my mail: that wasn't the
original question. ;-)


I'm curious: did you actually get false cache hits or just slower responses?


It broke the application using suds, not due to false cache hits, but 
due to not getting a cache hit anymore at all.


Long story: to interpret WSDL-files, suds has to get all related DTDs 
for the WSDL file, and Microsoft (as I wrote I was querying Exchange Web 
Services) insists on using http://www.w3.org/2001/xml.dtd for the XML 
spec path. This path is sometimes functional as a GET URL, but mostly 
not (due to overload of the W3-servers), so basically I worked around 
the problem by creating an appropriate cache entry with the appropriate 
name based on hash() using a local copy of xml.dtd I had around. This 
took place on a development machine (32-bit), and when migrating the 
application to a production machine (64-bit), the cache file wasn't used 
anymore (due to the hash not being stable).


It's not that this came as a surprise (I quickly knew the "workaround" 
by simply rehashing on the target machine and moving the cache file 
appropriately), and I already said that this is mostly just a plain bad 
design decision on the part of the suds developers, but it's one of 
those cases where a non-stable hash() can break applications, and except 
if you know the internal workings of suds, this will seriously bite the 
developer.


I don't know the prevalence of suds, but I guess there's more people 
than me using it to query SOAP-services - all of those will be affected 
if the hash() output is changed. Additionally, if hash() isn't stable 
between runs (the randomized hash() solution which is preferred, and 
would also be my preference), suds caching becomes completely useless. 
And for the results, see above.


--
--- Heiko.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Chris Angelico
On Mon, Jan 16, 2012 at 3:07 AM, Heiko Wundram  wrote:
> I don't know the prevalence of suds, but I guess there's more people than me
> using it to query SOAP-services - all of those will be affected if the
> hash() output is changed. Additionally, if hash() isn't stable between runs
> (the randomized hash() solution which is preferred, and would also be my
> preference), suds caching becomes completely useless. And for the results,
> see above.

Or you could just monkey-patch it so that 'hash' points to an old
hashing function. If the current hash() is kept in builtins as (say)
hash_320() or hash_272() or something, then anyone who wants the old
version of the hash can still get it.

Of course, it's still dodgy to depend on the stability of something
that isn't proclaimed stable, and would be far better to use some
other hashing algorithm (MD5 or SHA for uberreliability).

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Heiko Wundram

Am 15.01.2012 17:13, schrieb Chris Angelico:

On Mon, Jan 16, 2012 at 3:07 AM, Heiko Wundram  wrote:

I don't know the prevalence of suds, but I guess there's more people than me
using it to query SOAP-services - all of those will be affected if the
hash() output is changed. Additionally, if hash() isn't stable between runs
(the randomized hash() solution which is preferred, and would also be my
preference), suds caching becomes completely useless. And for the results,
see above.


Or you could just monkey-patch it so that 'hash' points to an old
hashing function. If the current hash() is kept in builtins as (say)
hash_320() or hash_272() or something, then anyone who wants the old
version of the hash can still get it.


Or even easier: overwrite the default caching module (called FileCache) 
with something that implements "sensible" caching, for example by using 
the complete URL (with special characters replaced) of the DTD as a 
cache index, instead of hash()ing it. ;-)


There's "workarounds", I know - and I may be implementing one of them if 
the time comes. Again, my mail was only to point at the fact that there 
are (serious) projects out there relying on the "stableness" of hash(), 
and that these will get bitten when hash() is replaced. Which is not a 
bad thing if you ask me. ;-)


--
--- Heiko.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Hash stability

2012-01-15 Thread Stefan Behnel
Chris Angelico, 15.01.2012 17:13:
> Of course, it's still dodgy to depend on the stability of something
> that isn't proclaimed stable, and would be far better to use some
> other hashing algorithm (MD5 or SHA for uberreliability).

I've seen things like MD5 or SHA* being used quite commonly for file caches
(or file storage in general, e.g. for related files referenced in a text
document). Given that these algorithms are right there in the stdlib, I
find them a rather obvious choice.

However, note that they may also be subject to complexity attacks at some
point, although likely requiring substantially more input data. In the
specific case of a cache, an attacker may only need an arbitrary set of
colliding hashes. Those can be calculated in advance for a given hash
function. For example, Wikipedia currently presents MD5 with a collision
complexity of ~2^20, that sounds a bit weak. Something like SHA256 should
be substantially more robust.

https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms

Stefan

-- 
http://mail.python.org/mailman/listinfo/python-list


Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Saqib Ali

I am using Solaris 10, python 2.6.2, pexpect 2.4

I create a file called me.txt which contains the letters "A", "B", "C"
on the same line separated by tabs.

My shell prompt is "% "

I then do the following in the python shell:


>>> import pexpect
>>> x = pexpect.spawn("/bin/tcsh")
>>> x.sendline("cat me.txt")
11
>>> x.expect([pexpect.TIMEOUT, "% "])
1
>>> x.before
'cat me.txt\r\r\nA   B   C\r\n'
>>> x.before.split("\t")
['cat me.txt\r\r\nA   B   C\r\n']



Now, clearly the file contains tabs. But when I cat it through expect,
and collect cat's output, those tabs have been converted to spaces.
But I need the tabs!

Can anyone explain this phenomenon or suggest how I can fix it?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why i can get nothing?

2012-01-15 Thread Chris Rebert
On Sun, Jan 15, 2012 at 7:40 AM, Jason Friedman  wrote:
>> here is my code :
>> import urllib
>> import lxml.html

> What version of python is this?  Based on the naked "print" I guess
> 2.x, and I got:

 import lxml.html
> Traceback (most recent call last):
>  File "", line 1, in 
> ImportError: No module named lxml.html

lxml is a fairly popular third-party XML package for Python:
http://lxml.de/

Regards,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why i can get nothing?

2012-01-15 Thread Roy Smith
In article ,
 Chris Rebert  wrote:

> On Sun, Jan 15, 2012 at 7:40 AM, Jason Friedman  wrote:
> >> here is my code :
> >> import urllib
> >> import lxml.html
> 
> > What version of python is this?  Based on the naked "print" I guess
> > 2.x, and I got:
> 
>  import lxml.html
> > Traceback (most recent call last):
> >  File "", line 1, in 
> > ImportError: No module named lxml.html
> 
> lxml is a fairly popular third-party XML package for Python:
> http://lxml.de/

Fairly popular and insanely awesome!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why i can get nothing?

2012-01-15 Thread Robert Helmer
On Sat, Jan 14, 2012 at 7:54 PM, contro opinion  wrote:
> here is my code :
> import urllib
> import lxml.html
> down='http://download.v.163.com/dl/open/00DL0QDR0QDS0QHH.html'
> file=urllib.urlopen(down).
> read()
> root=lxml.html.document_fromstring(file)
> tnodes = root.xpath("//a/@href[contains(string(),'mp4')]")
> for i,add in enumerate(tnodes):
>     print  i,add
>
> why i can get nothing?


The problem is the document. The links you are trying to match on are
inside the script tags in the document, here's a simplified version:

"""

  obj="";

"""

So the anchor elements are not part of the DOM as far as lxml is
concerned, lxml does not know how to parse javascript (and even if it
did it would have to execute the JS, and JS would have to modify the
DOM, before you could get this via xpath)

You could have lxml return just the script nodes that contain the text
you care about:
tnodes = root.xpath("//script[contains(.,'mp4')]")

Then you will need a different tool for the rest of this, regex is not
perfect but should be good enough. Probably not worth the effort to
use a real javascript parser if you're just trying to scrape the mp4
links out of this, but it's an option.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Steven D'Aprano
On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:

> I am using Solaris 10, python 2.6.2, pexpect 2.4
> 
> I create a file called me.txt which contains the letters "A", "B", "C"
> on the same line separated by tabs.
[...]
> Now, clearly the file contains tabs.

That is not clear at all. How do you know it contains tabs? How was the 
file created in the first place?

Try this:

text = open('me.txt', 'r').read()
print '\t' in text

My guess is that it will print False and that the file does not contain 
tabs. Check your editor used to create the file.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Cameron Simpson
On 15Jan2012 23:04, Steven D'Aprano  
wrote:
| On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:
| > I am using Solaris 10, python 2.6.2, pexpect 2.4
| > 
| > I create a file called me.txt which contains the letters "A", "B", "C"
| > on the same line separated by tabs.
| [...]
| > Now, clearly the file contains tabs.
| 
| That is not clear at all. How do you know it contains tabs? How was the 
| file created in the first place?
| 
| Try this:
| 
| text = open('me.txt', 'r').read()
| print '\t' in text
| 
| My guess is that it will print False and that the file does not contain 
| tabs. Check your editor used to create the file.

I was going to post an alternative theory but on more thought I think
Steven is right here.

What does:

  od -c me.txt

show you? TABs or multiple spaces?

What does:

  ls -ld me.txt

tell you about the file size? Is it 6 bytes long (three letters, two
TABs, one newline)?

Steven hasn't been explicit about it, but some editors will write spaces when
you type a TAB. I have configured mine to do so - it makes indentation more
reliable for others. If I really need a TAB character I have a special
finger contortion to get one, but the actual need is rare.

So first check that the file really does contain TABs.

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

Yes Officer, yes Officer, I will Officer. Thank you.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Saqib Ali

Very good question. Let me explain why I'm not opening me.txt directly
in python with open.

The example I have posted is simplified for illustrative purpose. In
reality, I'm not doing pexpect.spawn("/bin/tcsh"). I'm doing
pexpect.spawn("ssh myuser@ipaddress"). Since I'm operating on a remote
system, I can't simply open the file in my own python context.


On Jan 15, 2:24 pm, Dennis Lee Bieber  wrote:
> On Sun, 15 Jan 2012 09:51:44 -0800 (PST), Saqib Ali
>
>  wrote:
> >Now, clearly the file contains tabs. But when I cat it through expect,
> >and collect cat's output, those tabs have been converted to spaces.
> >But I need the tabs!
>
> >Can anyone explain this phenomenon or suggest how I can fix it?
>
>         My question is:
>
>         WHY are you doing this?
>
>         Based upon the problem discription, as given, the solution would
> seem to be to just open the file IN Python -- whether you read the lines
> and use split() by hand, or pass the open file to the csv module for
> reading/parsing is up to you.
>
> -=-=-=-=-=-=-
> import csv
> import os
>
> TESTFILE = "Test.tsv"
>
> #create data file
> fout = open(TESTFILE, "w")
> for ln in [  "abc",
>             "defg",
>             "hijA"  ]:
>     fout.write("\t".join(list(ln)) + "\n")
> fout.close()
>
> #process tab-separated data
> fin = open(TESTFILE, "rb")
> rdr = csv.reader(fin, dialect="excel-tab")
> for rw in rdr:
>     print rw
>
> fin.close()
> del rdr
> os.remove(TESTFILE)
> -=-=-=-=-=-=-
> ['a', 'b', 'c']
> ['d', 'e', 'f', 'g']
> ['h', 'i', 'j', 'A']
> --
>         Wulfraed                 Dennis Lee Bieber         AF6VN
>         wlfr...@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Saqib Ali

The file me.txt does indeed contain tabs. I created it with vi.

>>> text = open("me.txt", "r").read()
>>> print "\t" in text
True


% od -c me.txt
000   A  \t   B  \t   C  \n
006


% ls -al me.txt
-rw-r--r--   1 myUsermyGroup   6 Jan 15 12:42 me.txt



On Jan 15, 6:40 pm, Cameron Simpson  wrote:
> On 15Jan2012 23:04, Steven D'Aprano  
> wrote:
> | On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:
> | > I am using Solaris 10, python 2.6.2, pexpect 2.4
> | >
> | > I create a file called me.txt which contains the letters "A", "B", "C"
> | > on the same line separated by tabs.
> | [...]
> | > Now, clearly the file contains tabs.
> |
> | That is not clear at all. How do you know it contains tabs? How was the
> | file created in the first place?
> |
> | Try this:
> |
> | text = open('me.txt', 'r').read()
> | print '\t' in text
> |
> | My guess is that it will print False and that the file does not contain
> | tabs. Check your editor used to create the file.
>
> I was going to post an alternative theory but on more thought I think
> Steven is right here.
>
> What does:
>
>   od -c me.txt
>
> show you? TABs or multiple spaces?
>
> What does:
>
>   ls -ld me.txt
>
> tell you about the file size? Is it 6 bytes long (three letters, two
> TABs, one newline)?
>
> Steven hasn't been explicit about it, but some editors will write spaces when
> you type a TAB. I have configured mine to do so - it makes indentation more
> reliable for others. If I really need a TAB character I have a special
> finger contortion to get one, but the actual need is rare.
>
> So first check that the file really does contain TABs.
>
> Cheers,
> --
> Cameron Simpson  DoD#743http://www.cskk.ezoshosting.com/cs/
>
> Yes Officer, yes Officer, I will Officer. Thank you.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Two questions about logging

2012-01-15 Thread Vinay Sajip
On Jan 12, 2:19 am, Matthew Pounsett  wrote:

> First, I'd like to be able to permit users to do more typical log
> rotation, based on their OS's log rotation handler, rather than
> rotating logs from inside an application.  This is usually handled by
> signalling an application with a HUP, whereupon it closes and then re-
> opens all of its logs, getting new file handles (and new inodes).  I
> don't see anything in the Handler methods (or anywhere else) that
> would let me tell a logger object to refresh the file handles on a log
> file.  Is there some standard way to deal with this?

There's the WatchedFileHandler, which checks to see if a file's device
or inode has changed (which happens when the external rotator does
rotation) and if so, closes and reopens the file. (This handler is for
Unix/Linux/OS X - it doesn't work on Windows).

See

http://docs.python.org/library/logging.handlers.html#watchedfilehandler

for more information.

Regards,

Vinay Sajip
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Cameron Simpson
On 15Jan2012 16:14, Saqib Ali  wrote:
| The file me.txt does indeed contain tabs. I created it with vi.
| 
| >>> text = open("me.txt", "r").read()
| >>> print "\t" in text
| True
| 
| % od -c me.txt
| 000   A  \t   B  \t   C  \n
| 006
| 
| % ls -al me.txt
| -rw-r--r--   1 myUsermyGroup   6 Jan 15 12:42 me.txt

Ok, your file does indeed contain TABs.

Therefre something is turning the TABs into spaces. Pexpect should be
opening a pty and reading from that, and I do not expect that to expand
TABs. So:

  1: Using subprocess.Popen, invoke "cat me.txt" and check the result
 for TABs.

  2: Using pexpect, run "cat me.txt" instead of "/bin/tcsh" (eliminates a
 layer of complexity; I don't actually expect changed behaviour) and
 check for TABs.

On your Solaris system, read "man termios". Does it have an "expand
TABs" mode switch? This is about the only thing I can think of that
would produce your result - the pty terminal discipline is expanding
TABs for your (unwanted!) - cat is writing TABs to the terminal and the
terminal is passing expanded spaces to pexpect. Certainly terminal line
disciplines do rewrite stuff, most obviously "\n" into "\r\n", but a
quick glance through termios on a Linux box does not show a tab
expansion mode; I do not have access to a Solaris box at present.

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

Maintainer's Motto: If we can't fix it, it ain't broke.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Steven D'Aprano
On Sun, 15 Jan 2012 09:51:44 -0800, Saqib Ali wrote:

> I am using Solaris 10, python 2.6.2, pexpect 2.4

Are you sure about that? As far as I can see, pexpect's current version 
is 2.3 not 2.4.


> I create a file called me.txt which contains the letters "A", "B", "C"
> on the same line separated by tabs.
> 
> My shell prompt is "% "
> 
> I then do the following in the python shell:
> 
> 
 import pexpect
 x = pexpect.spawn("/bin/tcsh")

Can you try another shell, just in case tcsh is converting the tabs to 
spaces?

 x.sendline("cat me.txt")
> 11

What happens if you do this from the shell directly, without pexpect? It 
is unlikely, but perhaps the problem lies with cat rather than pexpect. 
You should eliminate this possibility.


 x.expect([pexpect.TIMEOUT, "% "])
> 1
 x.before
> 'cat me.txt\r\r\nA   B   C\r\n'


Unfortunately I can't replicate the same behaviour, however my setup is 
different. I'm using pexpect2.3 on Linux, and I tried it using bash and 
sh but not tcsh. In all my tests, the tabs were returned as expected.

(However, the x.expect call returned 0 instead of 1, even with the shell 
prompt set correctly.)



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWarts: time, datetime, and calendar modules

2012-01-15 Thread Michael Torrie
On 01/14/2012 10:27 PM, Rick Johnson wrote:
> Face it, Guido has broken Python's cherry. She is no longer pure.
> You're acting like some over- protective father. WAKE UP! Python is a
> promiscuous little whore and she's on girls gone wild (Volume 4000)
> shaking her little money maker. We should at least profit from the
> immorality.

Hmm, down goes all the credibility you had since your rebirth on this
list.  Back to the old ways.

That said, Rick I think it's time for you to fork python.  Your
brilliance and foresight put Guido's to shame.  No one else has the guts
to stand up and say what needs to be said.  If you can't fix python, no
one can.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while doing a cat on a tabbed file with pexpect

2012-01-15 Thread Michael Torrie
On 01/15/2012 05:11 PM, Saqib Ali wrote:
> 
> Very good question. Let me explain why I'm not opening me.txt directly
> in python with open.
> 
> The example I have posted is simplified for illustrative purpose. In
> reality, I'm not doing pexpect.spawn("/bin/tcsh"). I'm doing
> pexpect.spawn("ssh myuser@ipaddress"). Since I'm operating on a remote
> system, I can't simply open the file in my own python context.

There is a very nice python module called "paramiko" that you could use
to, from python, programatically ssh to the remote system and cat the
file (bypassing any shells) or use sftp to access it.  Either way you
don't need to use pexpect with it.
-- 
http://mail.python.org/mailman/listinfo/python-list


THAT WHAT NEED EXPECT FROM OPERATORS OF PYTHON. (e-mail get by the list moderator)

2012-01-15 Thread _

# THAT WHAT NEED EXPECT FROM OPERATORS OF PYTHON:
Worddr = "56" # CREATE A STRING: "56"
Word = ["12"] # CREATE A LIST WITH ONE SIGNED: "12"
Word = Word.append("34") # APPEND TO LIST ONE MORE SIGNED: "34"
Word = Word + "34" # MUST APPEND TO LIST ONE MORE SIGNED: "34"
Wordpr = Word[1] # MUST SIGNED TO THE Wordpr THE SECOND SIGNED OF THE Word
LIST: "34", AND IT'S ALL PARAMETRS
Wordpr = Wordpr + Worddr[1] # MUST ADD TO THE STRING Wordpr: "34", A SECOND
SIGNED OF STRING Worddr: "6"
Word[1] = Word[1] + Worddr[1] # MUST ADD TO THE SECOND STRING LIST Word:
"346", A SECOND SIGNED OF STRING Worddr: "6"

# THAT WHAT NEED EXPECT FROM OPERATORS OF PYTHON:
Worddr = "56" # CREATE A STRING: "56"
Word = ["12"] # CREATE A LIST WITH ONE SIGNED: "12"
Word = Word.append("34") # APPEND TO LIST ONE MORE SIGNED: "34"
Word = Word + "34" # MUST APPEND TO LIST ONE MORE SIGNED: "34"
Wordpr = Word[1] # MUST SIGNED TO THE Wordpr THE SECOND SIGNED OF THE Word LIST: 
"34", AND IT'S ALL PARAMETRS
Wordpr = Wordpr + Worddr[1] # MUST ADD TO THE STRING Wordpr: "34", A SECOND SIGNED OF 
STRING Worddr: "6"
Word[1] = Word[1] + Worddr[1] # MUST ADD TO THE SECOND STRING LIST Word: "346", A SECOND 
SIGNED OF STRING Worddr: "6"
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: understanding a program project

2012-01-15 Thread alex23
On Jan 14, 6:29 am, Tracubik  wrote:
> I remember at school time there was some schema or something to create to
> display the interaction of different functions / modules
>
> My idea was to create a model with all the methods and arrows to link
> they...

Do you mean call graphs? http://pycallgraph.slowchop.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: THAT WHAT NEED EXPECT FROM OPERATORS OF PYTHON. (e-mail get by the list moderator)

2012-01-15 Thread alex23
On Jan 16, 4:03 pm, "_"  wrote:
> # THAT WHAT NEED EXPECT FROM OPERATORS OF PYTHON:
> Word = Word.append("34") # APPEND TO LIST ONE MORE SIGNED: "34"

list.append is an in-place operation; it doesn't return a copy of the
list, so here you're setting Word to None.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: THAT WHAT NEED EXPECT FROM OPERATORS OF PYTHON. (e-mail get by the list moderator)

2012-01-15 Thread Steven D'Aprano
On Mon, 16 Jan 2012 09:03:54 +0300, _ wrote:

> # THAT WHAT NEED EXPECT FROM OPERATORS OF PYTHON: Worddr = "56" # CREATE
> A STRING: "56" Word = ["12"] # CREATE A LIST WITH ONE SIGNED: "12" Word
> = Word.append("34") 
...


Do you have a question, or are you just dumping a lot of noise in one 
post?




-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list