RE: xml-filter with XMLFilterBase() and XMLGenerator() shuffles attributes

2007-12-20 Thread Brian Smith
> > I want prevent it from shuffling attributes, i.e. preserve original 
> > file's attribute order. Is there any ContentHandler.features* 
> > responsible for that?
> 
> I suspect not.  attrs is a dictionary which does not maintain 
> order, and XML attributes are unordered to begin with.  Is 
> there any reason other than aesthetics that you want the 
> order preserved?  It shouldn't matter to any upstream 
> consumer of the filtered XML.

I had the same requirements. I also had the requirement to preserve
namespace prefixes. Luckily, for my application I was able to use the
XML_STRING property to handle my requirements. Otherwise, if you drop
down to using PyExpat (not SAX) then you can do what you want. If you
want to keep using SAX, then you need to use a non-default Python SAX
implementation or use one of the Java SAX parsers that have this option.


BTW, I have never been able to get XMLGenerator to work; it seems really
buggy regarding namespaces. I had to write my own version of it.

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: "env" parameter to "popen" won't accept Unicode on Windows -minor Unicode bug

2008-01-15 Thread Brian Smith
Diez B. Roggisch wrote:
> Sure thing, python will just magically convert unicode to the 
> encoding the program YOU invoke will expect. Right after we 
> introduced the
> 
> solve_my_problem()
> 
> built-in-function. Any other wishes?

There's no reason to be rude.

Anyway, at least on Windows it makes perfect sense for people to expect
Unicode to be handled automatically. popen() knows that it is running on
Windows, and it knows what encoding Windows needs for its environment
(it's either UCS2 or UTF-16 for most Windows APIs). At least when it
receives a unicode string, it has enough information to apply the
conversion automatically, and doing so saves the caller from having to
figure out what exact encoding is to be used.

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Some Berkeley DB questions (being maintained? queries?)

2008-01-17 Thread Brian Smith
[EMAIL PROTECTED] wrote:
> 1. Now that Berkeley DB is part of Oracle, is it still being
> maintained?   Is it free?

Berkeley DB is owned by Oracle, but it is seperate from the Oracle RDBMS
product. Yes, it is free. 

> 2. Are there good python libraries for bdb available, that 
> are being maintained?

I would like to know the answer to this question too--if you have used
the pybsddb/bsddb.db module, please share your experience.

> 3. Is it possible to query a berkeley db database?  Just 
> simple queries like: find me all items where key "name" = "John"

That is basically the only kind of query that a Berkeley DB database can
do: key [<|=|>] value.

> 4. What are good, stable alternatives?

That depends soley on your requirements. Berkeley DB is actually one of
the most complicated persistence solutions. It is more complex than
SQLite, and WAY more complex than gdbm, for example. If you don't need
all its functionality, especially its multi-user capabilities, then I
recommend using something simpler. However, if you DO need its
multi-user cabailities or its advanced features like secondary indexes,
then it is better to use Berkeley DB than to re-invent it.

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: HTTP POST uploading large files

2008-01-20 Thread Brian Smith
Wolfgang Draxinger wrote:
> The problem is, that videos, by nature are rather big files, 
> however urllib2 wants it's Request objects being prepared 
> beforehand, which would mean to first load the whole file to memory.

Try using mmap. Here is some untested code:

map = mmap(file.fileno(), len(file), access=ACCESS_READ)
try:
data = mmap.read() 
request = Request(url, data, headers)
...
finally:
map.close()


- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: [XML-SIG] SAX characters() output on multiple lines for non-ascii

2008-02-02 Thread Brian Smith
>   def characters(self, chars):
> 
>   newchars=[]
>   newchars.append(chars.encode('ISO-8859-1'))

The SAX parser calls characters() multiple times for the same text block. For 
example, in the input 123, characters() could be called once:
handler.characters("123")
 or twice:
handler.characters("12")
handler.characters("3")
 or:
handler.characters("1")
handler.cahraceters("23")
 or three times:
handler.characters("1")
handler.characters("2")
handler.characters("3")

If you want the whole text block, then you need to do something like this:

in __init__:
self.newchars = []

in startElement:
self.newchars = []

in characters:
self.newchars.append(chars)

in endElement:
if len(self.newchars) > 0:
combined = "".join(self.newchars).encode('ISO-8859-1') 
print "Strean read is '%s'" % combined

I recommend using ElementTree instead.

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


FW: apache/mod_wsgi daemon mode

2008-02-03 Thread Brian Smith
Also, mod_wsgi has its own mailing list:
http://groups.google.com/group/modwsgi

> Scott SA wrote:
> I am trying to configure mod_wsgi to run in daemon mode with Apache. I

> can easily get it to run 'normally' under Apache but I obtain 
> permission errors _or_ process-failures in daemon mode. Specifically:
> 
>  ... (13)Permission denied: mod_wsgi (pid=26962): Unable to 
> connect
>  to WSGI daemon process '' on 
> '/etc/httpd/logs/wsgi.26957.0.1.sock' after multiple attempts.

> The host is Apache 2.2n under CentOS 5.1 i386 running Python 2.4

Try again after "sudo /usr/sbin/setenforce 0". If it works with SELinux
disabled then you will have to do a bit more work to get it working with
SELinux enabled. I suggest creating a directory that is labeled with the
Apache read/write type, and setting the WSGI socket prefix such that the
domain sockets will get put in that directory. If that doesn't solve the
problem then use the procedures in the SELinux documentation to create a
security policy. And then, please share it with me. :)

Regards,
Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Is there a way to "link" a python program from several files?

2008-02-16 Thread Brian Smith
Diez B. Roggisch wrote:
> Edward A. Falk schrieb:
> > IOW, is there a "linker" for python?  I've written a 
> > program comprised of about five .py files. I'd like to
> > find a way to combine them into a single executable.
> > Obviously, I could hand-edit them into a single 
> > .py file, but I'm looking for a way to keep them as 
> > seperate files for development but distribute the
> > result as a single file.

> Depending on the OS, there are several options. Ranging from 
> distributing an .egg (setuptools) over py2exe for windows to 
> py2app on OSX - and some more, e.g. cx_freeze.

I would be interested in a program that can combine multiple modules
into a single module, which removes all the inter-package imports and
fixes other inter-module references, like Haskell All-in-One does for
Haskell: http://www.cs.utah.edu/~hal/HAllInOne/index.html

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Is there a way to "link" a python program from several files?

2008-02-16 Thread Brian Smith
Diez B. Roggisch wrote:
> Brian Smith wrote:
> > I would be interested in a program that can combine 
> > multiple modules into a single module, which removes
> > all the inter-package imports and fixes other
> > inter-module references, like Haskell 
> > All-in-One does for Haskell:
> > http://www.cs.utah.edu/~hal/HAllInOne/index.html
> 
> won't happen for python. python relies heavily on 
> modules/packages being namespaces.

So does Haskell. Haskell All-In-One handles that by renaming every
top-level artifact.

> Why would you want such a beast anyway? If it's about 
> single-file-distribution, that is solved by some of the above 
> mentioned tools - or even not desired anyway (say OS X bundles)

I want to package a complex WSGI application into a single CGI script,
so that users can copy the script into some CGI-enabled directory and
have it work without any further configuration, and so that it runs
faster (especially when the file is on NFS). If it is possible to run an
egg as a CGI (without modifying the web server configuration file), then
that would work as well.

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Large file support >2/4GB ?

2008-02-25 Thread Brian Smith
Chris wrote:
> On Feb 25, 12:35 pm, robert <[EMAIL PROTECTED]> wrote:
> > Somebody who uses my app gets a error :
> >
> > os.stat('/path/filename')
> >
> > OSError: [Errno 75] Value too large for defined data type:
> > '/path/filename'
> >
> > on a big file >4GB
> >
> > ( Python 2.4.4 / Linux )
> >
> > How about that? Does Python not support large files? Or which 
> > functions do not support?

It looks like Python is not being compiled with large file support by
default. Most distributions do not enable large file support for Python.

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: anydbm safe for simultaneous writes?

2008-02-28 Thread Brian Smith
Chris wrote:
> I need simple data persistence for a cgi application that 
> will be used potentially by multiple clients simultaneously.  
> So I need something that can handle locking among writes.  
> Sqlite probably does this, but I am using Python 2.4.4, which 
> does not include sqlite.  The dbm-style modules would 
> probably be fine, but I have no idea if they are "write safe" 
> (I have no experience with the underlying unix stuff).  Any 
> tips appreciated.

No, you cannot assume that this will work without locking. Locking is
not trivial to do in Python. And, even with a working locking mechanism,
you still have to invalidate the in-memory caches any time a write to
the database is done. Futher, most dbm modules do not have ACID
properties.

I suggest intalling the pysqlite module and using it, regardless of your
version of CPython. According to pysqlite developer, the version of
pysqlite included in CPython 2.5 is old.

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Socket Performance

2008-03-13 Thread Brian Smith
[EMAIL PROTECTED] wrote:
> Sent: Wednesday, March 12, 2008 9:47 PM
> To: python-list@python.org
> Subject: Socket Performance
> 
> Can anyone explain why socket performance (throughput) varies 
> depending on the amount of data send and recv are called with?
> 
> For example: try creating a local client/server (running on the same
> computer) where the server sends the client a fixed amount of data.
> Using method A, recv(8192) and sendall( ) with 8192 bytes 
> worth of data. Do this 100 times. Using method B, recv(1) and 
> sendall( ) with 1 byte worth of data. Do this 819200 times.
> 
> If you time both methods, method A has much greater 
> throughput than method B.

Why is it faster to drink a liter of water a cupful at a time than to
drink it out of an eyedropper?

- Brian

-- 
http://mail.python.org/mailman/listinfo/python-list