argv[0] and __file__ inconsistency

2007-12-31 Thread Hai Vu
I currently use ActivePython 2.5.1. Consider the following code which
I saved as cmdline.py:
import sys
print sys.argv[0]
If I invoke this code as 'python cmdline.py', then the output is:
cmdline.py
If I invoke it as 'cmdline.py', then the output is:
C:\Users\hai\src\python\cmdline.py

The same happens for __file__. My question: do you have any
suggestions for a more consistent way to figure out the full path of
your script?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: argv[0] and __file__ inconsistency

2007-12-31 Thread Hai Vu
> use os.path.abspath

Bingo! This is just what the doctor ordered. Thank you.
Hai
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parsing links within a html file.

2008-01-16 Thread Hai Vu
On Jan 14, 9:59 am, Shriphani <[EMAIL PROTECTED]> wrote:
> Hello,
> I have a html file over here by the name guide_ind.html and it
> contains links to other html files like guides.html#outline . How do I
> point BeautifulSoup (I want to use this module) to
> guides.html#outline ?
> Thanks
> Shriphani P.

Try Mark Pilgrim's excellent example at:
http://www.diveintopython.org/http_web_services/index.html

>From the above link, you can retrieve openanything.py which I use in
my example:

# list_url.py
# created by Hai Vu on 1/16/2008

from openanything import fetch
from sgmllib import SGMLParser

class RetrieveURLs(SGMLParser):
def reset(self):
SGMLParser.reset(self)
self.urls = []

def start_a(self, attributes):
url = [v for k, v in attributes if k.lower() == 'href']
self.urls.extend(url)
print '\t%s' % (url)

#
--
# main
def main():
site = 'http://www.google.com'

result = fetch(site)
if result['status'] == 200:
# Extracts a list of URLs off the top page
parser = RetrieveURLs()
parser.feed(result['data'])
parser.close()

# Display the URLs we just retrieved
print '\nURL retrieved from %s' % (site)
print '\t' + '\n\t'.join(parser.urls)
else:
print 'Error (%d) retrieving %s' % (result['status'], site)

if __name__ == '__main__':
main()
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: reading a specific column from file

2008-01-17 Thread Hai Vu
Here is another suggestion:

col = 2 # third column
filename = '4columns.txt'
third_column = [line[:-1].split('\t')[col] for line in open(filename,
'r')]

third_column now contains a list of items in the third column.

This solution is great for small files (up to a couple of thousand of
lines). For larger file, performance could be a problem, so you might
need a different solution.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "Code Friendly" Blog?

2008-01-17 Thread Hai Vu
On Jan 17, 2:50 pm, Miki <[EMAIL PROTECTED]> wrote:
> Hello,
>
> Posting code examples to blogger.com hosted blog is not fun (need to
> remember alway escape < and >).
> Is there any free blog hosting that is more "code friendly" (easy to
> post code snippets and such)?
>
> Thanks,
> --
> Miki <[EMAIL PROTECTED]>http://pythonwise.blogspot.com

how about bracketing your code in the  tags?

Something like this:

import sys, os, shutil

def getDir(fullPath):
dirName, fileName = os.path.split(fullPath)
return dirName

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "Code Friendly" Blog?

2008-01-18 Thread Hai Vu
Miki,
Why don't you try to use Code Colorizer:
http://www.chamisplace.com/colorizer/cc.asp

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating variables from dicts

2010-02-23 Thread Hai Vu
On Feb 23, 12:53 pm, vsoler  wrote:
> Hi,
>
> I have two dicts
>
> n={'a', 'm', 'p'}
> v={1,3,7}
>
> and I'd like to have
>
> a=1
> m=3
> p=7
>
> that is, creating some variables.
>
> How can I do this?

I think you meant to use the square brackets [ ] instead of the curly
ones { } to define the list:

>>> n = ['a', 'b', 'c']
>>> v = [3, 5, 7]
>>> for x, y in zip(n, v):
...  exec '%s=%d' % (x, y)
...
>>> a
3
>>> b
5
>>> c
7

---
The key is the use of the exec statement, which executes the strings
"a=3", "b=5", ... as if they are python statements.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: stripping fields from xml file into a csv

2010-02-27 Thread Hai Vu
On Feb 27, 12:50 pm, Hal Styli  wrote:
> Hello,
>
> Can someone please help.
> I have a sed solution to the problems below but would like to rewrite
> in python...
>
> I need to strip out some data from a quirky xml file into a csv:
>
> from something like this
>
> < . cust="dick"  product="eggs" ... quantity="12"  >
> <  cust="tom"  product="milk" ... quantity="2" ...>
> <  cust="harry"  product="bread" ... quantity="1" ...>
> <  cust="tom"  product="eggs" ... quantity="6" ...>
> < . cust="dick"  product="eggs" ... quantity="6"  >
>
> to this
>
> dick,eggs,12
> tom,milk,2
> harry,bread,1
> tom,eggs,6
> dick,eggs,6
>
> I am new to python and xml and it would be great to see some slick
> ways of achieving the above by using python's XML capabilities to
> parse the original file or python's regex to achive what I did using
> sed.
>
> Thanks for any constructive help given.
>
> Hal

Here is a sample XML file (I named it data.xml):
--







--

Code:
--
import csv
import xml.sax

# Handle the XML file with the following structure:
# 
#...
# 
class OrdersHandler(xml.sax.handler.ContentHandler):
def __init__(self, csvfile):
# Open a csv file for output
self.csvWriter = csv.writer(open(csvfile, 'w'))

def startElement(self, name, attributes):
# Only process the  element
if name == 'order':
# Construct a sorted list of attribute names in order to
# guarantee rows are written in the same order. We assume
# the XML elements contain the same attributes
attributeNames = attributes.getNames()
attributeNames.sort()

# Construct a row and write it to the csv file
row = []
for name in attributeNames:
row.append(attributes.getValue(name))
self.csvWriter.writerow(row)

def endDocument(self):
# Destroy the csv writer object to close the file
self.csvWriter = None

# Main
datafile = 'data.xml'
csvfile = 'data.csv'
ordersHandler = OrdersHandler(csvfile)
xml.sax.parse(datafile, ordersHandler)
--

To solve your problem, it is easier to use SAX than DOM. Basically,
use SAX to scan the XML file, if you encounter the element you like
(in this case ) then you process its attributes. In this
case, you sort the attributes, then write to a csv file.

--

References:

SAX Parser:
http://docs.python.org/library/xml.sax.html

SAX Content Handler:
http://docs.python.org/library/xml.sax.handler.html

Attributes Object:
http://docs.python.org/library/xml.sax.reader.html#attributes-objects

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: stripping fields from xml file into a csv

2010-02-28 Thread Hai Vu
On Feb 28, 12:05 am, Stefan Behnel  wrote:
> Hal Styli, 27.02.2010 21:50:
>
> > I have a sed solution to the problems below but would like to rewrite
> > in python...
>
> Note that sed (or any other line based or text based tool) is not a
> sensible way to handle XML. If you want to read XML, use an XML parser.
> They are designed to do exactly what you want in a standard compliant way,
> and they can deal with all sorts of XML formatting and encoding, for example.
>
> > I need to strip out some data from a quirky xml file into a csv:
>
> > from something like this
>
> > < . cust="dick"  product="eggs" ... quantity="12"  >
> > <  cust="tom"  product="milk" ... quantity="2" ...>
> > <  cust="harry"  product="bread" ... quantity="1" ...>
> > <  cust="tom"  product="eggs" ... quantity="6" ...>
> > < . cust="dick"  product="eggs" ... quantity="6"  >
>
> As others have noted, this doesn't tell much about your XML. A more
> complete example would be helpful.
>
> > to this
>
> > dick,eggs,12
> > tom,milk,2
> > harry,bread,1
> > tom,eggs,6
> > dick,eggs,6
>
> > I am new to python and xml and it would be great to see some slick
> > ways of achieving the above by using python's XML capabilities to
> > parse the original file or python's regex to achive what I did using
> > sed.
>
> It's funny how often people still think that SAX is a good way to solve XML
> problems. Here's an untested solution that uses xml.etree.ElementTree:
>
>     from xml.etree import ElementTree as ET
>
>     csv_field_order = ['cust', 'product', 'quantity']
>
>     clean_up_used_elements = None
>     for event, element in ET.iterparse("thefile.xml", events=['start']):
>         # you may want to select a specific element.tag here
>
>         # format and print the CSV line to the standard output
>         print(','.join(element.attrib.get(title, '')
>                        for title in csv_field_order))
>
>         # safe some memory (in case the XML file is very large)
>         if clean_up_used_elements is None:
>             # this assigns the clear() method of the root (first) element
>             clean_up_used_elements = element.clear
>         clean_up_used_elements()
>
> You can strip everything dealing with 'clean_up_used_elements' (basically
> the last section) if your XML file is small enough to fit into memory (a
> couple of MB is usually fine).
>
> Stefan

This solution is so beautiful and elegant. Thank you. Now I am off to
learn ElementTree.

By the way, Stefan, I am using Python 2.6. Do you know the differences
between ElementTree and cElementTree?
-- 
http://mail.python.org/mailman/listinfo/python-list