Re: Parsing a serial stream too slowly

2012-01-23 Thread Jon Clements
On Jan 23, 9:48 pm, "M.Pekala"  wrote:
> Hello, I am having some trouble with a serial stream on a project I am
> working on. I have an external board that is attached to a set of
> sensors. The board polls the sensors, filters them, formats the
> values, and sends the formatted values over a serial bus. The serial
> stream comes out like $A1234$$B-10$$C987$,  where "$A.*$" is a sensor
> value, "$B.*$" is a sensor value, "$C.*$" is a sensor value, ect...
>
> When one sensor is running my python script grabs the data just fine,
> removes the formatting, and throws it into a text control box. However
> when 3 or more sensors are running, I get output like the following:
>
> Sensor 1: 373
> Sensor 2: 112$$M-160$G373
> Sensor 3: 763$$A892$
>
> I am fairly certain this means that my code is running too slow to
> catch all the '$' markers. Below is the snippet of code I believe is
> the cause of this problem...
>
> def OnSerialRead(self, event):
>         text = event.data
>         self.sensorabuffer = self.sensorabuffer + text
>         self.sensorbbuffer = self.sensorbbuffer + text
>         self.sensorcbuffer = self.sensorcbuffer + text
>
>         if sensoraenable:
>                 sensorresult = re.search(r'\$A.*\$.*', self.sensorabuffer )
>                         if sensorresult:
>                                 s = sensorresult.group(0)
>                                 s = s[2:-1]
>                                 if self.sensor_enable_chkbox.GetValue():
>                                         self.SensorAValue = s
>                                 self.sensorabuffer = ''
>
>         if sensorbenable:
>                 sensorresult = re.search(r'\$A.*\$.*', self.sensorbenable)
>                         if sensorresult:
>                                 s = sensorresult.group(0)
>                                 s = s[2:-1]
>                                 if self.sensor_enable_chkbox.GetValue():
>                                         self.SensorBValue = s
>                                 self.sensorbenable= ''
>
>         if sensorcenable:
>                 sensorresult = re.search(r'\$A.*\$.*', self.sensorcenable)
>                         if sensorresult:
>                                 s = sensorresult.group(0)
>                                 s = s[2:-1]
>                                 if self.sensor_enable_chkbox.GetValue():
>                                         self.SensorCValue = s
>                                 self.sensorcenable= ''
>
>         self.DisplaySensorReadings()
>
> I think that regex is too slow for this operation, but I'm uncertain
> of another method in python that could be faster. A little help would
> be appreciated.

You sure that's your code? Your re.search()'s are all the same.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Find the mime type of a file.

2012-01-25 Thread Jon Clements
On Jan 25, 5:04 pm, Olive  wrote:
> I want to have a list of all the images in a directory. To do so I want
> to have a function that find the mime type of a file. I have found
> mimetypes.guess_type but it only works by examining the extension. In
> GNU/Linux the "file" utility do much better by actually looking at the
> file. Is there an equivalent function in python (as a last resort I can
> always use the external file utility).
>
> Olive

You could also try using PIL.(I hardly use it, but...)

from PIL import Image
for fname in [some list of filenames here]:
img = Image.open(fname)
print img.format

Might be more expensive than the file utility, but that's up to you to
determine (open might be lazy, or it might load it - there is a
separate load function though, so who knows).

hth,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Constraints -//- first release -//- Flexible abstract class based validation for attributes, functions and code blocks

2012-01-27 Thread Jon Clements
On Jan 27, 6:38 am, Nathan Rice 
wrote:
> > May I suggest a look at languages such as ATS and Epigram? They use
> > types that constrain values specifically to prove things about your
> > program. Haskell is a step, but as far as proving goes, it's less
> > powerful than it could be. ATS allows you to, at compile-time, declare
> > that isinstance(x, 0 <= Symbol() < len(L)) for some list L. So it
> > might align well with your ideas.
>
> Thanks for the tip.
>
> >>> Probably deserves a better name than "constraintslib", that makes one
> >>> think of constraint satisfaction.
>
> >> As you can probably tell from my other projects, I'm bad at coming up
> >> with snappy names.
>
> > I'm bad at doing research on previous projects ;)
>
> I guess I'm not plugging my other projects enough...  You should check
> out elementwise.
>
> Thanks,
>
> Nathan

I love elementwise and this one - thanks.

If I can be so bold, I would call it 'contracts'. Or, if you want to
be more imaginative and esoteric - 'judge'/'barrister'/'solicitor'.

Thanks again,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: os.stat last accessed attribute updating last accessed value

2012-02-06 Thread Jon Clements
On Feb 4, 9:33 pm, Python_Junkie 
wrote:
> I am trying to obtain the last accessed date.  About 50% of the files'
> attributes were updated such that the file was last accessed when this
> script touches the file.
> I was not opening the files
>
> Anyone have a thought of why this happened.
>
> Python 2.6 on windows xp

Read up on NTFS - but on some file systems - to check a file access
time is, well umm, is accessing it. Also possible that listing a
directory is considered an access. It's the least useful of all
records - I've only ever possibly wanted modification or creation
times.

hth,
Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding MIME type for a data stream

2012-03-08 Thread Jon Clements
On Thursday, 8 March 2012 23:40:13 UTC, Tobiah  wrote:
> > I have to assume you're talking python 2, since in python 3, strings 
> > cannot generally contain image data.  In python 2, characters are pretty 
> > much interchangeable with bytes.
> 
> Yeah, python 2
> 
> 
> > if you're looking for a specific, small list of file formats, you could 
> > make yourself a signature list.  Most (not all) formats distinguish 
> > themselves in the first few bytes. 
> 
> Yeah, maybe I'll just do that.  I'm alowing users to paste
> images into a rich-text editor, so I'm pretty much looking 
> at .png, .gif, or .jpg.  Those should be pretty easy to 
> distinguish by looking at the first few bytes.  
> 
> Pasting images may sound weird, but I'm using a jquery
> widget called cleditor that takes image data from the
> clipboard and replaces it with inline base64 data.  
> The html from the editor ends up as an email, and the
> inline images cause the emails to be tossed in the
> spam folder for most people.  So I'm parsing the
> emails, storing the image data, and replacing the
> inline images with an img tag that points to a 
> web2py app that takes arguments that tell it which 
> image to pull from the database.  
> 
> Now that I think of it, I could use php to detect the
> image type, and store that in the database.  Not quite
> as clean, but that would work.
> 
> Tobiah

Something like the following might be worth a go:
(untested)

from PIL import Image
img = Image.open(StringIO(blob))
print img.format

HTH
Jon.

PIL: http://www.pythonware.com/library/pil/handbook/image.htm
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fast file data retrieval?

2012-03-12 Thread Jon Clements
On Monday, 12 March 2012 20:31:35 UTC, MRAB  wrote:
> On 12/03/2012 19:39, Virgil Stokes wrote:
> > I have a rather large ASCII file that is structured as follows
> >
> > header line
> > 9 nonblank lines with alphanumeric data
> > header line
> > 9 nonblank lines with alphanumeric data
> > ...
> > ...
> > ...
> > header line
> > 9 nonblank lines with alphanumeric data
> > EOF
> >
> > where, a data set contains 10 lines (header + 9 nonblank) and there can
> > be several thousand
> > data sets in a single file. In addition,*each header has a* *unique ID
> > code*.
> >
> > Is there a fast method for the retrieval of a data set from this large
> > file given its ID code?
> >
> Probably the best solution is to put it into a database. Have a look at
> the sqlite3 module.
> 
> Alternatively, you could scan the file, recording the ID and the file
> offset in a dict so that, given an ID, you can seek directly to that
> file position.

I would have a look at either bsddb, Tokyo (or Kyoto) Cabinet or hamsterdb. If 
it's really going to get large and needs a full blown server, maybe 
MongoDB/redis/hadoop...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to decide if a object is instancemethod?

2012-03-14 Thread Jon Clements
On Wednesday, 14 March 2012 13:28:58 UTC, Cosmia Luna  wrote:
> class Foo(object):
> def bar(self):
> return 'Something'
> 
> func = Foo().bar
> 
> if type(func) == : # This should be always true
> pass # do something here
> 
> What should type at ?
> 
> Thanks
> Cosmia

import inspect
if inspect.ismethod(foo):
   # ...

Will return True if foo is a bound method.

hth

Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Jinja2 + jQuery tabs widget

2012-03-14 Thread Jon Clements
On Wednesday, 14 March 2012 14:16:35 UTC, JoeM  wrote:
> Hi All,
> 
>  I'm having issues including a {block} of content from Jinja2
> template into a jQueryUI tab. Does anyone know if such a thing is
> possible? An example is below, which gives me a 500 error when loading
> the page.
> 
> Thanks,
> Joe
> 
> 
> 
> 
>   $(function() {
>   $( "#tabs" ).tabs();
>   });
> 
> 
> 
> 
>   
>   Summary
>   Maps
>   Tables
>   Animations
>   Definitions
>   
>   
> 
> {% block map_content %} {% endblock %}
>   
> 

Firstly, this isn't really a Python language question - although jinja2 is a 
commonly used module for web frameworks.

Secondly, the code looks fine, except we don't know what's in the map_content 
block.

Thirdly, 500 is an internal server error - so it's possible it's nothing to do 
with any of this anyway -- could you provide a more comprehensive error message?

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Global join function?

2012-03-14 Thread Jon Clements
On Wednesday, 14 March 2012 18:41:27 UTC, Darrel Grant  wrote:
> In the virtualenv example bootstrap code, a global join function is used.
> 
> http://pypi.python.org/pypi/virtualenv
> 
> subprocess.call([join(home_dir, 'bin', 'easy_install'),
>  'BlogApplication'])
> 
> 
> In interpeter, I tried this:
> 
> >>> [join([], 'bin', 'easy_install')]
> Traceback (most recent call last):
>   File "", line 1, in 
> NameError: name 'join' is not defined
> 
> I think I've seen this used elsewhere, but googling only seems to show
> results about the string method join, not whatever this is.
> 
> To be clear, I understand how to use "".join(list), but have not found
> any information about this other, seemingly global, join function
> which takes multiple arguments. It's been bugging me.

os.path.join

Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Style question (Poll)

2012-03-15 Thread Jon Clements
On Wednesday, 14 March 2012 21:16:05 UTC, Terry Reedy  wrote:
> On 3/14/2012 4:49 PM, Arnaud Delobelle wrote:
> > On 14 March 2012 20:37, Croepha  wrote:
> >> Which is preferred:
> >>
> >> for value in list:
> >>   if not value is another_value:
> >> value.do_something()
> >> break
> 
> Do you really mean 'is' or '=='?
> 
> If you mean x is not y, write it that way.
> 'not x is y' can be misread and misunderstood, depending on whether
> the 'is' is true or not.
> 
>  >>> not 1 is 1
> False
>  >>> not (1 is 1)
> False
>  >>> (not 1) is 1
> False
> 
> Does not matter how read.
> 
>  >>> not (1 is 0)
> True
>  >>> (not 1) is 0
> False
>  >>> not 1 is 0
> True
> 
> Does matter how read.
> 
> >> if list and not list[0] is another_value:
> >>   list[0].do_something()
> 
> Or
> try:
>value = mylist[0]
>if value is not another_value: value.dosomething
> except IndexError:
>pass
> 
> I would not do this in this case of index 0, but if the index were a 
> complicated expression or expensive function call, making 'if list' an 
> inadequate test, I might.
> 
> > Hard to say, since they don't do the same thing :)
> >
> > I suspect you meant:
> >
> > for value in list:
> > if not value is another_value:
> > value.do_something()
> > break
> >
> > I always feel uncomfortable with this because it's misleading: a loop
> > that never loops.
> 
> I agree. Please do not do this in public ;-).
> 
> -- 
> Terry Jan Reedy

I'm not sure it's efficient or even if I like it, but it avoids try/except and 
the use of a for loop.

if next( iter(mylist), object() ) is not another_value:
# ...

Just my 2p,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: urllib.urlretrieve never returns???

2012-03-19 Thread Jon Clements
On Monday, 19 March 2012 19:32:03 UTC, Laszlo Nagy  wrote:
> The pythonw.exe may not have the rights to access network resources.
> >> Have you set a default timeout for sockets?
> >>
> >> import socket
> >> socket.setdefaulttimeout(10) # 10 seconds
> I have added pythonw.exe to allowed exceptions. Disabled firewall 
> completely. Set socket timeout to 10 seconds. Still nothing.
> 
> urllib.urlretrieve does not return from call
> 
> any other ideas?

Maybe try using the reporthook option for urlretrieve, just to see if that does 
anything... If it constantly calls the hook or never calls it, that's one thing.

Alternately, tcpdump/wireshark whatever, to see what the heck is going on with 
traffic - if any.

hth

Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python is readable (OT)

2012-03-22 Thread Jon Clements
On Thursday, 22 March 2012 08:56:17 UTC, Steven D'Aprano  wrote:
> On Wed, 21 Mar 2012 18:35:16 -0700, Steve Howell wrote:
> 
> > On Mar 21, 11:06 am, Nathan Rice 
> > wrote:
[snip].
> 
> Different programming languages are good for different things because 
> they have been designed to work in different problem/solution spaces. 
> Although I dislike C with a passion, I do recognise that it is good for 
> when the programmer needs fine control over the smallest details. It is, 
> after all, a high-level assembler. Likewise for Forth, which lets you 
> modify the compiler and language as you go.
> 
> Some languages are optimized for the compiler, some for the writer, and 
> some for the reader. So are optimized for numeric work, others for 
> database access. Some are Jack-Of-All-Trades. Each language encourages 
> its own idioms and ways of thinking about programming. 
> 
> When it comes to programming, I say, let a thousand voices shout out. 
> Instead of imagining a single language so wonderful that every other 
> language is overshadowed and forgotten, imagine that the single language 
> is the next Java, or C, or even for that matter Python, but whatever it 
> is, it's not ideal for the problems you care about, or the way you think 
> about them. Not so attractive now, is it?
> 
> 
> > The optimistic view is that there will be some kind of inflection point
> > around 2020 or so.  I could imagine a perfect storm of good things
> > happening, like convergence on a single browser platform,
> 
> You call that a perfect storm of good things. I call that sort of 
> intellectual and software monoculture a nightmare.
> 
> I want a dozen browsers, not one of which is so common that web designers 
> can design for it and ignore the rest, not one browser so common that 
> nobody dares try anything new.
> 
> 
> > nearly
> > complete migration to Python 3, further maturity of JVM-based languages,
> > etc., where the bar gets a little higher from what people expect from
> > languages.  Instead of fighting semicolons and braces, we start thinking
> > bigger.  It could also be some sort of hardware advance, like screen
> > resolutions that are so amazing they let us completely rethink our views
> > on terseness, punctuation, code organization, etc.
> 
> And what of those with poor eyesight, or the blind? Are they to be 
> excluded from your "bigger" brave new world?
> 
> 
> 
> -- 
> Steven



On Thursday, 22 March 2012 08:56:17 UTC, Steven D'Aprano  wrote:
> On Wed, 21 Mar 2012 18:35:16 -0700, Steve Howell wrote:
> 
> > On Mar 21, 11:06 am, Nathan Rice 
> > wrote:
> >> As for syntax, we have a lot of "real" domain specific languages, such
> >> as English, math and logic. They are vetted, understood and useful
> >> outside the context of programming.  We should approach the discussion
> >> of language syntax from the perspective of trying to define a unified
> >> syntactical structure for real these DSLs.    Ideally it would allow
> >> representation of things in a familiar way where possible, while
> >> providing an elegant mechanism for descriptions that cut across domains
> >> and eliminating redundancy/ambiguity.  This is clearly possible, though
> >> a truly successful attempt would probably be a work of art for the
> >> ages.
> > 
> > If I'm reading you correctly, you're expressing frustration with the
> > state of language syntax unification in 2012.  You mention language in a
> > broad sense (not just programming languages, but also English, math,
> > logic, etc.), but even in the narrow context of programming languages,
> > the current state of the world is pretty chaotic.
> 
> And this is a good thing. Programming languages are chaotic because the 
> universe of programming problems is chaotic, and the strategies available 
> to solve those problems are many and varied.
> 
> Different programming languages are good for different things because 
> they have been designed to work in different problem/solution spaces. 
> Although I dislike C with a passion, I do recognise that it is good for 
> when the programmer needs fine control over the smallest details. It is, 
> after all, a high-level assembler. Likewise for Forth, which lets you 
> modify the compiler and language as you go.
> 
> Some languages are optimized for the compiler, some for the writer, and 
> some for the reader. So are optimized for numeric work, others for 
> database access. Some are Jack-Of-All-Trades. Each language encourages 
> its own idioms and ways of thinking about programming. 
> 
> When it comes to programming, I say, let a thousand voices shout out. 
> Instead of imagining a single language so wonderful that every other 
> language is overshadowed and forgotten, imagine that the single language 
> is the next Java, or C, or even for that matter Python, but whatever it 
> is, it's not ideal for the problems you care about, or the way you think 
> about them. Not so attractive now, is it?
> 
> 
> > The optimistic view is that there

Re: Data mining/pattern recogniton software in Python?

2012-03-23 Thread Jon Clements
On Friday, 23 March 2012 16:43:40 UTC, Grzegorz Staniak  wrote:
> Hello,
> 
> I've been asked by a colleague for help in a small educational
> project, which would involve the recognition of patterns in a live 
> feed of data points (readings from a measuring appliance), and then 
> a more general search for patterns on archival data. The language 
> of preference is Python, since the lab uses software written in
> Python already. I can see there are packages like Open CV,
> scikit-learn, Orange that could perhaps be of use for the mining
> phase -- and even if they are slanted towards image pattern 
> recognition, I think I'd be able to find an appropriate package
> for the timeseries analyses. But I'm wondering about the "live" 
> phase -- what approach would you suggest? I wouldn't want to 
> force an open door, perhaps there are already packages/modules that 
> could be used to read data in a loop i.e. every 10 seconds, 
> maintain a a buffer of 15 readings and ring a bell when the data
> in buffer form a specific pattern (a spike, a trough, whatever)?
> 
> I'll be grateful for a push in the right direction. Thanks,
> 
> GS
> -- 
> Grzegorz Staniak   

It might also be worth checking out pandas[1] and scikits.statsmodels[2].

In terms of reading data in a loop I would probably go for a producer-consumer 
model (possibly using a Queue[3]). Have the consumer constantly try to get 
another reading, and notify the consumer which can then determine if it's got 
enough data to calculate a peak/trough. This article is also a fairly good 
read[4].

That's some pointers anyway,

hth,

Jon.


[1] http://pandas.pydata.org/
[2] http://statsmodels.sourceforge.net/
[3] http://docs.python.org/library/queue.html
[4] 
http://www.laurentluce.com/posts/python-threads-synchronization-locks-rlocks-semaphores-conditions-events-and-queues/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fetching data from a HTML file

2012-03-23 Thread Jon Clements
On Friday, 23 March 2012 13:52:05 UTC, Sangeet  wrote:
> Hi,
> 
> I've got to fetch data from the snippet below and have been trying to match 
> the digits in this to specifically to specific groups. But I can't seem to 
> figure how to go about stripping the tags! :(
> 
> Sum class="green">24511 align='center'>02561.496 
> [min]
> 
> 
> Actually, I'm working on ROBOT Framework, and haven't been able to figure out 
> how to read data from HTML tables. Reading from the source, is the best (read 
> rudimentary) way I could come up with. Any suggestions are welcome!
> 
> Thanks,
> Sangeet

I would personally use lxml - a quick example:

# -*- coding: utf-8 -*-
import lxml.html

text = """
Sum​2451102561.496 
[min]

"""

table = lxml.html.fromstring(text)
for tr in table.xpath('//tr'):
print [ (el.get('class', ''), el.text_content()) for el in 
tr.iterfind('td') ]

[('', 'Sum'), ('', ''), ('green', '245'), ('red', '11'), ('', '0'), ('', 
'256'), ('', '1.496 [min]')]

It does a reasonable job, but if it doesn't work quite right, then there's a 
.fromstring(parser=...) option, and you should be able to pass in ElementSoup 
and try your luck from there. 

hth,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best way to structure data for efficient searching

2012-04-02 Thread Jon Clements
On Wednesday, 28 March 2012 19:39:54 UTC+1, larry@gmail.com  wrote:
> I have the following use case:
> 
> I have a set of data that is contains 3 fields, K1, K2 and a
> timestamp. There are duplicates in the data set, and they all have to
> processed.
> 
> Then I have another set of data with 4 fields: K3, K4, K5, and a
> timestamp. There are also duplicates in that data set, and they also
> all have to be processed.
> 
> I need to find all the items in the second data set where K1==K3 and
> K2==K4 and the 2 timestamps are within 20 seconds of each other.
> 
> I have this working, but the way I did it seems very inefficient - I
> simply put the data in 2 arrays (as tuples) and then walked through
> the entire second data set once for each item in the first data set,
> looking for matches.
> 
> Is there a better, more efficient way I could have done this?

It might not be more *efficient* but others might find it more readable, and 
it'd be easier to change later. Try an in-memory SQL DB (such as sqlite3) and 
query as (untested)

select t2.* from t1 join t2 on k1=k3 and k2=k4 where abs(t1.timestamp - 
t2.timestamp) < 20

Failing that, two (default)dicts with a tuple as the pair, then use that as 
your base.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: help with subclassing problem

2012-04-02 Thread Jon Clements
On Thursday, 29 March 2012 21:23:20 UTC+1, Peter  wrote:
> I am attempting to subclass the date class from the datetime package. 
> Basically I want a subclass that can take the date as a string (in multiple 
> formats), parse the string and derive the year,month and day information to 
> create a date instance i.e. 
> 
> class MyDate(datetime.date):
>   def __init__(self, the_date):
> # magic happens here to derives year, month and day from the_date
> datetime.date.__init__(self, year, month, day)
> 
> But no matter what I do, when I attempt to create an instance of the new 
> class, I get the error message:
> 
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: Required argument 'year' (pos 1) not found
> 
> 
> I have even created a class which doesn't include the argument I want to use 
> but with default arguments i.e.
> 
> class MyDate (datetime.date):
>   def __init__(self, year = 1, month = 1, day = 1):
> datetime.date.__init__(self, year, month, day)
> 
> and get the same error message.
> 
> What am I doing wrong here? 
> 
> Thanks for any help,
> Peter

Details here: 
http://stackoverflow.com/questions/399022/why-cant-i-subclass-datetime-date

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Async IO Server with Blocking DB

2012-04-04 Thread Jon Clements
On Tuesday, 3 April 2012 23:13:24 UTC+1, looking for  wrote:
> Hi
> 
> We are thinking about building a webservice server and considering
> python event-driven servers i.e. Gevent/Tornado/ Twisted or some
> combination thereof etc.
> 
> We are having doubts about the db io part. Even with connection
> pooling and cache, there is a strong chance that server will block on
> db. Blocking for even few ms is bad.
> 
> can someone suggest some solutions or is async-io is not at the prime-
> time yet.
> 
> Thanks

Maybe look at Cyclone (a Tornado variation built on Twisted), and various 
modules that will offer synch and events - GIYF! It's doable!

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python Gotcha's?

2012-04-05 Thread Jon Clements
On Wednesday, 4 April 2012 23:34:20 UTC+1, Miki Tebeka  wrote:
> Greetings,
> 
> I'm going to give a "Python Gotcha's" talk at work.
> If you have an interesting/common "Gotcha" (warts/dark corners ...) please 
> share.
> 
> (Note that I want over http://wiki.python.org/moin/PythonWarts already).
> 
> Thanks,
> --
> Miki

One I've had to debug...

>>> text = 'abcdef'

>>> if text.find('abc'):
print 'found it!'
# Nothing prints as bool(0) is False

>>> if text.find('bob'):
print 'found it!'
found it!

Someone new who hasn't read the docs might try this, but then I guess it's not 
really a gotcha if they haven't bothered doing that.

-- 
http://mail.python.org/mailman/listinfo/python-list


ordering with duck typing in 3.1

2012-04-07 Thread Jon Clements
Any reason you can't derive from int instead of object? You may also want to 
check out functions.total_ordering on 2.7+
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ordering with duck typing in 3.1

2012-04-09 Thread Jon Clements
On Monday, 9 April 2012 12:33:25 UTC+1, Neil Cerutti  wrote:
> On 2012-04-07, Jon Clements  wrote:
> > Any reason you can't derive from int instead of object? You may
> > also want to check out functions.total_ordering on 2.7+
> 
> functools.total_ordering
> 
> I was temporarily tripped up by the aforementioned documentation,
> myself.
> 
> -- 
> Neil Cerutti

Oops. I sent it from a mobile tablet device - I got auto-corrected. But yes, it 
is functools.total_ordering - TY you Neil.

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: escaping

2012-04-16 Thread Jon Clements
On Monday, 16 April 2012 11:03:31 UTC+1, Kiuhnm  wrote:
> On 4/16/2012 4:42, Steven D'Aprano wrote:
> > On Sun, 15 Apr 2012 23:07:36 +0200, Kiuhnm wrote:
> >
> >> This is the behavior I need:
> >>   path = path.replace('\\', '')
> >>   msg = ". {} .. '{}' .. {} .".format(a, path, b)
> >> Is there a better way?
> >
> >
> > This works for me:
> >
>  a = "spam"
>  b = "ham"
>  path = r"C:\a\b\c\d\e.txt"
>  msg = ". %s .. %r .. %s ." % (a, path, b)
>  print msg
> > . spam .. 'C:\\a\\b\\c\\d\\e.txt' .. ham .
> 
> I like this one. Since I read somewhere that 'format' is preferred over 
> '%', I was focusing on 'format' and I didn't think of '%'.
> Anyway, it's odd that 'format' doesn't offer something similar.
> 
> Kiuhnm

If you look at http://docs.python.org/library/string.html#format-string-syntax

you'll notice the equiv. of %r is {!r}
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regular expressions, help?

2012-04-19 Thread Jon Clements
On Thursday, 19 April 2012 07:11:54 UTC+1, Sania  wrote:
> Hi,
> So I am trying to get the number of casualties in a text. After 'death
> toll' in the text the number I need is presented as you can see from
> the variable called text. Here is my code
> I'm pretty sure my regex is correct, I think it's the group part
> that's the problem.
> I am using nltk by python. Group grabs the string in parenthesis and
> stores it in deadnum and I make deadnum into a list.
> 
>  text="accounts put the death toll at 637 and those missing at
> 653 , but the total number is likely to be much bigger"
>   dead=re.match(r".*death toll.*(\d[,\d\.]*)", text)
>   deadnum=dead.group(1)
>   deaths.append(deadnum)
>   print deaths
> 
> Any help would be appreciated,
> Thank you,
> Sania

Or just don't fully rely on a regex. I would, for time, and the little sanity I 
believe I have left, would just do something like:

death_toll = re.search(r'death toll.*\d+', text).group().rsplit(' ', 1)[1]

hth,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How do you refer to an iterator in docs?

2012-04-19 Thread Jon Clements
On Thursday, 19 April 2012 13:21:20 UTC+1, Roy Smith  wrote:
> Let's say I have a function which takes a list of words.  I might write 
> the docstring for it something like:
> 
> def foo(words):
>"Foo-ify words (which must be a list)"
> 
> What if I want words to be the more general case of something you can 
> iterate over?  How do people talk about that in docstrings?  Do you say 
> "something which can be iterated over to yield words", "an iterable over 
> words", or what?
> 
> I can think of lots of ways to describe the concept, but most of them 
> seem rather verbose and awkward compared to "a list of words", "a 
> dictionary whose keys are words", etc.

I would just write the function signature as (very similar to how itertools 
does it):

def func(iterable, ..):
  pass

IMHO that documents itself.

If you need explicit, look at the itertools documentation.

hth

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using arguments in a decorator

2012-04-20 Thread Jon Clements
On Friday, 20 April 2012 16:57:06 UTC+1, Rotwang  wrote:
> Hi all, here's a problem I don't know how to solve. I'm using Python 2.7.2.
> 
> I'm doing some stuff in Python which means I have cause to call 
> functions that take a while to return. Since I often want to call such a 
> function more than once with the same arguments, I've written a 
> decorator to eliminate repeated calls by storing a dictionary whose 
> items are arguments and their results:
> 
> def memo(func):
>  def memofunc(*args, **kwargs):
>  twargs = tuple(kwargs.items())
>  if (args, twargs) in memofunc.d:
>  return copy(memofunc.d[(args, twargs)])
>  memofunc.d[(args, twargs)] = func(*args, **kwargs)
>  return copy(memofunc.d[(args, twargs)])
>  memofunc.__name__ = func.__name__
>  memofunc.d = {}
>  return memofunc
> 
> 
> If a function f is decorated by memo, whenever f is called with 
> positional arguments args and keyword arguments kwargs, the decorated 
> function defines twargs as a hashable representation of kwargs, checks 
> whether the tuple (args, twargs) is in f's dictionary d, and if so 
> returns the previously calculated value; otherwise it calculates the 
> value and adds it to the dictionary (copy() is a function that returns 
> an object that compares equal to its argument, but whose identity is 
> different - this is useful if the return value is mutable).
> 
> As far as I know, the decorated function will always return the same 
> value as the original function. The problem is that the dictionary key 
> stored depends on how the function was called, even if two calls should 
> be equivalent; hence the original function gets called more often than 
> necessary. For example, there's this:
> 
>  >>> @memo
> def f(x, y = None, *a, **k):
>   return x, y, a, k
> 
>  >>> f(1, 2)
> (1, 2, (), {})
>  >>> f.d
> {((1, 2), ()): (1, 2, (), {})}
>  >>> f(y = 2, x = 1)
> (1, 2, (), {})
>  >>> f.d
> {((1, 2), ()): (1, 2, (), {}), ((), (('y', 2), ('x', 1))): (1, 2, (), {})}
> 
> 
> What I'd like to be able to do is something like this:
> 
> def memo(func):
>  def memofunc(*args, **kwargs):
>  #
>  # define a tuple consisting of values for all named positional
>  # arguments occurring in the definition of func, including
>  # default arguments if values are not given by the call, call
>  # it named
>  #
>  # define another tuple consisting of any positional arguments
>  # that do not correspond to named arguments in the definition
>  # of func, call it anon
>  #
>  # define a third tuple consisting of pairs of names and values
>  # for those items in kwargs whose keys are not named in the
>  # definition of func, call it key
>  #
>  if (named, anon, key) in memofunc.d:
>  return copy(memofunc.d[(named, anon, key)])
>  memofunc.d[(named, anon, key)] = func(*args, **kwargs)
>  return copy(memofunc.d[(named, anon, key)])
>  memofunc.__name__ = func.__name__
>  memofunc.d = {}
>  return memofunc
> 
> 
> But I don't know how. I know that I can see the default arguments of the 
> original function using func.__defaults__, but without knowing the 
> number and names of func's positional arguments (which I don't know how 
> to find out) this doesn't help me. Any suggestions?
> 
> 
> -- 
> Hate music? Then you'll hate this:
> 
> http://tinyurl.com/psymix

Possibly take a look at functools.lru_cache (which is Python 3.2+), and use the 
code from that (at it's part of the stdlib, someone must have done design and 
testing on it!). http://hg.python.org/cpython/file/default/Lib/functools.py

Jon


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using arguments in a decorator

2012-04-21 Thread Jon Clements
On Saturday, 21 April 2012 09:25:40 UTC+1, Steven D'Aprano  wrote:
> On Fri, 20 Apr 2012 09:10:15 -0700, Jon Clements wrote:
> 
> >> But I don't know how. I know that I can see the default arguments of
> >> the original function using func.__defaults__, but without knowing the
> >> number and names of func's positional arguments (which I don't know how
> >> to find out) this doesn't help me. Any suggestions?
> > 
> > Possibly take a look at functools.lru_cache (which is Python 3.2+), and
> > use the code from that (at it's part of the stdlib, someone must have
> > done design and testing on it!).
> 
> With respect Jon, did you read the Original Poster's question closely? 
> Using a LRU cache doesn't even come close to fixing his problem, which 
> occurs *before* you do the lookup in the cache.

I did indeed Steven - what I was suggesting was that functools.lru_cache would 
be a good starting point. Although I will completely admit that I didn't read 
the code for functools.lru_cache thoroughly enough to realise it wouldn't be 
suitable for the OP (ie, it sounded right, looked okay at a glance, and I 
figured it wasn't a totally unreasonable assumption of a suggestion - so guess 
I fell into the old 'assume' trap! [not the first time, and won't be the last 
for sure!])

> 
> Rotwang's problem is that if you have a function with default arguments:
> 
> def func(spam=42):
> return result_of_time_consuming_calculation()
> 
> then these three function calls are identical and should (but don't) 
> share a single cache entry:
> 
> func()
> func(42)
> func(spam=42)
> 
> The OP would like all three to share a single cache entry without needing 
> two redundant calculations, which take a long time.
> 
> The problem is that the three calls give three different patterns of args 
> and kwargs:
> 
> (), {}
> (42,) {}
> (), {'spam': 42}
> 
> hence three different cache entries, two of which are unnecessary.

I'm wondering if it wouldn't be unreasonable for lru_cache to handle this.

Cheers, Jon.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Newbie, homework help, please.

2012-04-21 Thread Jon Clements
On Saturday, 21 April 2012 18:35:26 UTC+1, someone  wrote:
> On Saturday, April 21, 2012 12:28:33 PM UTC-5, someone wrote:
> > Ok, this is my dillema, not only am I new to this programming buisness, 
> > before the last few days, I did not even know what python was, and besides 
> > opening up the internet or word documents, that is most of what I know. 
> > Yet, I have a professor who should be on Psych medication for giving us 3 
> > projects, 2 of which I have not listed here to do. I was able to do 
> > research over the last 3 days, and I have spent 3 days on this project, by 
> > borrowing others ideas on this project. Below, you will find my professors 
> > assignment (oh, and due in one week right before finals, so I am stressing 
> > out so much, cause I don't know why he is crazy enough to assign crap like 
> > this a week before finals when I have Calculus final,chem final, etc. I 
> > have figured out most of the assignment, and below, it will be posted after 
> > the teacher's post of the assignment. What I need help with, and I have 
> > tried relentlessly to find, is how to put freaking stars(asterisks) as 
> > border around a list without installing any other program to a portable 
> > python, of course, this is where my problem lies. Below, you will see what 
> > I have done, please, help!!!
> > You are required to complete and submit the following programming projects 
> > in Python by the indicated deadline:
> > 
> > Standard Header Information project (5 pts):
> > Write a program that will:
> > 1) Ask the user for the following information:
> > - name of file to be created for storing SHI
> > - user’s name (as part of SHI)
> > - user’s course and section (as part of SHI)
> > - user’s semester and year (as part of SHI)
> > - user’s assignment title (as part of SHI)
> > 2) Write the above SHI data to a text (.txt) file with the name chosen by 
> > the user (above)
> > 3) Close the file that the SHI data was written to
> > 4) Open the file with the SHI data (again)
> > 5) Read the data into different (from part 1) variable names
> > 6) Display the SHI data read from the file in the interpreter with a border 
> > around the SHI data (include a buffer of 1 line/space between the border 
> > and SHI data). An example might look like:
> > 
> > ***
> > * *
> > * First Name and Last *
> > * ENGR 109-X  *
> > * Fall 2999   *
> > * Format Example  *
> > * *
> > ***
> > 
> > 
> > textfile=input('Hello, we are about to create a text file. An example would 
> > be: (sample.txt) without the parenthesis. What ever you do name it, it 
> > needs to end in (.txt). What would you like to name your textfile?')
> > userinput=[input('What is your name?'),input('What is your Course Section 
> > and Course number?'),input('What is the Semester and year?'),input('What is 
> > the title of this class assignment?')]
> > for item in userinput:
> > openfile=open(textfile,'w');openfile.writelines("%s\n" % item for item 
> > in userinput);openfile.close()
> > x=textfile;indat=open(x,'r');SHI=indat.read()
> > def border(Sullivan):
> > string=SHI
> > stringlength=len(string)
> > stringlength=stringlength("%s\n" % item for item in stringlength) + 2 * 
> > (3 + 3)
> > hBorder=stringlength//2*"* "+"*"[:stringlength%2]
> > spacer="*"+" "*(stringlength - 2)+"*"
> > fancyText="*  "+string+"  *"
> > return(hBorder,spacer,fancyText,hBorder)
> > 
> > textTuple = border(SHI)
> > for lines in textTuple:
> > print (lines)
> 
> almost forgot, it has to have a 1 inch border around the top, bottom, left, 
> and right, with it being aligned to the left. In the picture above, that is 
> not how it actually looks, the stars to the right are aligned on the right, 
> not right next to each other. Thanks.

Honestly phrased question - well done.

Look at the textwrap module - I have no idea how you'll got an inch outputting 
in just text, as I might have a slightly different font setting and logical and 
physical inches are different.

Good luck, Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Web Scraping - Output File

2012-04-26 Thread Jon Clements
  comcast.net> writes:

> 
> Hello,
> 
> I am having some difficulty generating the output I want from web
> scraping. Specifically, the script I wrote, while it runs without any
> errors, is not writing to the output file correctly. It runs, and
> creates the output .txt file; however, the file is blank (ideally it
> should be populated with a list of names).
> 
> I took the base of a program that I had before for a different data
> gathering task, which worked beautifully, and edited it for my
> purposes here. Any insight as to what I might be doing wrote would be
> highly appreciated. Code is included below. Thanks!

I would approach it like this...

import lxml.html

QUERY = '//tr[@bgcolor="#F1F3F4"][td[starts-with(@class, "body_cols")]]'

url = 'http://www.skadden.com/Index.cfm?contentID=44&alphaSearch=A'


tree = lxml.html.parse(url).getroot()
trs = tree.xpath(QUERY)
for tr in trs:
   tds = [el.text_content() for el in tr.iterfind('td')]
   print tds


hth

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML Code - Line Number

2012-04-27 Thread Jon Clements
  comcast.net> writes:

> 
> Hello,
> 
[snip]
> Any thoughts as to how to define a function to do this, or do this
> some other way? All insight is much appreciated! Thanks.
> 


Did you not see my reply to your previous thread?

And why do you want the line number?

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML Code - Line Number

2012-04-27 Thread Jon Clements
On Friday, 27 April 2012 18:09:57 UTC+1, smac...@comcast.net  wrote:
> Hello,
> 
> For scrapping purposes, I am having a bit of trouble writing a block
> of code to define, and find, the relative position (line number) of a
> string of HTML code. I can pull out one string that I want, and then
> there is always a line of code, directly beneath the one I can pull
> out, that begins with the following:
> 
> 
> However, because this string of HTML code above is not unique to just
> the information I need (which I cannot currently pull out), I was
> hoping there is a way to effectively say "if you find the html string
> _ in the line of HTML code above, and the string  valign="top" class="body_co  comcast.net> writes:

> 
> Hello,
> 
> I am having some difficulty generating the output I want from web
> scraping. Specifically, the script I wrote, while it runs without any
> errors, is not writing to the output file correctly. It runs, and
> creates the output .txt file; however, the file is blank (ideally it
> should be populated with a list of names).
> 
> I took the base of a program that I had before for a different data
> gathering task, which worked beautifully, and edited it for my
> purposes here. Any insight as to what I might be doing wrote would be
> highly appreciated. Code is included below. Thanks!

[quoting reply to first thread]
I would approach it like this...

import lxml.html

QUERY = '//tr[@bgcolor="#F1F3F4"][td[starts-with(@class, "body_cols")]]'

url = 'http://www.skadden.com/Index.cfm?contentID=44&alphaSearch=A'


tree = lxml.html.parse(url).getroot()
trs = tree.xpath(QUERY)
for tr in trs:
   tds = [el.text_content() for el in tr.iterfind('td')]
   print tds


hth

Jon.
[/quote]





> following, then pull everything that follows this second string.
> 
> Any thoughts as to how to define a function to do this, or do this
> some other way? All insight is much appreciated! Thanks.

  comcast.net> writes:

> 
> Hello,
> 
[snip]
> Any thoughts as to how to define a function to do this, or do this
> some other way? All insight is much appreciated! Thanks.
> 

[quote in reply to second thread]
Did you not see my reply to your previous thread?

And why do you want the line number?
[/quote]

I'm trying this on GG, as the mailing list gateway one or t'other does nee seem 
to work (mea culpa no doubt).

So may have obscured the issue more with my quoting and snipping, or what not.

Jon.









-- 
http://mail.python.org/mailman/listinfo/python-list


Re: key/value store optimized for disk storage

2012-05-05 Thread Jon Clements
On Friday, 4 May 2012 16:27:54 UTC+1, Steve Howell  wrote:
> On May 3, 6:10 pm, Miki Tebeka  wrote:
> > > I'm looking for a fairly lightweight key/value store that works for
> > > this type of problem:
> >
> > I'd start with a benchmark and try some of the things that are already in 
> > the standard library:
> > - bsddb
> > - sqlite3 (table of key, value, index key)
> > - shelve (though I doubt this one)
> >
> 
> Thanks.  I think I'm ruling out bsddb, since it's recently deprecated:
> 
> http://www.gossamer-threads.com/lists/python/python/106494
> 
> I'll give sqlite3 a spin.  Has anybody out there wrapped sqlite3
> behind a hash interface already?  I know it's simple to do
> conceptually, but there are some minor details to work out for large
> amounts of data (like creating the index after all the inserts), so if
> somebody's already tackled this, it would be useful to see their
> code.
> 
> > You might find that for a little effort you get enough out of one of these.
> >
> > Another module which is not in the standard library is hdf5/PyTables and in 
> > my experience very fast.
> 
> Thanks.

Could also look at Tokyo cabinet or Kyoto cabinet (but I believe that has 
slightly different licensing conditions for commercial use).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A question of style (finding item in list of tuples)

2012-05-21 Thread Jon Clements
On Monday, 21 May 2012 13:37:29 UTC+1, Roy Smith  wrote:
> I've got this code in a django app:
> 
> CHOICES = [
> ('NONE', 'No experience required'),
> ('SAIL', 'Sailing experience, new to racing'),
> ('RACE', 'General racing experience'),
> ('GOOD', 'Experienced racer'),
> ('ROCK', 'Rock star'),
> ]
> 
> def experience_text(self):
> for code, text in self.CHOICES:
> if code == self.level:
> return text
> return ""
> 
> Calling experience_text("ROCK") should return "Rock star".  Annoyingly, 
> django handles this for you automatically inside a form, but if you also 
> need it in your application code, you have to roll your own.
> 
> The above code works, but it occurs to me that I could use the much 
> shorter:
> 
> def experience_text(self):
> return dict(CHOICES).get("self.level", "???")
> 
> So, the question is, purely as a matter of readability, which would you 
> find easier to understand when reading some new code?  Assume the list 
> of choices is short enough that the cost of building a temporary dict on 
> each call is negligible.  I'm just after style and readability here.

Haven't used django in a while, but doesn't the model provide a 
get_experience_display() method which you could use...

Failing that, if order isn't important, you can not bother with tuples and have 
CHOICES be a dict, then pass choices=CHOICES.iteritems() as I believe it takes 
any iterable, and maybe plug an ordereddict if order is important.

hth

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Email Id Verification

2012-05-25 Thread Jon Clements
On Friday, 25 May 2012 14:36:18 UTC+1, Grant Edwards  wrote:
> On 2012-05-25, Steven D'Aprano  wrote:
> > On Thu, 24 May 2012 05:32:16 -0700, niks wrote:
> >
> >> Hello everyone..
> >> I am new to asp.net...
> >> I want to use Regular Expression validator in Email id verification..
> >
> > Why do you want to write buggy code that makes your users hate your 
> > program? Don't do it! Write good code, useful code! Validating email 
> > addresses is the wrong thing to do.
> 
> I have to agree with Steven.  Nothing will make your users swear at
> you as certainly as when you refuse to accept the e-mail address at
> which the reeive e-mail all day every day.
> 
> -- 
> Grant Edwards   grant.b.edwardsYow! I appoint you
>   at   ambassador to Fantasy
>   gmail.comIsland!!!

Ditto.

This would be my public email, but (like most I believe) also have 'private' 
and work email addresses. 

For the OP, just trying to check an email is syntactically correct is okay-ish 
if done properly. Normally as mentioned you just send a confirmation email to 
said address with some id and link that confirms (normally with an expiry 
period). Some mail servers support the "does this mailbox exist?" request, but 
I fear these days due to spam, most will just say no -- so the only option is 
to send and handle a bounce (and some don't even send back bounces). And a 
pretty good way for malicious people to make mail servers think you're trying a 
DoS.

Although, what I'm finding useful is an option of "auth'ing" with twitter, 
facebook, google etc... Doesn't require a huge amount of work, and adds a bit 
of validity to the request.

Jon (who still didn't get any bloody Olympic tickets).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dynamic comparison operators

2012-05-25 Thread Jon Clements
> 
> Any time you find yourself thinking that you want to use eval to solve a 
> problem, take a long, cold shower until the urge goes away.
> 
> If you have to ask why eval is dangerous, then you don't know enough 
> about programming to use it safely. Scrub it out of your life until you 
> have learned about code injection attacks, data sanitation, trusted and 
> untrusted input. Then you can come back to eval and use it safely and 
> appropriately.

I would +1 QOTW - but fear might have to cheat and say +1 to 2 paragraphs of 
the week :)

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


usenet reading

2012-05-25 Thread Jon Clements
Hi All,

Normally use Google Groups but it's becoming absolutely frustrating - not only 
has the interface changed to be frankly impractical, the posts are somewhat 
random of what appears, is posted and whatnot. (Ironically posted from GG)

Is there a server out there where I can get my news groups? I use to be with an 
ISP that hosted usenet servers, but alas, it's no longer around...

Only really interested in Python groups and C++.

Any advice appreciated,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite INSERT performance

2012-05-31 Thread Jon Clements
On Thursday, 31 May 2012 16:25:10 UTC+1, duncan smith  wrote:
> On 31/05/12 06:15, John Nagle wrote:
> > On 5/30/2012 6:57 PM, duncan smith wrote:
> >> Hello,
> >> I have been attempting to speed up some code by using an sqlite
> >> database, but I'm not getting the performance gains I expected.
> >
> > SQLite is a "lite" database. It's good for data that's read a
> > lot and not changed much. It's good for small data files. It's
> > so-so for large database loads. It's terrible for a heavy load of
> > simultaneous updates from multiple processes.
> >
> 
> Once the table is created the data will not be changed at all. 
> Corresponding integer codes will have to be generated for columns. (I 
> want to do this lazily because some columns might never be needed for 
> output files, and processing all columns was relatively expensive for my 
> initial solution.) After that it's a series of 'SELECT a, b, ... FROM 
> table WHERE f="g" ORDER by a, b, ...' style queries dumped to space 
> separated text files.
> 
> > However, wrapping the inserts into a transaction with BEGIN
> > and COMMIT may help.
> >
> 
> Unfortunately there's no discernible difference.
> 
> > If you have 67 columns in a table, you may be approaching the
> > problem incorrectly.
> >
> 
> Quite possibly. I have defined start and end points. The data are 
> contained in text files. I need to do the mapping to integer codes and 
> generate output files for subsets of variables conditional on the levels 
> of other variables. (I was doing the subsequent sorting separately, but 
> if I'm using SQL I guess I might as well include that in the query.) The 
> output files are inputs for other (C++) code that I have no control over.
> 
> Any approach that doesn't consume large amounts of memory will do. Cheers.
> 
> Duncan

It might be worth checking out https://sdm.lbl.gov/fastbit/ which has Python 
bindings (nb: the library itself takes a while to compile), but I'm not I00% 
sure it would meet all your requirements.

Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: DBF records API

2012-06-01 Thread Jon Clements

On 01/06/12 23:13, Tim Chase wrote:

On 06/01/12 15:05, Ethan Furman wrote:

MRAB wrote:

I'd probably think of a record as being more like a dict (or an
OrderedDict)
with the fields accessed by key:

 record["name"]

but:

 record.deleted


Record fields are accessible both by key and by attribute -- by key
primarily for those cases when the field name is in a variable:

  for field in ('full_name','nick_name','pet_name'):
  print record[field]

and since dbf record names cannot start with _ and are at most 10
characters long I've used longer than that method names... but if I want
to support dbf version 7 that won't work.


It seems to me that, since you provide both the indexing notation
and the dotted notation, just ensure that the methods such as

   dbf.scatter_fields

*always* trump and refer to the method.  This allows for convenience



of using the .field_name notation for the vast majority of cases,
but ensures that it's still possible for the user (of your API) to
use the indexing method to do things like

   value = dbf["scatter_fields"]

if they have a thusly-named field name and want its value.

-tkc


I did think about *trumping* one way or the other, but both *ugh*.

Ethan:

I think offering both is over-complicating the design for no gain, and 
possible complications later. For instance, what if you introduce a 
method/property called "last" to get the last row of a table, it'll 
cause some head-scratching as someone will suddenly have to make sure 
your API changes didn't conflict with their column names (or if they've 
used yours as a base and introduce methods, doesn't interfere with their 
users of their version of the library...)


To most developers, I think blah["whatever"] is perfectly clear as 
looking up a value via key is mostly done that way.


I suppose you could use __getitem__ to grab certain fields in one go ( 
as per your example - from any iterable that isn't a basestring? - and 
users would probably enjoy not keep re-typing "record.xxx" and would 
save you having to invent another possibly conflicting name) such as:


print record['full_name', 'nick_name', 'pet_name']  # looks clean to me

In short I totally agree with MRAB here.

Just my 2p,

Jon.








--
http://mail.python.org/mailman/listinfo/python-list


Re: file pointer array

2012-06-06 Thread Jon Clements

On 06/06/12 18:54, Prasad, Ramit wrote:

data= []
for index in range(N, 1): # see Chris Rebert's comment
 with open('data%d.txt' % index,'r') as f:
 data.append( f.readlines() )



I think "data.extend(f)" would be a better choice.

Jon.
--
http://mail.python.org/mailman/listinfo/python-list


Re: file pointer array

2012-06-06 Thread Jon Clements

On 06/06/12 19:51, MRAB wrote:

On 06/06/2012 19:28, Jon Clements wrote:

On 06/06/12 18:54, Prasad, Ramit wrote:

data= []
for index in range(N, 1): # see Chris Rebert's comment
with open('data%d.txt' % index,'r') as f:
data.append( f.readlines() )



I think "data.extend(f)" would be a better choice.


.extend does something different, and "range(N, 1)" is an empty range
if N > 0.


Mea culpa -  I had it in my head the OP wanted to treat the files as one 
contiguous one. So yeah:


# something equiv to... (unless it is definitely a fixed range in which
# case (x)range can be used)
data = [ list(open(fname)) for fname in iglob('/home/jon/data*.txt') ]

# then if they ever need to treat it as a contiguous sequence...
all_data = list(chain.from_iterable(data))

Jon.




--
http://mail.python.org/mailman/listinfo/python-list


Re: Compare 2 times

2012-06-06 Thread Jon Clements

On 06/06/12 14:39, Christian Heimes wrote:

Am 06.06.2012 14:50, schrieb loial:

I have a requirement to test the creation time of a file with the
current time and raise a message if the file is  more than 15 minutes
old.

Platform is Unix.

I have looked at using os.path.getctime for the file creation time and
time.time() for the current time, but is this the best approach?


Lots of people are confused by ctime because they think 'c' stands for
change. That's wrong. st_ctime is status change time. The ctime is
updated when you change (for example) owner or group of a file, create a
hard link etc. POSIX has no concept of creation time stamp.

Christian


I haven't thought this through too much, but perhaps an ugly 
"work-around" would be to use inotify (in some kind of daemon) to watch 
for the IN_CREATE events and store the crtime in a personal DB. Then 
possibly look at some sort of scheduling to fulfil what happens after 15 
minutes.


I'm sure there's subtleties I'm missing, but just thought it could be 
useful.


Jon.

--
http://mail.python.org/mailman/listinfo/python-list


Re: Is that safe to use ramdom.random() for key to encrypt?

2012-06-16 Thread Jon Clements
On Sun, 17 Jun 2012 12:31:04 +1000, Chris Angelico wrote:

> On Sun, Jun 17, 2012 at 12:15 PM, Yesterday Paid
>  wrote:
>> I'm making cipher program with random.seed(), random.random() as the
>> key table of encryption.
>> I'm not good at security things and don't know much about the algorithm
>> used by random module.
> 
> For security, you don't want any algorithm, you want something like
> /dev/random (on Unix-like platforms).
> 
> I'm pretty sure Python includes crypto facilities. Unless it (most
> oddly) lacks these batteries, I would recommend using one of them
> instead.
> 
> ChrisA

Cryptography is a complex subject - I've had the (mis)fortune to study it 
briefly.

Whatever you do - *do not* attempt to write your own algorithm. 

Python includes hashlib (forms of SHA and MD5) and uuid modules, but I 
take it a symmetric or possibly public/private key system is required - 
depending on what you want to secure, where it's stored and who needs 
access.

I generally find a separate partition with an encrypted file-system 
(which is fairly straight forward on *nix systems or I think there's a 
product out there that works with Windows), is a lot easier and puts the 
load on the filesystem/OS instead of having to be handled in your 
application is a lot simpler.

Jon

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is that safe to use ramdom.random() for key to encrypt?

2012-06-17 Thread Jon Clements
On Sun, 17 Jun 2012 23:17:37 +, Steven D'Aprano wrote:

> On Mon, 18 Jun 2012 08:41:57 +1000, Chris Angelico wrote:
> 
>> On Mon, Jun 18, 2012 at 3:06 AM, Rafael Durán Castañeda
>>  wrote:
>>> The language Python includes a SystemRandom class that obtains
>>> cryptographic grade random bits from /dev/urandom on a Unix-like
>>> system, including Linux and Mac OS X, while on Windows it uses
>>> CryptGenRandom.
>> 
>> /dev/urandom isn't actually cryptographically secure; it promises not
>> to block, even if it has insufficient entropy. But in your instance...
> 
> Correct. /dev/random is meant to be used for long-lasting
> cryptographically-significant uses, such as keys. urandom is not.
> 
> http://en.wikipedia.org/wiki//dev/random
> 
> 
>>> Do you think is secure enough for token generation? (40 chars long
>>> tokens are used for password reset links in a website, there isn't any
>>> special security concern for the web).
>> 
>> ... it probably is fine, since password reset tokens don't need to be
>> as secure as encryption keys (if anyone _does_ figure out how to
>> predict your password resets, all they'll be able to do is lock people
>> out of their accounts one by one, not snoop on them all unbeknownst,
>> and you'll be able to see log entries showing the resets - you DO log
>> them, right?). In fact, you could probably get away with something
>> pretty trivial there, like a SHA1 of the current timestamp, the user
>> name, and the user's current password hash. The chances that anybody
>> would be able to exploit that are fairly low, given that you're not a
>> bank or other high-profile target.
> 
> If I were an identity thief, I would *love* low-profile targets. Even
> though the payoff would be reduced, the cost would be reduced even more:
> 
> - they tend to be complacent, even more so than high-profile targets;
> 
> - they tend to be smaller, with fewer resources for security;
> 
> - mandatory disclosure laws tend not to apply to them;
> 
> - they don't tend to have the resources to look for anomalous usage
>   patterns, if they even cared enough to want to.
> 
> 
> If there was a Facebook-like website that wasn't Facebook[1], but still
> with multiple tens of thousands of users, I reckon a cracker who didn't
> vandalise people's accounts could steal private data from it for *years*
> before anyone noticed, and months or years more before they did
> something about it.
> 
> 
> 
> [1] And very likely a Facebook-like website that *was* Facebook. I
> reckon the odds are about 50:50 that FB would prefer to keep a breach
> secret than risk the bad publicity by fixing it.
> 
> 
> --
> Steven

I'm reminded of:

http://xkcd.com/936/
http://xkcd.com/792/

There's also one where it's pointed out it's easier to brute force a 
person who has the code, than brute force the computer. [but can't find 
that one at the moment]




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How does this work?

2011-06-04 Thread Jon Clements
On Jun 5, 4:37 am, Ben Finney  wrote:
>  writes:
> > I was surfing around looking for a way to split a list into equal
> > sections. I came upon this algorithm:
>
> > >>> f = lambda x, n, acc=[]: f(x[n:], n, acc+[(x[:n])]) if x else acc
> > >>> f("Hallo Welt", 3)
> > ['Hal', 'lo ', 'Wel', 't']
>
> > (http://stackoverflow.com/questions/312443/how-do-you-split-a-list-int...)
>
> This is an excellent example of why “clever” code is to be shunned.
> Whoever wrote this needs to spend more time trying to get their code
> past a peer review; the above would be rejected until it was re-written
> to be clear.
>
> Here is my attempt to write the above to be clear (and fixing a couple
> of bugs too):
>
>     def split_slices(seq, slicesize, accumulator=None):
>         """ Return a list of slices from `seq` each of size `slicesize`.
>
>             :param seq: The sequence to split.
>             :param slicesize: The maximum size of each slice.
>             :param accumulator: A sequence of existing slices to which
>                 ours should be appended.
>             :return: A list of the slices. Each item will be a slice
>                 from the original `seq` of `slicesize` length; the last
>                 item may be shorter if there were fewer than `slicesize`
>                 items remaining.
>
>             """
>         if accumulator is None:
>             accumulator = []
>         if seq:
>             slice = seq[:slicesize]
>             result = split_slices(
>                 seq[slicesize:], slicesize, accumulator + [slice])
>         else:
>             result = accumulator
>         return result
>
> > It doesn't work with a huge list, but looks like it could be handy in
> > certain circumstances. I'm trying to understand this code, but am
> > totally lost. I know a little bit about lambda, as well as the ternary
> > operator
>
> In Python, ‘lambda’ is merely an alternative syntax for creating
> function objects. The resulting object *is* a function, so I've written
> the above using the ‘def’ syntax for clarity.
>
> The ternary operator is often useful for very simple expressions, but
> quickly becomes too costly to read when the expression is complex. The
> above is one where the writer is so much in love with the ternary
> operator that they have crammed far too much complexity into a single
> expression.
>
> > Just curious if anyone could explain how this works or maybe share a link
> > to a website that might explain this?
>
> Does the above help?
>
> --
>  \       “We must find our way to a time when faith, without evidence, |
>   `\    disgraces anyone who would claim it.” —Sam Harris, _The End of |
> _o__)                                                     Faith_, 2004 |
> Ben Finney

Just my 2p, but isn't the itertools "grouper" recipe prudent?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Need help with simple OOP Python question

2011-09-05 Thread Jon Clements
On Sep 5, 3:43 pm, Peter Otten <__pete...@web.de> wrote:
> Kristofer Tengström wrote:
> > Thanks everyone, moving the declaration to the class's __init__ method
> > did the trick. Now there's just one little problem left. I'm trying to
> > create a list that holds the parents for each instance in the
> > hierarchy. This is what my code looks like now:
>
> > -
>
> > class A:
> >     def __init__(self, parents=None):
> >         self.sub = dict()
> >         if parents:
>
> You should explicitly test for None here; otherwise in a call like
>
> ancestors = []
> a = A(anchestors)
>
> the list passed as an argument will not be used, which makes fore confusing
> behaviour.
>
> >             self.parents = parents
> >         else:
> >             self.parents = []
> >     def sub_add(self, cls):
> >         hierarchy = self.parents
> >         hierarchy.append(self)
>
> Here you are adding self to the parents (that should be called ancestors)
> and pass it on to cls(...). Then -- because it's non-empty -- it will be
> used by the child, too, and you end up with a single parents list.
>
> >         obj = cls(hierarchy)
> >         self.sub[obj.id] = obj
>
> While the minimal fix is to pass a copy
>
> def sub_add(self, cls):
>     obj = cls(self.parents + [self])
>     self.sub[obj.id] = obj
>
> I suggest that you modify your node class to keep track only of the direct
> parent instead of all ancestors. That makes the implementation more robust
> when you move a node to another parent.

I may not be understanding the OP correctly, but going by what you've
put here, I might be tempted to take this kind of stuff out of the
class's and using a graph library (such as networkx) - that way if
traversal is necessary, it might be a lot easier. But once again, I
must say I'm not 100% sure what the OP wants to achieve...

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pattern matching

2011-02-24 Thread Jon Clements
On Feb 24, 2:11 am, monkeys paw  wrote:
> if I have a string such as '01/12/2011' and i want
> to reformat it as '20110112', how do i pull out the components
> of the string and reformat them into a DDMM format?
>
> I have:
>
> import re
>
> test = re.compile('\d\d\/')
> f = open('test.html')  # This file contains the html dates
> for line in f:
>      if test.search(line):
>          # I need to pull the date components here

I second using an html parser to extact the content of the TD's, but I
would also go one step further reformatting and do something such as:

>>> from time import strptime, strftime
>>> d = '01/12/2011'
>>> strftime('%Y%m%d', strptime(d, '%m/%d/%Y'))
'20110112'

That way you get some validation about the data, ie, if you get
'13/12/2011' you've probably got mixed data formats.


hth

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: TextWrangler "run" command not working properly

2011-04-14 Thread Jon Clements
On Apr 14, 9:52 pm, Fabio  wrote:
> Hi to all,
> I have troubles with TextWrangler "run" command in the "shebang" (#!)
> menu.
> I am on MacOSX 10.6.7.
> I have the "built-in" Python2.5 which comes installed by "mother Apple".
> Then I installed Python2.6, and left 2.5 untouched (I was suggested to
> leave it on the system, since "something might need it").
>
> I ran the "Update Shell Profile.command", and now if I launch "python"
> in the terminal it happily launches the 2.6 version.
> Then I installed some libraries (scipy and matplotlib) on this newer 2.6
> version.
> They work, and everything is fine.
>
> Then, I started to use TexWrangler, and I wanted to use the "shebang"
> menu, and "run" command.
> I have the "#! first line" pointing to the 2.6 version.
> It works fine, as long as I don't import the libraries, in which case it
> casts an error saying:
>
> ImportError: No module named scipy
>
> Maybe for some reason it points to the old 2.5 version.
> But I might be wrong and the problem is another...
>
> I copy here the first lines in the terminal window if i give the "run in
> terminal" command
>
> Last login: Thu Apr 14 22:38:26 on ttys000
> Fabio-Mac:~ fabio$
> /var/folders/BS/BSS71XvjFKiJPH3Wqtx90k+++TM/-Tmp-/Cleanup\ At\
> Startup/untitled\ text-324506443.860.command ; exit;
> Traceback (most recent call last):
>   File "/Users/fabio/Desktop/test.py", line 3, in 
>     import scipy as sp
> ImportError: No module named scipy
> logout
>
> [Process completed]
>
> where the source (test.py) contains just:
>
> #!/usr/bin/python2.6
>
> import scipy as sp
>
> print "hello world"
>
> Any clue?
>
> Thanks
>
> Fabio

http://www.velocityreviews.com/forums/t570137-textwrangler-and-new-python-version-mac.html
?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding empty columns. Is there a faster way?

2011-04-21 Thread Jon Clements
On Apr 21, 5:40 pm, nn  wrote:
> time head -100 myfile  >/dev/null
>
> real    0m4.57s
> user    0m3.81s
> sys     0m0.74s
>
> time ./repnullsalt.py '|' myfile
> 0 1 Null columns:
> 11, 20, 21, 22, 23, 24, 25, 26, 27, 30, 31, 33, 45, 50, 68
>
> real    1m28.94s
> user    1m28.11s
> sys     0m0.72s
>
> import sys
> def main():
>     with open(sys.argv[2],'rb') as inf:
>         limit = sys.argv[3] if len(sys.argv)>3 else 1
>         dlm = sys.argv[1].encode('latin1')
>         nulls = [x==b'' for x in next(inf)[:-1].split(dlm)]
>         enum = enumerate
>         split = bytes.split
>         out = sys.stdout
>         prn = print
>         for j, r in enum(inf):
>             if j%100==0:
>                 prn(j//100,end=' ')
>                 out.flush()
>                 if j//100>=limit:
>                     break
>             for i, cur in enum(split(r[:-1],dlm)):
>                 nulls[i] |= cur==b''
>     print('Null columns:')
>     print(', '.join(str(i+1) for i,val in enumerate(nulls) if val))
>
> if not (len(sys.argv)>2):
>     sys.exit("Usage: "+sys.argv[0]+
>          "   ")
>
> main()


What's with the aliasing enumerate and print??? And on heavy disk IO I
can hardly see that name lookups are going to be any problem at all?
And why the time stats with /dev/null ???


I'd probably go for something like:

import csv

with open('somefile') as fin:
nulls = set()
for row in csv.reader(fin, delimiter='|'):
nulls.update(idx for idx,val in enumerate(row, start=1) if not
val)
print 'nulls =', sorted(nulls)

hth
Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: checking if a list is empty

2011-05-06 Thread Jon Clements
On May 7, 12:51 am, Ian Kelly  wrote:
> On Fri, May 6, 2011 at 4:21 PM, Philip Semanchuk  wrote:
> > What if it's not a list but a tuple or a numpy array? Often I just want to 
> > iterate through an element's items and I don't care if it's a list, set, 
> > etc. For instance, given this function definition --
>
> > def print_items(an_iterable):
> >    if not an_iterable:
> >        print "The iterable is empty"
> >    else:
> >        for item in an_iterable:
> >            print item
>
> > I get the output I want with all of these calls:
> > print_items( list() )
> > print_items( tuple() )
> > print_items( set() )
> > print_items( numpy.array([]) )
>
> But sadly it fails on iterators:
> print_items(xrange(0))
> print_items(-x for x in [])
> print_items({}.iteritems())

My stab:

from itertools import chain

def print_it(iterable):
it = iter(iterable)
try:
head = next(it)
except StopIteration:
print 'Empty'
return
for el in chain( (head,), it ):
print el

Not sure if I'm truly happy with that though.

Jon
Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Wrote a new library - Comments and suggestions please!

2011-09-26 Thread Jon Clements
On Sep 26, 12:23 pm, Tal Einat  wrote:
> The library is called RunningCalcs and is useful for running several
> calculations on a single iterable of values.
>
> https://bitbucket.org/taleinat/runningcalcs/http://pypi.python.org/pypi/RunningCalcs/
>
> I'd like some input on how this could be made more useful and how to
> spread the word about it.
>
> The library contains the base RunningCalc class and implementations of
> sub-classes for common calculations: sum, min/max, average, variance &
> standard deviation, n-largest & n-smallest. Additionaly a utility
> function apply_in_parallel() is supplied which makes running several
> calculations on an iterable easy (and fast!).
>
> Straight-forward example:
>
> mean_rc, stddev_rc = RunningMean(), RunningStdDev()
> for x in values:
>     mean_rc.feed(x)
>     stddev_rc.feed(x)
> mean, stddev = mean_rc.value, stddev_rc.value
>
> Examples using apply_in_parallel():
>
> mean, stddev = apply_in_parallel(values, [RunningMean(),
> RunningStdDev()])
> five_smallest, five_largest = apply_in_parallel(values,
> [RunningNSmallest(5), RunningNLargest(5)])
>
> Comments and suggestions would be highly appreciated!

You may not of heard of it, but the SAS language has something called
PROC FREQ... I'm imagining that maybe this is where you should be
taking this. Sorry I can't comment on the code, as I haven't really
got time, but have a look! (I'd be willing to invest sometime with
you, if you agree that's where something like this should be going...)

Cheers,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Wrote a new library - Comments and suggestions please!

2011-09-27 Thread Jon Clements
On Sep 27, 6:33 pm, Steven D'Aprano  wrote:
> Robert Kern wrote:
> > On 9/27/11 10:24 AM, Tal Einat wrote:
> >> I don't work with SAS so I have no reason to invest any time developing
> >> for it.
>
> >> Also, as far as I can tell, SAS is far from free or open-source, meaning
> >> I definitely am not interested in developing for it.
>
> > I don't think he's suggesting that you drop what you are doing in Python
> > and start working with SAS. He is suggesting that you look at the similar
> > procedures that exist in the SAS standard library for inspiration.
>
> Yeah, inspiration on what *not* to do.
>
> I googled on "SAS PROC FREQ" and found this:
>
> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/defau...
>
> All the words are in English, but I have no idea what the function does, how
> you would call it, and what it returns. Would it have been so hard to show
> a couple of examples?
>
> Documentation like that really makes me appreciate the sterling work done on
> Python's docs.
>
> This tutorial:
>
> http://www2.sas.com/proceedings/sugi30/263-30.pdf
>
> is much clearer.
>
> --
> Steven

Yes - I definitely do not like the SAS docs - in fact, when I last had
to "buy" the product, it was something like £5k for the "BASE" system,
then if I wanted ODBC it was another £900, and the "proper" manuals
were something stupid like another £1k (and only in hard copy) - this
was a good 5/6 years ago though... (oh, and for a very basic course,
it was £1.2k a day for staff to train) *sighs* [oh, and if I wanted a
'site' licence, we were talking 6 digits]

Anyway, Robert Kern correctly interpreted me. I was not suggesting to
the OP that he move to SAS (heaven forbid), I was indeed suggesting
that he look into what similar systems have (that I have experience
with and appreciate), and he acknowledges that is not present in
Python, and ummm, take inspiration and quite possibly "rip 'em off".

A decent tabulate/cross-tabulation and statistics related there-to
library is something I'd be willing to assist with and put time into.

Cheers,

Jon.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Simplest way to resize an image-like array

2011-09-30 Thread Jon Clements
On Sep 30, 5:40 pm, John Ladasky  wrote:
> Hi folks,
>
> I have 500 x 500 arrays of floats, representing 2D "grayscale" images,
> that I need to resample at a lower spatial resolution, say, 120 x 120
> (details to follow, if you feel they are relevant).
>
> I've got the numpy, and scipy, and matplotlib. All of these packages
> hint at the fact that they have the capability to resample an image-
> like array.  But after reading the documentation for all of these
> packages, none of them make it straightforward, which surprises me.
> For example, there are several spline and interpolation methods in
> scipy.interpolate.  They seem to return interpolator classes rather
> than arrays.  Is there no simple method which then calls the
> interpolator, and builds the resampled array?
>
> Yes, I can do this myself if I must -- but over the years, I've come
> to learn that a lot of the code I want is already written, and that
> sometimes I just have to know where to look for it.
>
> Thanks!

Is something like 
http://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.imresize.html#scipy.misc.imresize
any use?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Usefulness of the "not in" operator

2011-10-08 Thread Jon Clements
On Oct 8, 11:42 am, candide  wrote:
> Python provides
>
>      -- the not operator, meaning logical negation
>      -- the in operator, meaning membership
>
> On the other hand, Python provides the not in operator meaning
> non-membership. However, it seems we can reformulate any "not in"
> expression using only "not" and "in" operation. For instance
>
>  >>> 'th' not in "python"
> False
>
>  >>> not ('th' in "python")
> False
>  >>>
>
> So what is the usefulness of the "not in" operator ? Recall what Zen of
> Python tells
>
> There should be one-- and preferably only one --obvious way to do it.

You would seriously prefer the later?

Guess I'll have to start writing stuff like:

10 - 5 as 10 + -5 (as obviously the - is redundant as an operation),
and 10 / 2 as int(10 * .5) or something, who needs a divide!?

Jokely yours,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reading a file into a data structure....

2011-10-13 Thread Jon Clements
On Oct 13, 10:59 pm, MrPink  wrote:
> This is a continuing to a post I made in 
> August:http://groups.google.com/group/comp.lang.python/browse_thread/thread/...
>
> I got some free time to work with Python again and have some followup
> questions.
>
> For example, I have a list in a text file like this:
> Example list of lottery drawings:
> date,wb,wb,wb,wb,wb,bb
> 4/1/2011,5,1,45,23,27,27
> 5/1/2011,15,23,8,48,22,32
> 6/1/2011,33,49,21,16,34,1
> 7/1/2011,9,3,13,22,45,41
> 8/1/2011,54,1,24,39,35,18
> 
>
> Ticket:
> startdate,enddate,wb,wb,wb,wb,wb,bb
> 4/1/2011,8/1/2011,5,23,32,21,3,27
>
> I am trying to determine the optimal way to organize the data
> structure of the drawing list, search the drawing list, and mark the
> matches in the drawing list.
>
> f = open("C:\temp\drawinglist.txt", "r")
> lines = f.readlines()
> f.close()
> drawing = lines[1].split()
>
> The results in drawing is this:
> drawing[0] = '4/1/2011'
> drawing[1] = '5'
> drawing[2] = '1'
> drawing[3] = '45'
> drawing[4] = '23'
> drawing[5] = '27'
> drawing[6] = '27'
>
> I need to convert drawing[0] to a date datatype.  This works, but I'm
> sure there is a better way.
> from datetime import date
> month, day, year = drawing[0].split('/')
> drawing[0] = date(int(year), int(month), int(day))
>
> For searching, I need to determine if the date of the drawing is
> within the date range of the ticket.  If yes, then mark which numbers
> in the drawing match the numbers in the ticket.
>
> ticket[0] = '4/1/2011'
> ticket[0] = '8/1/2011'
> ticket[0] = '5'
> ticket[0] = '23'
> ticket[0] = '32'
> ticket[0] = '21'
> ticket[0] = '3'
> ticket[0] = 27'
>
> drawing[0] = '4/1/2011' (match)
> drawing[1] = '5' (match)
> drawing[2] = '1'
> drawing[3] = '45'
> drawing[4] = '23' (match)
> drawing[5] = '27'
> drawing[6] = '27' (match)
>
> I'm debating on structuring the drawing list like this:
> drawing[0] = '4/1/2011'
> drawing[1][0] = '5'
> drawing[1][1] = '1'
> drawing[1][2] = '45'
> drawing[1][3] = '23'
> drawing[1][4] = '27'
> drawing[2] = '27'
>
> Sort drawing[1] from low to high
> drawing[1][0] = '1'
> drawing[1][1] = '5'
> drawing[1][2] = '23'
> drawing[1][3] = '27'
> drawing[1][4] = '45'
>
> I want to keep the drawing list in memory for reuse.
>
> Any guidance would be most helpful and appreciated.
> BTW, I want to learn, so be careful not to do too much of the work for
> me.
> I'm using WingIDE to do my work.
>
> Thanks,

- Use the csv module to read the file
- Use strptime to process the date field
- Use a set for draw numbers (you'd have to do pure equality on the
bb)
- Look at persisting in a sqlite3 DB (maybe with a custom convertor)

hth,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for browser emulator

2011-10-13 Thread Jon Clements
On Oct 14, 3:19 am, Roy Smith  wrote:
> I've got to write some tests in python which simulate getting a page of
> HTML from an http server, finding a link, clicking on it, and then
> examining the HTML on the next page to make sure it has certain features.
>
> I can use urllib to do the basic fetching, and lxml gives me the tools
> to find the link I want and extract its href attribute.  What's missing
> is dealing with turning the href into an absolute URL that I can give to
> urlopen().  Browsers implement all sorts of stateful logic such as "if
> the URL has no hostname, use the same hostname as the current page".  
> I'm talking about something where I can execute this sequence of calls:
>
> urlopen("http://foo.com:/bar";)
> urlopen("/baz")
>
> and have the second one know that it needs to get
> "http://foo.com:/baz";.  Does anything like that exist?
>
> I'm really trying to stay away from Selenium and go strictly with
> something I can run under unittest.

lxml.html.make_links_absolute() ?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for browser emulator

2011-10-13 Thread Jon Clements
On Oct 14, 3:19 am, Roy Smith  wrote:
> I've got to write some tests in python which simulate getting a page of
> HTML from an http server, finding a link, clicking on it, and then
> examining the HTML on the next page to make sure it has certain features.
>
> I can use urllib to do the basic fetching, and lxml gives me the tools
> to find the link I want and extract its href attribute.  What's missing
> is dealing with turning the href into an absolute URL that I can give to
> urlopen().  Browsers implement all sorts of stateful logic such as "if
> the URL has no hostname, use the same hostname as the current page".  
> I'm talking about something where I can execute this sequence of calls:
>
> urlopen("http://foo.com:/bar";)
> urlopen("/baz")
>
> and have the second one know that it needs to get
> "http://foo.com:/baz";.  Does anything like that exist?
>
> I'm really trying to stay away from Selenium and go strictly with
> something I can run under unittest.

lxml.html.make_links_absolute() ?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Loop through a dict changing keys

2011-10-16 Thread Jon Clements
On Oct 16, 12:53 am, PoD  wrote:
> On Sat, 15 Oct 2011 11:00:17 -0700, Gnarlodious wrote:
> > What is the best way (Python 3) to loop through dict keys, examine the
> > string, change them if needed, and save the changes to the same dict?
>
> > So for input like this:
> > {'Mobile': 'string', 'context': '', 'order': '7',
> > 'time': 'True'}
>
> > I want to booleanize 'True', turn '7' into an integer, escape
> > '', and ignore 'string'.
>
> > Any elegant Python way to do this?
>
> > -- Gnarlie
>
> How about
>
> data = {
>     'Mobile': 'string',
>     'context': '',
>     'order': '7',
>     'time': 'True'}
> types={'Mobile':str,'context':str,'order':int,'time':bool}
>
> for k,v in data.items():
>     data[k] = types[k](v)

Bit of nit-picking, but:

>>> bool('True')
True
>>> bool('False')
True
>>> bool('')
False
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: understand program used to create file

2011-11-01 Thread Jon Clements
On Nov 1, 7:27 pm, pacopyc  wrote:
> Hi, I have about 1 files .doc and I want know the program used to
> create them: writer? word? abiword? else? I'd like develop a script
> python to do this. Is there a module to do it? Can you help me?
>
> Thanks

My suggestion would be the same as DaveA's.

This gives you the format it was *written* in.
(Saved a blank OO document as 95/97/XP Word DOC under Linux)

jon@forseti:~/filetest$ file *
saved-by-OO.doc: CDF V2 Document, Little Endian, Os: Windows, Version
1.0, Code page: -535, Author: jon , Revision Number: 0, Create Time/
Date: Mon Oct 31 20:47:30 2011

I'd be impressed if you could discover the program that did *write*
it; I'd imagine you'd need something that understood some meta-data in
the format (if the format has a kind of 'created_by' field, for
instance), or depend on nuisances which give away that a certain
program wrote data in another's native format.

Assuming the former, what might be possible:

1) Grab a "magic number" lookup list
2) Grab 8 (I think that should be all that's needed, but hey ummm..)
bytes from the start of each file
3) Look it up in the "magic number" list
4) If you got something great, if not compare 7, 6, 5, 4 bytes...
etc... until you get a hit or bail out

(Or just find a Windows port of 'file')

HTH

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: simple file flow question with csv.reader

2011-11-02 Thread Jon Clements
On Nov 2, 11:50 pm, Terry Reedy  wrote:
> On 11/2/2011 7:06 PM, Dennis Lee Bieber wrote:
>
> > On Wed, 2 Nov 2011 14:13:34 -0700 (PDT), Matt
> > declaimed the following in gmane.comp.python.general:
>
> >> I have a few hundred .csv files, and to each file, I want to
> >> manipulate the data, then save back to the original file.
>
> That is dangerous. Better to replace the file with a new one of the same
> name.
>
> > Option 1:  Read the file completely into memory (your example is
> > reading line by line); close the reader and its file; reopen the
> > file for "wb" (delete, create new); open CSV writer on that file;
> > write the memory contents.
>
> and lose data if your system crashes or freezes during the write.
>
> > Option 2:  Open a temporary file "wb"; open a CSV writer on the file;
> > for each line from the reader, update the data, send to the writer;
> > at end of reader, close reader and file; delete original file;
> > rename temporary file to the original name.
>
> This works best if new file is given a name related to the original
> name, in case rename fails. Alternative is to rename original x to
> x.bak, write or rename new file, then delete .bak file.
>
> --
> Terry Jan Reedy

To the OP, I agree with Terry, but will add my 2p.

What is this meant to achieve?

>>> row = range(10)
>>>
print ">",row[0],row[4],"\n",row[1], "\n", ">", row[2], "\n", row[3]
> 0 4
1
> 2
3

Is something meant to read this afterwards?

I'd personally create a subdir called db, create a sqlite3 db, then
load all the required fields into it (with a column for filename)...
it will either work or fail, then if it succeeds, start overwriting
the originals - just a "select * from some_table" will do, using
itertools.groupby on the filename column, changing the open() request
etc...

just my 2p mind you,

Jon.






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Get keys from a dicionary

2011-11-11 Thread Jon Clements
On Nov 11, 1:31 pm, macm  wrote:
> Hi Folks
>
> I pass a nested dictionary to a function.
>
> def Dicty( dict[k1][k2] ):
>         print k1
>         print k2
>
> There is a fast way (trick) to get k1 and k2 as string.
>
> Whithout loop all dict. Just it!
>
> Regards
>
> macm

I've tried to understand this, but can't tell if it's a question or
statement, and even then can't tell what the question or statement
is...

Care to eloborate?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: my new project, is this the right way?

2011-11-14 Thread Jon Clements
On Nov 14, 10:41 am, Tracubik  wrote:
> Hi all,
> i'm developing a new program.
> Mission: learn a bit of database management
> Idea: create a simple, 1 window program that show me a db of movies i've
> seen with few (<10) fields (actors, name, year etc)
> technologies i'll use: python + gtk
> db: that's the question
>
> since i'm mostly a new-bye for as regard databases, my idea is to use
> sqlite at the beginning.
>
> Is that ok? any other db to start with? (pls don't say mysql or similar,
> they are too complex and i'll use this in a second step)
>
> is there any general tutorial of how to start developing a database? i
> mean a general guide to databases you can suggest to me?
> Thank you all
>
> MedeoTL
>
> P.s. since i have a ods sheet files (libreoffice calc), is there a way to
> easily convert it in a sqlite db? (maybe via csv)

I would recommend working through the book "SQL for Dummies". I found
it very clear, and slowly leads you into how to think about design,
not just how to manipulate databases.

Instead of using Python to start with consider using OOo Base or MS
Access (retching noise), so you can use RAD to play with structure and
manipulation of your data and create data-entry forms -- this'll allow
you to enter data, and play with queries and the structure -- as you
won't get it right the first time! You will be able to get either of
these programs to give you the SQL that constructs tables, or makes
queries etc...

That'd be enough to keep you going for a couple of weeks I guess.

Also, some things make more sense in a NoSQL database, so have a look
at something like MongoDB or CouchDB and how their design works
differently.

That's probably another couple of weeks.

Also worth checking out would be http://dabodev.com

hth

Jon.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Py2.7/FreeBSD: maximum number of open files

2011-11-14 Thread Jon Clements
On Nov 14, 5:03 pm, Tobias Oberstein 
wrote:
> > > I need 50k sockets + 100 files.
>
> > > Thus, this is even more strange: the Python (a Twisted service) will
> > > happily accept 50k sockets, but as soon as you do open() a file, it'll 
> > > bail out.
>
> > A limit of 32k smells like a overflow in a signed int. Perhaps your system 
> > is
> > able and configured to handle more than 32k FDs but you hit an artificial 
> > limit
> > because some C code or API has a overflow. This seems to be a known bug in
> > FreeBSDhttp://lists.freebsd.org/pipermail/freebsd-bugs/2010-
> > July/040689.html
>
> This is unbelievable.
>
> I've just tested: the bug (in libc) is still there on FreeBSD 8.2 p3 ... both 
> on i386
> _and_ amd64.
>
> Now I'm f***d;(
>
> A last chance: is it possible to compile Python for not using libc fopen(),
> but the Posix open()?
>
> Thanks anyway for this hint!

Have you tried/or is it possible to get your 100 or whatever files
first, before your sockets?

hth

Jon
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: can some one help me with my code. thanks

2012-01-20 Thread Jon Clements
On Jan 20, 9:26 pm, Terry Reedy  wrote:
> On 1/20/2012 2:46 PM, Terry Reedy wrote:
>
>
>
>
>
>
>
>
>
> > On 1/20/2012 1:49 PM, Tamanna Sultana wrote:
>
> >> can some one help me??
> >>> I would like to create a function that, given a bin, which is a list
> >>> (example below), generates averages for the numbers separated by a
> >>> string 'end'. I am expecting to have 4 averages from the above bin,
> >>> since there are 4 sets of numbers separated by 4 'end' strings
>
> > [Posting your overly long set of data lines with a '>' quote at the
> > beginning of each line was a nuisance. Reposted with few lines. I will
> > let you compare your code to mine.]
>
> > bin = ['2598.95165', '2541.220308', '221068.0401', 'end', '4834.581952',
> > '1056.394859', '3010.609563', '2421.437603', '4619.861889',
> > '3682.012227', '3371.092883', '6651.509488', '7906.092773',
> > '7297.133447', 'end', '4566.874299', 'end', '4255.700077',
> > '1857.648393', '11289.48095', '2070.981805', '1817.505094',
> > '563.0265409', '70796.45356', '565.2123689', '6560.030116',
> > '2668.934414', '418.666014', '5216.392132', '760.894589', '8072.957639',
> > '346.5905371', 'end']
>
> > def average(bin):
> > num=[]
> > total = 0.0
> > count=0
> > for number in bin:
> > if number!='end':
> > total += float(number)
> > count+=1
> > else:
> > num.append(total/count)
> > total = 0.0
> > count= 0
> > return num
>
> > print(average(bin))
>
> > [75402.7373526, 4485.0726684, 4566.874299, 7817.36494866]
>
> U're welcome. But do notice Tim's comment. In non-toy situations, you
> have to decide how to handle empty collections (return float('nan')?),
> or whether to just let whatever happens happen.
>
> If you control the input format, a list of lists would be easier than an
> end marker. But sometimes one is handed data and asked to process it as is.
>
> Also note (this is a more advanced topic) that average() could be turned
> into a generator function by replacing 'num.append(total/count)' with
> 'yield total/count' and removing the initialization and return of num.
>
> --
> Terry Jan Reedy

Not directing this at you Terry, and you and Tim have made fine points
-- this just appears to me to be the best point at which to respond to
a thread.

To the OP - you have great answers, and, please note this just happens
to be the way I would do this.

I would separate the parsing of the data, and the calculation code
out. I've whipped this up rather quickly, so it might have a few flaws
but...

from itertools import groupby
def partition(iterable, sep=lambda L: L == 'end', factory=float):
for key, vals in groupby(iterable, sep):
if not key: yield map(factory, vals)

# And a pure cheat, but useful if more complex calculations are
required etc... (Plus covers NaN)
import numpy as np
print map(np.mean, partition(bin))

What you've got will work though, so wouldn't worry too much and this
is just my 2p,

Jon.
-- 
http://mail.python.org/mailman/listinfo/python-list


Automatic email checking - best procedures/suggestions

2006-07-27 Thread Jon Clements
Hi All,

I'm hoping someone has some experience in this field and could give me
a pointer in the right direction - it's not purely python related
though. Any modules/links someone has tried and found useful would be
greatly appreciated...

I want to have an automated process which basically has its own email
account on the LAN. The basic idea is that upon receipt of an email, it
logs this in a database and then forwards the message on to 'suitable'
recipients. (Well it will do more, but this is perfect to be going on
with and buildable on...)

The database and email account are set up and working fine. Using
smtplib, imaplib or poplib I can send and receive mail - this is not a
problem. What I'm unsure of is the best way to design this. Bear in
mind that network/email server configuration changes can be made. For
instance, do I connect to the email server and keep polling it every
'n' whatever for new messages, or should I be looking to the smtpd
module and get mail via that? (or any other way?)

I think I'm basically after the best way to implement:
Email in --> Python process --> Email out

Cheers,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How do you implement this Python idiom in C++

2006-07-27 Thread Jon Clements

[EMAIL PROTECTED] wrote:
> // Curious class definitions
> class CountedClass : public Counted {};
> class CountedClass2 : public Counted {};
>
> It apparently works but in fact it doesn't:
> If you derive from such a class, you get the count of the parent class,
>
> not of the derived class.
> class CountedClass3 : public CountedClass {};
>

Hint: where's the template parameter gone as per the previous two
statements...

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How do you implement this Python idiom in C++

2006-07-27 Thread Jon Clements

[EMAIL PROTECTED] wrote:
> You miss the point; i want to derive a class and inherit all properties
> without worrying about those implementation details. The Python code is
> much cleaner in that respect. My post is about whether it is possible
> to get such a clean interface in C++

I was simply pointing out that your statement declaring that it didn't
work, wasn't accurate, because the code you'd used was incorrect.

Jon

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pickling an instance of a class containing a dict doesn't work

2006-10-12 Thread Jon Clements

Marco Lierfeld wrote:

> The class looks like this:
> class subproject:
> configuration   = {}
> build_steps = []
> # some functions
> # ...
>
> Now I create an instance of this class, e.g.
> test = subproject()
> and try to save it with pickle.dump(test, file('test.pickle','wb')) or with
> pickle.Pickler(file('test.pickle','wb')).save(test) it looks like
> everything has worked well, but in the saved file 'test.pickle' only the
> list 'build_steps' is saved - the dictionary 'configuration' is missing.
> There is wether an error-message nor an exception.
>
> When I try to save only the dictionary, there is no problem at all - the
> dict is saved to the file.
>
> I also tried the 3 different protocols (0, 1, 2), but none of them worked
> for me.

At a wild guess. Since pickle descends the objects hierarchy, and since
configuration and build_steps aren't local to an instance of a class,
it stores only a reference to them (so you won't see values). However,
if you change the above to:

class subproject:
def __init__(self):
configuration = { }
build_steps = [ ]

That'll probably be what you expect...

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pickling an instance of a class containing a dict doesn't work

2006-10-12 Thread Jon Clements

Jon Clements wrote:

> if you change the above to:
>
> class subproject:
> def __init__(self):
> configuration = { }
> build_steps = [ ]

Of course, I actually meant to write self.configuration and
self.build_steps; d0h!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Loops Control with Python

2006-10-13 Thread Jon Clements

Wijaya Edward wrote:
> Can we make loops control in Python?
> What I mean is that whether we can control
> which loops to exit/skip at the given scope.
>
> For example in Perl we can do something like:
>
> OUT:
> foreach my $s1 ( 0 ...100) {
>
> IN:
> foreach my $s2 (@array) {
>
>   if ($s1 == $s2) {
>  next OUT;
>   }
>   else {
>   last IN;
>   }
>
>  }
> }
>
> How can we implement that construct with Python?

Literally.

for si in range(100 + 1):
for s2 in some_array:
if s1 == s2: break

Same thing, but nicer.

for si in range(100 + 1):
if si in some_array:
# Do something here.

Cheers,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: jython and toString

2006-10-16 Thread Jon Clements

ivansh wrote:

> Hello,
>
> For one java class (Hello) i use another (HelloPrinter) to build the
> string representation of the first one. When i've tried to use this
> from within jython,  HelloPrinter.toString(hello) call gives results
> like Object.toString() of hello has being called. The example below
> shows this behaviour.
> Could somebody explain this?
>
>
> // Hello.java
> package jythontest;
> public class Hello {
>   private String name;
>   public Hello(String name)
>   {
>   this.name = name;
>   }
>   public String sayHello()
>   {
>   return "Hello, "+name;
>   }
> }
>
> // HelloPrinter.java
> package jythontest;
> public class HelloPrinter {
>   public static String toString(Hello h)
>   {
>   return h.sayHello();
>   }
>
>   public static String toMyString(Hello h)
>   {
>   return h.sayHello();
>   }
> }
>
>
>
> #  calljava.py
> from jythontest import *
> h = Hello("theName")
> print h
> print HelloPrinter.toString(h)
> print HelloPrinter.toMyString(h)
>
> OUTPUT:
> [EMAIL PROTECTED]   // GOOD
> [EMAIL PROTECTED]   // WRONG
> Hello, theName // GOOD
>
>
> Jython 2.1 on java (JIT: null)
>
> java version "1.5.0_03"
> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_03-b07)
> Java HotSpot(TM) Server VM (build 1.5.0_03-b07, mixed mode)


I'm guessing your toString(Hello h) is never being called because
there's another toString(something) behind the scenes that's being
preferred. I could be well wrong, but I'm guessing toString isn't meant
to be static, and when you create an object in Java they inherit from
object which defines a default toString. It might be a temporary object
of type HelloPrinter is created in the call to "print
HelloPrinter.toString(h)", and the method you end up calling is the
toString for that temporary (not your static one). This would
especially make sense if there's a version of toString which takes an
object, and returns its toString result...

I'm basing this purely on the fact PrintHello.toMyString() works... so
take with a pinch of salt.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: making a valid file name...

2006-10-17 Thread Jon Clements

SpreadTooThin wrote:

> Hi I'm writing a python script that creates directories from user
> input.
> Sometimes the user inputs characters that aren't valid characters for a
> file or directory name.
> Here are the characters that I consider to be valid characters...
>
> valid =
> ':./,^0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ '
>
> if I have a string called fname I want to go through each character in
> the filename and if it is not a valid character, then I want to replace
> it with a space.
>
> This is what I have:
>
> def fixfilename(fname):
>   valid =
> ':.\,^0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ '
>   for i in range(len(fname)):
>   if valid.find(fname[i]) < 0:
>   fname[i] = ' '
>return fname
>
> Anyone think of a simpler solution?

If you want to strip 'em:

>>> valid=':./,^0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ '
>>> filename = '!"£!£$"$££$%$£%$£lasfjalsfjdlasfjasfd()()()somethingelse.dat'
>>> stripped = ''.join(c for c in filename if c in valid)
>>> stripped
'lasfjalsfjdlasfjasfdsomethingelse.dat'

If you want to replace them with something, be careful of the regex
string  being built (ie a space character).
import re
>>> re.sub(r'[^%s]' % valid,' ',filename)
' lasfjalsfjdlasfjasfd  somethingelse.dat'


Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: list comprehension (searching for onliners)

2006-10-20 Thread Jon Clements

Gerardo Herzig wrote:

> Hi all: I have this list thing as a result of a db.query: (short version)
> result = [{'service_id' : 1, 'value': 10},
> {'service_id': 2, 'value': 5},
> {'service_id': 1, 'value': 15},
> {'service_id': 2, 'value': 15},
>  ]
>
> and so on...what i need to do is some list comprehension that returns me
> something like
>
> result = [
> {
> 'service_id' : 1, 'values': [ {'value': 10},
> {'value': 15}]
>  },
> {
>   'service_id' : 2, 'values': [ {'value': 5}, {'value': 15}]
> }
>
>
> My problem now is i cant avoid have "repeteated" entries, lets say, in
> this particular case, 2 entries for "service_id = 1", and other 2 for
> "service_id =2".
> Ill keeping blew off my hair and drinking more cofee while searching for
> this damn onliner im looking for.

If you import itertools and have your DB query return in order of
service_id (or sort the list after retrieving result), then...


[ [dict(service_id=key),[dict(value=n['value']) for n in value]] for
key,value in itertools.groupby(result,lambda x: x['service_id']) ]

... is more or less what you want.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why does this unpacking work

2006-10-20 Thread Jon Clements

John Salerno wrote:

> I'm a little confused, but I'm sure this is something trivial. I'm
> confused about why this works:
>
>  >>> t = (('hello', 'goodbye'),
>   ('more', 'less'),
>   ('something', 'nothing'),
>   ('good', 'bad'))
>  >>> t
> (('hello', 'goodbye'), ('more', 'less'), ('something', 'nothing'),
> ('good', 'bad'))
>  >>> for x in t:
>   print x
>
>
> ('hello', 'goodbye')
> ('more', 'less')
> ('something', 'nothing')
> ('good', 'bad')
>  >>> for x,y in t:
>   print x,y
>
>
> hello goodbye
> more less
> something nothing
> good bad
>  >>>
>
> I understand that t returns a single tuple that contains other tuples.
> Then 'for x in t' returns the nested tuples themselves.
>
> But what I don't understand is why you can use 'for x,y in t' when t
> really only returns one thing. I see that this works, but I can't quite
> conceptualize how. I thought 'for x,y in t' would only work if t
> returned a two-tuple, which it doesn't.
>
> What seems to be happening is that 'for x,y in t' is acting like:
>
> for x in t:
>  for y,z in x:
>  #then it does it correctly
>
> But if so, why is this? It doesn't seem like very intuitive behavior.

It makes perfect sense: in fact, you have kind of explained it
yourself!

Think of the for statement as returning the next element of some
sequence; in this case it's a tuple. Then on the left side, the
unpacking occurs. Using "for x in t", means that effectively no
unpackig occurs, so you get the tuple. However, since the in is
returning a tuple, using "for x,y in t", the tuple returned gets
unpacked.

Hope that helps.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: curious paramstyle qmark behavior

2006-10-20 Thread Jon Clements

BartlebyScrivener wrote:

> With
>
> aColumn = "Topics.Topic1"'
>
> The first statement "works" in the sense that it finds a number of
> matching rows.
>
> c.execute ("SELECT Author, Quote, ID, Topics.Topic1, Topic2 FROM
> QUOTES7 WHERE " + aColumn + " LIKE ?", ("%" + sys.argv[1] + "%",))
>
> I've tried about 20 different variations on this next one. And it finds
> 0 records no matter what I do. Is there some violation when I use two
> qmarks?
>
> c.execute ("SELECT Author, Quote, ID, Topics.Topic1, Topic2 FROM
> QUOTES7 WHERE ? LIKE ?", (aColumn, "%" + sys.argv[1] + "%"))
>
> I'm using mx.ODBC and Python 2.4.3 to connect to an MS Access DB.
>
> Thank you,

At a guess; it's probably translating the first '?' (the one after the
WHERE) as a string literal: so your query string is effectively "select
 from  where 'somestring' like '%%'".

I would try re-writing it like:
c.execute("select  from  where %s like ?" % aColumn,
"%" + sys.argv[1] + "%")

I don't use mx.ODBC, and definately don't use Access (gagging sounds...
but if you're stuck with it, so be it)...

hth,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: curious paramstyle qmark behavior

2006-10-21 Thread Jon Clements

BartlebyScrivener wrote:

> Thanks, Jon.
>
> I'm moving from Access to MySQL. I can query all I want using Python,
> but so far haven't found  a nifty set of forms (ala Access) for easying
> entering of data into MySQL. My Python is still amateur level and I'm
> not ready for Tkinkter or gui programming yet.

Not wanting to start a RDMS war, I'd personally choose PostgreSQL over
MySQL. (Quite interestingly, most Python programmers go for PostgreSQL
and most PHP programmers go for MySQL)... However, only you know what
you really want to do, so it's up to you to evaluate which RDMS to go
for!

In terms of data entry; if you're able to extend the idea of GUI a
little, why not use web forms? The django project, although I've only
played with it, was quite nice to set up and get running straight away:
if your load on the data-entry/browsing side isn't too heavy, you can
use the 'development server' instead of installing a full-blown server
such as Apache (I'm not sure if IIS is supported).

Users need not have any specific software (well, apart from a web
browser), you can change the back-end any time, have authentication,
the database and users can be remote to the actual "GUI" etc

Just some thoughts you can do with as you wish.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: curious paramstyle qmark behavior

2006-10-21 Thread Jon Clements

BartlebyScrivener wrote:

> Jon Clements wrote:
>
> > if your load on the data-entry/browsing side isn't too heavy, you can
> > use the 'development server' instead of installing a full-blown server
> > such as Apache (I'm not sure if IIS is supported).
>
> What's IIS?

It's Internet Information Services: the MS web/ftp server, that's
standard on some window platforms (Control Panel->Add/Remove
Software->Add/Remove Windows Components - or something like that). I
assumed you were on Windows because of you mentioning Access.

Good luck with your project Rick.

All the best,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: can't open word document after string replacements

2006-10-24 Thread Jon Clements

Antoine De Groote wrote:

> Hi there,
>
> I have a word document containing pictures and text. This documents
> holds several 'ABCDEF' strings which serve as a placeholder for names.
> Now I want to replace these occurences with names in a list (members). I
> open both input and output file in binary mode and do the
> transformation. However, I can't open the resulting file, Word just
> telling that there was an error. Does anybody what I am doing wrong?
>
> Oh, and is this approach pythonic anyway? (I have a strong Java background.)
>
> Regards,
> antoine
>
>
> import os
>
> members = somelist
>
> os.chdir(somefolder)
>
> doc = file('ttt.doc', 'rb')
> docout = file('ttt1.doc', 'wb')
>
> counter = 0
>
> for line in doc:
>  while line.find('ABCDEF') > -1:
>  try:
>  line = line.replace('ABCDEF', members[counter], 1)
>  docout.write(line)
>  counter += 1
>  except:
>  docout.write(line.replace('ABCDEF', '', 1))
>  else:
>  docout.write(line)
>
> doc.close()
> docout.close()

Errr I wouldn't even attempt to do this; how do you know each
'line' isn't going to be split arbitarily, and that 'ABCDEF' doesn't
happen to be part of an image. As you've noted, this is binary data so
you can't assume anything about it. Doing it this way is a Bad Idea
(tm).

If you want to do something like this, why not use templated HTML, or
possibly templated PDFs? Or heaven forbid, Word's mail-merge facility?


(I think MS Office documents are effectively self-contained file
systems, so there is probably some module out there which can
read/write them).

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Ctypes Error: Why can't it find the DLL.

2006-10-25 Thread Jon Clements

Mudcat wrote:


> So then I use the find_library function, and it finds it:
>
> >>> find_library('arapi51.dll')
> 'C:\\WINNT\\system32\\arapi51.dll'
>

Notice it's escaped the '\' character.

> At that point I try to use the LoadLibrary function, but it still can't
> find it:
>
> >>> windll.LoadLibrary('C:\WINNT\system32\arapi51.dll')
> Traceback (most recent call last):
>   File "", line 1, in ?
>   File "C:\Python24\Lib\site-packages\ctypes\__init__.py", line 395, in
> LoadLibrary
> return self._dlltype(name)
>   File "C:\Python24\Lib\site-packages\ctypes\__init__.py", line 312, in
> __init__
> self._handle = _dlopen(self._name, mode)
> WindowsError: [Errno 126] The specified module could not be found
>
> What am I doing wrong? [snip]

You need to use either
windll.LoadLibrary(r'c:\winnt\system32\arapi51.dll') or escape the \'s
as the find_library function did. You're getting caught out by \a which
is the alert/bell character. \w and \s aren't valid escape sequences so
you get away with them. I'm guessing it's worked before because you've
been lucky.

Works fine!:
>>> 'C:\winnt\system32\smtpctrs.dll'
'C:\\winnt\\system32\\smtpctrs.dll'

Uh oh, escaped:
>>> 'C:\winnt\system32\arapi51.dll'
'C:\\winnt\\system32\x07rapi51.dll'

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dict problem

2006-10-25 Thread Jon Clements

Alistair King wrote:

> Hi,
>
> ive been trying to update a dictionary containing a molecular formula, but 
> seem to be getting this error:
>
>
> Traceback (most recent call last):
>   File "DS1excessH2O.py", line 242, in ?
> updateDS1v(FCas, C, XDS)
> NameError: name 'C' is not defined
>
> dictionary is:
>
> DS1v = {'C': 6, 'H': 10, 'O': 5}
>
>
>
> #'Fxas' in each case will be integers but 'atoms' should be a float
>
> def updateDS1v(Fxas, x, XDS):
> while Fxas != 0:
> atoms = DS1v.get('x') + Fxas*XDS
> DS1v[x] = atoms
>
> updateDS1v(FCas, C, XDS)
> updateDS1v(FHas, H, XDS)
> updateDS1v(FOas, O, XDS)
> updateDS1v(FNas, N, XDS)
> updateDS1v(FSas, S, XDS)
> updateDS1v(FClas, Cl, XDS)
> updateDS1v(FBras, Br, XDS)
> updateDS1v(FZnas, Zn, XDS)
> print DS1v
>
> I know there is probably a simple solution but im quite new to python and am 
> lost?
>

I strongly suggest reading through the tutorial.

I don't think there's enough code here for anyone to check it properly.
For instance, it looks like FCas exists somewhere as it's barfing on
trying to find C. Where is XDS defined etc...?

I can't see updateDS1v() ever completing: any Fxas passed in not equal
to 0 will repeat indefinately.

I'm guessing unless C is meant to be a variable, you mean to pass in
the string 'C'.

A dictionary already has it's own update method

Perhaps if you explain what you're trying to do in plain English, we
could give you some pointers.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: dict problem

2006-10-25 Thread Jon Clements

Alistair King wrote:

> Jon Clements wrote:
>
> > > Alistair King wrote:
> > >
> > >
> >
> >> >> Hi,
> >> >>
> >> >> ive been trying to update a dictionary containing a molecular formula, 
> >> >> but seem to be getting this error:
> >> >>
> >> >>
> >> >> Traceback (most recent call last):
> >> >>   File "DS1excessH2O.py", line 242, in ?
> >> >> updateDS1v(FCas, C, XDS)
> >> >> NameError: name 'C' is not defined
> >> >>
> >> >> dictionary is:
> >> >>
> >> >> DS1v = {'C': 6, 'H': 10, 'O': 5}
> >> >>
> >> >>
> >> >>
> >> >> #'Fxas' in each case will be integers but 'atoms' should be a float
> >> >>
> >> >> def updateDS1v(Fxas, x, XDS):
> >> >> while Fxas != 0:
> >> >> atoms = DS1v.get('x') + Fxas*XDS
> >> >> DS1v[x] = atoms
> >> >>
> >> >> updateDS1v(FCas, C, XDS)
> >> >> updateDS1v(FHas, H, XDS)
> >> >> updateDS1v(FOas, O, XDS)
> >> >> updateDS1v(FNas, N, XDS)
> >> >> updateDS1v(FSas, S, XDS)
> >> >> updateDS1v(FClas, Cl, XDS)
> >> >> updateDS1v(FBras, Br, XDS)
> >> >> updateDS1v(FZnas, Zn, XDS)
> >> >> print DS1v
> >> >>
> >> >> I know there is probably a simple solution but im quite new to python 
> >> >> and am lost?
> >> >>
> >> >>
> >>
> > >
> > > I strongly suggest reading through the tutorial.
> > >
> > > I don't think there's enough code here for anyone to check it properly.
> > > For instance, it looks like FCas exists somewhere as it's barfing on
> > > trying to find C. Where is XDS defined etc...?
> > >
> > > I can't see updateDS1v() ever completing: any Fxas passed in not equal
> > > to 0 will repeat indefinately.
> > >
> > > I'm guessing unless C is meant to be a variable, you mean to pass in
> > > the string 'C'.
> > >
> > > A dictionary already has it's own update method
> > >
> > > Perhaps if you explain what you're trying to do in plain English, we
> > > could give you some pointers.
> > >
> > > Jon.
> > >
> > >
> >
> sorry,
>
> this has been a little rushed
>
> XDS is defined before the function and is a float.
> the Fxas values are also and they are integers
>
>
> now ive tried
>
> def updateDS1v(Fxas, x, XDS):
> while Fxas != 0:
> atoms = DS1v.get(x) + Fxas*XDS
> DS1v['x'] = atoms
>
> updateDS1v(FCas, 'C', XDS)
> updateDS1v(FHas, H, XDS)
> updateDS1v(FOas, O, XDS)
> updateDS1v(FNas, N, XDS)
> updateDS1v(FSas, S, XDS)
> updateDS1v(FClas, Cl, XDS)
> updateDS1v(FBras, Br, XDS)
> updateDS1v(FZnas, Zn, XDS)
> print DS1v
>
> from this i get the error:
>
> Traceback (most recent call last):
> File "DS1excessH2O.py", line 242, in ?
> updateDS1v(FCas, 'C', XDS)
> File "DS1excessH2O.py", line 239, in updateDS1v
> atoms = DS1v.get(x) + Fxas*XDS
> TypeError: unsupported operand type(s) for +: 'int' and 'str'
>
>
> with single quotes (FCas, 'C', XDS) to retrieve the value for that key
> from the dictionary and then create the new value and replace the old value.

One of Fxas or XDS is a string then...

Again, no-one can help you if we can't see what's actually there.

What are FCas, FHas etc... do they relate to the element? If so, isn't
that a dictionary in itself? And your update function is still an
infinite loop!

We're still in the dark as to what you're trying to do, try describing
something like: "for each element there is an associated 'F' value. For
each element in an existing molecule I wish to change the number of
'whatever' to be 'whatever' + my 'F' value * value XDS..."



Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: newbie class-building question

2006-11-09 Thread Jon Clements

jrpfinch wrote:

> I am constructing a simple class to make sure I understand how classes
> work in Python (see below this paragraph).
>
> It works as expected, except the __add__ redefinition.  I get the
> following in the Python interpreter:
>
> >>> a=myListSub()
> >>> a
> []
> >>> a+[5]
> Traceback (most recent call last):
>   File "", line 1, in ?
> TypeError: 'str' object is not callable
> >>>
>
> Please could you let me know where I am going wrong.  I have hacked
> around with the code and tried googling this error message but am
> having difficulty finding the source of the problem.
>
> Many thanks
>
> Jon
>
> class myList:
> def __init__ (self,value=[]):
> self.wrapped=[]
> for x in value :
> self.wrapped.append(x)
> def __repr__ (self):
> return `self.wrapped`
> def __getattr__ (self,attrib):
> return getattr(self.wrapped,attrib,'attribute not found')
> def __len__ (self):
> return len(self.wrapped)
> def __getitem__ (self,k):
> return self.wrapped[k]
> def __add__(self,other):
> return self.wrapped+other
>
> class myListSub(myList):
> classCounter=0
> def __init__ (self,value=[]):
> self.instanceCounter=0
> myList.__init__(self,value)
> def __add__(self,other):
> myListSub.classCounter=myListSub.classCounter+1
> self.instanceCounter=self.instanceCounter+1
> myList.__add__(self,other)
> def getCounters (self):
> return "classCounter=%s instanceCounter=%s" %
> (myListSub.classCounter,self.classCounter)

I'm not sure what you're trying to achieve with the __getattr__ in
myList, but the reason your __add__ isn't working is that when "a +
[5]" is executed, Python tries to find a __coerce__ attribute (to prove
this just put a "print attrib" as the first line of your __getattr__).
Your customer __getattr__ returns a default string object of 'attribute
not found', when it fails to locate this. Python then attempts to call
this string, and as your exception states -- strings aren't callable.

That should give you enough information to be going onwith.

Also, if you're trying to get use to classes, I would suggest you start
off with the 'new-style' classes; those are the ones that derive from
object. A quick google for new-style classes will sort you.

hth a bit,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: newbie class-building question

2006-11-09 Thread Jon Clements

jrpfinch wrote:

> Thank you this is very helpful.  The only thing I now don't understand
> is why it is calling __coerce__.  self.wrapped and other are both
> lists.

Yes, but in "a + [5]", *a* is a myListSub object -- it's not a list! So
__coerce__ is called to try and get a common type...

Try this in myList...

# We could  check the type of 'other' to determine what we return
here
# At the moment, we return a list, whose + operator, requires another
list so this works
# and gives an exception if it's not for us
def __coerce__(self,other):
return self.wrapped, other

# 
def __add__(self,other):
return self + other



hth

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: extract text from a string

2006-11-09 Thread Jon Clements

[EMAIL PROTECTED] wrote:

> Hallo all,
>
> I have tried for a couple of hours to solve my problem with re but I
> have no success.
>
> I have a string containing: "+abc_cde.fgh_jkl\n" and what I need to
> become is "abc_cde.fgh_jkl".  Could anybody be so kind and write me a
> code of how to extract this text from that string?

Perhaps if you described what the actual *criteria* is for the
translation; for instance, "I need to keep only letters, numbers and
punctuation characters" etc... Going by your example, it's tempting to
suggest the best method would be string_name[1:-1] and that you don't
need a regex.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: generating random passwords ... for a csv file with user details

2006-05-28 Thread Jon Clements
Something like:

import csv
in_csv=csv.reader( file('your INPUT filenamehere.csv') )
out_csv=csv.writer( file('your OUPUT filenamehere.csv','wb') )
## If you have a header record on your input file, then
out_csv.writerow( in_csv.next() )
## Iterate over your input file
for row in in_csv:
# Row will be a list where row[0]=userid and row[3]=passwd
password=some_function_as_advised_by_rest_of_group()
# Assuming you want to write password as new field then
out_csv.writerow( row + [password] )
# Assuming you want to over-write password field then
row[3] = password
out_csv.writerow(row)

All the best,

Jon.

k.i.n.g. wrote:
> Hi ALL,
>
> I am sorry for not mentioning that I am new to python and scripting.
> How can I add the above script to handle csv file. I want the script to
> generate passwords in the passwords column/row in a csv file.
>
> userid,realname,dateofB,passwd
>
> The script should read the userid and genrate the password for each
> user id (there are thousands of userids)
> 
> Kanthi

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using print instead of file.write(str)

2006-06-01 Thread Jon Clements
Didn't know of the >> syntax:  lovely to know about it Bruno - thank
you.

To the OP - I find the print statement useful for something like:
print 'this','is','a','test'
>>> 'this is a test'
(with implicit newline and implicit spacing between parameters)

If you want more control (more flexibility, perhaps?) over the
formatting of the output: be it spacing between parameters or newline
control, use the methods Bruno describes below.

I'm not sure if you can suppress the spacing between elements (would
love to be corrected though); to stop the implicit newline use
something like
print 'testing',
>>> 'testing'
(but - with the leading comma, the newline is suppressed)

I personally find that print is convenient for sentences (or writing
'lines').

Thought it worth pointing this out in case, like some I know, you come
across a cropper with certain output streams.

All the best,

Jon.



Bruno Desthuilliers wrote:
> A.M a écrit :
> > Hi,
> >
> >
> > I found print much more flexible that write method. Can I use print instead
> > of file.write method?
> >
>
> f = open("/path/to/file")
> print >> f, "this is my %s message" % "first"
> f.close()
>
> To print to stderr:
>
> import sys
> print >> sys.stderr, "oops"
>
> FWIW, you and use string formating anywhere, not only in print statements:
>
> s = "some %s and % formating" % ("nice", "cool")
> print s
>
> You can also use "dict formating":
>
> names = {"other": "A.M.", "me" : "bruno"}
> s = "hello %(other)s, my name is %(me)s" % names

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using print instead of file.write(str)

2006-06-01 Thread Jon Clements
I meant 'trailing': not leading.

mea culpa.

Jon.

Jon Clements wrote:
> Didn't know of the >> syntax:  lovely to know about it Bruno - thank
> you.
>
> To the OP - I find the print statement useful for something like:
> print 'this','is','a','test'
> >>> 'this is a test'
> (with implicit newline and implicit spacing between parameters)
>
> If you want more control (more flexibility, perhaps?) over the
> formatting of the output: be it spacing between parameters or newline
> control, use the methods Bruno describes below.
>
> I'm not sure if you can suppress the spacing between elements (would
> love to be corrected though); to stop the implicit newline use
> something like
> print 'testing',
> >>> 'testing'
> (but - with the leading comma, the newline is suppressed)
>
> I personally find that print is convenient for sentences (or writing
> 'lines').
>
> Thought it worth pointing this out in case, like some I know, you come
> across a cropper with certain output streams.
>
> All the best,
>
> Jon.
>
>
>
> Bruno Desthuilliers wrote:
> > A.M a écrit :
> > > Hi,
> > >
> > >
> > > I found print much more flexible that write method. Can I use print 
> > > instead
> > > of file.write method?
> > >
> >
> > f = open("/path/to/file")
> > print >> f, "this is my %s message" % "first"
> > f.close()
> >
> > To print to stderr:
> >
> > import sys
> > print >> sys.stderr, "oops"
> >
> > FWIW, you and use string formating anywhere, not only in print statements:
> >
> > s = "some %s and % formating" % ("nice", "cool")
> > print s
> >
> > You can also use "dict formating":
> >
> > names = {"other": "A.M.", "me" : "bruno"}
> > s = "hello %(other)s, my name is %(me)s" % names

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to generate k+1 length strings from a list of k length strings?

2006-06-08 Thread Jon Clements
Are you asking the question, "Which pairs of strings have one character
different in each?", or "Which pairs of strings have a substring of
len(string) - 1 in common?".

Jon.

Girish Sahani wrote:
> I have a list of strings all of length k. For every pair of k length
> strings which have k-1 characters in common, i want to generate a k+1
> length string(the k-1 common characters + 2 not common characters).
> e.g i want to join 'abcd' with bcde' to get 'abcde' but i dont want to
> join 'abcd' with 'cdef'
>  Currently i'm joining every 2 strings, then removing duplicate characters
> from every joined string and finally removing all those strings whose
> length != k+1.Here's the code i've written:
>
>   for i in range(0,len(prunedK) - 1,1):
> if k in range(1,len(prunedK),1) & i+k <= len(prunedK) -1:
> colocn = prunedK[i] + prunedK[i+k]
> prunedNew1.append(colocn)
> continue
> for string in prunedNew1:
> stringNew = withoutDup(string)
> prunedNew.append(stringNew)
> continue
>
> But this one is quite bad in the time aspect :(.
> Thanks in advance,
> girish

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "groupby" is brilliant!

2006-06-13 Thread Jon Clements
Not related to itertools.groupby, but the csv.reader object...

If for some reason you have malformed CSV files, with embedded newlines
or something of that effect, it will raise an exception. To skip those,
you will need a construct of something like this:

raw_csv_in = file('filenamehere.csv')
for raw_line in raw_csv_in:
try:
# Do something to rawline here maybe if necessary to "clean it
up"
row = csv.reader( [raw_line] ).next()
# Do your stuff here
except csv.Error:
pass # or do something more appropriate if the record is
important

May not be applicable in your case, but has stung me a few times...

All the best,

Jon.


Frank Millman wrote:
> Paul McGuire wrote:
> > >
> > > reader = csv.reader(open('trans.csv', 'rb'))
> > > rows = []
> > > for row in reader:
> > > rows.append(row)
> > >
> >
> > This is untested, but you might think about converting your explicit "for...
> > append" loop into either a list comp,
> >
> > rows = [row for row in reader]
> >
> > or just a plain list constructor:
> >
> > rows = list(reader)
> >
> > Neh?
> >
> > -- Paul
> >
>
> Yup, they both work fine.
>
> There may be times when you want to massage the data before appending
> it, in which case you obviously have to do it the long way. Otherwise
> these are definitely neater, the last one especially.
>
> You could even do it as a one-liner -
> rows = list(csv.reader(open('trans.csv', 'rb')))
> 
> It still looks perfectly readable to me.
> 
> Thanks
> 
> Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __cmp__ method

2006-06-14 Thread Jon Clements
This probably isn't exactly what you want, but, unless you wanted to do
something especially with your own string class, I would just pass a
function to the sorted algorithm.

eg:

sorted( [a,b,c], cmp=lambda a,b: cmp(len(a),len(b)) )

gives you the below in the right order...

Never tried doing what you're doing, but something about builtin types,
and there's a UserString module...

Hope that helps a bit anyway,

Jon.

JH wrote:

> Hi
>
> Can anyone explain to me why the following codes do not work? I want to
> try out using __cmp__ method to change the sorting order. I subclass
> the str and override the __cmp__ method so the strings can be sorted by
> the lengh. I expect the shortest string should be in the front. Thanks
>
> >>> class myStr(str):
> def __init__(self, s):
> str.__init__(self, s) # Ensure super class is initialized
> def __cmp__(self, other):
> return cmp(len(self), len(other))
>
> >>> a = myStr('abc')
> >>> b = myStr('Personal')
> >>> c = myStr('Personal firewall')
> >>> sorted([c, b, a])
> ['Personal', 'Personal firewall', 'abc']
> >>>

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Iteration over recursion?

2006-06-20 Thread Jon Clements

MTD wrote:
> Hello all,
>

(snip)

> I've been told that iteration in python is generally more
> time-efficient than recursion. Is that true?

(snip)

AFAIK, in most languages it's a memory thing. Each time a function
calls itself, the 'state' of that function has to be stored somewhere
so that it may continue, as was, when the recursive function returns.
Therefore, you can effectively think of it as creating N many objects
which don't start getting released until the very last nested call
returns. This (depending on the stack size and implementation etc...)
may force several memory allocations. Then of course, as it starts
going back 'upwards' (towards the initiator of the recursive call that
is), the garbage collector may kick in freeing memory...

Depending on return values, iterating will just require space for the
returned value from each function in term (which in most cases - I
would imagine fits on the stack), so although it's doing effectively
the same thing, it's doing so with less memory.

I probably haven't explained too well, and I may not even be right. So
take with a pinch of salt.

All the best,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Iteration over recursion?

2006-06-20 Thread Jon Clements

Sudden Disruption wrote:

> Bruno,
>
> > It doesn't. Technical possible, but BDFL's decision...
>
> Sure.  But why bother?
>

I agree.

> Anything that can be done with recursion can be done with iteration.
> Turng proved that in 1936.
>
> Recursion was just an attempt to "unify" design approach by abstracting
> itteration and creating a new context.  It allowed the programmer to
> isolate himself from the reality that he was actually iterating.  Talk
> about mind fuck.
>

Well, unless I'm seriously mistaken, it also breaks good design. If a
function calls another function, it's because it requires that
function's specific service. If the service it requires is itself, then
the function should iterate over a set of data and  accumulate/reduce
or whatever else it needs to do. As well as that, I can imagine
exception handling becoming quite cumbersome/clumsy.


> It seems things were just to simple the way they were.
>
> Like all fashion, this too shall pass.

Be great if it does; but I don't imagine this will happen until
examples of traversing a binary tree using recursion disappear from
computer science text books (the ones I have seen anyway...).  Unless,
later in the course (they might do this, I don't know for sure), they
then say, "BTW people, this is the correct way to do it, because the
previous way isn't too good an idea...".


> Sudden Disruption
> --
> Sudden View...
> the radical option for editing text
> http://www.sudden.net/
> http://suddendisruption.blogspot.com

Just my little rant,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Iteration over recursion?

2006-06-20 Thread Jon Clements

Kay Schluehr wrote:

> Nick Maclaren wrote:
>
> > Tail recursion removal can often eliminate the memory drain, but the
> > code has to be written so that will work - and I don't know offhand
> > whether Python does it.
>
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/496691
> 
> Regards,
> Kay

Interesting.

Thanks Kay.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: error with string (beginner)

2006-06-25 Thread Jon Clements

Alex Pavluck wrote:
> Hello. I get the following error with the following code.  Is there
> something wrong with my Python installation?
>
> code:
> import types
> something = input("Enter something and I will tell you the type: ")
>
> if type(something) is types.IntType:
> print "you entered an integer"
> elif type(something) is types.StringType:
> print "you entered a string"
>
> error:
> String: Source for exec/eval is unavailable

>From the docs:
"""
input( [prompt])

Equivalent to eval(raw_input(prompt)). Warning: This function is not
safe from user errors! It expects a valid Python expression as input;
if the input is not syntactically valid, a SyntaxError will be raised.
Other exceptions may be raised if there is an error during evaluation.
(On the other hand, sometimes this is exactly what you need when
writing a quick script for expert use.)
If the readline module was loaded, then input() will use it to provide
elaborate line editing and history features.

Consider using the raw_input() function for general input from users.


So you want be using raw_input() for starters...

Cheers,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Extending built-in objects/classes

2006-07-03 Thread Jon Clements
Hi All,

I've reached the point in using Python where projects, instead of being
like 'batch scripts', are becoming more like 'proper' programs.

Therefore, I'm re-designing most of these and have found things in
common which I can use classes for. As I'm only just starting to get
into classes, I see that new style classes are thte way to go, so will
be using those. I come from a C++ background, and understand I need to
adjust my thinking in certain ways - I have read
http://www.geocities.com/foetsch/python/new_style_classes.htm.


As a really simple class, I've decided to make a 'str' to include a
'substr' function. Yes, I know this can be done using slicing, and
effectively this is what substr would do: something like;

class mystr(str):
 My rather rubbish but trying to be simple custom string class
"""
def substr(self,start,length,pad=False):
"""
Return str of (up to) _length_ chars, starting at _start_ which
is 1 offset based.
If pad is True, ensure _length_ chars is returned by padding
with trailing whitespace.

return self.[ (start-1): (start-1)+length ]

Ignore the fact pad isn't implemented...

 should be the actual string value of the string object: How do I
work out what this is?
Secondly, I'm not 100% sure what I need for the __init__; is str's
__init__ implicitly called, or do I need to call str's __init__ in
mystr's (I seem to remember seeing some code which did this, as well as
calling super()).

Any critiscm is appreciated.

Many thanks,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extending built-in objects/classes

2006-07-03 Thread Jon Clements

[EMAIL PROTECTED] wrote:
> My experiance is mostly with old-style classes, but here goes.
>
> first off, the  question is actually easier than you think.
> After all, self is an instance of a string, so   self[3:4] would grab
> the slice of characters between 3 and 4 =)
>

That's kind of funky - I like it. However, I'd still like to know what
 technically is - any ideas on how to find out?

> as for __init__, what I have found is that if you do not include an
> __init__ function, the parent class's __init__ gets inherited, just
> like any other function, so you dont need one.  If you have multiple
> inheritance, however, you must include an __init__ which calls the
> __init__ on every parent, otherwise only the first parent's gets
> called.

Thanks for your post: it's most appreciated.

Cheers,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Extending built-in objects/classes

2006-07-03 Thread Jon Clements

John Machin wrote:
(snip)
>
> You have already been told: you don't need "self.", you just write
> "self" ... self *is* a reference to the instance of the mystr class that
> is being operated on by the substr method.
>
(snip)

I get that; let me clarify why I asked again.

As far as I'm aware, the actual representation of a string needn't be
the same as its 'physical' value. ie, a string could always appear in
uppercase ('ABCD'), while stored as 'aBcd'. If I need to guarantee that
substr always returned from the physical representation and not the
external appearance, how would I do this? Or, would self, always return
internal representation, (if so, how would I get external appearance?).

Or I could be talking complete _beep_ - in which case I apologise.

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: IRC questions!!

2006-07-07 Thread Jon Clements

bruce wrote:
> hi...
>
> i'm trying to figure out what i have to do to setup mIRC to get the #python
> channel on IRC!!
>
> any pointers. the mIRC docs didn't get me very far.
>
> is there an irc.freenode.net that i need to connect to? how do i do it?
>
> thanks..
>
> -bruce

Assuming you're familiar with the basics of IRC.

In mIRC, File->Select Server->Add, enter "Freenode" as description,
enter "irc.freenode.net" as server. Leave the port as 6667, then change
it later if server supports other ports.

Click connect to server.

Cheers,

Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Simple question on indexing

2006-12-01 Thread Jon Clements

Tartifola wrote:

> Hi,
> I would like to obtain the position index in a tuple when an IF
> statement is true. Something like
>
> >>>a=['aaa','bbb','ccc']
> >>>[ ??? for name in a if name == 'bbb']
> >>>1
>
> but I'm not able to find the name of the function ??? in the python 
> documentation, any help?
> Thanks

Ummm, that's a list not a tuple: I'll assume you meant sequence.

This will generate a list of indexes which match the criteria:

>>> a = [ 1, 2, 3, 2, 5, 7]
>>> [elno for elno,el in enumerate(a) if el == 2]
[1, 3]

hth
Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: route planning

2006-12-01 Thread Jon Clements
It's not really what you're after, but I hope it might give some ideas
(useful or not, I don't know).

How about considering a vertex as a point in space (most libraries will
allow you to decorate a vertex with additonal information), then
creating an edge between vertices, which will be your 'path'. You can
then decorate the edge with information such as distance/maximum speed
etc...

Then all you need to do is use an A* path algorithm or shortest path
search to get the shortest / most efficient route You might need a
custom visitor to suit the 'weight'/'score' of how efficient the path
is.

I know this probably isn't of much help, but I hope it comes in useful;
I've only ever used Boost.Graph (which is C++, but I believe it has a
Python binding) and that was for something else -- although I do recall
it had examples involving Kevin Bacon and dependency tracking etc... so
a good old Google might do you some good -- ie, it's not completely
related, but it might give you a few extra things to search on...

All the best with the search.

Jon.

PS. If you do find a library, can you let me know? I'd be interested in
having a play with it...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to replace a comma

2006-12-18 Thread Jon Clements

Lad wrote:

> In a text I need to
> add a blank(space) after a comma but only if there was no blank(space)
> after the comman
> If there was a blank(space), no change is made.
>
> I think it could be a task for regular expression but can not figure
> out the correct regular expression.
> Can anyone help, please?
> Thank you
> L.

Off the top of my head, something like re.sub(', *', ', ', 'a,
b,c,d,e, f'), meets your requirements (it also ensures the number of
spaces after the comma is one). However, you may need to refine the
rules depending on what you really want to achieve. For instance, what
happens with: a comma appearing before any text, consecutive commas (ie
,,,), or commas within quotes?

hth
Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Encoding / decoding strings

2007-01-05 Thread Jon Clements

[EMAIL PROTECTED] wrote:

> Hey Everyone,
>
> Was just wondering if anyone here could help me. I want to encode (and
> subsequently decode) email addresses to use in URLs. I believe that
> this can be done using MD5.
>
> I can find documentation for encoding the strings, but not decoding
> them. What should I do to encode =and= decode strings with MD5?
>
> Many Thanks in Advance,
> Oliver Beattie

Depends what you mean by "encode email addresses to use in URLs". MD5
is a cryptographic one-way hash function; it creates a 'finger print'
of the input data: given this, it's impossible to reproduce the
original input.

Is this what you're looking for?

>>> import urllib
>>> urllib.quote('[EMAIL PROTECTED]')
'some.persons%40somedomain.com'

hth
Jon.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Writing a nice formatted csv file

2007-05-02 Thread Jon Clements
On 2 May, 15:14, redcic <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I use the csv module of Python to write a file. My code is of the
> form :
>
> cw = csv.writer(open("out.txt", "wb"))
> cw.writerow([1,2,3])
> cw.writerow([10,20,30])
>
> And i get an out.txt file looking like:
> 1,2,3
> 10,20,30
>
> Whereas what I'd like to get is:
> 1,2,3,
> 10,  20,   30
>
> which is more readable.
>
> Can anybody help me to do so ?

How about pre-formatting the columns before hand before using
something like:

# List of formatting to apply to each column (change this to suit)
format = ['%03d', '%10s', '%10s']

data = [1, 10, 100]
print [fmt % d for fmt,d in zip(format,data)]

Results in: ['001', '10', '   100']

Then write that using the CSV module.

hth

Jon.





-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   >