It's ...

2009-06-24 Thread Angus Rodgers

... my first Python program!  So please be gentle (no fifty ton
weights on the head!), but tell me if it's properly "Pythonic",
or if it's a dead parrot (and if the latter, how to revive it).

I'm working from Beazley's /Python: Essential Reference/ (2nd
ed. 2001), so my first newbie question is how best to find out
what's changed from version 2.1 to version 2.5. (I've recently
installed 2.5.4 on my creaky old Win98SE system.) I expect to 
be buying the 4th edition when it comes out, which will be soon,
but before then, is there a quick online way to find this out?

Having only got up to page 84 - where we can actually start to
read stuff from the hard disk - I'm emboldened to try to learn
to do something useful, such as removing all those annoying hard
tab characters from my many old text files (before I cottoned on
to using soft tabs in my text editor).

This sort of thing seems to work, in the interpreter (for an 
ASCII text file, named 'h071.txt', in the current directory):

stop = 3   # Tab stops every 3 characters
from types import StringType   # Is this awkwardness necessary?
detab = lambda s : StringType.expandtabs(s, stop)  # Or use def
f = open('h071.txt')   # Do some stuff to f, perhaps, and then:
f.seek(0)
print ''.join(map(detab, f.xreadlines()))
f.close()

Obviously, to turn this into a generally useful program, I need
to learn to write to a new file, and how to parcel up the Python
code, and write a script to apply the "detab" function to all the
files found by searching a Windows directory, and replace the old
files with the new ones; but, for the guts of the program, is this
a reasonable way to write the code to strip tabs from a text file?

For writing the output file, this seems to work in the interpreter:

g = open('temp.txt', 'w')
g.writelines(map(detab, f.xreadlines()))
g.close()

In practice, does this avoid creating the whole string in memory
at one time, as is done by using ''.join()? (I'll have to read up
on "opaque sequence objects", which have only been mentioned once
or twice in passing - another instance perhaps being an xrange()?)
Not that that matters much in practice (in this simple case), but
it seems elegant to avoid creating the whole output file at once.

OK, I'm just getting my feet wet, and I'll try not to ask too many
silly questions!

First impressions are: (1) Python seems both elegant and practical;
and (2) Beazley seems a pleasantly unfussy introduction for someone 
with at least a little programming experience in other languages.

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-24 Thread Angus Rodgers
On Wed, 24 Jun 2009 20:53:49 +0100, I wrote:

>[...] my first newbie question is how best to find out
>what's changed from version 2.1 to version 2.5.
>[...] is there a quick online way to find this out?

One way seems to be:

<http://www.python.org/doc/2.3/whatsnew/>
<http://www.python.org/doc/2.4/whatsnew/>
<http://www.python.org/doc/2.5/whatsnew/>

... although there doesn't seem to be any
<http://www.python.org/doc/2.2/whatsnew/>

... ah! ...
<http://www.python.org/doc/2.2.3/whatsnew/>
"What's New in Python 2.2"

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-24 Thread Angus Rodgers
On Wed, 24 Jun 2009 16:40:29 -0400, "J. Cliff Dyer"
 wrote:

>On Wed, 2009-06-24 at 20:53 +0100, Angus Rodgers wrote:
>> [...]
>> from types import StringType   # Is this awkwardness necessary?
>
>Not anymore.  You can just use str for this.
>
>> detab = lambda s : StringType.expandtabs(s, stop)  # Or use def
>
>First, use def.  lambda is a rarity for use when you'd rather not assign
>your function to a variable.  
>
>Second, expandtabs is a method on string objects.  s is a string object,
>so you can just use s.expandtabs(stop)

How exactly do I get detab, as a function from strings to strings
(for a fixed tab size)?  (This is aside from the point, which you
make below, that the whole map/join idea is a bit of a no-no - in
some other context, I might want to isolate a method like this.)

>Third, I'd recommend passing your tabstops into detab with a default
>argument, rather than defining it irrevocably in a global variable
>(which is brittle and ugly)

No argument there - I was just messing about in the interpreter,
to see if the main idea worked.

>> f = open('h071.txt')   # Do some stuff to f, perhaps, and then:
>> f.seek(0)
>
>f is not opened for writing, so if you do stuff to the contents of f,
>you'll have to put the new version in a different variable, so f.seek(0)
>doesn't help.  If you don't do stuff to it, then you're at the beginning
>of the file anyway, so either way, you shouldn't need to f.seek(0).

I seemed to find that if I executed f.xreadlines() or f.readlines()
once, I was somehow positioned at the end of the file or something,
and had to do the f.seek(0) - but maybe I did something else silly.

>> print ''.join(map(detab, f.xreadlines()))
>
>Sometime in the history of python, files became iterable, which means
>you can do the following:
>
>for line in f:
>print detab(line)
>
>Much prettier than running through join/map shenanigans.  This is also
>the place to modify the output before passing it to detab:
>
>for line in f:
># do stuff to line
>print detab(line)
>
>Also note that you can iterate over a file several times:
>
>f = open('foo.txt')
>for line in f:
>print line[0]  # prints the first character of every line
>for line in f:
>print line[1]  #prints the second character of every line
>> f.close()

This all looks very nice.

>> For writing the output file, this seems to work in the interpreter:
>> 
>> g = open('temp.txt', 'w')
>> g.writelines(map(detab, f.xreadlines()))
>> g.close()
>> 
>
>Doesn't help, as map returns a list.

Pity.   Oh, well.

>You can use itertools.imap, or you
>can use a for loop, as above.

This is whetting my appetite!

>The terms to look for, rather than opaque sequence objects are
>"iterators" and "generators".

OK, will do.

>Glad you're enjoying Beazley.  I would look for something more
>up-to-date.  Python's come a long way since 2.1.  I'd hate for you to
>miss out on all the iterators, booleans, codecs, subprocess, yield,
>unified int/longs, decorators, decimals, sets, context managers and
>new-style classes that have come since then.

I'll get either Beazley's 4th ed. (due next month, IIRC), or Chun,
/Core Python Programming/ (2nd ed.), or both, unless someone has
a better suggestion. (Eventually I'll migrate from Windows 98SE(!),
and will need info on Python later than 2.5, but that's all I need
for now.)

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-24 Thread Angus Rodgers
On Wed, 24 Jun 2009 22:12:33 +0100, I wrote:

>How exactly do I get detab, as a function from strings to strings
>(for a fixed tab size)?

(It's OK - this has been explained in another reply.  I'm still a
little hazy about what exactly objects are in Python, but the haze
will soon clear, I'm sure, especially after I have written more
than one one-line program!)

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-24 Thread Angus Rodgers
On Wed, 24 Jun 2009 14:10:54 -0700, Scott David Daniels
 wrote:

>Angus Rodgers wrote:
>
>> from types import StringType   # Is this awkwardness necessary?
>Nope

I'm starting to see some of the mental haze that was confusing me.

>Also, expandtabs is an instance method, so the roundabout is not needed.
>
> def detab(s):
> return s.expandtabs(stop)

I'd forgotten where Beazley had explained that "methods such as
... s.expandtabs() always return a new string as opposed to mod-
ifying the string s."  I must have been hazily thinking of it as
somehow modifying s, even though my awkward code itself depended
on a vague understanding that it didn't.  No point in nailing
this polly to the perch any more!

>I'd simply use:
> for line in f:
> print detab(line.rstrip())
>or even:
> for line in f:
> print line.rstrip().expandtabs(stop)

I'll read up on iterating through files, somewhere online for
the moment, and then get a more up-to-date textbook.

And I'll try not too ask too many silly questions like this, but
I wanted to make sure I wasn't getting into any bad programming
habits right at the start - and it's a good thing I did, because
I was!

>Nope.  But you could use a generator expression if you wanted:
>  g.writelines(detab(line) for line in f)

Ah, so that actually does what I was fondly hoping my code would
do.  Thanks!  I must learn about these "generator" thingies.

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-24 Thread Angus Rodgers
On Wed, 24 Jun 2009 22:43:01 +0100, I wrote:

>No point in nailing this polly to the perch any more!

Indeed not, so please skip what follows (I've surely been enough
of an annoying newbie, already!), but I've just remembered why I
wrote my program in such an awkward way.  I wanted to be able to
import the type name t (StringType in this case) so that I could
simply use t.m() as the name of one of its methods [if "method" 
is the correct term]; but in this case, where m is expandtabs(),
an additional parameter (the tab size) is needed; so, I used the
lambda expression to get around this, entirely failing to realise
that (as was clearly shown in the replies I got), if I was going
to use "lambda" at all (not recommended!), then it would be a lot 
simpler to write the function as lambda s : s.m(), with or without
any additional parameters needed. (It didn't really have anything
to do with a separate confusion as to what exactly "objects" are.)

>I wanted to make sure I wasn't getting into any bad programming
>habits right at the start

I'm just trying to make sure I really understand how I screwed up.

(In future, I'll try to work through a textbook with exercises.
But I thought I'd better try to get some quick feedback at the
start, because I knew that I was fumbling around, and that it 
was unlikely to be necessary to use such circumlocutions.)

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [SPAM] It's ...

2009-06-25 Thread Angus Rodgers
Someone has gently directed me to the Tutor mailing list:
<http://mail.python.org/mailman/listinfo/tutor>
which I hadn't known about.  I've joined, and will try to
confine my initial blundering experiments to there.  Sorry
about the spam spam spam spam, lovely spam, wonderful spam!
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-25 Thread Angus Rodgers
On Thu, 25 Jun 2009 17:53:51 +0100, I wrote:

>On Thu, 25 Jun 2009 10:31:47 -0500, Kirk Strauser 
> wrote:
>
>>At 2009-06-24T19:53:49Z, Angus Rodgers  writes:
>>
>>> print ''.join(map(detab, f.xreadlines()))
>>
>>An equivalent in modern Pythons:
>>
>>>>> print ''.join(line.expandtabs(3) for line in file('h071.txt'))
>
>I guess the code below would also have worked in 2.1?
>(It does in 2.5.4.)
>
> print ''.join(line.expandtabs(3) for line in \
> file('h071.txt').xreadlines())

Possibly silly question (in for a penny ...): does the new feature,
by which a file becomes iterable, operate by some kind of coercion
of a file object to a list object, via something like x.readlines()?

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-25 Thread Angus Rodgers
On Thu, 25 Jun 2009 17:56:47 +0100, I burbled incoherently:

>[...] does the new feature,
>by which a file becomes iterable, operate by some kind of coercion
>of a file object to a list object, via something like x.readlines()?

Sorry to follow up my own post yet again (amongst my weapons is
a fanatical attention to detail when it's too late!), but I had 
better rephrase that question:

Scratch "list object", and replace it with something like: "some
kind of iterator object, that is at least already implicit in 2.1
(although the term 'iterator' isn't mentioned in the index to the
2nd edition of Beazley's book)".  Something like that!  8-P

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-25 Thread Angus Rodgers
On Thu, 25 Jun 2009 10:31:47 -0500, Kirk Strauser 
 wrote:

>At 2009-06-24T19:53:49Z, Angus Rodgers  writes:
>
>> print ''.join(map(detab, f.xreadlines()))
>
>An equivalent in modern Pythons:
>
>>>> print ''.join(line.expandtabs(3) for line in file('h071.txt'))

I guess the code below would also have worked in 2.1?
(It does in 2.5.4.)

 print ''.join(line.expandtabs(3) for line in \
 file('h071.txt').xreadlines())

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-25 Thread Angus Rodgers
On Thu, 25 Jun 2009 17:56:47 +0100, I found a new way to disgrace
myself, thus:

>[...] something like x.readlines()?
   ^
I don't know how that full stop got in there.  Please ignore it!
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-26 Thread Angus Rodgers
On Thu, 25 Jun 2009 18:22:48 +0100, MRAB 
 wrote:

>Angus Rodgers wrote:
>> On Thu, 25 Jun 2009 10:31:47 -0500, Kirk Strauser 
>>  wrote:
>> 
>>> At 2009-06-24T19:53:49Z, Angus Rodgers  writes:
>>>
>>>> print ''.join(map(detab, f.xreadlines()))
>>> An equivalent in modern Pythons:
>>>
>>>>>> print ''.join(line.expandtabs(3) for line in file('h071.txt'))
>> 
>> I guess the code below would also have worked in 2.1?
>> (It does in 2.5.4.)
>> 
>>  print ''.join(line.expandtabs(3) for line in \
>>  file('h071.txt').xreadlines())
>> 
>That uses a generator expression, which was introduced in 2.4.

Sorry, I forgot that list comprehensions need square brackets.

The following code works in 2.1 (I installed version 2.1.3, on
a different machine, to check!):

 f = open('h071.txt')   # Can't use file('h071.txt') in 2.1
 print ''.join([line.expandtabs(3) for line in f.xreadlines()])

(Of course, in practice I'll stick to doing it the more sensible
way that's already been explained to me.  I'm ordering a copy of
Wesley Chun, /Core Python Programming/ (2nd ed., 2006), to learn
about version 2.5.)
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: It's ...

2009-06-27 Thread Angus Rodgers
On Sat, 27 Jun 2009 03:32:12 -0300, "Gabriel Genellina"
 wrote:

>Iterators were added in Python 2.2.

Just my luck.  :-)

>See PEP 234 http://www.python.org/dev/peps/pep-0234/

You've got to love a language whose documentation contains 
sentences beginning like this:

 "Among its chief virtues are the following four -- no, five
 -- no, six -- points: [...]"

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change the first character of the line to uppercase in a text file

2009-06-27 Thread Angus Rodgers
On Fri, 26 Jun 2009 18:58:27 -0700 (PDT), powah 
 wrote:

>On Jun 26, 4:51 pm, Chris Rebert  wrote:
>> On Fri, Jun 26, 2009 at 12:43 PM, powah wrote:
>> > How to change the first character of the line to uppercase in a text
>> > file?
>> > [...]
>>
>> We're not in the business of doing homework. Some hints though:
>>
>> `s.upper()` converts the string in variable `s` to all upper case
>> (e.g. "aBcD".upper() --> "ABCD")
>> `for line in afile:` iterates over each line in a file object.
>> [...]
>>
>> And here are the docs on working with files:
>> http://docs.python.org/library/functions.html#open
>> http://docs.python.org/library/stdtypes.html#file-objects
>>
>> That should be enough to get you started.
>
>Thank you for your hint.
>This is my solution:
>f = open('test', 'r')
>for line in f:
>print line[0].upper()+line[1:],

I know this is homework, so I didn't want to say anything (especially
as I'm a newcomer, also just starting to learn the language), but it
seems OK to mention that if you hunt around some more in the standard
library documentation, you'll find an even shorter way to write this.
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change the first character of the line to uppercase in a text file

2009-06-27 Thread Angus Rodgers
On Fri, 26 Jun 2009 18:58:27 -0700 (PDT), powah
 wrote:

>Thank you for your hint.
>This is my solution:
>f = open('test', 'r')
>for line in f:
>print line[0].upper()+line[1:],

Will your program handle empty lines of input correctly?
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change the first character of the line to uppercase in a text file

2009-06-27 Thread Angus Rodgers
On Sat, 27 Jun 2009 11:39:28 +0100, I asked rhetorically:

>On Fri, 26 Jun 2009 18:58:27 -0700 (PDT), powah
> wrote:
>
>>Thank you for your hint.
>>This is my solution:
>>f = open('test', 'r')
>>for line in f:
>>print line[0].upper()+line[1:],
>
>Will your program handle empty lines of input correctly?

Strangely enough, it seems to do so, but why?
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change the first character of the line to uppercase in a text file

2009-06-27 Thread Angus Rodgers
On Sat, 27 Jun 2009 13:02:47 +0200, Peter Otten 
<__pete...@web.de> wrote:

>Angus Rodgers wrote:
>
>> On Sat, 27 Jun 2009 11:39:28 +0100, I asked rhetorically:
>>
>>>Will your program handle empty lines of input correctly?
>> 
>> Strangely enough, it seems to do so, but why?
>
>Because there aren't any. When you read lines from a file there will always 
>be at least the newline character. Otherwise it would indeed fail:
>
>>>> for line in "peter\npaul\n\nmary".splitlines():
>... print line[0].upper() + line[1:]
>...
>Peter
>Paul
>Traceback (most recent call last):
>  File "", line 2, in 
>IndexError: string index out of range

Hmm ... the \r\n sequence at the end of a Win/DOS file seems to be
treated as a single character.

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change the first character of the line to uppercase in a text file

2009-06-27 Thread Angus Rodgers
On Sat, 27 Jun 2009 12:13:57 +0100, I wrote:

>the \r\n sequence at the end of a Win/DOS file

Of course, I meant the end of a line of text, not the end of
the file.

(I promise I'll try to learn to proofread my posts.  This is
getting embarrassing!)
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change the first character of the line to uppercase in a text file

2009-06-27 Thread Angus Rodgers
On Sat, 27 Jun 2009 12:13:57 +0100, I wrote:

>Hmm ... the \r\n sequence at the end of a Win/DOS file seems to be
>treated as a single character.

For instance, if test001A.txt is this file:

abc xyz
Bd ef

gH ij

and test001E.py is this:

f = open('test001A.txt', 'r')
for line in f:
   print repr(line)

then the output from "python test001E.py > temp.txt" is this:

'abc xyz\n'
'Bd ef\n'
'\n'
'gH ij\n'

and indeed the output from "print repr(f.read())" is this:

'abc xyz\nBd ef\n\ngH ij\n'

How do you actually get to see the raw bytes of a file in Windows?

OK, this seems to work:

f = open('test001A.txt', 'rb')   # Binary mode
print repr(f.read())

Output:

'abc xyz\r\nBd ef\r\n\r\ngH ij\r\n'

Indeed, when a Windows file is opened for reading in binary mode, 
the length of an "empty" line is returned as 2.  This is starting
to make some sense to me now.

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: change the first character of the line to uppercase in a text file

2009-06-27 Thread Angus Rodgers
On Sat, 27 Jun 2009 13:49:57 +0200, Peter Otten 
<__pete...@web.de> wrote:

>Angus Rodgers wrote:
>
>> On Sat, 27 Jun 2009 13:02:47 +0200, Peter Otten
>> <__pete...@web.de> wrote:
>> 
>>>Angus Rodgers wrote:
>>>
>>>> On Sat, 27 Jun 2009 11:39:28 +0100, I asked rhetorically:
>>>>
>>>>>Will your program handle empty lines of input correctly?
>>>> 
>>>> Strangely enough, it seems to do so, but why?
>>>
>>>Because there aren't any. When you read lines from a file there will
>>>always be at least the newline character. Otherwise it would indeed fail:
>>>
>>>>>> for line in "peter\npaul\n\nmary".splitlines():
>>>... print line[0].upper() + line[1:]
>>>...
>>>Peter
>>>Paul
>>>Traceback (most recent call last):
>>>  File "", line 2, in 
>>>IndexError: string index out of range
>> 
>> Hmm ... the \r\n sequence at the end of a Win/DOS 
>
>line
>
>> seems to be treated as a single character.
>
>Yes, but "\n"[1:] will return an empty string rather than fail.

Yes, I understood that, and it's logical, but what was worrying me
was how to understand the cross-platform behaviour of Python with
regard to the different representation of text files in Windows
and Unix-like OSs. (I remember getting all in a tizzy about this
the last time I tried to do any programming.  That was in C++,
about eight years ago.  Since then, I've only written a couple of
short BASIC programs for numerical analysis on a TI-84+ calculator,
and I feel as if I don't understand ANYTHING any more, but I expect
it'll come back to me.  Sorry about my recent flurry of confused
posts!  If I have any silly questions of my own, I'll post then to
the Tutor list, but in this instance, I imagined I knew what I was
talking about, and didn't expect to get into difficulties ...)  8-P

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Buffer pair for lexical analysis of raw binary data

2009-06-27 Thread Angus Rodgers

Partly as an educational exercise, and partly for its practical
benefit, I'm trying to pick up a programming project from where
I left off in 2001.  It implemented in slightly generalised form
the "buffer pair" scheme for lexical analysis described on pp.
88--92 of Aho et al., /Compilers: Principles, Techniques and 
Tools/ (1986). (I'm afraid I don't have a page reference for the
2007 second edition.  Presumably it's also in Knuth somewhere.)

Documentation for one of the C++ header files describes it thus
(but I never quite got the hang of C++, so some of the language-
specific details may be very poorly conceived):

"An  object incorporates a handle to a file, opened in 
read-only mode, and a buffer containing (by default) raw binary
data from that file. The constructor also has an option to open
a file in text mode.

The buffer may, optionally, consist of several segments, linked
to one another in cyclic sequence. The number of segments is a
constant class member, nblocks (1 <= nblocks <= 32,767). A second
constant class member, block (1 <= block <= 32,767) gives the size
of each of the segments in bytes.

The purpose of creating a buffer in cyclically linked segments
is to allow reference to the history of reading the file, even
though it is being read sequentially. The bare class  
does not do this itself, but is designed so that classes derived
from it may incorporate one or more pointers to parts of the buffer
that have already been read (assuming these parts have not yet been
overwritten).

If there were only one segment, the length of available history
would periodically be reduced to zero, when the buffer is re-
freshed. In general, the available history occupies at least 
a fraction (nblocks - 1)/nblocks of a full buffer."

Aho et al. describe the scheme thus (p. 90):

"Two pointers to the input buffer are maintained.  The string
of characters between the two pointers is the current lexeme.
Initially, both pointers point to the first character of the
next lexeme to be found.  One, called the forward pointer, scans
ahead until a match for a pattern is found.  Once the next lexeme
is determined, the forward pointer is set to the character at
its right end.  After the lexeme is processed, both pointers
are set to the character immediately past the lexeme."

[There follows a description of the use of "sentinels" to test
efficiently for pointers moving past the end of input to date.]

I seem to remember (but my memory is still very hazy) that there
was some annoying difficulty in coding the raw binary input file
reading operation in C++ in an implementation-independent way;
and I'm reluctant to go back and perhaps get bogged down again
in whatever way I got bogged down before; so I would prefer to
use Python for the whole thing, if possible (either using some
existing library, or else by recoding it all myself in Python).

Does some Python library already provide some functionality like
this?  (It's enough to do it with nblocks = 2, as in Aho et al.)

If not, is this a reasonable thing to try to program in Python?
(At the same time as learning the language, and partly as a
fairly demanding exercise intended to help me to learn it.)

Or should I just get my hands dirty with some C++ compiler or
other, and get my original code working on my present machine
(possibly in ANSI C instead of C++), and call it from Python?

-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Buffer pair for lexical analysis of raw binary data

2009-06-28 Thread Angus Rodgers
On 28 Jun 2009 08:00:23 -0700, a...@pythoncraft.com (Aahz) wrote:

>In article <0qec45lho8lkng4n20sb1ad4eguat67...@4ax.com>,
>Angus Rodgers   wrote:
>>
>>Partly as an educational exercise, and partly for its practical
>>benefit, I'm trying to pick up a programming project from where
>>I left off in 2001.  It implemented in slightly generalised form
>>the "buffer pair" scheme for lexical analysis described on pp.
>>88--92 of Aho et al., /Compilers: Principles, Techniques and 
>>Tools/ (1986). (I'm afraid I don't have a page reference for the
>>2007 second edition.  Presumably it's also in Knuth somewhere.)
>>
>>  [...]
>>
>>Does some Python library already provide some functionality like
>>this?  (It's enough to do it with nblocks = 2, as in Aho et al.)
>
>Not AFAIK, but there may well be something in the recipes or PyPI; have
>you tried searching them?

Searching for "buffer" at <http://pypi.python.org/pypi> (which I
didn't know about) gives quite a few hits (including reflex 0.1,
"A lightweight regex-based lexical scanner library").

By "recipes", do you mean
<http://code.activestate.com/recipes/langs/python/> (also new to me)?

There is certainly a lot of relevant code there (e.g. "Recipe 392150:
Buffered Stream with Multiple Forward-Only Readers"), which I can try
to learn from, even if I can't use it directly.

Thanks!
-- 
Angus Rodgers
-- 
http://mail.python.org/mailman/listinfo/python-list