On Dec 21, 2:01 am, Alexander Kapps wrote:
> On 20.12.2011 22:04, Nick Dokos wrote:
>
>
>
>
>
>
>
>
>
> >>> I have a text file containing such data ;
>
> >>> A B C
> >>> ---
> >>> -2.0100e-01 8.000e-02
On 20.12.2011 22:04, Nick Dokos wrote:
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-028.000e-05
-2.e-010.000e+00 4.800e-04
-1.9900e-014.000e-021.600e-04
Jérôme wrote:
> Tue, 20 Dec 2011 11:17:15 -0800 (PST)
> Yigit Turgut a écrit:
>
> > Hi all,
> >
> > I have a text file containing such data ;
> >
> > ABC
> > ---
> > -2.0100e-018.000e-028.000e-0
Tue, 20 Dec 2011 11:17:15 -0800 (PST)
Yigit Turgut a écrit:
> Hi all,
>
> I have a text file containing such data ;
>
> ABC
> ---
> -2.0100e-018.000e-028.000e-05
> -2.e-010.000e+00 4.800
On 12/20/2011 02:17 PM, Yigit Turgut wrote:
Hi all,
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-028.000e-05
-2.e-010.000e+00 4.800e-04
-1.9900e-014.000e-02
Hi all,
I have a text file containing such data ;
ABC
---
-2.0100e-018.000e-028.000e-05
-2.e-010.000e+00 4.800e-04
-1.9900e-014.000e-021.600e-04
But I only need Section B, and I
> haven't used XSLT, and don't know if there's one in emacs...
>
> it'd be nice if someone actually give a example...
>
Hi Xah, actually I have to correct myself. HTML is not XML. If it
were, you
could use a stylesheet like this:
http://www.w3.org/1999/XSL/Transform";>
On Tue, Jul 5, 2011 at 2:37 PM, Xah Lee wrote:
> but in anycase, i can't see how this part would work
> ((?:[^<]|<(?!/p>))+)
It's not that different from the pattern 「alt="[^"]+"」 earlier in the
regex. The capture group accepts one or more characters that either
aren't '<', or that are '<' but a
On Jul 5, 12:17 pm, Ian Kelly wrote:
> On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee wrote:
> > So, a solution by regex is out.
>
> Actually, none of the complications you listed appear to exclude
> regexes. Here's a possible (untested) solution:
>
>
> ((?:\s* height="[0-9]+">)+)
> \s*((?:[^<]|<(?!/
On Jul 5, 12:17 pm, Ian Kelly wrote:
> On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee wrote:
> > So, a solution by regex is out.
>
> Actually, none of the complications you listed appear to exclude
> regexes. Here's a possible (untested) solution:
>
>
> ((?:\s* height="[0-9]+">)+)
> \s*((?:[^<]|<(?!/
On Jul 4, 12:13 pm, "S.Mandl" wrote:
> Nice. I guess that XSLT would be another (the official) approach for
> such a task.
> Is there an XSLT-engine for Emacs?
>
> -- Stefan
haven't used XSLT, and don't know if there's one in emacs...
it'd be nice if someone actually give a example...
Xah
--
On Mon, Jul 4, 2011 at 12:36 AM, Xah Lee wrote:
> So, a solution by regex is out.
Actually, none of the complications you listed appear to exclude
regexes. Here's a possible (untested) solution:
((?:\s*)+)
\s*((?:[^<]|<(?!/p>))+)
\s*
and corresponding replacement string:
\1
\2
I don't kno
Nice. I guess that XSLT would be another (the official) approach for
such a task.
Is there an XSLT-engine for Emacs?
-- Stefan
--
http://mail.python.org/mailman/listinfo/python-list
llows.
--
Emacs Lisp: Processing HTML: Transform Tags to HTML5 “figure” and
“figcaption” Tags
Xah Lee, 2011-07-03
Another triumph of using elisp for text processing over perl/python.
The Problem
--
Summary
I want batch tran
pyt...@bdurham.com, 16.12.2010 21:03:
Is text processing with dicts a good use case for Python
cross-compilers like Cython/Pyrex or ShedSkin? (I've read the
cross compiler claims about massive increases in pure numeric
performance).
Cython is generally a good choice for string proce
Is text processing with dicts a good use case for Python
cross-compilers like Cython/Pyrex or ShedSkin? (I've read the
cross compiler claims about massive increases in pure numeric
performance).
I have 3 use cases I'm considering for Python-to-C++
cross-compilers for generating 32-
Steven D'Aprano wrote:
>On Fri, 11 Sep 2009 21:52:36 -0700, Tim Roberts wrote:
>
>> Basically, when you're good with Perl, you start to think of every task
>> in terms of regular expression matches. When you're good with Python,
>> you start to think of every task in terms of lists and tuples.
>
On Fri, 11 Sep 2009 21:52:36 -0700, Tim Roberts wrote:
> Basically, when you're good with Perl, you start to think of every task
> in terms of regular expression matches. When you're good with Python,
> you start to think of every task in terms of lists and tuples.
Not me -- I think of most such
AJAskey wrote:
>
>Never mind. I guess I had been trying to make it more difficult than
>it is. As a note, I can work on something for 10 hours and not figure
>it out. But the second I post to a group, then I immediately figure
>it out myself. Strange snake this Python...
Come sit on the couch
Never mind. I guess I had been trying to make it more difficult than
it is. As a note, I can work on something for 10 hours and not figure
it out. But the second I post to a group, then I immediately figure
it out myself. Strange snake this Python...
Example for anyone else interested:
line =
On Thu, Sep 10, 2009 at 11:36 AM, AJAskey wrote:
> New to Python. I can solve the problem in perl by using "split()" to
> an array. Can't figure it out in Python.
>
> I'm reading variable lines of text. I want to use the first number I
> find. The problem is the lines are variable.
>
> Input
New to Python. I can solve the problem in perl by using "split()" to
an array. Can't figure it out in Python.
I'm reading variable lines of text. I want to use the first number I
find. The problem is the lines are variable.
Input example:
this is a number: 1
here are some numbers 1 2 3 4
Thanks Black Jack
Working
--
http://mail.python.org/mailman/listinfo/python-list
On Sep 25, 9:51 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> I have string like follow
> 12560/ABC,12567/BC,123,567,890/JK
>
> I want above string to group like as follow
> (12560,ABC)
> (12567,BC)
> (123,567,890,JK)
>
> i try regular expression i am able to get first two not the third one.
On Sep 25, 6:34 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> On Thu, 25 Sep 2008 15:51:28 +0100, [EMAIL PROTECTED] wrote:
> > I have string like follow
> > 12560/ABC,12567/BC,123,567,890/JK
>
> > I want above string to group like as follow (12560,ABC)
> > (12567,BC)
> > (123,567,890,JK
You can do it with regexps too :
>--
import re
to_watch = re.compile(r"(?P\d+)[/](?P[A-Z]+)")
final_list = to_watch.findall("12560/ABC,12567/BC,123,567,890/JK")
for number,word in final_list :
print "number:%s -- word: %s"%(num
On Thu, 25 Sep 2008 15:51:28 +0100, [EMAIL PROTECTED] wrote:
> I have string like follow
> 12560/ABC,12567/BC,123,567,890/JK
>
> I want above string to group like as follow (12560,ABC)
> (12567,BC)
> (123,567,890,JK)
>
> i try regular expression i am able to get first two not the third one.
> ca
I have string like follow
12560/ABC,12567/BC,123,567,890/JK
I want above string to group like as follow
(12560,ABC)
(12567,BC)
(123,567,890,JK)
i try regular expression i am able to get first two not the third one.
can regular expression given data in different groups
--
http://mail.python.org
... continued from previous post.
PS I'm cross-posting this post to perl and python groups because i
find that it being a little know fact that emacs lisp's power in the
area of text processing, are far beyond Perl (or Python).
... i worked as a professional perl programer since 1998.
Text Processing with Emacs Lisp
Xah Lee, 2007-10-29
This page gives a outline of how to use emacs lisp to do text
processing, using a specific real-world problem as example. If you
don't know elisp, first take a gander at Emacs Lisp Basics.
HTML version with links and colors is at:
[EMAIL PROTECTED] wrote:
>
>And now for something completely different...
>
>I've been reading up a bit about Python and Excel and I quickly told
>the program to output to Excel quite easily. However, what if the
>input file were a Word document? I can't seem to find much
>information about parsi
And now for something completely different...
I've been reading up a bit about Python and Excel and I quickly told
the program to output to Excel quite easily. However, what if the
input file were a Word document? I can't seem to find much
information about parsing Word files. What could I add
And now for something completely different...
I see a lot of COM stuff with Python for excel...and I quickly made
the same program output to excel. What if the input file were a Word
document? Where is there information about manipulating word
documents, or what could I add to make the same prog
patrick.waldo wrote:
> manipulation? Also, I conceptually get it, but would you mind walking
> me through
>> for key, group in groupby(instream, unicode.isspace):
>> if not key:
>> yield "".join(group)
itertools.groupby() splits a sequence into groups with the same key; e. g
On Oct 14, 8:48 am, [EMAIL PROTECTED] wrote:
> Hi all,
>
> I started Python just a little while ago and I am stuck on something
> that is really simple, but I just can't figure out.
>
> Essentially I need to take a text document with some chemical
> information in Czech and organize it into another
On Oct 15, 10:08 pm, [EMAIL PROTECTED] wrote:
> Because of my limited Python knowledge, I will need to try to figure
> out exactly how they work for future text manipulation and for my own
> knowledge. Could you recommend some resources for this kind of text
> manipulation? Also, I conceptually g
Wow, thank you all. All three work. To output correctly I needed to
add:
output.write("\r\n")
This is really a great help!!
Because of my limited Python knowledge, I will need to try to figure
out exactly how they work for future text manipulation and for my own
knowledge. Could you recommend
patrick.waldo wrote:
> my sample input file looks like this( not organized,as you see it):
> 200-720-769-93-2
> kyselina mocová C5H4N4O3
>
> 200-001-8 50-00-0
> formaldehyd CH2O
>
> 200-002-3
> 50-01-1
> guanidínium-chlorid CH5N3.ClH
Assuming that the records are al
On Oct 15, 12:20 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote:
> > my sample input file looks like this( not organized,as you see it):
> > 200-720-769-93-2
> > kyselina mocová C5H4N4O3
>
> > 200-001-8 50-00-0
>
On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote:
> my sample input file looks like this( not organized,as you see it):
> 200-720-769-93-2
> kyselina mocová C5H4N4O3
>
> 200-001-8 50-00-0
> formaldehyd CH2O
>
> 200-002-3
> 50-01-1
> guanidínium-chlorid CH5N3.C
> lines = open('your_file.txt').readlines()[:4]
> print lines
> print map(len, lines)
gave me:
['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov
\xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n']
[28, 32, 1, 18]
I think it means that I'm still at option 3. I got
> lines = open('your_file.txt').readlines()[:4]
> print lines
> print map(len, lines)
gave me:
['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov
\xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n']
[28, 32, 1, 18]
I think it means that I'm still at option 3. I got
On Oct 14, 11:48 pm, [EMAIL PROTECTED] wrote:
> Hi all,
>
> I started Python just a little while ago and I am stuck on something
> that is really simple, but I just can't figure out.
>
> Essentially I need to take a text document with some chemical
> information in Czech and organize it into anothe
On Sun, 14 Oct 2007 16:57:06 +, patrick.waldo wrote:
> Thank you both for helping me out. I am still rather new to Python
> and so I'm probably trying to reinvent the wheel here.
>
> When I try to do Paul's response, I get
tokens = line.strip().split()
> []
What is in `line`? Paul wrot
Thank you both for helping me out. I am still rather new to Python
and so I'm probably trying to reinvent the wheel here.
When I try to do Paul's response, I get
>>>tokens = line.strip().split()
[]
So I am not quite sure how to read line by line.
tokens = input.read().split() gets me all the in
On Oct 14, 2:48 pm, [EMAIL PROTECTED] wrote:
> Hi all,
>
> I started Python just a little while ago and I am stuck on something
> that is really simple, but I just can't figure out.
>
> Essentially I need to take a text document with some chemical
> information in Czech and organize it into another
On Sun, 14 Oct 2007 13:48:51 +, patrick.waldo wrote:
> Essentially I need to take a text document with some chemical
> information in Czech and organize it into another text file. The
> information is always EINECS number, CAS, chemical name, and formula
> in tables. I need to organize them
Hi all,
I started Python just a little while ago and I am stuck on something
that is really simple, but I just can't figure out.
Essentially I need to take a text document with some chemical
information in Czech and organize it into another text file. The
information is always EINECS number, CAS
On Sep 7, 3:50 am, George Sakkis <[EMAIL PROTECTED]> wrote:
> On Sep 5, 5:17 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> wrote:
> If this was a code golf challenge,
I'd choose the Unix split solution and be both maintainable as well as
concise :-)
- Paddy.
--
http://mail.python.org/mailman/li
On Sep 5, 5:17 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> On Sep 5, 1:28 pm, Paddy <[EMAIL PROTECTED]> wrote:
>
> > On Sep 5, 5:13 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > wrote:
>
> > > I have a text source file of about 20.000 lines.>From this file, I like
> > > to write the fir
Shawn Milochik wrote:
> On 9/5/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
>> I have a text source file of about 20.000 lines.
>> >From this file, I like to write the first 5 lines to a new file. Close
>> that file, grab the next 5 lines write these to a new file... grabbing
>> 5 lines and cre
Here's my solution, for what it's worth:
#!/usr/bin/env python
import os
input = open("test.txt", "r")
counter = 0
fileNum = 0
fileName = ""
def newFileName():
global fileNum, fileName
while os.path.exists(fileName) or fileName == "":
fileNum += 1
x = "%0.5d" % fileN
[EMAIL PROTECTED] escribió:
> I am still wondering how to do this efficiently in Python (being kind
> of new to it... and it's not for homework).
You should post some code anyway, it would be easier to give useful advice (it
would also demonstrate that you put some effort on it).
Anyway, here i
> Thanks for making me aware of the (UNIX) split command (split -l 5
> inFile.txt), it's short, it's fast, it's beautiful.
>
> I am still wondering how to do this efficiently in Python (being kind
> of new to it... and it's not for homework).
Something like this should do the job:
def nlines(num
On Sep 6, 12:46 am, Steve Holden <[EMAIL PROTECTED]> wrote:
> Arnaud Delobelle wrote:
[...]
> > print "all done!" # All done
> > print "Now there are 4000 files in this directory..."
>
> > Python 3.0 - ready (I've used open() instead of file())
>
> bzzt!
>
> Python 3.0a1 (py3k:57844, Aug 31
and u can
parse lines from read buffer freely.
have fun!
- Original Message -
From: "Shawn Milochik" <[EMAIL PROTECTED]>
To:
Sent: Thursday, September 06, 2007 1:03 AM
Subject: Re: Text processing and file creation
> On 9/5/07, [EMAIL PROTECTED] <[EMAIL PROTECTE
Arnaud Delobelle wrote:
[...]
> from my_useful_functions import new_file, write_first_5_lines,
> done_processing_file, grab_next_5_lines, another_new_file, write_these
>
> in_f = open('myfile')
> out_f = new_file()
> write_first_5_lines(in_f, out_f) # write first 5 lines
> close(out_f)
> while not
On Sep 5, 5:13 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> I have a text source file of about 20.000 lines.>From this file, I like to
> write the first 5 lines to a new file. Close
>
> that file, grab the next 5 lines write these to a new file... grabbing
> 5 lines and creating new files
On Sep 5, 1:28 pm, Paddy <[EMAIL PROTECTED]> wrote:
> On Sep 5, 5:13 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> wrote:
>
> > I have a text source file of about 20.000 lines.>From this file, I like to
> > write the first 5 lines to a new file. Close
>
> > that file, grab the next 5 lines write t
On Sep 5, 5:13 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> I have a text source file of about 20.000 lines.>From this file, I like to
> write the first 5 lines to a new file. Close
>
> that file, grab the next 5 lines write these to a new file... grabbing
> 5 lines and creating new files
[EMAIL PROTECTED] wrote:
> I have a text source file of about 20.000 lines.
>>From this file, I like to write the first 5 lines to a new file. Close
> that file, grab the next 5 lines write these to a new file... grabbing
> 5 lines and creating new files until processing of all 20.000 lines is
> do
On Sep 5, 11:57 am, Bjoern Schliessmann wrote:
> [EMAIL PROTECTED] wrote:
> > I would use a counter in a for loop using the readline method to
> > iterate over the 20,000 line file.
>
> file objects are iterables themselves, so there's no need to do that
> by using a method.
Very true! Darn it!
On 9/5/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> I have a text source file of about 20.000 lines.
> >From this file, I like to write the first 5 lines to a new file. Close
> that file, grab the next 5 lines write these to a new file... grabbing
> 5 lines and creating new files until proces
[EMAIL PROTECTED] wrote:
> I would use a counter in a for loop using the readline method to
> iterate over the 20,000 line file.
file objects are iterables themselves, so there's no need to do that
by using a method.
> Reset the counter every 5 lines/ iterations and close the file.
I'd use a
[EMAIL PROTECTED] escribió:
> I have a text source file of about 20.000 lines.
>>From this file, I like to write the first 5 lines to a new file. Close
> that file, grab the next 5 lines write these to a new file... grabbing
> 5 lines and creating new files until processing of all 20.000 lines is
On Sep 5, 11:13 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> I have a text source file of about 20.000 lines.>From this file, I like to
> write the first 5 lines to a new file. Close
>
> that file, grab the next 5 lines write these to a new file... grabbing
> 5 lines and creating new files
I have a text source file of about 20.000 lines.
>From this file, I like to write the first 5 lines to a new file. Close
that file, grab the next 5 lines write these to a new file... grabbing
5 lines and creating new files until processing of all 20.000 lines is
done.
Is there an efficient way to d
> > I'm in a process of rewriting a bash/awk/sed script -- that grew to
> > big -- in python. I can rewrite it in a simple line-by-line way but
> > that results in ugly python code and I'm sure there is a simple
> > pythonic way.
> >
> > The bash script processed text files of the form:
> >
> > ###
On Mar 23, 5:30 pm, "Daniel Nogradi" <[EMAIL PROTECTED]> wrote:
> Hi list,
>
> I'm in a process of rewriting a bash/awk/sed script -- that grew to
> big -- in python. I can rewrite it in a simple line-by-line way but
> that results in ugly python code and I'm sure there is a simple
> pythonic way.
On Mar 23, 10:30 pm, "Daniel Nogradi" <[EMAIL PROTECTED]> wrote:
> Hi list,
>
> I'm in a process of rewriting a bash/awk/sed script -- that grew to
> big -- in python. I can rewrite it in a simple line-by-line way but
> that results in ugly python code and I'm sure there is a simple
> pythonic way.
> This is my first try:
>
> ddata = {}
>
> inside_matrix = False
> for row in file("data.txt"):
> if row.strip():
> fields = row.split()
> if len(fields) == 2:
> inside_matrix = False
> ddata[fields[0]] = [fields[1]]
> lastkey = fields[0]
>
Daniel Nogradi:
> Any elegant solution for this?
This is my first try:
ddata = {}
inside_matrix = False
for row in file("data.txt"):
if row.strip():
fields = row.split()
if len(fields) == 2:
inside_matrix = False
ddata[fields[0]] = [fields[1]]
Hi list,
I'm in a process of rewriting a bash/awk/sed script -- that grew to
big -- in python. I can rewrite it in a simple line-by-line way but
that results in ugly python code and I'm sure there is a simple
pythonic way.
The bash script processed text files of the form:
###
I remember something about it coming up in some of the discussions of
free lists and better behavior in this regard in 2.5, but I don't
remember the details.
Under Python 2.5, my original code posting no longer exhibits the bug - upon
calling del(a), python's size shrinks back to ~4 MB, which i
On 1/8/07, tsuraan <[EMAIL PROTECTED]> wrote:
>
>
> > My first thought was that interned strings were causing the growth,
> > but that doesn't seem to be the case.
>
> Interned strings, as of 2.3, are no longer immortal, right? The intern doc
> says you have to keep a reference around to the strin
My first thought was that interned strings were causing the growth,
but that doesn't seem to be the case.
Interned strings, as of 2.3, are no longer immortal, right? The intern doc
says you have to keep a reference around to the string now, anyhow. I
really wish I could find that thing I read
$ python
Python 2.4.4c1 (#2, Oct 11 2006, 21:51:02)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> # Python is using 2.7 MiB
... a = ['1234' for i in xrange(10 << 20)]
>>> # Python is using 42.9 MiB
..
On 1/8/07, Felipe Almeida Lessa <[EMAIL PROTECTED]> wrote:
> On 1/8/07, tsuraan <[EMAIL PROTECTED]> wrote:
> >
> >
> > > I just tried on my system
> > >
> > > (Python is using 2.9 MiB)
> > > >>> a = ['a' * (1 << 20) for i in xrange(300)]
> > > (Python is using 304.1 MiB)
> > > >>> del a
> > > (Pyth
On 1/8/07, tsuraan <[EMAIL PROTECTED]> wrote:
>
>
> > I just tried on my system
> >
> > (Python is using 2.9 MiB)
> > >>> a = ['a' * (1 << 20) for i in xrange(300)]
> > (Python is using 304.1 MiB)
> > >>> del a
> > (Python is using 2.9 MiB -- as before)
> >
> > And I didn't even need to tell the ga
I just tried on my system
(Python is using 2.9 MiB)
>>> a = ['a' * (1 << 20) for i in xrange(300)]
(Python is using 304.1 MiB)
>>> del a
(Python is using 2.9 MiB -- as before)
And I didn't even need to tell the garbage collector to do its job. Some
info:
It looks like the big difference betwe
On 1/8/07, tsuraan <[EMAIL PROTECTED]> wrote:
[snip]
> The loop is deep enough that I always interrupt it once python's size is
> around 250 MB. Once the gc.collect() call is finished, python's size has
> not changed a bit.
[snip]
> This has been tried under python 2.4.3 in gentoo linux and python
After reading
http://www.python.org/doc/faq/general/#how-does-python-manage-memory, I
tried modifying this program as below:
a=[]
for i in xrange(33,127):
for j in xrange(33,127):
for k in xrange(33,127):
for l in xrange(33, 127):
a.append(chr(i)+chr(j)+chr(k)+chr(l))
import sys
sys
I have a pair of python programs that parse and index files on my computer
to make them searchable. The problem that I have is that they continually
grow until my system is out of memory, and then things get ugly. I
remember, when I was first learning python, reading that the python
interpreter
Harold> To illustrate, assume I have a text file, call it test.txt, with
Harold> the following information:
Harold> X11 .32
Harold> X22 .45
Harold> My goal in the python program is to manipulate this file such
Harold> that a new file would be created that looks like:
I am beginning to use python primarily to organize data into formats
needed for input into some statistical packages. I do not have much
programming experience outside of LaTeX and R, so some of this is a bit
new. I am attempting to write a program that reads in a text file that
contains some value
Alexis Gallagher wrote:
> Steve,
>
> First, many thanks!
>
> Steve Holden wrote:
>> Alexis Gallagher wrote:
>>>
>>> filehandle = open("data",'r',buffering=1000)
>>
>> This buffer size seems, shall we say, unadventurous? It's likely to
>> slow things down considerably, since the filesystem is pr
Steve,
First, many thanks!
Steve Holden wrote:
> Alexis Gallagher wrote:
>>
>> filehandle = open("data",'r',buffering=1000)
>
> This buffer size seems, shall we say, unadventurous? It's likely to slow
> things down considerably, since the filesystem is probably going to
> naturally wnt to use
Maybe this code will be faster? (If it even does the same thing:
largely untested)
filehandle = open("data",'r',buffering=1000)
fileIter = iter(filehandle)
lastLine = fileIter.next()
lastTokens = lastLine.strip().split(delimiter)
lastGeno = extract(lastTokens[0])
for currentLine in fileIter:
Alexis Gallagher wrote:
> (I tried to post this yesterday but I think my ISP ate it. Apologies if
> this is a double-post.)
>
> Is it possible to do very fast string processing in python? My
> bioinformatics application needs to scan very large ASCII files (80GB+),
> compare adjacent lines, and
(I tried to post this yesterday but I think my ISP ate it. Apologies if
this is a double-post.)
Is it possible to do very fast string processing in python? My
bioinformatics application needs to scan very large ASCII files (80GB+),
compare adjacent lines, and conditionally do some further proce
Gregory Piñero wrote:
>That's how Python works. You read in the whole file, edit it, and write it
> back out.
that's how file systems work. if file systems generally supported insert
operations, Python would of course support that feature.
--
http://mail.python.org/mailman/listinfo/python-
[EMAIL PROTECTED] writes:
> I'm a total newbie to Python so any and all advice is greatly
> appreciated.
Well, I've got some for you.
> I'm trying to use regular expressions to process text in an SGML file
> but only in one section.
This is generally a bad idea. SGML family languages aren't easy
You can edit a file in place, but it is not applicable to what you are doing.
As soon as you insert the first "", you've shifted everything
downstream by those 8 bytes. Since they map to a physically located blocks on
a physical drive, you will have to rewrite those blocks. If it is a big file
That's how Python works. You read in the whole file, edit it, and
write it back out. As far as I know there's no way to edit a file
"in place" which I'm assuming is what you're asking?
And now, cue the responses telling you to use a fancy parser (XML?) for your project ;-)
-Greg
On 4 Oct 2005 2
Hi,
I'm a total newbie to Python so any and all advice is greatly
appreciated.
I'm trying to use regular expressions to process text in an SGML file
but only in one section.
So the input would look like this:
RESEARCH GUIDE
content
content
content
content
FORMS
content
content
content
cont
Yes indeed, the real data often has surprising differences from the
simulations! :)
It turns out that pyparsing LineStart()'s are pretty fussy. Usually,
pyparsing is very forgiving about whitespace between expressions, but
it turns out that LineStart *must* be followed by the next expression,
wit
[EMAIL PROTECTED] wrote:
> Paul McGuire wrote:
> > match...), this program has quite a few holes.
> tried run it though and it is not working for me. The following code
> runs but prints nothing at all:
>
> import pyparsing as prs
>
And this is the point where I have to post the real stuff because
Paul McGuire wrote:
> match...), this program has quite a few holes.
>
> What if the word "Identifier" is inside one of the quoted strings?
> What if the actual value is "tablename10"? This will match your
> "tablename1" string search, but it is certainly not what you want.
> Did you know there ar
Miki Tebeka wrote:
> Look at re.findall, I think it'll be easier.
Minor changes aside the interesting thing, as you pointed out, would be
using re.findall. I could not figure out how to.
--
http://mail.python.org/mailman/listinfo/python-list
Hello pruebauno,
> import re
> f=file('tlst')
> tlst=f.read().split('\n')
> f.close()
tlst = open("tlst").readlines()
> f=file('plst')
> sep=re.compile('Identifier "(.*?)"')
> plst=[]
> for elem in f.read().split('Identifier'):
> content='Identifier'+elem
> match=sep.search(content)
>
1 - 100 of 115 matches
Mail list logo