ElementTree: can't figure out a mismached-tag error

2013-07-11 Thread F.R.

Hi all,

I haven't been able to get up to speed with XML. I do examples from the 
tutorials and experiment with variations. Time and time again I fail 
with errors messages I can't make sense of. Here's the latest one. The 
url is "http://finance.yahoo.com/q?s=XIDEQ&ql=0";. Ubuntu 12.04 LTS, 
Python 2.7.3 (default, Aug  1 2012, 05:16:07) [GCC 4.6.3]


>>> import xml.etree.ElementTree as ET
>>> tree = ET.parse('q?s=XIDEQ')  # output of wget 
http://finance.yahoo.com/q?s=XIDEQ&ql=0

Traceback (most recent call last):
  File "", line 1, in 
tree = ET.parse('q?s=XIDEQ')
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse
tree.parse(source, parser)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed
self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1507, in 
_raiseerror

raise err
ParseError: mismatched tag: line 9, column 2

Below first nine lines. The line numbers and the following space are 
hand-edited in. Three dots stand for sections cut out to fit long lines. 
Line 6 is a bunch of "meta" statements, all of which I show on a 
separate line each in order to preserve the angled brackets. On all 
lines the angled brackets have been preserved. The mismatched character 
is the slash of the closing tag . What could be wrong with it? 
And if it is, what about fault tolerance?


1 
2 
3 
4 XIDEQ: Summary for EXIDE TECH NEW- Yahoo! Finance
5 

  
  
  
  
  
  content="http://l.yimg.com/a/p/fi/31/09/00.jpg";>

  http://finance.yahoo.com/q?s=XIDEQ";>
  href="http://finance.yahoo.com/q?s=XIDEQ";>

8 http://l.yimg.com/zz/ . . . type="text/css">
9 
   ^
Mismatch!

Thanks for suggestions

Frederic

--
http://mail.python.org/mailman/listinfo/python-list


Re: ElementTree: can't figure out a mismached-tag error

2013-07-11 Thread F.R.

On 07/11/2013 10:59 AM, F.R. wrote:

Hi all,

I haven't been able to get up to speed with XML. I do examples from 
the tutorials and experiment with variations. Time and time again I 
fail with errors messages I can't make sense of. Here's the latest 
one. The url is "http://finance.yahoo.com/q?s=XIDEQ&ql=0";. Ubuntu 
12.04 LTS, Python 2.7.3 (default, Aug  1 2012, 05:16:07) [GCC 4.6.3]


>>> import xml.etree.ElementTree as ET
>>> tree = ET.parse('q?s=XIDEQ')  # output of wget 
http://finance.yahoo.com/q?s=XIDEQ&ql=0

Traceback (most recent call last):
  File "", line 1, in 
tree = ET.parse('q?s=XIDEQ')
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse
tree.parse(source, parser)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
parser.feed(data)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed
self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1507, in 
_raiseerror

raise err
ParseError: mismatched tag: line 9, column 2

Below first nine lines. The line numbers and the following space are 
hand-edited in. Three dots stand for sections cut out to fit long 
lines. Line 6 is a bunch of "meta" statements, all of which I show on 
a separate line each in order to preserve the angled brackets. On all 
lines the angled brackets have been preserved. The mismatched 
character is the slash of the closing tag . What could be wrong 
with it? And if it is, what about fault tolerance?


1 
2 
3 

4 XIDEQ: Summary for EXIDE TECH NEW- Yahoo! Finance
5 

  
  
  
  
  
  content="http://l.yimg.com/a/p/fi/31/09/00.jpg";>

  http://finance.yahoo.com/q?s=XIDEQ";>
  href="http://finance.yahoo.com/q?s=XIDEQ";>
8 http://l.yimg.com/zz/ . . . 
type="text/css">

9 
   ^
Mismatch!

Thanks for suggestions

Frederic


Thank you all!

I was a little apprehensive it could be a silly mistake. And so it was. 
I have BeautifulSoup somewhere. Having had no urgent need for it I 
remember shirking the learning curve.


lxml seems to be a package with these components (from help (lxml)):

PACKAGE CONTENTS
ElementInclude
_elementpath
builder
cssselect
doctestcompare
etree
html (package)
isoschematron (package)
objectify
pyclasslookup
sax
usedoctest

I would start with "from lxml import html" and see what comes out.

Break time now. Thanks again!

Frederic

--
http://mail.python.org/mailman/listinfo/python-list


Re: attaching names to subexpressions

2012-10-28 Thread F.R.

On 10/28/2012 06:57 AM, Devin Jeanpierre wrote:

line = function(x, y, z)
>while line:
>  do something with(line)
>  line = function(x, y, z)



How about:

line = True
while line:
line = function(x, y, z)
do something with(line)

?

Frederic

--
http://mail.python.org/mailman/listinfo/python-list


Re: [Gimp-user] export to non xcf

2012-10-29 Thread F.R.

On 10/28/2012 09:09 PM, Michael Schumacher wrote:

Von: Donald Miller 
Can't directly save to jpg, so exported.
Export to jpg made png. Same for psd.
Shouldn't name track chosen format, so no manual override needed?

Maybe you had set the file-type chooser to this format?
The default "By extension" should do what you want, any other value is for 
special cases like you've discovered, like ambiguous file name extensions.


Regards,
Michael
___
gimp-user-list mailing list
gimp-user-l...@gnome.org
https://mail.gnome.org/mailman/listinfo/gimp-user-list
I am contending with a similar malfunction: I can export by extension, 
but when done Gimp closes. The exported file exists, but I need to 
restart Gimp after every export. No loss of capability, but annoying.


Judging by the flood of Gimp-related posts of late and the great variety 
of the issues raised, there seem to be major stability or environmental 
compatibility problems. I am running Gimp 2.6.12 on Ubunty 12.04.
I found a Gimp user list 
(https://mail.gnome.org/mailman/listinfo/gimp-user-list) and intend to 
check it out the moment I get around to it.


Frederic

--
http://mail.python.org/mailman/listinfo/python-list


Strange object identity problem

2012-11-12 Thread F.R.

Hi all,

Once in a while I write simple routine stuff and spend the next few hours
trying to understand why it doesn't behave as I expect. Here is an example
holding me up: I have a module "st" with a class "runs". In a loop I 
repeatedly

create an object "ba" and call the method "ba.run ()" which processes the
constructor's arguments. Next I store the object in a dictionary "bas". It
then turns out that all keys hold the same object, namely the one created
last in the loop.
 Verifying the identity of each object when it is being assigned to
the dictionary reveals different identities. Repeating the verification
after the loop is done shows the same object in all key positions:

>>> bas = {}
>>> for year in range (2010, 2013):
ba = st.runs ('BA', '%d-01-01' % year, '%d-12-31' % year)
ba.run ()
print year, id (ba)
bas [year] = ba

2010 150289932
2011 150835852
2012 149727788

>>> for y in sorted (bas.keys ()):
b = bas [year]
print y, id (b)

2010 149727788
2011 149727788
2012 149727788



The class "runs" has a bunch of attributes, among which an object 
"parameters"
for tweaking processing runs and a object "quotes" containing a list of 
data

base records. Both objects are created by "runs.__init__ (...)".

Trying something similar with a simpler class works as expected:

>>> class C:
def __init__ (self, i):
self.i = i
def run (self):
self.ii = self.i * self.i

>>> cees = {}
>>> for year in range (2010, 2013):
c = C (year)
c.run ()
print year, id (c)
cees [year] = c

2010 150837804
2011 148275756
2012 146131212

>>> for year in sorted (cees.keys ()):
print year, id (cees [year]), cees [year].ii

2010 150837804 4040100
2011 148275756 4044121
2012 146131212 4048144




I have checked for name clashes and found none, wondering what to check
next for. Desperate for suggestions.


Frederic


(Python 2.7 on Ubuntu 12.04)

--
http://mail.python.org/mailman/listinfo/python-list


Re: Strange object identity problem

2012-11-12 Thread F.R.

On 11/12/2012 02:27 PM, Robert Franke wrote:

Hi Frederic,

[...]


bas = {}
for year in range (2010, 2013):

 ba = st.runs ('BA', '%d-01-01' % year, '%d-12-31' % year)
 ba.run ()
print year, id (ba)
 bas [year] = ba

2010 150289932
2011 150835852
2012 149727788


for y in sorted (bas.keys ()):

 b = bas [year]

Shouldn't that be b = bas[y]?
Yes, it should, indeed! What's more, I should have closed and restarted 
IDLE. There must have
been a name clash somewhere in the name space. The problem no longer 
exists. Sorry
about that. And thanks to all who paused to reflect on this non-problem. 
- Frederic.






 print y, id (b)

2010 149727788
2011 149727788
2012 149727788


[...]

Cheers,

Robert



--
http://mail.python.org/mailman/listinfo/python-list


Re: Strange object identity problem

2012-11-12 Thread F.R.

On 11/12/2012 06:02 PM, duncan smith wrote:

On 12/11/12 13:40, F.R. wrote:

On 11/12/2012 02:27 PM, Robert Franke wrote:

Hi Frederic,

[...]


bas = {}
for year in range (2010, 2013):

 ba = st.runs ('BA', '%d-01-01' % year, '%d-12-31' % year)
 ba.run ()
print year, id (ba)
 bas [year] = ba

2010 150289932
2011 150835852
2012 149727788


for y in sorted (bas.keys ()):

 b = bas [year]

Shouldn't that be b = bas[y]?

Yes, it should, indeed! What's more, I should have closed and restarted
IDLE. There must have
been a name clash somewhere in the name space. The problem no longer
exists. Sorry
about that. And thanks to all who paused to reflect on this non-problem.
- Frederic.





 print y, id (b)

2010 149727788
2011 149727788
2012 149727788


[...]

Cheers,

Robert





The problem was that year was bound to the integer 2013 from the first 
loop. When you subsequently looped over the keys you printed each key 
followed by id(bas[2013]). Restarting IDLE only helped because you 
presumably didn't repeat the error.


Duncan

That's it! Isn't it strange how on occasion one doesn't see the most 
obvious and simple mistake, focusing beyond the realm of foolishness. 
Thanks all . . .


Frederic

--
http://mail.python.org/mailman/listinfo/python-list


MySQL - "create table" creates malfunctioning tables

2013-01-24 Thread F.R.
The other day, for unfathomable reasons, I lost control over tables 
which I create. There was no concurrent change of anything on the 
machine, such as an update. So I have no suspect. Does the following 
action log suggest any recommendation to experienced SQL programmers?



1. A table:

mysql> select * from  expenses;
+++---+--++-+---+
| id | date   | place | stuff| amount | category| flags |
+++---+--++-+---+
| 38 | 2013-01-15 | ATT   | Sim card |  25.00 | Visa credit |   |
+++---+--++-+---+
1 row in set (0.00 sec)

2. I want to delete everything:

mysql> delete from expenses;

3. Nothing happens for about one minute. Then this:

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql>

4. I want to delete the table:

mysql> drop table expenses;

5. Nothing happens indefinitely. I have to hit Ctrl-C

^CCtrl-C -- sending "KILL QUERY 167" to server ...
Ctrl-C -- query aborted.
ERROR 1317 (70100): Query execution was interrupted
mysql>

6. I expect to find "expenses.frm", "expenses.MYD" and "expenses.MYI".

I find only the format file:

root@hatchbox-one:/home/fr# find / -name 'expenses.*' -ls
1055950   12 -rw-rw   1 mysqlmysql8783 Jan 25 01:51 
/var/lib/mysql/fr/expenses.frm

root@hatchbox-one:/home/fr#

7. All the tables I regularly work with have a "frm", a "MYD" and a 
"MYI" file in the directory "/var/lib/mysql/fr/".


8. I'd like to drop the table "expenses", plus a number of other tables 
I created, hoping to hit it right by chance. I didn't and now I have a 
bunch of unusable tables which I can't drop and I don't know where their 
files are hiding. If I did, I could remove them.




Grateful for any hint, comment, suggestion

Frederic



Dell E6500
Ubuntu 10.04 LTS
Python 2.7.3 (default, Aug  1 2012, 05:16:07) \n[GCC 4.6.3
MySQL (Can't find a version number)
Server version: 5.5.28-0ubuntu0.12.04.3 (Ubuntu)

--
http://mail.python.org/mailman/listinfo/python-list


Puzzling PDF

2014-02-16 Thread F.R.

Hi all,

Struggling to parse bank statements unavailable in sensible 
data-transfer formats, I use pdftotext, which solves part of the 
problem. The other day I encountered a strange thing, when one single 
figure out of many erroneously converted into letters. Adobe Reader 
displays the figure 50'000 correctly, but pdftotext makes it into 
"SO'OOO" (The letters "S" as in Susan and "O" as in Otto). One would 
expect such a mistake from an OCR. However, the statement is not a scan, 
but is made up of text. Because malfunctions like this put a damper on 
the hope to ever have a reliable reader that doesn't require 
time-consuming manual verification, I played around a bit and ended up 
even more confused: When I lift the figure off the Adobe display (mark, 
copy) and paste it into a Python IDLE window, it is again letters (ascii 
83 and 79), when on the Adobe display it shows correctly as digits. How 
can that be?


Frederic








--
https://mail.python.org/mailman/listinfo/python-list


Re: Puzzling PDF

2014-02-16 Thread F.R.

On 02/16/2014 05:29 PM, Emile van Sebille wrote:

You
On 2/16/2014 6:00 AM, F.R. wrote:

Hi all,

Struggling to parse bank statements unavailable in sensible
data-transfer formats, I use pdftotext, which solves part of the
problem. The other day I encountered a strange thing, when one single
figure out of many erroneously converted into letters. Adobe Reader
displays the figure 50'000 correctly, but pdftotext makes it into
"SO'OOO" (The letters "S" as in Susan and "O" as in Otto). One would
expect such a mistake from an OCR. However, the statement is not a scan,
but is made up of text. Because malfunctions like this put a damper on
the hope to ever have a reliable reader that doesn't require
time-consuming manual verification, I played around a bit and ended up
even more confused: When I lift the figure off the Adobe display (mark,
copy) and paste it into a Python IDLE window, it is again letters (ascii
83 and 79), when on the Adobe display it shows correctly as digits. How
can that be?




I've also gotten inconsistent results using various pdf to text 
converters[1], but getting an explanation for pdf2totext's failings 
here isn't likely to happen.  I'd first try google doc's on-line 
conversion tool to see if you get better results.  If you're lucky 
it'll do the job and you'll have confirmation that better tools 
exist.  Otherwise, I'd look for an alternate way of getting the bank 
info than working from the pdf statement.  At one site I've scripted 
firefox to access the bank's web based inquiry to retrieve the new 
activity overnight and use that to complete a daily bank reconciliation.


HTH,

Emile


[1] I wrote my own once to get data out of a particularly gnarly EDI 
specification pdf.





Emile, thanks for your response. Thanks to Roy Smith and Alister, too.

pdftotext has been working just fine. So much so that this freak 
incident is all the more puzzling. It smacks of an OCR error, but where 
does OCR come in, I wonder. I certainly suspected that the font I was 
looking at had fives and zeroes identical to esses and ohs, 
respectively, but the suspicion didn't hold up to scrutiny. I attach a 
little screen shot: At the top, the way it looks on the statement. Next, 
two words marked with the mouse. (One single marking, doesn't color the 
space.) Ctl-c puts both words to the clip board. Ctl-v drops them into 
the python IDLE window between the quotation marks. Lo and behold: 
they're clearly different! A little bit of code around displays the 
ascii numbers. Isn't that interesting?


Frederic



No matter. You're both right. There are alternatives. The best would be 
to get the data in a CSV format. Alas, I am so lightweight a client that 
banks don't even bother to find out what I am talking about.


I know how to access web pages programmatically, but haven't gotten 
around to dealing with password-protected log-ins and to sending such 
data as one writes into templates interactively.


Frederic


<>-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Storing the state of script between steps

2014-02-22 Thread F.R.

On 02/21/2014 09:59 PM, Denis Usanov wrote:

Good evening.

First of all I would like to apologize for the name of topic. I really didn't 
know how to name it more correctly.

I mostly develop on Python some automation scripts such as deployment (it's not about 
fabric and may be not ssh at all), testing something, etc. In this terms I have such 
abstraction as "step".

Some code:

class IStep(object):
 def run():
 raise NotImplementedError()

And the certain steps:

class DeployStep: ...
class ValidateUSBFlash: ...
class SwitchVersionS: ...

Where I implement run method.
Then I use some "builder" class which can add steps to internal list and has a method 
"start" running all step one by one.

And I like this. It's loosely coupled system. It works fine in simple cases. But sometimes some 
steps have to use the results from previous steps. And now I have problems. Before now I had 
internal dict in "builder" and named it as "world" and passed it to each run() 
methods of steps. It worked but I disliked this.

How would you solve this problem and how would you do it? I understant that 
it's more architecture specific question, not a python one.

I bet I wouldn't have asked this if I had worked with some of functional 
programming languages.


A few months ago I posted a summary of a data transformation framework 
inviting commentary. 
(https://mail.python.org/pipermail/python-list/2013-August/654226.html). 
It didn't meet with much interest and I forgot about it. Now that 
someone is looking for something along the line as I understand his 
post, there might be some interest after all.



My module is called TX. A base class "Transformer" handles the flow of 
data. A custom Transformer defines a method "T.transform (self)" which 
transforms input to output. Transformers are callable, taking input as 
an argument and returning the output:


transformed_input = T (some_input)

A Transformer object retains both input and output after a run. If it is 
called a second time without input, it simply returns its output, 
without needlessly repeating its job:


same_transformed_input = T ()

Because of this IO design, Transformers nest:

csv_text = CSV_Maker (Data_Line_Picker (Line_Splitter (File_Reader 
('1st-quarter-2013.statement'


A better alternative to nesting is to build a Chain:

Statement_To_CSV = TX.Chain (File_Reader, Line_Splitter, 
Data_Line_Picker, CSV_Maker)


A Chain is functionally equivalent to a Transformer:

csv_text = Statement_To_CSV ('1st-quarter-2013.statement')

Since Transformers retain their data, developing or debugging a Chain is 
a relatively simple affair. If a Chain fails, the method "show ()" 
displays the innards of its elements one by one. The failing element is 
the first one that has no output. It also displays such messages as the 
method "transform (self)" would have logged. (self.log (message)). While 
fixing the failing element, the element preceding keeps providing the 
original input for testing, until the repair is done.


Since a Chain is functionally equivalent to a Transformer, a Chain can 
be placed into a containing Chain alongside Transformers:


Table_Maker = TX.Chain (TX.File_Reader (), TX.Line_Splitter (), 
TX.Table_Maker ())
Table_Writer = TX.Chain (Table_Maker, Table_Formatter, 
TX.File_Writer (file_name = '/home/xy/office/addresses-4214'))
DB_Writer = TX.Chain (Table_Maker, DB_Formatter, TX.DB_Writer 
(table_name = 'contacts'))


Better:

Splitter = TX.Splitter (TX.Table_Writer (), TX.DB_Writer ())
Table_Handler = TX.Chain (Table_Maker, Splitter)

Table_Handler ('home/xy/Downloads/report-4214')  # Writes to both 
file and to DB



If a structure builds up too complex to remember, the method "show_tree 
()" would display something like this:


Chain
Chain[0] - Chain
Chain[0][0] - Quotes
Chain[0][1] - Adjust Splits
Chain[1] - Splitter
Chain[1][0] - Chain
Chain[1][0][0] - High_Low_Range
Chain[1][0][1] - Splitter
Chain[1][0][1][0] - Trailing_High_Low_Ratio
Chain[1][0][1][1] - Standard Deviations
Chain[1][1] - Chain
Chain[1][1][0] - Trailing Trend
Chain[1][1][1] - Pegs

Following a run, all intermediary formats are accessible:

standard_deviations = C[1][0][1][1]()

TM = TX.Table_Maker ()
TM (standard_deviations).write ()

 0  | 1  | 2 |

 116.49 | 132.93 | 11.53 |
 115.15 | 128.70 | 11.34 |
   1.01 |   0.00 |  0.01 |

A Transformer takes parameters, either at construction time or by means 
of the method "T.set (key = parameter)". Whereas a File Reader doesn't 
get payload passed and may take a file name as input argument, as a 
convenient alternative, a File Writer does take payload and the file 
name must be set by keyword:


File_Writer = TX.File_Writer (file_name = '/tmp/memos-with-dates-1')
File_Writer (input)  # Writes file
File_Writer.set ('/tmp/memos-with-dates-2')
File_Writer ()

A data transformation framework. A presentation inviting commentary.

2013-08-21 Thread F.R.

Hi all,

In an effort to do some serious cleaning up of a hopelessly cluttered 
working environment, I developed a modular data transformation system 
that pretty much stands. I am very pleased with it. I expect huge time 
savings. I would share it, if had a sense that there is an interest out 
there and would appreciate comments. Here's a description. I named the 
module TX:


The nucleus of the TX system is a Transformer class, a wrapper for any 
kind of transformation functionality. The Transformer takes input as 
calling argument and returns it transformed. This design allows the 
assembly of transformation chains, either nesting calls or better, using 
the class Chain, derived from 'Transformer' and 'list'. A Chain consists 
of a sequence of Transformers and is functionally equivalent to an 
individual Transformer. A high degree of modularity results: Chains 
nest. Another consequence is that many transformation tasks can be 
handled with a relatively modest library of a few basic prefabricated 
Transformers from which many different Chains can be assembled on the 
fly. A custom Transformer to bridge an eventual gap is quickly written 
and tested, because the task likely is trivial.
A good analogy of the TX methodology is a road map with towns 
scattered all over it and highways connecting them. To get from any town 
to any other one is a simple matter of hopping the towns in between. The 
TX equivalent of the towns are data formats, the equivalent of the 
highways are TX Transformers. They are not so much thought of in terms 
of what they do than in terms of the formats they take and give. 
Designing a library of Transformers is essentially a matter of 
establishing a collection of standard data formats. First the towns, 
then the highways.
A feature of the TX Transformer is that it retains both its input 
and output. This makes a Chain a breeze to build progressively, link by 
link, and also makes debugging easy: If a Chain doesn't work, Chain.show 
() reveals the failing link as the first one that has no output. It can 
be replaced with a corrected instance, as one would replace a blown 
fuse. Running the Chain again without input makes it have another try.
Parameter passing runs on a track that is completely separate from 
the payload track. Parameters can be set in order to configure a Chain 
prior to running it, or can be sent at runtime by individual 
Transformers to its siblings and their progeny. Parameters are keyed and 
get picked up by those Chain links whose set of pre-defined keys 
includes the parameter's key. Unintended pick-ups with coincidentally 
shared keys for unrelated parameters can be prevented by addressing 
parameters to individual Translators.


Below an application example. Five custom classes at the end exemplify 
the pattern. I join the post also as attachment, in case some 
auto-line-wrap messes up this text.


Commentary welcome

Frederic





An example of use: Download historic stock quotes from Yahoo Finance for 
a given range of dates and a list of symbols, delete a column and add 
three, insert the data in a MySQL table. Also write them to temporary 
files in tabular form for verification.
"make_quotes_writer ()" returns a custom transformation tree. 
"run_quotes ()" makes such a tree, sets it on a given time range and 
runs it on a list of symbols.
(Since Yahoo publishes the data for downloading, I presume it's 
okay to do it this way. This is a demo of TX, however, and should not be 
misconstrued as an encouragement to violate any publisher's terms of 
service.)



import TX, yahoo_historic_quotes as yhq

def make_quotes_writer ():

 Visualizer = TX.Chain (
  yhq.percent (),
  TX.Table_Maker (has_header = True),
  TX.Table_Writer (),
  name = 'Visualizer'
 )

 To_DB = TX.Chain (yhq.header_stripper(), TX.DB_Writer(table_name = 
'quotes'), name = 'To DB')


 To_File = TX.Chain (Visualizer, TX.File_Writer (), name = 'To File')

 Splitter = TX.Splitter (To_DB, To_File, name = 'Splitter')

 Quotes = TX.Chain (
  yhq.yahoo_quotes (),
  TX.CSV_To_List (delimiter = ','),
  TX.Numerizer (),
  yhq.wiggle_and_trend (),
  yhq.symbol (),
  Splitter,
  name = 'Quotes'
 )

 return Quotes


>>> Quotes = make_quotes_writer ()
>>> Quotes.show_tree()

Quotes
Quotes[0] - Yahoo Quotes
Quotes[1] - CSV To List
Quotes[2] - Numerizer
Quotes[3] - Wiggle and Trend
Quotes[4] - Symbol
Quotes[5] - Splitter
Quotes[5][0] - To DB
Quotes[5][0][0] - Header Stripper
Quotes[5][0][1] - DB Writer
Quotes[5][1] - To File
Quotes[5][1][0] - Visualizer
Quotes[5][1][0][0] - Percent
Quotes[5][1][0][1] - Table Maker
Quotes[5][1][0][2] - Table Writer
Quotes[5][1][1] - File Writer


def run_quotes (symbols, from_date = '1970-01-01', to_date = '2099-12-31'):
'''Downloads 

Re: A data transformation framework. A presentation inviting commentary.

2013-08-22 Thread F.R.

On 08/21/2013 06:29 PM, F.R. wrote:

Hi all,

In an effort to do some serious cleaning up of a hopelessly cluttered 
working environment, I developed a modular data transformation system 
that pretty much stands. I am very

. . . etc


Chris, Terry, Dieter, thanks for your suggestions.

Chris: If my Transformer looks like a function, that's because it is 
(__call__). My idea was to have something like an erector set of 
elementary transformation machines that can be assembled into chains. 
There may be some processing overhead in managing the data flow, but I'm 
not even sure of that, because the flow needs to be managed somehow and 
throwing one's stones into someone else's garden doesn't get rid of the 
stones. My idea was to simplify, generalize and automate in order to 
deal with the kind of overhead that matters most to me: my own mental 
overhead.


Terry: I am aware of the memory-load aspect. It is no constraint for the 
things I do. If it became one, I'd develop a translation assembly using 
a small data sample and when it reaches the stage of reliability, I'd 
add a line to have each Translator delete its input the moment it is 
done. I shall certainly look at itertools. Thanks for your suggestions 
and explanations.


Dieter: I wish I could respond to the points you raise. I am unfamiliar 
with the details and they don't seem like they can be looked up in five 
minutes. I do make a note of your thoughts.



Frederic


--
http://mail.python.org/mailman/listinfo/python-list


Where does MySQLdb put inserted data?

2013-10-04 Thread F.R.

Hi,
As of late clipboard pasting into a terminal sometimes fails (a 
known bug, apparently), I use MySQLdb to access MySQL tables. In general 
this works just fine. But now I fail filling a new table. The table 
exists. "mysql>EXPLAIN new_table;" explains and "root@blackbox-one:/# 
sudo/find / -name 'new_table*'" finds "/var/lib/mysql/fr/new_table.frm". 
So I do "cursor.executemany ('insert into new_table values (%s)' % 
format, data)". No error occurs and "cursor.execute ('select * from 
new_table;')" returns the number of records read, and "cursor.fetchall 
()" returns all new records. All looks fine, but "mysql>SELECT * FROM 
new_table;" produces an "Empty set" and "sudo find / -name 'new_table*" 
still finds only the format file, same as before.
Could it have to do with COMMIT. I believe I am using ISAM tables 
(default?) and those don't recognize undo commands, right?. Anyway, an 
experimental "cursor.execute ('COMMIT')" didn't make a difference. It 
looks like MySQLdb puts the data into a cache and that cache should be 
saved either by the OS or by me. Strange thing is that this is one freak 
incident in an almost daily routine going back years and involving 
thousands of access operations in and out acting instantaneously. I seem 
to remember a similar case some time ago and it also involved a new 
empty table.


Thanks for hints

Frederic



mysql> select version()
-> ;
+-+
| version()   |
+-+
| 5.5.31-0ubuntu0.12.04.1 |
+-+
1 row in set (0.00 sec)

--
https://mail.python.org/mailman/listinfo/python-list


Re: Where does MySQLdb put inserted data?

2013-10-04 Thread F.R.

On 10/04/2013 09:38 AM, F.R. wrote:

Hi,
As of late clipboard pasting into a terminal sometimes fails (a 
known bug, apparently), I use MySQLdb to access MySQL tables. In 
general this works just fine. But now I fail filling a new table. The 
table exists. "mysql>EXPLAIN new_table;" explains and 
"root@blackbox-one:/# sudo/find / -name 'new_table*'" finds 
"/var/lib/mysql/fr/new_table.frm". So I do "cursor.executemany 
('insert into new_table values (%s)' % format, data)". No error occurs 
and "cursor.execute ('select * from new_table;')" returns the number 
of records read, and "cursor.fetchall ()" returns all new records. All 
looks fine, but "mysql>SELECT * FROM new_table;" produces an "Empty 
set" and "sudo find / -name 'new_table*" still finds only the format 
file, same as before.
Could it have to do with COMMIT. I believe I am using ISAM tables 
(default?) and those don't recognize undo commands, right?. Anyway, an 
experimental "cursor.execute ('COMMIT')" didn't make a difference. It 
looks like MySQLdb puts the data into a cache and that cache should be 
saved either by the OS or by me. Strange thing is that this is one 
freak incident in an almost daily routine going back years and 
involving thousands of access operations in and out acting 
instantaneously. I seem to remember a similar case some time ago and 
it also involved a new empty table.


Thanks for hints

Frederic



mysql> select version()
-> ;
+-+
| version()   |
+-+
| 5.5.31-0ubuntu0.12.04.1 |
+-+
1 row in set (0.00 sec)


Thank you Chris, thank you Steven,
The suggestion to switch to PostgreSQL isn't lost on me. I have it 
installed, but have been putting off the change, apprehensive of getting 
slowed down by many annoying side effects for some time to come. This 
may be the moment . . .
 Off list? MySQL is. MySQLdb is not. Before I know which of the two 
is the culprit, I don't know whether I'm off list or not and take the 
risk, prepared to beg pardon if I am.


Frederic


--
https://mail.python.org/mailman/listinfo/python-list


Re: Where does MySQLdb put inserted data?

2013-10-04 Thread F.R.

On 10/04/2013 12:11 PM, Chris Angelico wrote:

On Fri, Oct 4, 2013 at 8:05 PM, F.R.  wrote:

  Off list? MySQL is. MySQLdb is not. Before I know which of the two is
the culprit, I don't know whether I'm off list or not and take the risk,
prepared to beg pardon if I am.


Just to clarify: Off-topic means discussing stuff that isn't about
Python; off-list means sending private emails, not to
python-list@python.org / comp.lang.python. You're uncertain as to
whether you're off-topic or not, but you're definitely on-list; and my
previous mail to you was off-list, so people here are going to be a
little confused, as they lack context. (I merely suggested that
switching to PostgreSQL would quite probably be a worthwhile time
investment.)

ChrisA
I shall switch and give you credit for the impulse in addition to the 
terminological clarification on being off something or other . . .


Frederic

--
https://mail.python.org/mailman/listinfo/python-list


Re: Where does MySQLdb put inserted data?

2013-10-05 Thread F.R.

On 10/05/2013 12:55 AM, Dennis Lee Bieber wrote:

On Fri, 04 Oct 2013 09:38:41 +0200, "F.R." 
declaimed the following:



MySQLdb, as with all DB-API compliant adapters, does NOT do
"auto-commit" -- you MUST execute a con.commit() after any query (sequence)
that modifies data. Without it, closing the connection will invoke a
ROLLBACK operation, removing any attempted changes.


That's it! It works! Thank you sooo much. A miracle how I could go 
without commits for years and never have missing data. Anyway, another 
lesson learned . . .


Thanks

Frederic

--
https://mail.python.org/mailman/listinfo/python-list


Re: fixing an horrific formatted csv file.

2014-07-01 Thread F.R.

On 07/01/2014 04:04 PM, flebber wrote:

What I am trying to do is to reformat a csv file into something more usable.
currently the file has no headers, multiple lines with varying columns that are 
not related.

This is a sample

Meeting,05/07/14,RHIL,Rosehill Gardens,Weights,TAB,+3m Entire Circuit,  
,
Race,1,CIVIC STAKES,CIVIC,CIVIC,1350,~ ,3U,~ ,QLT   
,54,0,0,5/07/2014,,  ,  ,  ,  ,No class 
restriction, Quality, For Three-Years-Old and Upwards, No sex restriction, 
(Listed),Of $10. First $6, second $2, third $1, fourth $5000, 
fifth $2000, sixth $1000, seventh $1000, eighth $1000
Horse,1,Bennetta,0,"Grahame Begg",Randwick,,0,0,16-3-1-3 
$390450.00,,0,0,0,,98.00,M,
Horse,2,Breakfast in Bed,0,"David Vandyke",Warwick Farm,,0,0,20-6-1-5 
$201250.00,,0,0,0,,81.00,M,
Horse,3,Capital Commander,0,"Gerald Ryan",Rosehill,,0,0,43-9-9-3 
$438625.00,,0,0,0,,85.00,M,
Horse,4,Coup Ay Tee (NZ),0,"Chris Waller",Rosehill,,0,0,35-9-6-5 
$519811.00,,0,0,0,,101.00,G,
Horse,5,Generalife,0,"John O'Shea",Warwick Farm,,0,0,19-6-1-3 
$235045.00,,0,0,0,,87.00,G,
Horse,6,He's Your Man (FR),0,"Chris Waller",Rosehill,,0,0,13-2-3-1 
$108110.00,,0,0,0,,93.00,G,
Horse,7,Hidden Kisses,0,"Chris Waller",Rosehill,,0,0,40-8-8-5 
$565750.00,,0,0,0,,96.00,M,
Horse,8,Oakfield Commands,0,"Gerald Ryan",Rosehill,,0,0,22-7-4-6 
$269530.00,,0,0,0,,94.00,G,
Horse,9,Taxmeifyoucan,0,"Gregory Hickman",Warwick Farm,,0,0,18-2-4-4 
$539730.00,,0,0,0,,91.00,G,
Horse,10,The Peak,0,"Bart & James Cummings",Randwick,,0,0,15-6-1-0 
$426732.00,,0,0,0,,95.00,G,
Horse,11,Tougher Than Ever (NZ),0,"Chris Waller",Rosehill,,0,0,17-3-2-3 
$321613.00,,0,0,0,,97.00,H,
Horse,12,TROMSO,0,"Chris Waller",Rosehill,,0,0,47-8-11-2 
$622300.00,,0,0,0,,103.00,G,
Race,2,FLYING WELTER - BENCHMARK 95 HCP,BM95,BM95,1100,BM95  ,3U,~  
   ,HCP   ,54,0,0,5/07/2014,,  ,  ,  ,  
,BenchMark 95, Handicap, For Three-Years-Old and Upwards, No sex restriction,Of 
$85000. First $48750, second $16750, third $8350, fourth $4150, fifth $2000, 
sixth $1000, seventh $1000, eighth $1000, ninth $1000, tenth $1000
Horse,1,Big Bonanza,0,"Don Robb",Wyong,,0,57.5,31-9-4-3 
$366860.00,,0,0,0,,92.00,G,
Horse,2,Casual Choice,0,"Joseph Pride",Warwick Farm,,0,54,8-2-3-0 
$105930.00,,0,0,0,

So what I am trying to so is end up with an output like this.

Meeting, Date, Race, Number, Name, Trainer, Location
Rosehill, 05/07/14, 1, 1,Bennetta,"Grahame Begg",Randwick,
Rosehill, 05/07/14, 1, 2,Breakfast in Bed,"David Vandyke",Warwick Farm,

So as a start i thought i would try inserting the Meeting and Race number 
however I am just not getting it right.

import csv

outfile = open("/home/sayth/Scripts/cleancsv.csv", "w")
with open('/home/sayth/Scripts/test.csv') as f:
 f_csv = csv.reader(f)
 headers = next(f_csv)
 for row in f_csv:
 meeting = row[3] in row[0] == 'Meeting'
 new = row.insert(0, meeting)
 while row[1] in row[0] == 'Race' < 9:  # pref less than next found 
row[0]

 # grab row[1] as id number
 id = row[1]
 # from row[0] and insert it in first position
 new_lines = new.insert(1, id)
 outfile.write(new_lines)
 outfile.close()

How should I go about this?

Thanks

Sayth


Reformatting is what I do most and over time I have acquired some 
practice. Complete solutions are not
often proposed, possibly sneered on for their officiousness. In that 
case I apologize. I couldn't resist. It is such a nice example. Having 
solved it, I figure why not share it . . .


Frederic



def race_table (csv_text):
input_table = [[item.strip(' "') for item in record.split (',')] 
for record in csv_text.splitlines ()]

# At this point look at input_table to find the record indices
output_table = []
for record in input_table:
if record [0] == 'Meeting':
meeting = record [3]
elif record [0] == 'Race':
date = record [13]
race = record [1]
elif record [0] == 'Horse':
number = record [1]
name = record [2]
trainer = record [4]
location = record [5]
output_table.append ((meeting, date, race, number, name, 
trainer, location))

return output_table


>>> for record in race_table (your_csv_text): print record

('Rosehill Gardens', '5/07/2014', '1', '1', 'Bennetta', 'Grahame Begg', 
'Randwick')
('Rosehill Gardens', '5/07/2014', '1', '2', 'Breakfast in Bed', 'David 
Vandyke', 'Warwick Farm')
('Rosehill Gardens', '5/07/2014', '1', '3', 'Capital Commander', 'Gerald 
Ryan', 'Rosehill')
('Rosehill Gardens', '5/07/2014', '1', '4', 'Coup Ay Tee (NZ)', 'Chris 
Waller', 'Rosehill')
('Rosehill Gardens', '5/07/2014', '1', '5', 'Generalife', "John O'Shea", 
'Warwick Far

Re: fixing an horrific formatted csv file.

2014-07-02 Thread F.R.

On 07/02/2014 11:13 AM, flebber wrote:

TM = TX.Table_Maker (headings =

('Meeting','Date','Race','Number','Name','Trainer','Location'))

TM (race_table (your_csv_text)).write ()

Where do I find TX? Found this mention in the list, was it available in pip by 
any name?
https://mail.python.org/pipermail/python-list/2014-February/667464.html

Sayth


I'd have to make it available. I proposed it some time ago and received 
a couple of suggestions in return. It is a modular transformation 
framework written entirely in python (2.7). It consists essentially of a 
base class "Transformer" that handles input and output in such a way 
that Transformer objects can be chained. It saved me from drowning an a 
horrible and growing tangle of hacks. Finding something usable I had 
previously done took time. Understanding how it worked took more time 
and adapting it took still more time, so that writing yet another hack 
from scratch was faster.
A number of hacks I could quickly wrap into a Transformer object 
and so could start building a library of standard Transformers. The 
Table_Maker is one of them. The table making code is quite bad. It 
suffers from feature overload. I would clean it up for distribution.
I'd be happy to distribute the base class and a few standard 
Translators, such as I use every day. (File Reader, File Writer, DB Run 
Command, DB Write, Table Maker, PDF To Text, Text To Lines, Lines To 
Text, Sort, Sort And Unique, etc.) Writing one's own Transformers is a 
breeze. Testing too, because a Transformer keeps its input and output 
and, in line with the system's design philosophy, does only its own 
single thing.
A Chain is a list of Transformers that run in sequence. It is 
itself derived from Transformer and is a functional equivalent. So 
Chains nest. Fixing a Chain that nothing comes out of is a 
straightforward matter too. It will still have run up to the failing 
element. Chain.show () reveals the culprit as the first one to have no 
output.
I am not up to date on distributing and would depend on qualified 
help on that.


Frederic





A brief overview


The TX solution to your race table would be (TX is the name of the module):

class Race_Table (TX.Transformer):
'''
In: CSV text
Out: Tabular data (2-dimensional list)
'''
name = 'Race_Table'
@TX.setup   # Checks timestamps to prevent needless reruns in 
the absence of new input

def transform (self):
for line in self.Input.data:
# See my post
self.Output.take (output_table)

Example file to file:
>>> Race_Schedule_F2F = TX.Chain (TX.File_Reader (), Race_Table (), 
TX.List_To_CSV (delimiter = ';'), TX.File_Writer (terminal = out_file_name)

>>> Race_Schedule_F2F (input_file_name)   # Does it all!

Example web to database:
>>> Race_Schedule_WWW2DB = TX.Chain (TX.WWW_Reader (), 
Race_Schedule_HTML_Reader (), Race_Table (), TX.DB_Writer (table_name = 
'horses'))
>>> Race_Schedule_WWW2DB (url)   # Does is all! You'd have to write 
the Race_Schedule_HTML_Reader


Verify your table:
>>> Table_Viewer = TX.Chain (TX.Table_Maker (), TX.Table_Writer ())
>>> Race_Schedule_WWW2DB.show_tree () # See which one should display
Chain
Chain[0] - WWW Reader
Chain[1] - Race_Schedule_HTML_Reader
Chain[2] - Race_Table
Chain[3] - DB Writer
>>> print Table_Viewer (Race_Schedule_WWW2DB[2]()) # All 
Transformers keep their data

(Display of table)

Verify database:
>>> print Table_Viewer (TX.DB_Reader (table_name = 'horses')())
(Display of database table)

--
https://mail.python.org/mailman/listinfo/python-list


Re: fixing an horrific formatted csv file.

2014-07-04 Thread F.R.

On 07/04/2014 12:28 PM, flebber wrote:

On Friday, 4 July 2014 14:12:15 UTC+10, flebber  wrote:

I have taken the code and gone a little further, but I need to be able to 
protect myself against commas and single quotes in names.



How is it the best to do this?



so in my file I had on line 44 this trainer name.



"Michael, Wayne & John Hawkes"



and in line 95 this horse name.

Inz'n'out



this throws of my capturing correct item 9. How do I protect against this?



Here is current code.



import re

from sys import argv

SCRIPT, FILENAME = argv





def out_file_name(file_name):

 """take an input file and keep the name with appended _clean"""

 file_parts = file_name.split(".",)

 output_file = file_parts[0] + '_clean.' + file_parts[1]

 return output_file





def race_table(text_file):

 """utility to reorganise poorly made csv entry"""

 input_table = [[item.strip(' "') for item in record.split(',')]

for record in text_file.splitlines()]

 # At this point look at input_table to find the record indices

 output_table = []

 for record in input_table:

 if record[0] == 'Meeting':

 meeting = record[3]

 elif record[0] == 'Race':

 date = record[13]

 race = record[1]

 elif record[0] == 'Horse':

 number = record[1]

 name = record[2]

 results = record[9]

 res_split = re.split('[- ]', results)

 starts = res_split[0]

 wins = res_split[1]

 seconds = res_split[2]

 thirds = res_split[3]

 prizemoney = res_split[4]

 trainer = record[4]

 location = record[5]

 print(name, wins, seconds)

 output_table.append((meeting, date, race, number, name,

  starts, wins, seconds, thirds, prizemoney,

  trainer, location))

 return output_table



MY_FILE = out_file_name(FILENAME)



# with open(FILENAME, 'r') as f_in, open(MY_FILE, 'w') as f_out:

# for line in race_table(f_in.readline()):

# new_row = line

with open(FILENAME, 'r') as f_in, open(MY_FILE, 'w') as f_out:

 CONTENT = f_in.read()

 # print(content)

 FILE_CONTENTS = race_table(CONTENT)

 # print new_name

 f_out.write(str(FILE_CONTENTS))





if __name__ == '__main__':

 pass

So I found this on stack overflow

In [2]: import string

In [3]: identity = string.maketrans("", "")

In [4]: x = ['+5556', '-1539', '-99', '+1500']

In [5]: x = [s.translate(identity, "+-") for s in x]

In [6]: x
Out[6]: ['5556', '1539', '99', '1500']

but it fails in my file, due to I believe mine being a list of list. Is there 
an easy way to iterate the sublists without flattening?

Current code.

 input_table = [[item.strip(' "') for item in record.split(',')]
for record in text_file.splitlines()]
 # At this point look at input_table to find the record indices
 identity = string.maketrans("", "")
 print(input_table)
 input_table = [s.translate(identity, ",'") for s
in input_table]

Sayth


Take Gregory's advice and use the csv module. Don't reinvent a csv 
parser. My "csv" splitter was the simplest approach possible, which I 
tend to use with undocumented formats, tweaking for unexpected features 
as they come along.


Frederic


--
https://mail.python.org/mailman/listinfo/python-list


Re: draw a line if the color of points of beginning and end are différent from white

2013-03-06 Thread F.R.

On 03/06/2013 06:46 PM, olsr.ka...@gmail.com wrote:

how can i draw a line if the point of the begining and the end if those points  
are différent from the white
in other exepretion how can i get the color of two points of the begining and 
the end?
please help me


This should get you going. If it doesn't work it will
still direct you to the relevant chapters in the tutorial.

Frederic


def draw_line (image):

# image is a PIL Image (  )

# Define your colors
WHITE = ~0  # Probably white for all modes.
LINE_COLOR = 0  # define

# Find end points
points = []
pixels = image.load () # Fast pixel access
for y in range (image.size [1]):
for x in range (image.size [0]):
if pixels [x, y] != WHITE
points.append ((x, y))

# Join end points
draw = ImageDraw.Draw (image)
draw.line (points, fill = LINE_COLOR)

--
http://mail.python.org/mailman/listinfo/python-list


Re: Regex help needed!

2009-12-24 Thread F.R.



On 21.12.2009 12:38, Oltmans wrote:

Hello,. everyone.

I've a string that looks something like

lksjdfls  kdjff lsdfs  sdjflssdfsdwelcome


> From above string I need the digits within the ID attribute. For
example, required output from above string is
- 35343433
- 345343
- 8898

I've written this regex that's kind of working
re.findall("\w+\s*\W+amazon_(\d+)",str)

but I was just wondering that there might be a better RegEx to do that
same thing. Can you kindly suggest a better/improved Regex. Thank you
in advance.
   


If you filter in two or even more sequential steps the problem becomes a 
lot simpler, not least because you can

test each step separately:

>>> r1 = re.compile (']*')   # Add ignore case and 
variable white space

>>> r2 = re.compile ('\d+')
>>> [r2.search (item).group () for item in r1.findall (s) if item] 
# s is your sample

['345343', '35343433', '8898'] # Supposing all ids have digits

Frederic

--
http://mail.python.org/mailman/listinfo/python-list