On 02/21/2014 09:59 PM, Denis Usanov wrote:
Good evening.
First of all I would like to apologize for the name of topic. I really didn't
know how to name it more correctly.
I mostly develop on Python some automation scripts such as deployment (it's not about
fabric and may be not ssh at all), testing something, etc. In this terms I have such
abstraction as "step".
Some code:
class IStep(object):
def run():
raise NotImplementedError()
And the certain steps:
class DeployStep: ...
class ValidateUSBFlash: ...
class SwitchVersionS: ...
Where I implement run method.
Then I use some "builder" class which can add steps to internal list and has a method
"start" running all step one by one.
And I like this. It's loosely coupled system. It works fine in simple cases. But sometimes some
steps have to use the results from previous steps. And now I have problems. Before now I had
internal dict in "builder" and named it as "world" and passed it to each run()
methods of steps. It worked but I disliked this.
How would you solve this problem and how would you do it? I understant that
it's more architecture specific question, not a python one.
I bet I wouldn't have asked this if I had worked with some of functional
programming languages.
A few months ago I posted a summary of a data transformation framework
inviting commentary.
(https://mail.python.org/pipermail/python-list/2013-August/654226.html).
It didn't meet with much interest and I forgot about it. Now that
someone is looking for something along the line as I understand his
post, there might be some interest after all.
My module is called TX. A base class "Transformer" handles the flow of
data. A custom Transformer defines a method "T.transform (self)" which
transforms input to output. Transformers are callable, taking input as
an argument and returning the output:
transformed_input = T (some_input)
A Transformer object retains both input and output after a run. If it is
called a second time without input, it simply returns its output,
without needlessly repeating its job:
same_transformed_input = T ()
Because of this IO design, Transformers nest:
csv_text = CSV_Maker (Data_Line_Picker (Line_Splitter (File_Reader
('1st-quarter-2013.statement'))))
A better alternative to nesting is to build a Chain:
Statement_To_CSV = TX.Chain (File_Reader, Line_Splitter,
Data_Line_Picker, CSV_Maker)
A Chain is functionally equivalent to a Transformer:
csv_text = Statement_To_CSV ('1st-quarter-2013.statement')
Since Transformers retain their data, developing or debugging a Chain is
a relatively simple affair. If a Chain fails, the method "show ()"
displays the innards of its elements one by one. The failing element is
the first one that has no output. It also displays such messages as the
method "transform (self)" would have logged. (self.log (message)). While
fixing the failing element, the element preceding keeps providing the
original input for testing, until the repair is done.
Since a Chain is functionally equivalent to a Transformer, a Chain can
be placed into a containing Chain alongside Transformers:
Table_Maker = TX.Chain (TX.File_Reader (), TX.Line_Splitter (),
TX.Table_Maker ())
Table_Writer = TX.Chain (Table_Maker, Table_Formatter,
TX.File_Writer (file_name = '/home/xy/office/addresses-4214'))
DB_Writer = TX.Chain (Table_Maker, DB_Formatter, TX.DB_Writer
(table_name = 'contacts'))
Better:
Splitter = TX.Splitter (TX.Table_Writer (), TX.DB_Writer ())
Table_Handler = TX.Chain (Table_Maker, Splitter)
Table_Handler ('home/xy/Downloads/report-4214') # Writes to both
file and to DB
If a structure builds up too complex to remember, the method "show_tree
()" would display something like this:
Chain
Chain[0] - Chain
Chain[0][0] - Quotes
Chain[0][1] - Adjust Splits
Chain[1] - Splitter
Chain[1][0] - Chain
Chain[1][0][0] - High_Low_Range
Chain[1][0][1] - Splitter
Chain[1][0][1][0] - Trailing_High_Low_Ratio
Chain[1][0][1][1] - Standard Deviations
Chain[1][1] - Chain
Chain[1][1][0] - Trailing Trend
Chain[1][1][1] - Pegs
Following a run, all intermediary formats are accessible:
standard_deviations = C[1][0][1][1]()
TM = TX.Table_Maker ()
TM (standard_deviations).write ()
0 | 1 | 2 |
116.49 | 132.93 | 11.53 |
115.15 | 128.70 | 11.34 |
1.01 | 0.00 | 0.01 |
A Transformer takes parameters, either at construction time or by means
of the method "T.set (key = parameter)". Whereas a File Reader doesn't
get payload passed and may take a file name as input argument, as a
convenient alternative, a File Writer does take payload and the file
name must be set by keyword:
File_Writer = TX.File_Writer (file_name = '/tmp/memos-with-dates-1')
File_Writer (input) # Writes file
File_Writer.set ('/tmp/memos-with-dates-2')
File_Writer () # Writes the same thing to the second file
That's about it. I am very pleased with the design. I developed it to
wrap a growing jungle of existing modules and classes having no
interconnectability and no common input-output specifications. The
improvement in terms of work time and resource management is enormous. I
would share the base class and a few custom classes, reasonably
autonomous to not require surgical extraction from the jungle.
Writing a custom class requires no more than defining private keywords,
if any, and writing the method "transform (self)", or "process_record
(self, record)" if the input is a list of records, which it often is.
The modular design encourages to have a Transformer do just one simple
thing, easy to write and easy to debug. Complexity comes from assembling
simple Transformers in a great variety of configurations.
Frederic
--
https://mail.python.org/mailman/listinfo/python-list