>Sorry about "obfuscation contest," I just balked at the reduce code, which >seemed >like premature overgeneralization ;-)
It's a personal preference, but the only thing I consider premature is optimization, not generalization. I prefer to try and capture any concept I can as its own abstracted procedure. In practice this usually means using a lot of lazy data structures at the end of the day :) >Yes, but does collect(fields, fun) always know that fun takes two args? And >that >it makes sense to apply fun pairwise as a boolean relation to reduce to a >single boolean value? You could say that by definition it does. In that case, will your future library extender need >to add some other kind of collect-like function? The collect does assume that the function takes 2 parameters and the logic is to apply to the result of each call and the next item in the list. The ony more general way I can think of making this would be to do inspection on the function to determine the number. To make a general algorithm that does something like: let n = number of elements a given function f takes apply n elements from the list to the function given a result of the function r, apply [r, *(n-1)] recursively to the function. def f(a, b, c): return (b+c)*a map_col([1, 2, 3, 4, 5], f) => 45 However, I don't know how applicable that would be in this context, since this is a boolean system and in the simple binary setup, it makes sense to return the last operand to the beginning of the next evaluation. In a more general case like this, the value of the evalution would be the most useful, not the boolean truth value. I do have one other generic rountine besides the booleans and collect collectors which is called check: def check(fields, test): """ check(list fields, func test) -> rule """ def rule(record): values = [record[field] for field in fields] return test(*values) return rule it basically lets you pull out whatever fields you want and pass them to the provided function. def do_complicated_test(a, b, tolerance=0.001): return a - ((a+b)/b) < a*tolerance rule = check(('x', 'y'), do_complicated_test) Now that I think of it, I should probably swap this to be check(do_complicated_test, 'x', 'y') When I first wrote this thing it was 3 stage construction, not 2. I wanted to abstract out specific tests and then JUST apply fields to them in the actual rules for the business code. I then figured out if I standardize having the fields as the last arguments on the check routines, then I could use a simple curry technique to build the abstraction. from functional import curry comp_rule = curry(check)(do_complicated_test) Then in the actual business code, can use it like this: comp_rule(('a', 'b')) Or: ssn_rule = curry(match)(is_ssn) ssn_rule('my-social-security-field') For this to work properly and easily for the business logic. I do need to get down some more solid rule invocation standards. >>(have('length'), have('width')), >> check(['length', 'width'], lambda x, y: x == y)) >>assert rule({'length' : '2', 'width' : '2'}) == True >>assert rule({'length' : '2', 'width' : '1'}) == False >>assert rule({'length' : '1', 'width' : '2'}) == False > >But what about when the "when" clause says the rule does not apply? >Maybe return NotImplemented, (which passes as True in an if test) e.g., I don't really see why a special exception needs to be here because "when" easily falls into the logic. When is basically the truth function, "if A therefore B" and when(A, B) => NOT A OR B If A is false, then the expression is always true. def when(predicate, expr): """ when(rule predicate, rule expr) -> rule """ def rule(record): if predicate(record): return expr(record) else: return True return rule So if the "test" fails, then the expression is True OR X, which is always True. Or, form another POV, the entire system is based on conjunction. Returning "true" means that it doesn't force the entire validation False. When the predicate fails on the When expression, then the "therefore" expression does not matter because the predicate failed. Therefore, the expresison is true no matter what. Trying not to be condesending, I just think that the typical propositional logic works well in this case. Because this fits well into the system, can you give me an example why this should make an exception to the boolean logic and raise an exception? The inline code generation is an interesting concept. It would definately run faster. Hmm, I did this little test. This one does arbitrary recursion calls >>> def doit(v, i): ... if not i: ... return v ... else: ... return doit(v, i-1) # Simple Adding while stressing function calls >>> def run(times, depth): ... for i in range(times): ... doit(i, depth/2) + doit(i, depth/2) # benchmark >>> def ezrun(times): ... for i in range(times): ... i + i # timer >>> def efunc(f, *a, **k): ... s = time.time() ... f(*a, **k) ... return time.time() - s ... >>> efunc(run, 10000, 8) 0.061595916748046875 >>> efunc(run, 10000, 4) 0.038624048233032227 >>> efunc(ezrun, 10000) 0.0024700164794921875 >>> efunc(run, 1000, 4) 0.0038750171661376953 >>> efunc(ezrun, 1000) 0.00024008750915527344 >>> efunc(run, 1000, 2) 0.002780914306640625 >>> efunc(ezrun, 1000) 0.00023913383483886719 >>> Even in the best case where there are 2 function calls per evaluation, then it's at least 10^1 slower. However, given a more realistic example with the kind of domain I'm going to apply this to: Hmm, I'd say that for practical application a record with 500 different fields with an average of 3 rules per field is a common case upper bound. using the product of that for a profile: >>> efunc(run, 1500, 4) 0.0058319568634033203 >>> efunc(ezrun, 1500) 0.00034809112548828125 >>> I can probably live with those results for now, though optimiziation in the future is always an option :) >>>> jslowery.rule({'length':2, 'width':2}) > True > >>> jslowery.rule({'length':2, 'width':1}) > False > >>> jslowery.rule({'height':2, 'width':1}) > NotImplemented > >>> jslowery.box_rule({'height':2, 'width':1}) > NotImplemented > >>> jslowery.box_rule({'length':2, 'width':1}) > False > >>> jslowery.box_rule({'length':2, 'width':2}) > True You know, the first implementation of this thing I did had the fields seperated from the complete rules. When I realized that this was just a basic logic system, it became a problem having the fields seperated from the rules. In practice, I'm looking to use this for easily collecting many rules together and applying them to a data set. def test_validator2(self): from strlib import is_ssn rules = [ have('first_name'), have('last_name'), all(have('ssn'), match(is_ssn, 'ssn')), when(all(have('birth_date'), have('hire_date')), lt('birth_date', 'hire_date')) ] v = validator(rules) data = {'first_name' : 'John', 'last_name' : 'Smith', 'ssn' : '123-34-2343', 'birth_date' : 0, 'hire_date' : 0} assert v(data) == [] data = {'first_name' : 'John', 'last_name' : 'Smith', 'ssn' : '12-34-2343', 'birth_date' : 0, 'hire_date' : 0} assert v(data) == [rules[2]] The thing I'm wrestling with now is the identity of rules and fields. I'd like to fully support localization. Making a localization library that will provide string tables for "error messages". Could use positions to do this. rules = [ have('first_name'), have('last_name'), all(have('ssn'), match(is_ssn, 'ssn')), when(all(have('birth_date'), have('hire_date')), lt('birth_date', 'hire_date')) ] msgs = [ locale('MISSING_VALUE', 'FIRST_NAME'), locale('MISSING_VALUE', 'LAST_NAME'), locale('INVALID_SSN'), locale('INVALID_BIRTH_HIRE_DATE'), ] # Could use a functional construction to get all of the locale references out of here if wanted and then use a generic routine to join them: error_map = assoc_list(rules, msgs) errors = validate(record) for e in errors: print locale.lookup(error_map[e]) -- http://mail.python.org/mailman/listinfo/python-list