On Sunday, December 28, 2014 5:34:11 PM UTC-6, Vincent Davis wrote: > > [snip: code sample with Unicode spaces! Yes, *UNICODE SPACES*!]
Oh my! Might i offer some suggestions to improve the readability of this code? 1. Indexing is syntactically noisy, so if you find yourself fetching the same index more than once, then that is a good time to store the indexed value into a local variable. 2. The only thing worse than duplicating code which fetches the same index over and over again, is wrapping the fetch in casting function (in this case: "int()") OVER and OVER again! 3. I see that you are utilizing regexps to aid in the logic, and although i agree that regexps are overkill for this problem (since it could "technically" be solved with string methods) if *I* had to solve this problem, i would use the power of regexps -- although i would use them more wisely ;-) I have not studied the data thoroughly, but just by "grazing over" the code you posted i can see a few distinct patterns that emerge from the VIN data-set. Here is a description of the patterns: "\d+n" "\d+na" "d\d+" "du\d+" and the last pattern being all digits: "\d+" Even though your "verbose-run-on-conditional" would most likely execute faster, i prefer to write code (when performance is not mission critical!) in the most readable and maintainable fashion. And in order to achieve that goal, you always want to keep the main logic as succinct as possible whist encapsulating the difficult bits in "suitably abstracted structures". DIVIDE AND CONQUER! ============================================================ My approach would be as follows: ============================================================ 1. Create a map for each distinct set of VIN patterns with the keys being a two-tuple that represents the low and high limits of the serial number, and the values being the year of that range.. database = { 'map_NA':{ (101, 15808): "Triumph 1951", (15809, 25000): "Triumph 1952", ..., }, 'map_N':{ ..., }, 'map_H':{ ..., }, 'map_D':{ ..., }, 'map_DU':{ ..., }, } 2. Create a regexp pattern for each "distinct VIN pattern". The group captures will be used to strip-out *ONLY* the numeric parts! Then concatenate all the regexp patterns into a single monolithic program utilizing "named groups". (The group names will be the corresponding "map_*" for which to search) [code stub here] :-P" 3. Now you can write some fairly simple logic. prog = re.compile("pat1|pat2|pat3...") def parse_vin(vin): match = prog.search(vin) if match: gname = # Fetch the groupname from the match object. number = # Fetch the digits from the group capture. d = database[gname] for k in d: low, high = d[k] if low <= number <= high: return d[k] return None While this approach could be "heavy handed", i feel it will be much easier to maintain and expand. I'd argue that if you're going to utilize re's, then you should wield the full power they provide, else, use some other method. PS: You know you have a Unicode monkey on your back when you use tools that insert Unicode spaces! PPS: Hopefully i did not make any stupid mistakes, it's past my bedtime! -- https://mail.python.org/mailman/listinfo/python-list