Hi Richard, +# Field examples: +# +# %disp 0:s16 -- sextract(i, 0, 16) +# %imm9 16:6 10:3 -- extract(i, 16, 6) << 3 | extract(i, 10, 3)
startindex:endindex for unnamed_field is more intuitive. As any ISA manual would specify those. +# +# It is recommended, but not required, that all field_ref and args_ref +# appear at the end of the line, not interleaving with fixedbit_elf or +# field_elt. This seems not very intuitive to me. Why specify the field_refs at the end and fill the space they occupy with '.'? I'd prefer interleaving the reference to make it more readable (which doesn't work so far) +# Pattern examples: +# +# addl_r 010000 ..... ..... .... 0000000 ..... @opr +# addl_i 010000 ..... ..... .... 0000000 ..... @opi We should note in the documentation where the LSB is. Got that the wrong way in the first time I implemented RISC-V . +def str_indent(c): + """Return a string with C spaces""" + r = '' + for i in range(0, c): + r += ' ' + return r return ' ' * c +def popcount(b): + b = (b & 0x55555555) + ((b >> 1) & 0x55555555) + b = (b & 0x33333333) + ((b >> 2) & 0x33333333) + b = (b & 0x0f0f0f0f) + ((b >> 4) & 0x0f0f0f0f) + b = (b + (b >> 8)) & 0x00ff00ff + b = (b + (b >> 16)) & 0xffff + return b We really don't need to look for efficiency here? return bin(b).count('1') +def parse_arguments(lineno, name, toks): + """Parse one argument set from TOKS at LINENO""" + global arguments + global re_ident + + flds = [] arg_fields is a better name. Likewise for other places containing flds. Such that one knows where this fields belongs to. + elif name[0] == '&': + parse_arguments(lineno, name[1:], toks) + elif name[0] == '@': + parse_generic(lineno, True, name[1:], toks) + else: + parse_generic(lineno, False, name, toks) + toks = [] Why do we have a generic function for pat and fields? Rather make it two parse functions and split subparsings that they share into generic functions, instead of doing the distinction in parse_generic(). + for p in pats: + pnames.append(p.name) + #pdb.set_trace() Guess you forgot that ;) + long_opts = [ 'decode=', 'translate=', 'header=', 'output=' ] + try: + (opts, args) = getopt.getopt(sys.argv[1:], 'h:o:', long_opts) + except getopt.GetoptError as err: + error(0, err) + for o, a in opts: + if o in ('-h', '--header'): + h_file = a + elif o in ('-o', '--output'): + c_file = a + elif o == '--decode': + decode_function = a + elif o == '--translate': + translate_prefix = a + else: + assert False, 'unhandled option' Please use argsparse On a more general note. Do you really need to invent your own description format + parser? Why not reuse an existing markup language? Can you give me a rationale? That said, your description language looks very readable, even though I'm not a big fan of symbol prefixes ;). Thanks for improving my original approach. Cheers, Bastian