"cjl" <[EMAIL PROTECTED]> writes: > Fredrik Lundh wrote: > >> something like this could work: >> >> import re >> >> text = open(file, "rb").read() >> >> for m in re.finditer("([\x20-\x7f]{4,})[\n\0]", text): >> print m.start(), repr(m.group(1)) > > Hey...that worked. I actually modified: > > for m in re.finditer("([\x20-\x7f]{4,})[\n\0]", text): > > to > > for m in re.finditer("([\x20-\x7f]{4,})", text): > > and now the output is nearly identical to 'strings'. One problem > exists, in that if the binary file contains a string > "monkey/chicken/dog/cat" it is printed as "mokey//chicken//dog//cat", > and I don't know enough to figure out where the extra "/" is coming > from.
Are you sure it's monkey/chicken/dog/cat, and not monkey\chicken\dog\cat? The later one will print monkey\\chicken... because of the repr() call. Also, you probably want it as [\x20-\x7e] (the DEL character \x7f isn't printable). You're also missing tabs (\t). The GNU binutils string utility looks for \t or [\x20-\x7e]. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke |cookedm(at)physics(dot)mcmaster(dot)ca -- http://mail.python.org/mailman/listinfo/python-list