Paul wrote: > Suppose I have some text that has a lot of quoted speech in it, but it's > supplied using standard (") straight single and double quotes. > > Is there some pre-processing tool that will try to convert them to > proper curly quotes suitable for LaTeX (``) and ('')? > > I know it can't be done perfectly and will need manual tweaking, but > there must be something to do most of the work. > > All I can think of is to use some word processor that has a "clever > quotes" function (e.g. MS Word) and then use something like wvLatex to > export that to LaTeX format. Or do a search and replace. > > Is there a command-line tool that does this using some heuristics to > cover most areas that could be problematic?
sed, perl, python or any other scripting language would do it for you. Here's something in python: $ cat trial.txt The newspaper reported "he said 'The quick brown fox jumped over the lazy dog' 500 times in a row and then dropped down dead". $ python quotes.py trial.txt The newspaper reported ``he said `The quick brown fox jumped over the lazy dog' 500 times in a row and then dropped down dead''. Regards, Angus #! /usr/bin/env python import sys def usage(prog_name): return "Usage: %s 'input text file'\n" % prog_name def warning(message): sys.stderr.write(message + '\n') def error(message): sys.stderr.write(message + '\n') sys.exit(1) def manipulate(filename): doubleq = '"' singleq = "'" inside_double = 0 inside_single = 0 double_latex_lq = '``' double_latex_rq = "''" single_latex_lq = '`' single_latex_rq = "'" try: output = [] for line in open(filename, 'r').readlines(): for c in line: if c == doubleq: if not inside_double: output.append(double_latex_lq) inside_double = 1 else: output.append(double_latex_rq) inside_double = 0 elif c == singleq: if not inside_single: output.append(single_latex_lq) inside_single = 1 else: output.append(single_latex_rq) inside_single = 0 else: output.append(c) return ''.join(output) except: warning('Unable to read %s' % filename) return None def main(argv): if len(argv) != 2: error(usage(argv[0])) input_file = argv[1] manipulated_text = manipulate(input_file) if manipulated_text != None: print manipulated_text, if __name__ == "__main__": main(sys.argv)