On May 11, 11:04 am, "Tim Arnold" <tim.arn...@sas.com> wrote: > Hi, I have some html files that I want to validate by using an external > script 'validate'. The html files need a doctype header attached before > validation. The files are in utf8 encoding. My code: > --------------- > import os,sys > import codecs,subprocess > HEADER = '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">' > > filename = 'mytest.html' > fd = codecs.open(filename,'rb',encoding='utf8') > s = HEADER + fd.read() > fd.close() > > p = subprocess.Popen(['validate'], > stdin=subprocess.PIPE, > stdout=subprocess.PIPE, > stderr=subprocess.STDOUT) > validate = p.communicate(unicode(s,encoding='utf8')) > print validate > --------------- > > I get lots of lines like this: > Error at line 1, character 66:\tillegal character number 0 > etc etc. > > But I can give the command in a terminal 'cat mytest.html | validate' and > get reasonable output. My subprocess code must be wrong, but I could use > some help to see what the problem is. >
Newline missing after the header is my guess. -- http://mail.python.org/mailman/listinfo/python-list