And using the codecs module [CODE] import codecs
f = codecs.open("show_btchina.user.js","r","utf-8") modf = codecs.open("modified.js","w","utf-8") for line in f: idx = line.find(u"//") if idx==0: continue elif idx>0: line = line[:idx]+u'\n' modf.write(line) modf.close() f.close() [/CODE] Gabriel Genellina wrote: > At Friday 26/1/2007 06:54, Frank Potter wrote: > >>[CODE] >>import re >> >>f=open("show_btchina.user.js","r").read() >>f=unicode(f,"utf8") >> >>r=re.compile(ur"//[^\r\n]+$", re.UNICODE|re.VERBOSE) >>f_new=r.sub(ur"",f) >> >>open("modified.js","w").write(f_new.encode("utf8")) >>[/CODE] >> >>And, the problem is, it seems that only the last comment is removed. >>How can I remove all of the comments, please? > > Note that it's not as easy as simply deleting from // to end of line, > because those characters might be inside a string literal. But if you > can afford the risk, this is a simple way without re: > > f = open("show_btchina.user.js","r") > modf = open("modified.js","w") > for line in f: > uline=unicode(line,"utf8") > idx = uline.find("//") > if idx==0: > continue > elif idx>0: > uline = uline[:idx]+'\n' > modf.write(uline.encode("utf8")) > modf.close() > f.close() > > -- http://mail.python.org/mailman/listinfo/python-list