Bugs item #1610654, was opened at 2006-12-07 09:18 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1610654&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Performance Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Chui Tey (teyc) Assigned to: Nobody/Anonymous (nobody) Summary: cgi.py multipart/form-data Initial Comment: Uploading large binary files using multipart/form-data can be very inefficient because LF character may occur too frequently, resulting in the read_line_to_outer_boundary looping too many times. *** cgi.py.Py24 Thu Dec 7 18:46:13 2006 --- cgi.py Thu Dec 7 16:38:04 2006 *************** *** 707,713 **** last = next + "--" delim = "" while 1: ! line = self.fp.readline() if not line: self.done = -1 break --- 703,709 ---- last = next + "--" delim = "" while 1: ! line = self.fp_readline() if not line: self.done = -1 break *************** *** 729,734 **** --- 730,753 ---- delim = "" self.__write(odelim + line) + def fp_readline(self): + + tell = self.fp.tell() + buffer = self.fp.read(1 << 17) + parts = buffer.split("\n") + retlst = [] + for part in parts: + if part.startswith("--"): + if retlst: + retval = "\n".join(retlst) + "\n" + else: + retval = part + "\n" + self.fp.seek(tell + len(retval)) + return retval + else: + retlst.append(part) + return buffer + def skip_lines(self): """Internal: skip lines until outer boundary if defined.""" if not self.outerboundary or self.done: The patch reads the file in larger increments. For my test file of 138 Mb, it reduced parsing time from 168 seconds to 19 seconds. #------------ test script -------------------- import cgi import cgi import os import profile import stat def run(): filename = 'body.txt' size = os.stat(filename)[stat.ST_SIZE] fp = open(filename,'rb') environ = {} environ["CONTENT_TYPE"] = open('content_type.txt','rb').read() environ["REQUEST_METHOD"] = "POST" environ["CONTENT_LENGTH"] = str(size) fieldstorage = cgi.FieldStorage(fp, None, environ=environ) return fieldstorage import hotshot, hotshot.stats import time if 1: t1 = time.time() prof = hotshot.Profile("bug1718.prof") # hotshot profiler will crash with the # patch applied on windows xp #prof_results = prof.runcall(run) prof_results = run() prof.close() t2 = time.time() print t2-t1 if 0: for key in prof_results.keys(): if len(prof_results[key].value)> 100: print key, prof_results[key].value[:80] + "..." else: print key, prof_results[key] content_type.txt ---------------------------- multipart/form-data; boundary=----------ThIs_Is_tHe_bouNdaRY_$ ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1610654&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com