On 7/16/07, John Nagle <[EMAIL PROTECTED]> wrote: > I'm reading the PhishTank XML file of active phishing sites, > at "http://data.phishtank.com/data/online-valid/" This changes > frequently, and it's big (about 10MB right now) and on a busy server. > So once in a while I get a bogus copy of the file because the file > was rewritten while being sent by the server. > > Any good way to deal with this, short of reading it twice > and comparing? > If you have: 1. Ball park estimate of the size of XML 2. Some footers or "last tags" in the XML
May be you can use the above to check the xml and catch the "bogus" ones ! cheers, -- ---- Amit Khemka website: www.onyomo.com wap-site: www.owap.in Home Page: www.cse.iitd.ernet.in/~csd00377 Endless the world's turn, endless the sun's Spinning, Endless the quest; I turn again, back to my own beginning, And here, find rest. -- http://mail.python.org/mailman/listinfo/python-list