On 7/16/07, John Nagle <[EMAIL PROTECTED]> wrote:
>     I'm reading the PhishTank XML file of active phishing sites,
> at "http://data.phishtank.com/data/online-valid/";  This changes
> frequently, and it's big (about 10MB right now) and on a busy server.
> So once in a while I get a bogus copy of the file because the file
> was rewritten while being sent by the server.
>
>     Any good way to deal with this, short of reading it twice
> and comparing?
>
If you have:
1. Ball park estimate of the size of XML
2. Some footers or "last tags" in the XML

May be you can use the above to check the xml and catch the "bogus" ones !

cheers,

-- 
----
Amit Khemka
website: www.onyomo.com
wap-site: www.owap.in
Home Page: www.cse.iitd.ernet.in/~csd00377

Endless the world's turn, endless the sun's Spinning, Endless the quest;
I turn again, back to my own beginning, And here, find rest.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to