Hello there,

Depending on the firmware version of the HP printer and the model type, one will encounter a myriad of combinations of the following strings while reading the index page:

hp
HP
color
Color
Printer
Printer Status
Status:
Device:
Device Status
laserjet
LaserJet

How can I go about determining if a site is indeed the Web interface to a HP printer? The goal is to remove all HP printers from a list of publicly available Web sites... I've tried this approach, but it gets messy quickly when I attempt to account for all possible combinations that HP uses:

f = urllib2.urlopen("http://%s"; %host)
data = f.read()
f.close()
if 'hp' or 'HP' and 'color' or 'Color' and 'Printer' or 'Printer Status' in data:
DISREGARD THE IP


I'm sure there's a more graceful way to go about this while maintaining a high degree of accuracy and as few false positives as possible. Any tips or pointers?

Thanks in advance!
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to