just crawling is supereasy. its how to index and search that is hard. just start at yahoo.com, scrape out all the links and then for every site visit every link. i wrote a crawler in 15 lines of code. but then it all it did was visit the sites, not indexing them or anything.
you could write a faster one in C++ probably but if you are new to it doing it in python will let you experiment and learn faster. some links: http://infolab.stanford.edu/~backrub/google.html http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html http://www.example-code.com/python/pythonspider.asp http://www.example-code.com/python/spider_simpleCrawler.asp -- http://mail.python.org/mailman/listinfo/python-list