Subject: ITP: parsero -- Audit tool for robots.txt of a site Package: wnpp Owner: Thiago Andrade Marques <thmarq...@gmail.com> Severity: wishlist
* Package name : parsero Version : 0.0+git20140929.e5b585a Upstream Author : Javier Nieto <javier.ni...@behindthefirewalls.com> * URL : https://github.com/behindthefirewalls/Parsero/ * License : (GPL-2+ Programming Lang: Python3 Description : Audit tool for robots.txt of a site Parsero reads the Robots.txt file of a web server and looks at the Disallow entries. The Disallow entries tell the search engines what directories or files hosted on a web server must not be indexed. For example, "Disallow: /portal/login" means that the content on www.example.com/portal/login it's not allowed to be indexed by crawlers like Google, Bing, Yahoo... This is the way the administrator have to not share sensitive or private information with the search engines.