Andre Burgaud added the comment:
Hi Matele,
Thanks for looking into this issue.
I have seen indeed some implementations that were based on the Python
implementation and that had the same problems. The Crystal implementation in
particular (as far as I remember, as it was a while ago). As a
Andre Burgaud added the comment:
Thanks @xtreak for providing some clarification on this behavior! I can write
some tests to cover this behavior, assuming that we agree that an empty file
means "unlimited access". This was worded as such in the old internet draft
from 1996 (sectio
Andre Burgaud added the comment:
During testing identified a related issue that is fixed by the same sort
function implemented to address the longest match rule.
This related problem also addressed by this change takes into account the
situation when 2 equivalent rules (same path for allow
Change by Andre Burgaud :
--
keywords: +patch
pull_requests: +17227
stage: -> patch review
pull_request: https://github.com/python/cpython/pull/17794
___
Python tracker
<https://bugs.python.org/issu
New submission from Andre Burgaud :
As per the current Robots Exclusion Protocol internet draft,
https://tools.ietf.org/html/draft-koster-rep-00#section-3.2. a robot should
apply the rules respecting the longest match.
urllib.robotparser relies on the order of the rules in the robots.txt
Andre Burgaud added the comment:
Hi,
Is this ticket still relevant for Python 3.8?
While running some tests with an empty robotstxt file I realized that it was
returning "ALLOWED" for any path (as per the current draft of the Robots
Exclusion Protocol:
https://tools.ietf.org/