On Wed, Aug 16, 2017 at 6:12 AM, sebb <seb...@gmail.com> wrote:
> I think this is caused by the proxy server which sometimes returns an
> error for a valid URL.

That's not how I read this stack trace.  Mind you, the error makes no
sense the way I do read it.

The message says:

bad URI(is not URI?): http://spamassassin.apache.org/

The cause for this message is that the uri passed to
URI::RFC3986_Parser.split doesn't respond to "to_str".  Which is a
round-about way of saying that the uri isn't a String (or more
precisely, doesn't behave like a String in a Duck typing sense).

That uri is obtained by ASF::Site, and the value is obtained by
parsing contents in svn working directories (i.e., no network access
required - updates are done by cronjobs).  And in nearly every run, is
always a string.

What's particularly puzzling is that I don't even see recursion here,
which would indicate following a redirect or the like.  What the
message is clearly saying is that the uri isn't a string.

And oddly enough, when the uri is included in the error message (under
the covers using to_s instead of to_str), it looks right.

> I'm not quite sure why a network access should should be triggered by
> URI.parse().
> Whilst the code could retry (as for timeout) how can we then
> distinguish bad URIs from a temporary proxy fail?
> Ideally we don't want to retry repeatedly for invalid URIs until the
> depth is exceeded, but I don't see how to distinguish them from the
> proxy errors...
>
> On 16 August 2017 at 08:03, Ping My Box <no-re...@pingmybox.com> wrote:
>>
>> Hi there!
>> The service at whimsy.apache.org (whimsy.apache.org (https)) appears to be 
>> down from multiple locations around the world.
>>
>> The exact error is:
>>
>> Error component: response
>> Error code: Internal Server Error or equivalent bad message received: 
>> HTTP/1.1 400 site_scan 
>> ["/usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/rfc3986_parser.rb:18:in
>>  `rescue in split': bad URI(is not URI?): http://spamassassin.apache.org/ 
>> (URI::InvalidURIError)", "\tfrom 
>> /usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/rfc3986_parser.rb:15:in 
>> `split'", "\tfrom 
>> /usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/rfc3986_parser.rb:73:in 
>> `parse'", "\tfrom 
>> /usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/common.rb:231:in 
>> `parse'", "\tfrom /x1/srv/whimsy/lib/whimsy/cache.rb:104:in `fetch'", 
>> "\tfrom /x1/srv/whimsy/lib/whimsy/cache.rb:124:in `rescue in fetch'", 
>> "\tfrom /x1/srv/whimsy/lib/whimsy/cache.rb:101:in `fetch'", "\tfrom 
>> /x1/srv/whimsy/lib/whimsy/cache.rb:66:in `get'", "\tfrom site-scan.rb:28:in 
>> `parse'", "\tfrom site-scan.rb:184:in `block in <main>'", "\tfrom 
>> site-scan.rb:176:in `each'", "\tfrom site-scan.rb:176:in `<main>'"]
>> Debug output:
>> [Wed Aug 16 07:03:40 2017]: Initialising socket
>> [Wed Aug 16 07:03:40 2017]: Looking up hostname whimsy.apache.org...
>> [Wed Aug 16 07:03:40 2017]: Connecting to 54.91.229.41:443
>> [Wed Aug 16 07:03:41 2017]: Connected, sending HTTPS payload.
>> [Wed Aug 16 07:03:41 2017]: Analyzing server certificate
>> [Wed Aug 16 07:03:41 2017]: Saving certificate data
>> [Wed Aug 16 07:03:41 2017]: Reading response header from server
>> [Wed Aug 16 07:03:42 2017]: Caught exception: Internal Server Error or 
>> equivalent bad message received: HTTP/1.1 400 site_scan 
>> ["/usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/rfc3986_parser.rb:18:in
>>  `rescue in split': bad URI(is not URI?): http://spamassassin.apache.org/ 
>> (URI::InvalidURIError)", "\tfrom 
>> /usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/rfc3986_parser.rb:15:in 
>> `split'", "\tfrom 
>> /usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/rfc3986_parser.rb:73:in 
>> `parse'", "\tfrom 
>> /usr/local/rvm/rubies/ruby-2.4.1/lib/ruby/2.4.0/uri/common.rb:231:in 
>> `parse'", "\tfrom /x1/srv/whimsy/lib/whimsy/cache.rb:104:in `fetch'", 
>> "\tfrom /x1/srv/whimsy/lib/whimsy/cache.rb:124:in `rescue in fetch'", 
>> "\tfrom /x1/srv/whimsy/lib/whimsy/cache.rb:101:in `fetch'", "\tfrom 
>> /x1/srv/whimsy/lib/whimsy/cache.rb:66:in `get'", "\tfrom site-scan.rb:28:in 
>> `parse'", "\tfrom site-scan.rb:184:in `block in <main>'", "\tfrom 
>> site-scan.rb:176:in `each'", "\tfrom site-scan.rb:176:in `<main>'"]
>>
>>
>>
>>
>>
>> With regards,
>> Ping My Box - https://www.pingmybox.com/

Reply via email to