The IP addresses from the Google app aren't those of Google. They are ISPs generally. What bugs me is a fair number of these IP addresses never read my web pages. Easy enough to see from access.log. They just look for photos. If I served ads, I would be furious. But what I perceive is Google provides hot linking, pure and simple. I find it annoying. So now the app is tamed. The can always click on visit page. At one time the Google image search, as run from the browser, would be blocked if the user clicked on the image. I have the code to stop hot linking in my conf file. But now Google does some weird thing where the image link is not to my website, but is some conglomeration of my URL embedded in a google URL. I assume there is a redirect scheme going on, but the bottom line is the browser gets the full size image without ever clicking on a html file. I try to be as unobtrusive as possible on my website. I don't use Google analytics. I don't serve ads. Most pages have no _javascript_, so you can use no script if you want. All that said, I'm probably going to set up a scheme where if the IP hadn't read an html file within a given time period, I will 403 image requests. I'd like to do it without a session cookie. I don't have an issue with the Google bot reading image files for indexing. What I want is for Google to provide links to the relevant page, not serve the image directly. I've used the Google image search from time to time to judge the user experience, and it isn't good in general other than finding photos of famous people. Case in point, do a search on the SU-27, which is a plane recently in the news. You get a lot of SU-35s. Is this really rocket science? I assume Google has no trust in image tags. But many images have SU-35 in text, which could be read using openCV, as is done with openALPR. But I'm rambling.....
That user agent doesn't belong to a Google crawler - they are end-user requests from the Google App (mobile application). I'm not sure what the motivation is for blocking them but I wouldn't consider it malicious / unwanted traffic. On Thu, Jun 22, 2017 at 4:47 PM, Jeff Dyke <jeff.d...@gmail.com> wrote:
|
_______________________________________________ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx