Le 19/05/2020 à 08:35, Peter Kovacs a écrit :
what is your search string? I do not get the line that Google has no hits.
The string is the one in the thread in the forum:
"text lines are overwriting margins" site:forum.openoffice.org
The result page says (in French):
No result found for...
Then it says:
Results for... (without quotes)
And a list of topics from the forum but no match for the exact string of
course.
I've posted a screenshot in the forum:
https://forum.openoffice.org/en/forum/viewtopic.php?f=50&t=102021&p=492807#p492807
Hagar
Am 18.05.20 um 22:20 schrieb Hagar Delest:
Hi Peter,
I noticed that Google provides hits nevertheless. But the first line
does tell that there are no hits with the specified string.
Hagar
Le 18/05/2020 à 18:48, Peter Kovacs a écrit :
Im am already at it. It worked for me so far. I get search
results.Maybe it has to do with the cache.
Not sure.
Am 18.05.20 um 18:22 schrieb Rory O'Farrell:
On Mon, 18 May 2020 15:44:42 +0100
Rory O'Farrell <ofarr...@iol.ie> wrote:
On Tue, 12 May 2020 17:41:09 +0200
Peter Kovacs <pe...@apache.org> wrote:
Okay, I had a short debug session with Dave and Humbedooh.
We are now sure that the crawlers are not blocked. The 301 Response
comes from the fact that Yandex still defaults to http and not
https.
This post on User Forum might be relevant
https://forum.openoffice.org/en/forum/viewtopic.php?f=50&t=102021#p492756
Rory
More detailed examination today shows that
Google search in French seems to drop out six days ago, in Italian
five days ago, and in English about 23rd April - try a search for
openoffice and the site specifier
See the above URL for details.
Rory
After I added https toi the URL all worked fine.
Wave did also do a curl request which also worked fine.
We have agreed now that I play the ball back to google, with the
feedback that this looks like a Google internal issue.
The Robot.txt has not been changed for 11 years. Yandex can crawl
the
URL and we can curl the Webpage. So we think it is an Google Issue.
I very much appreciated the quick session. Thanks.
all the Best
Peter
Am 12.05.20 um 17:24 schrieb Dave Fisher:
It’s not an IP Ban. Infra tells me that would not be a 301.
Ah-ha - here is the 301:
% curl -D headers http://forum.openoffice.org/
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a
href="https://forum.openoffice.org/">here</a>.</p>
</body></html>
Surprising that they cannot shift from HTTP to HTTPS via a 301!
Regards,
Dave
On May 12, 2020, at 8:04 AM, Dave Fisher <w...@apache.org> wrote:
Information about Infra IP Bans is here:
https://infra.apache.org/infra-ban.html
Please direct the Google engineer to that resource.
Regards,
Dave
On May 12, 2020, at 7:55 AM, Dave Fisher <w...@apache.org> wrote:
Are you sure you weren’t using forums.openoffice.org instead
of forum.openoffice.org?
curl -D headers https://forum.openoffice.org/ does return the
correct page.
The robots.txt is this:
curl -D headers https://forum.openoffice.org/robots.txt
User-agent: *
Crawl-delay: 1
Disallow: /en/forum/common.php
Disallow: /en/forum/config.php
Disallow: /en/forum/con.php
Disallow: /en/forum/faq.php
Disallow: /en/forum/mcp.php
Disallow: /en/forum/memberlist.php
Disallow: /en/forum/posting.php
Disallow: /en/forum/report.php
Disallow: /en/forum/search.php
Disallow: /en/forum/style.php
Disallow: /en/forum/ucp.php
Disallow: /en/forum/viewonline.php
Disallow: /en/forum/adm
Disallow: /en/forum/cache
Disallow: /en/forum/docs
Disallow: /en/forum/files
Disallow: /en/forum/images
Disallow: /en/forum/includes
Disallow: /en/forum/language
Disallow: /en/forum/store
Disallow: /en/forum/styles
Disallow: /es/forum/common.php
Disallow: /es/forum/config.php
Disallow: /es/forum/con.php
Disallow: /es/forum/faq.php
Disallow: /es/forum/mcp.php
Disallow: /es/forum/memberlist.php
Disallow: /es/forum/posting.php
Disallow: /es/forum/report.php
Disallow: /es/forum/search.php
Disallow: /es/forum/style.php
Disallow: /es/forum/ucp.php
Disallow: /es/forum/viewonline.php
Disallow: /es/forum/adm
Disallow: /es/forum/cache
Disallow: /es/forum/docs
Disallow: /es/forum/files
Disallow: /es/forum/images
Disallow: /es/forum/includes
Disallow: /es/forum/language
Disallow: /es/forum/store
Disallow: /es/forum/styles
Disallow: /fr/forum/common.php
Disallow: /fr/forum/config.php
Disallow: /fr/forum/con.php
Disallow: /fr/forum/faq.php
Disallow: /fr/forum/mcp.php
Disallow: /fr/forum/memberlist.php
Disallow: /fr/forum/posting.php
Disallow: /fr/forum/report.php
Disallow: /fr/forum/search.php
Disallow: /fr/forum/style.php
Disallow: /fr/forum/ucp.php
Disallow: /fr/forum/viewonline.php
Disallow: /fr/forum/adm
Disallow: /fr/forum/cache
Disallow: /fr/forum/docs
Disallow: /fr/forum/files
Disallow: /fr/forum/images
Disallow: /fr/forum/includes
Disallow: /fr/forum/language
Disallow: /fr/forum/store
Disallow: /fr/forum/styles
Disallow: /fr/ci-joint
Disallow: /hu/forum/common.php
Disallow: /hu/forum/config.php
Disallow: /hu/forum/con.php
Disallow: /hu/forum/faq.php
Disallow: /hu/forum/mcp.php
Disallow: /hu/forum/memberlist.php
Disallow: /hu/forum/posting.php
Disallow: /hu/forum/report.php
Disallow: /hu/forum/search.php
Disallow: /hu/forum/style.php
Disallow: /hu/forum/ucp.php
Disallow: /hu/forum/viewonline.php
Disallow: /hu/forum/adm
Disallow: /hu/forum/cache
Disallow: /hu/forum/docs
Disallow: /hu/forum/files
Disallow: /hu/forum/images
Disallow: /hu/forum/includes
Disallow: /hu/forum/language
Disallow: /hu/forum/store
Disallow: /hu/forum/styles
Disallow: /ja/forum/common.php
Disallow: /ja/forum/config.php
Disallow: /ja/forum/con.php
Disallow: /ja/forum/faq.php
Disallow: /ja/forum/mcp.php
Disallow: /ja/forum/memberlist.php
Disallow: /ja/forum/posting.php
Disallow: /ja/forum/report.php
Disallow: /ja/forum/search.php
Disallow: /ja/forum/style.php
Disallow: /ja/forum/ucp.php
Disallow: /ja/forum/viewonline.php
Disallow: /ja/forum/adm
Disallow: /ja/forum/cache
Disallow: /ja/forum/docs
Disallow: /ja/forum/files
Disallow: /ja/forum/images
Disallow: /ja/forum/includes
Disallow: /ja/forum/language
Disallow: /ja/forum/store
Disallow: /ja/forum/styles
Disallow: /test
Disallow: /nl/forum/common.php
Disallow: /nl/forum/config.php
Disallow: /nl/forum/con.php
Disallow: /nl/forum/faq.php
Disallow: /nl/forum/mcp.php
Disallow: /nl/forum/memberlist.php
Disallow: /nl/forum/posting.php
Disallow: /nl/forum/report.php
Disallow: /nl/forum/search.php
Disallow: /nl/forum/style.php
Disallow: /nl/forum/ucp.php
Disallow: /nl/forum/viewonline.php
Disallow: /nl/forum/adm
Disallow: /nl/forum/cache
Disallow: /nl/forum/docs
Disallow: /nl/forum/files
Disallow: /nl/forum/images
Disallow: /nl/forum/includes
Disallow: /nl/forum/language
Disallow: /nl/forum/store
Disallow: /nl/forum/styles
Disallow: /vi/forum/common.php
Disallow: /vi/forum/config.php
Disallow: /vi/forum/con.php
Disallow: /vi/forum/faq.php
Disallow: /vi/forum/mcp.php
Disallow: /vi/forum/memberlist.php
Disallow: /vi/forum/posting.php
Disallow: /vi/forum/report.php
Disallow: /vi/forum/search.php
Disallow: /vi/forum/style.php
Disallow: /vi/forum/ucp.php
Disallow: /vi/forum/viewonline.php
Disallow: /vi/forum/adm
Disallow: /vi/forum/cache
Disallow: /vi/forum/docs
Disallow: /vi/forum/files
Disallow: /vi/forum/images
Disallow: /vi/forum/includes
Disallow: /vi/forum/language
Disallow: /vi/forum/store
Disallow: /vi/forum/styles
Disallow: /zh/forum/common.php
Disallow: /zh/forum/config.php
Disallow: /zh/forum/con.php
Disallow: /zh/forum/faq.php
Disallow: /zh/forum/mcp.php
Disallow: /zh/forum/memberlist.php
Disallow: /zh/forum/posting.php
Disallow: /zh/forum/report.php
Disallow: /zh/forum/search.php
Disallow: /zh/forum/style.php
Disallow: /zh/forum/ucp.php
Disallow: /zh/forum/viewonline.php
Disallow: /zh/forum/adm
Disallow: /zh/forum/cache
Disallow: /zh/forum/docs
Disallow: /zh/forum/files
Disallow: /zh/forum/images
Disallow: /zh/forum/includes
Disallow: /zh/forum/language
Disallow: /zh/forum/store
Disallow: /zh/forum/styles
This has been the robots.txt file since: Last-Modified: Sat,
06 Jun 2009 23:40:14 GMT
Forum search uses phpBB
We haven’t allowed search engines to crawl
forum.openoffice.org since before the Oracle donation to the ASF.
Crawlers IP addresses might be blocked by ASF Infra if their
use is excessive. That could give the 301.
Regards,
Dave
On May 12, 2020, at 3:55 AM, Peter Kovacs <leg...@posteo.de>
wrote:
Hello all,
What I figured is that from the Google search tool the URL
forum.openoffice.org is not reachable.
So I checked with Duckduckgo (my prefered Search engine),
they don't use crawler and point at the infra of Google, Bing
and Yandex.
I checked then with Bing, but could not figure out to check
bots feedback on an URL so I moved on
I checked with Yandex. They have a search URL test page. I
have entered there forum.openoffice.org
The Response is:
------------------------------------------------------------------------
* Date: Tue, 12 May 2020 10:37:47 GMT
* Server: Apache/2.4.18 (Ubuntu)
* Location: https://forum.openoffice.org/
* Content-Length: 237
* Keep-Alive: timeout=15, max=100
* Connection: Keep-Alive
* Content-Type: text/html; charset=iso-8859-1
------------------------------------------------------------------------
HTTP status code 301 Moved Permanently
Server response time 133 ms
IP address 54.84.201.130
Encoding UTF-8(unicode-1-1-utf-8, UTF8)
Page size 237 B
I am not sure, what that means. HTTP Status Code moved
Permanently reads wrong. I just dont know if this is the
return code from our webservcer or a response code from the
crawler.
I try to get someone from Infra. Or I'll open a ticket.
All the best
Peter
Am 12.05.20 um 10:39 schrieb Matthias Seidel:
Hi Kay,
Am 12.05.20 um 01:21 schrieb Kay Schenk:
On 5/11/20 12:33 PM, Matthias Seidel wrote:
Hi Kay,
Am 11.05.20 um 21:23 schrieb Kay Schenk:
Hi Peter...
Since I am a Google Search admin for www.openoffice.org, and
openoffice.apache.org, I got this also. Disclaimer: I
have not done
ANY work with the Google Search apis on these sites in
quite some time.
I actually was NOT aware forum.openoffice.org was set up
to use Google
Search until I saw this.
I think, I added it to the list when we had a discussion
about outdated
information regarding SourceForge found by Google Search.
But I don't have access to forum.openoffice.org, so I
could never
complete the step.
Regards,
Matthias
OK. In the top level of the website source, there is a file
called
"skeleton.html" which references the following bit of code --
<!--#include virtual="/scripts/google-analytics.js" -->
I didn't dig far enough to find how "skeleton.html" is used
( I
forgot) but this this is example for the google-analytics
code snippet
that is used. Basically, this needs to be included in the
site you
want analytics to be used on by putting it in the (header)
files that
generate the site. And, you might take a look at recent
instructions
from Google. Things change.
https://support.google.com/analytics/answer/1008080
Yes, but this is for Google Analytics. I wouldn't want to
"analyze" the
forum...
The procedure for the Google Search Console is the same, it
needs access
to the root directory.
Maybe Andrea can help if he is available again?
Regards,
Matthias
Regards,
Kay
One of the Google Search admins for forum.openoffice.org
could check
the current Google search apis that are in use on that
site. Changes
are occasionally made to the calls, and maybe that is the
issue, or a
robots.txt for that site is causing this. I don't think
it requires a
response, but maybe some investigation.
Just some ideas...
Regards,
Kay
On 5/11/20 6:02 AM, Peter Kovacs wrote:
Hi all,
I have received following mail. Probably because I am
listed in the
google-Analytics page.
Does this has some action items? What can we answer Mr
John Mueller?
All the Best
Peter
-------- Weitergeleitete Nachricht --------
Betreff: Critical issue on forum.openoffice.org and
Google Search
Datum: Mon, 11 May 2020 13:37:27 +0200
Von: John Mueller <joh...@google.com>
An: morsei...@gmail.com, kay.sch...@gmail.com,
legi...@gmail.com
Dear webmaster of forum.openoffice.org
<http://forum.openoffice.org>
I'm an analyst at Google in Switzerland. We wanted to
bring your
attention to a critical issue with your website, and how
it's
available for Google's web search.
In particular, Googlebot has been unable to crawl URLs from
https://forum.openoffice.org/ . This will cause those
pages to drop
out of Google's search results, and will prevent new
pages from being
picked up for Search. If you're not aware of this issue,
you may be
accidentally blocking these pages from Google Search due
to a server
issue. If you need to block Googlebot from crawling
pages on your
website, we'd recommend using the robots.txt file instead.
Should you need to recognize IP addresses of Googlebot
requests, you
can use a reverse IP lookup to do so:
https://support.google.com/webmasters/answer/80553
Should you have any questions, feel free to contact me
directly. For
verification purposes, we are sending a copy of this
message to your
site's Search Console account.
Thank you,
John Mueller (joh...@google.com <mailto:joh...@google.com>)
Webmaster Trends Analyst
---------------------------------------------------------------------
To unsubscribe, e-mail:
dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail:
dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail:
dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
--
Rory O'Farrell <ofarr...@iol.ie>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org