On Tue, 16 Apr 2024 at 22:47, James Addison <j...@jp-hosting.net> wrote:>
> Thanks Holger,
>
> On Mon, 15 Apr 2024 at 20:43, Holger Wansing <hwans...@mailbox.org> wrote:
> >
> > Hi,
> >
> > James Addison <j...@jp-hosting.net> wrote (Sun, 14 Apr 2024 23:52:03 +0100):
> > > From some testing of these: the search results have a problem that they
> > > hyperlink to a language-less .html URL, meaning that clicking a result 
> > > link in
> > > the DE-language search results takes the user to a EN-language page.
> >
> > Yes, good catch.
> > However it's even worse: if you view a German page and use the Search 
> > function,
> > the search index for English is used.
>
> Wait, I'm confused.  Not on your site, though.  There you have the 
> per-language
> search indices.
>
> And in fact, when deployed to the debian.org website, requested-language 
> search
> (mapping of the browser language to an appropriate searchindex.*.js
> file) could be
> (and is already) a better user experience.  If you hypothetically send
> me a hyperlink
> to a policy section auf Deutsch, that's fine, but when I search for
> 'configuration'
> after reading that, I might want my browser settings to have been respected, 
> in
> terms of what is searched.
>
> > I have tried to deal with this by some adaptions in the cronjob - see the
> > first two additions in my patch: change all links to search.html into
> > search.§lang.html, and rename the language-specific searchindex files into
> > searchindex.$lang.js.
> > However, that does not seem to be enough.
>
> When you say it is not enough: could I check what you mean by that?
>
> > > The _other_ hyperlinks in the static content are replaced as part of the
> > > cronjob[1] - but that doesn't work for items in the searchindex.js file.
> >
> > Why is that a problem at all (maybe you know this already):
> > since we use content negotiation at Debian website (so the pages are
> > delivered in the correct language according to user's browser setting), we
> > change the directory structure away from the default how it's build by 
> > Sphinx:
> > after Sphinx' make there are separate directories for every language,
> > and they contain everything for that language, including the searchindex.js
> > file. And in that structure it works fine.
> > On Debian website we put everything in one directory, adding the language
> > code into the filename in front of the .html extension.
> > While this works fine for static content, it does not for the search
> > function here.
>
> I think this is a reasonable solution; serving the content from a
> single directory
> is simple and logical because the permissions and content should be the same;
> the latter only differs as a result of locale and therefore translation.
>
> > > Fortunately I think there might be a better way to do this.  Sphinx 
> > > itself has
> > > an HTML builder option 'html_file_suffix' and I think we could use that 
> > > instead
> > > to define the filenames.  That option is respected by the search 
> > > JavaScript
> > > using a template variable[3] in the documentation_options.js file.
> > >
> > > We should be careful of other side-effects if making that change, but it
> > > would remove a deployment transformation step on the static content, and I
> > > think that's beneficial.
> >
> > I don't understand how that could affect our search function problem,
> > but I could give it a try.
>
> The main change that it would introduce is that the dynamic search results 
> that
> appear in the search results (as gathered by the JavaScript) have
> hyperlinks that
> include the build-time suffix in the filename.  So in the example
> above, you have
> linked me to a German-language dokumentation page, and when I search from
> that page, I find (based on a DE search index) and am linked to (based on DE
> file suffixes) Deutsch results; foo.de.html instead of foo.html for example.
>
> I'm in two minds about this: if my browser settings say that my locale is 
> en-150
> and I land on a de-DE page, what language should search be performed in, and
> what language should the results link to?
>
> An answer that I find straightforward is that if the page is de-DE -- which 
> your
> hypothetical link to me was -- then because everything on that page _should_
> (with sufficient translation availability) be in German, then I would expect 
> to
> search and be linked to pages accordingly.  If you'd linked to a language-less
> URL, then that would (a) have been thoughtful if you suspected that I did not
> comprehend Deutsch, and (b) also be provided in my default locale, with search
> and results taking place accordingly, and without any specific locale in the
> result hyperlinks (because the server will select a resource to serve).
>
> Note also: there does _not_ appear to be an equivalent to the 
> 'html_file_suffix'
> config setting to adjust the search index filename.
>
> Regards,
> James

I'm don't think that I communicated clearly, which means that I should have
taken more time before adding a reply; I'd also like to apologize for the
erratic formatting of my messages.

My understanding is that we want to build a single RST project into multiple
languages in HTML format (multi-page, currently), and that each of those
per-language sites should be internally consistent.

We'd like some language-agnostic URLs to exist for those resources, and when
those are used, the webserver should select the appropriate files to serve;
currently this uses language-to-filename mapping, and that seems reasonable.

Per-language search is important; precisely how this should function may or may
not have been specified.

Currently Debian uses some custom scripting to build its documentation using
Sphinx into multiple languages, and this includes some post-processing of the
results -- outside of standard .deb packaging -- before they reach the
webserver.

Sphinx itself has an open feature request for multi-language builds[2]; Debian
should not bind itself to any solution proposed there, but may be able to offer
constructive feedback.  Similarly, Sphinx dev/users may have stories about what
has worked well (or not) for them.

We would like the documentation to build using both Sphinx as packaged in
Debian stable (currently v5.3.0) and also testing (currently v7.2.6).

Tangentially, there may have some cases where existing Sphinx-built Debian
HTML documentation contains hyperlink/reference consistency problems[1].

My hope is that I will be able to attend next month's MiniDebConf, and if so
then I would like to work on trying to clarify and improve the situation here,
to the benefit of both Debian and Sphinx.

[1] - https://lists.debian.org/debian-www/2024/04/msg00041.html

[2] - https://github.com/sphinx-doc/sphinx/issues/788

Reply via email to