On Wed, Nov 13, 2024 at 10:08:10PM +0000, Gavin Smith wrote:
> I feel like this should be quite easy to achieve for the input
> files, at least, as we already do something similar for an input
> file with a Latin-1 name, which we place in a directory in the
> build directory ("built_input").  This is the rule to build
> "input_file_names_recoded_stamp.txt" in tp/tests/Makefile.am.
> 
> I've started work for something similar for files such as
> tp/tests/encoded/osé_utf8.texi - this will take me some time to finish
> properly.

You could probably reuse tp/maintain/copy_change_file_name_encoding.pl,
though the substitution is hardcoded in that file, maybe substitutions
within a hash could be selected with a command-line option.  As run in
Makefile.am, a side effect of copy_change_file_name_encoding.pl is to
set input_file_names_recoded_stamp.txt which is later on used to
determine if tests with 'Need recoded file names' in their specification
in list-of-tests are run.  Doing the same for encoding of file names to
UTF-8 could be right if it fails when it should on MS-Windows.

> For the reference test results, this does not work, as these results
> are checked in a fixed location in srcdir.
> 
> The easiest solution that comes to mind is to post-process the test
> results to escape file names.  So instead of
> 
> encoded/res_parser/non_ascii_command_line/intérnal.txt
> 
> you would have something like
> 
> encoded/res_parser/non_ascii_command_line/int\xc3\xa9rnal.txt

The unescaping would be done in tp/tests/run_parser_all.sh?

It seems to me that it could also be possible to generate test result
files in built_input/my_test/res_html/file_é.html and copy all those
files before doing the diff in run_parser_all.sh.

> All the distributed reference test results would have ASCII file names
> and the non-ASCII file names would only be used if the tests were actually
> run, which would be disabled on MS-Windows (again, as already happens
> for some tests, I believe).

Currently there is 'Need recoded file names' on list-of-tests test
command line, such that the test is skipped if there was an error
calling copy_change_file_name_encoding.pl for a latin1 file name.  There
is also 'Need command-line unicode' that is recognized  on list-of-tests
test command line, such that the test is skipped if HOST_IS_WINDOWS_VARIABLE
is set (from configure.ac, I believe).

-- 
Pat

Reply via email to