On Wed, Nov 13, 2024 at 10:08:10PM +0000, Gavin Smith wrote: > I feel like this should be quite easy to achieve for the input > files, at least, as we already do something similar for an input > file with a Latin-1 name, which we place in a directory in the > build directory ("built_input"). This is the rule to build > "input_file_names_recoded_stamp.txt" in tp/tests/Makefile.am. > > I've started work for something similar for files such as > tp/tests/encoded/osé_utf8.texi - this will take me some time to finish > properly.
You could probably reuse tp/maintain/copy_change_file_name_encoding.pl, though the substitution is hardcoded in that file, maybe substitutions within a hash could be selected with a command-line option. As run in Makefile.am, a side effect of copy_change_file_name_encoding.pl is to set input_file_names_recoded_stamp.txt which is later on used to determine if tests with 'Need recoded file names' in their specification in list-of-tests are run. Doing the same for encoding of file names to UTF-8 could be right if it fails when it should on MS-Windows. > For the reference test results, this does not work, as these results > are checked in a fixed location in srcdir. > > The easiest solution that comes to mind is to post-process the test > results to escape file names. So instead of > > encoded/res_parser/non_ascii_command_line/intérnal.txt > > you would have something like > > encoded/res_parser/non_ascii_command_line/int\xc3\xa9rnal.txt The unescaping would be done in tp/tests/run_parser_all.sh? It seems to me that it could also be possible to generate test result files in built_input/my_test/res_html/file_é.html and copy all those files before doing the diff in run_parser_all.sh. > All the distributed reference test results would have ASCII file names > and the non-ASCII file names would only be used if the tests were actually > run, which would be disabled on MS-Windows (again, as already happens > for some tests, I believe). Currently there is 'Need recoded file names' on list-of-tests test command line, such that the test is skipped if there was an error calling copy_change_file_name_encoding.pl for a latin1 file name. There is also 'Need command-line unicode' that is recognized on list-of-tests test command line, such that the test is skipped if HOST_IS_WINDOWS_VARIABLE is set (from configure.ac, I believe). -- Pat