On Sat, Oct 28, 2023 at 11:15:36AM +0200, Patrice Dumas wrote: > > At the moment, performance improvements from the new code seem > > hypothetical. > > Actually, my guess was that there would also be a performance > improvement with the XS overrides of parser > post-processing/Structuring/Transformations, but very small or > unnoticeable. The fact that there is some worsening is actually a bit > mysterious to me -- except if it is explained by the rebuild of the > tree, as one should normally easily be avoided as said in another mail. > I can work on that later, but I would like to finish at least partly the > HTML XS converter before. > > The code is very new, too, so bugs are not out of the question, > including big ones.
I managed to disable a lot of the new XS code and get the test suite to pass. I had to leave the XS translation module active due to the coupling that now exists between it and the XS parser. (My previous attempts at disabling the new code probably stopped the translation module being initialised properly, even though it was called from the XS parser. It appears that the new translation module is more functional than I had previously surmised.) Here are some typical timing reports, run on a version of the glibc manual I downloaded for testing. (I run the timing several times, and chose a typical one for each.) Current master code (commit b81328d2): $ time ../tp/texi2any.pl ../../libc/libc.texinfo creature.texi:309: warning: `.' or `,' must follow @xref, not f real 0m6.364s user 0m6.094s sys 0m0.261s With new modules disabled (patch at bottom of this email): $ time ../tp/texi2any.pl ../../libc/libc.texinfo creature.texi:309: warning: `.' or `,' must follow @xref, not f real 0m6.123s user 0m5.822s sys 0m0.273s texi2any 7.1: $ time texi2any ../../libc/libc.texinfo creature.texi:309: warning: `.' or `,' must follow @xref, not f real 0m6.019s user 0m5.716s sys 0m0.277s As you can see, my attempt at disabling the new modules reverses most of, but not all, of the slowdown. I'm still trying to find causes for the remaining slowdown. I profiled with NYTProf and think that build_document is one possibility, as it may does more than build_texinfo_tree did. For the glibc manual, it is called 2412 times (at least once per parser object). As you know, there is a new parser for every @def* command in the Texinfo sources, so per-parser overhead can be significant. I see there are also changes to index sorting, but haven't investigated them enough to understand if this would have a performance impact. It was important to be able to disable these new modules in order to see this remaining slowdown. I still argue for making it easy to cleanly disable these new modules unless or until they do not slow down the program as much. I think I've explained how I view it in several other emails - there is a large amount of new code that was added since the release, that has proven difficult to extricate the program from. Dependencies of one part of the program on others, such as the parser module on the translation module, make it harder. As time goes on, and with more development, it would become even harder if not impossible. If the promised benefits of the new development never materialised, it would mean that the post-7.1 development of texi2any was not worth pursuing. This is from my perspective of somebody who is not familiar with the new code and doesn't understand how it all works. I've spent hours trying to work this out over the last few days because I view it as a threat to the future development of the program. If the Perl object for the parse tree is built twice, this is a definite problem, and something that needs to be remedied before the new XS code can be considered to be in a finished state. However, I'm hopeful we are now close to understanding where the slowdown is occuring, so texi2any can be developed into a superior program. diff --git a/tp/Texinfo/Common.pm b/tp/Texinfo/Common.pm index 01c5c118eb..e3f554b83f 100644 --- a/tp/Texinfo/Common.pm +++ b/tp/Texinfo/Common.pm @@ -83,32 +83,32 @@ sub import { if (!$module_loaded) { if (!defined $ENV{TEXINFO_XS_PARSER} or $ENV{TEXINFO_XS_PARSER} eq '1') { - Texinfo::XSLoader::override( - "Texinfo::Common::set_document_options", - "Texinfo::StructTransf::set_document_options"); - Texinfo::XSLoader::override( - "Texinfo::Common::copy_tree", - "Texinfo::StructTransf::copy_tree"); - Texinfo::XSLoader::override( - "Texinfo::Common::relate_index_entries_to_table_items_in_tree", - "Texinfo::StructTransf::relate_index_entries_to_table_items_in_tree" - ); - Texinfo::XSLoader::override( - "Texinfo::Common::move_index_entries_after_items_in_tree", - "Texinfo::StructTransf::move_index_entries_after_items_in_tree" - ); - Texinfo::XSLoader::override( - "Texinfo::Common::protect_colon_in_tree", - "Texinfo::StructTransf::protect_colon_in_tree" - ); - Texinfo::XSLoader::override( - "Texinfo::Common::protect_comma_in_tree", - "Texinfo::StructTransf::protect_comma_in_tree" - ); - Texinfo::XSLoader::override( - "Texinfo::Common::protect_node_after_label_in_tree", - "Texinfo::StructTransf::protect_node_after_label_in_tree" - ); +# Texinfo::XSLoader::override( +# "Texinfo::Common::set_document_options", +# "Texinfo::StructTransf::set_document_options"); +# Texinfo::XSLoader::override( +# "Texinfo::Common::copy_tree", +# "Texinfo::StructTransf::copy_tree"); +# Texinfo::XSLoader::override( +# "Texinfo::Common::relate_index_entries_to_table_items_in_tree", +# "Texinfo::StructTransf::relate_index_entries_to_table_items_in_tree" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Common::move_index_entries_after_items_in_tree", +# "Texinfo::StructTransf::move_index_entries_after_items_in_tree" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Common::protect_colon_in_tree", +# "Texinfo::StructTransf::protect_colon_in_tree" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Common::protect_comma_in_tree", +# "Texinfo::StructTransf::protect_comma_in_tree" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Common::protect_node_after_label_in_tree", +# "Texinfo::StructTransf::protect_node_after_label_in_tree" +# ); } $module_loaded = 1; } diff --git a/tp/Texinfo/Convert/ConvertXS.pm b/tp/Texinfo/Convert/ConvertXS.pm index b5e620f091..b3e6b2981a 100644 --- a/tp/Texinfo/Convert/ConvertXS.pm +++ b/tp/Texinfo/Convert/ConvertXS.pm @@ -26,7 +26,7 @@ our $VERSION = '7.1'; use Texinfo::XSLoader; -BEGIN { +if (0) { Texinfo::XSLoader::init ( "Texinfo::Convert::ConvertXS", "Texinfo::Convert::ConvertXS", diff --git a/tp/Texinfo/Convert/HTML.pm b/tp/Texinfo/Convert/HTML.pm index bdd495e48d..35c30ecfc7 100644 --- a/tp/Texinfo/Convert/HTML.pm +++ b/tp/Texinfo/Convert/HTML.pm @@ -87,47 +87,45 @@ $VERSION = '7.1'; our $module_loaded = 0; sub import { if (!$module_loaded) { - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_default_format_protect_text", - "Texinfo::MiscXS::default_format_protect_text"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_entity_text", - "Texinfo::MiscXS::entity_text"); - - if (defined $ENV{TEXINFO_XS_CONVERT} - and $ENV{TEXINFO_XS_CONVERT} eq '1') { - - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_converter_initialize", - "Texinfo::Convert::ConvertXS::html_converter_initialize_sv"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_initialize_output_state", - "Texinfo::Convert::ConvertXS::html_initialize_output_state"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_sort_sortable_index_entries_by_letter", - "Texinfo::Convert::ConvertXS::sort_sortable_index_entries_by_letter"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_get_index_entries_sorted_by_letter", - "Texinfo::Convert::ConvertXS::get_index_entries_sorted_by_letter"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_prepare_conversion_units", - "Texinfo::Convert::ConvertXS::html_prepare_conversion_units"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_prepare_units_directions_files", - "Texinfo::Convert::ConvertXS::html_prepare_units_directions_files"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_prepare_output_units_global_targets", - "Texinfo::Convert::ConvertXS::html_prepare_output_units_global_targets"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_translate_names", - "Texinfo::Convert::ConvertXS::html_translate_names"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_html_convert_init", - "Texinfo::Convert::ConvertXS::html_convert_init"); - Texinfo::XSLoader::override( - "Texinfo::Convert::HTML::_XS_html_convert_convert", - "Texinfo::Convert::ConvertXS::html_convert_convert"); - } +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_default_format_protect_text", +# "Texinfo::MiscXS::default_format_protect_text"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_entity_text", +# "Texinfo::MiscXS::entity_text"); +## if (defined $ENV{TEXINFO_XS_CONVERT} +# and $ENV{TEXINFO_XS_CONVERT} eq '1') { +## Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_converter_initialize", +# "Texinfo::Convert::ConvertXS::html_converter_initialize_sv"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_initialize_output_state", +# "Texinfo::Convert::ConvertXS::html_initialize_output_state"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_sort_sortable_index_entries_by_letter", +# "Texinfo::Convert::ConvertXS::sort_sortable_index_entries_by_letter"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_get_index_entries_sorted_by_letter", +# "Texinfo::Convert::ConvertXS::get_index_entries_sorted_by_letter"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_prepare_conversion_units", +# "Texinfo::Convert::ConvertXS::html_prepare_conversion_units"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_prepare_units_directions_files", +# "Texinfo::Convert::ConvertXS::html_prepare_units_directions_files"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_prepare_output_units_global_targets", +# "Texinfo::Convert::ConvertXS::html_prepare_output_units_global_targets"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_translate_names", +# "Texinfo::Convert::ConvertXS::html_translate_names"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_html_convert_init", +# "Texinfo::Convert::ConvertXS::html_convert_init"); +# Texinfo::XSLoader::override( +# "Texinfo::Convert::HTML::_XS_html_convert_convert", +# "Texinfo::Convert::ConvertXS::html_convert_convert"); +# } $module_loaded = 1; } diff --git a/tp/Texinfo/Document.pm b/tp/Texinfo/Document.pm index 8b12698591..037e1b71c4 100644 --- a/tp/Texinfo/Document.pm +++ b/tp/Texinfo/Document.pm @@ -33,14 +33,14 @@ use Texinfo::StructTransf; our $module_loaded = 0; sub import { if (!$module_loaded) { - if (!defined $ENV{TEXINFO_XS_PARSER} - or $ENV{TEXINFO_XS_PARSER} eq '1') { - Texinfo::XSLoader::override( - "Texinfo::Document::remove_document", - # TODO add a Document XS .xs file and move to that file? - "Texinfo::StructTransf::remove_document" - ); - } + # if (!defined $ENV{TEXINFO_XS_PARSER} + # or $ENV{TEXINFO_XS_PARSER} eq '1') { + # Texinfo::XSLoader::override( + # "Texinfo::Document::remove_document", + # # TODO add a Document XS .xs file and move to that file? + # "Texinfo::StructTransf::remove_document" + # ); + # } $module_loaded = 1; } # The usual import method diff --git a/tp/Texinfo/StructTransf.pm b/tp/Texinfo/StructTransf.pm index d8749727f1..73262fd645 100644 --- a/tp/Texinfo/StructTransf.pm +++ b/tp/Texinfo/StructTransf.pm @@ -26,7 +26,8 @@ our $VERSION = '7.1'; use Texinfo::XSLoader; -BEGIN { +#BEGIN { +if (0) { Texinfo::XSLoader::init ( "Texinfo::StructTransf", "Texinfo::StructTransf", diff --git a/tp/Texinfo/Structuring.pm b/tp/Texinfo/Structuring.pm index 2b5b4f5a86..7acaf91b10 100644 --- a/tp/Texinfo/Structuring.pm +++ b/tp/Texinfo/Structuring.pm @@ -90,63 +90,63 @@ sub import { Texinfo::XSLoader::override( "Texinfo::Structuring::rebuild_document", "Texinfo::StructTransf::rebuild_document"); - Texinfo::XSLoader::override( - "Texinfo::Structuring::rebuild_tree", - "Texinfo::StructTransf::rebuild_tree"); - Texinfo::XSLoader::override( - "Texinfo::Structuring::clear_document_errors", - "Texinfo::StructTransf::clear_document_errors"); - Texinfo::XSLoader::override( - "Texinfo::Structuring::associate_internal_references", - "Texinfo::StructTransf::associate_internal_references" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::sectioning_structure", - "Texinfo::StructTransf::sectioning_structure" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::warn_non_empty_parts", - "Texinfo::StructTransf::warn_non_empty_parts" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::nodes_tree", - "Texinfo::StructTransf::nodes_tree" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::set_menus_node_directions", - "Texinfo::StructTransf::set_menus_node_directions" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::complete_node_tree_with_menus", - "Texinfo::StructTransf::complete_node_tree_with_menus" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::check_nodes_are_referenced", - "Texinfo::StructTransf::check_nodes_are_referenced" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::number_floats", - "Texinfo::StructTransf::number_floats" - ); - Texinfo::XSLoader::override( - "Texinfo::Structuring::rebuild_output_units", - "Texinfo::StructTransf::rebuild_output_units"); - Texinfo::XSLoader::override( - "Texinfo::Structuring::_XS_unsplit", - "Texinfo::StructTransf::unsplit"); - # Not useful for HTML as functions, as the calling functions are - # already overriden - # Could be readded when other converters than HTML are done in C - #Texinfo::XSLoader::override( - # "Texinfo::Structuring::split_by_node", - # "Texinfo::StructTransf::split_by_node"); - #Texinfo::XSLoader::override( - # "Texinfo::Structuring::split_by_section", - # "Texinfo::StructTransf::split_by_section"); - #Texinfo::XSLoader::override( - # "Texinfo::Structuring::split_pages", - # "Texinfo::StructTransf::split_pages" - #); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::rebuild_tree", +# "Texinfo::StructTransf::rebuild_tree"); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::clear_document_errors", +# "Texinfo::StructTransf::clear_document_errors"); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::associate_internal_references", +# "Texinfo::StructTransf::associate_internal_references" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::sectioning_structure", +# "Texinfo::StructTransf::sectioning_structure" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::warn_non_empty_parts", +# "Texinfo::StructTransf::warn_non_empty_parts" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::nodes_tree", +# "Texinfo::StructTransf::nodes_tree" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::set_menus_node_directions", +# "Texinfo::StructTransf::set_menus_node_directions" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::complete_node_tree_with_menus", +# "Texinfo::StructTransf::complete_node_tree_with_menus" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::check_nodes_are_referenced", +# "Texinfo::StructTransf::check_nodes_are_referenced" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::number_floats", +# "Texinfo::StructTransf::number_floats" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::rebuild_output_units", +# "Texinfo::StructTransf::rebuild_output_units"); +# Texinfo::XSLoader::override( +# "Texinfo::Structuring::_XS_unsplit", +# "Texinfo::StructTransf::unsplit"); +# # Not useful for HTML as functions, as the calling functions are +# # already overriden +# # Could be readded when other converters than HTML are done in C +# #Texinfo::XSLoader::override( +# # "Texinfo::Structuring::split_by_node", +# # "Texinfo::StructTransf::split_by_node"); +# #Texinfo::XSLoader::override( +# # "Texinfo::Structuring::split_by_section", +# # "Texinfo::StructTransf::split_by_section"); +# #Texinfo::XSLoader::override( +# # "Texinfo::Structuring::split_pages", +# # "Texinfo::StructTransf::split_pages" +# #); } $module_loaded = 1; } diff --git a/tp/Texinfo/Transformations.pm b/tp/Texinfo/Transformations.pm index 836e1b4aae..0a98e2016b 100644 --- a/tp/Texinfo/Transformations.pm +++ b/tp/Texinfo/Transformations.pm @@ -54,41 +54,41 @@ $VERSION = '7.1'; our $module_loaded = 0; sub import { if (!$module_loaded) { - if (!defined $ENV{TEXINFO_XS_PARSER} - or $ENV{TEXINFO_XS_PARSER} eq '1') { - Texinfo::XSLoader::override( - "Texinfo::Transformations::fill_gaps_in_sectioning", - "Texinfo::StructTransf::fill_gaps_in_sectioning" - ); - Texinfo::XSLoader::override( - "Texinfo::Transformations::reference_to_arg_in_tree", - "Texinfo::StructTransf::reference_to_arg_in_tree" - ); - Texinfo::XSLoader::override( - "Texinfo::Transformations::complete_tree_nodes_menus", - "Texinfo::StructTransf::complete_tree_nodes_menus" - ); - Texinfo::XSLoader::override( - "Texinfo::Transformations::complete_tree_nodes_missing_menu", - "Texinfo::StructTransf::complete_tree_nodes_missing_menu" - ); - Texinfo::XSLoader::override( - "Texinfo::Transformations::regenerate_master_menu", - "Texinfo::StructTransf::regenerate_master_menu" - ); - Texinfo::XSLoader::override( - "Texinfo::Transformations::insert_nodes_for_sectioning_commands", - "Texinfo::StructTransf::insert_nodes_for_sectioning_commands" - ); - Texinfo::XSLoader::override( - "Texinfo::Transformations::protect_hashchar_at_line_beginning", - "Texinfo::StructTransf::protect_hashchar_at_line_beginning" - ); - Texinfo::XSLoader::override( - "Texinfo::Transformations::protect_first_parenthesis_in_targets", - "Texinfo::StructTransf::protect_first_parenthesis_in_targets" - ); - } +# if (!defined $ENV{TEXINFO_XS_PARSER} +# or $ENV{TEXINFO_XS_PARSER} eq '1') { +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::fill_gaps_in_sectioning", +# "Texinfo::StructTransf::fill_gaps_in_sectioning" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::reference_to_arg_in_tree", +# "Texinfo::StructTransf::reference_to_arg_in_tree" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::complete_tree_nodes_menus", +# "Texinfo::StructTransf::complete_tree_nodes_menus" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::complete_tree_nodes_missing_menu", +# "Texinfo::StructTransf::complete_tree_nodes_missing_menu" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::regenerate_master_menu", +# "Texinfo::StructTransf::regenerate_master_menu" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::insert_nodes_for_sectioning_commands", +# "Texinfo::StructTransf::insert_nodes_for_sectioning_commands" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::protect_hashchar_at_line_beginning", +# "Texinfo::StructTransf::protect_hashchar_at_line_beginning" +# ); +# Texinfo::XSLoader::override( +# "Texinfo::Transformations::protect_first_parenthesis_in_targets", +# "Texinfo::StructTransf::protect_first_parenthesis_in_targets" +# ); +# } $module_loaded = 1; } # The usual import method diff --git a/tp/t/test_utils.pl b/tp/t/test_utils.pl index 29fdc521b9..597cb83874 100644 --- a/tp/t/test_utils.pl +++ b/tp/t/test_utils.pl @@ -1084,13 +1084,13 @@ sub test($$) # should not actually be useful, as the same element should be reused. $tree = $document->tree(); - if ($with_XS) { - foreach my $error (@{$document->{'errors'}}) { - $registrar->add_formatted_message($error); - } - Texinfo::Structuring::clear_document_errors( - $document->document_descriptor()); - } +# if ($with_XS) { +# foreach my $error (@{$document->{'errors'}}) { +# $registrar->add_formatted_message($error); +# } +# Texinfo::Structuring::clear_document_errors( +# $document->document_descriptor()); +# } my ($errors, $error_nrs) = $registrar->errors(); # FIXME maybe it would be good to compare $merged_index_entries? diff --git a/tp/texi2any.pl b/tp/texi2any.pl index 26cb16e2c0..0eaba5d07b 100755 --- a/tp/texi2any.pl +++ b/tp/texi2any.pl @@ -1634,13 +1634,13 @@ while(@input_files) { $document = Texinfo::Structuring::rebuild_document($document); - if ($with_XS) { - foreach my $error (@{$document->{'errors'}}) { - $registrar->add_formatted_message($error); - } - Texinfo::Structuring::clear_document_errors( - $document->document_descriptor()); - } +# if ($with_XS) { +# foreach my $error (@{$document->{'errors'}}) { +# $registrar->add_formatted_message($error); +# } +# Texinfo::Structuring::clear_document_errors( +# $document->document_descriptor()); +# } $error_count = handle_errors($registrar, $error_count, \@opened_files);