https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117091
--- Comment #8 from Filip Kastl <pheeck at gcc dot gnu.org> ---
I've looked into analyze_switch_statement(), find_bit_tests() and
find_jump_tables() and did some perf-ing. Some observations:
1) I don't think that the code in SNIPPET 1 is responsible for the slowness.
The comment here is correct: the body of the for loop is indeed constant time.
I think the cause of the slowness is the code in SNIPPET 2 instead (it's from
find_bit_tests()). I believe this code is O(n^3) (Maybe careful analysis would
reveal it's actually O(n^2) but that is besides the point).
2) I think that find_jump_tables() is also O(n^3) or O(n^2). Andi is calling
for dynamic programming, but I think this already uses dynamic programming (as
does find_bit_tests()).
3) I don't think there is a limit on clusters.length(). At least I didn't find
it. Since we're dealing with at least O(n^2) I agree that there should be a
limit.
I think that (at least asymptotically) there isn't much to improve. I don't
think we can find the best possible clustering in time under O(n^2). What I
would do now is add the limit for number of clusters (although I'm not sure
what to do if we hit it -- just lower the switch as a decision tree?) and
disable bit tests and jump tables for -O0.
Andi, could you point me to the recursive function you mentioned? I'm not sure
which function you mean.
P.S. I guess I should be technically using Theta instead of O but I think it is
clear what I'm trying to say.
--- SNIPPET 1 ---
auto_vec<int, m_max_case_bit_tests> dest_bbs;
for (unsigned i = start; i <= end; i++)
{
simple_cluster *sc = static_cast<simple_cluster *> (clusters[i]);
/* m_max_case_bit_tests is very small integer, thus the operation
is constant. */
if (!dest_bbs.contains (sc->m_case_bb->index))
--- ---
--- SNIPPET 2 ---
for (unsigned i = 1; i <= l; i++)
{
/* Set minimal # of clusters with i-th item to infinite. */
min.quick_push (min_cluster_item (INT_MAX, INT_MAX, INT_MAX));
for (unsigned j = 0; j < i; j++)
{
if (min[j].m_count + 1 < min[i].m_count
&& can_be_handled (clusters, j, i - 1))
min[i] = min_cluster_item (min[j].m_count + 1, j, INT_MAX);
}
gcc_checking_assert (min[i].m_count != INT_MAX);
}
--- ---