https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113419
Bug ID: 113419
Summary: SRA should replace some aggregate copies by load/store
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
For example gcc.dg/tree-ssa/pr94969.c has
int a = 0, b = 0, c = 0;
struct S {
signed m : 7;
signed e : 2;
};
struct S f[2] = {{0, 0}, {0, 0}};
struct S g = {0, 0};
void __attribute__((noinline))
k()
{
for (; c <= 1; c++) {
f[b] = g;
f[b].e ^= 1;
}
}
the aggregate copy f[b] = g isn't touched by SRA because it's global
variables. For locals we'd end up with sth like
<unnamed-signed:2> g$e;
<unnamed-signed:7> g$m;
struct S g;
struct S a[2];
<bb 2> :
g$m_7 = 0;
g$e_8 = 0;
MEM[(struct S *)&a + 4B].m = g$m_7;
MEM[(struct S *)&a + 4B].e = g$e_8;
so bit-precision integers. That might be good, esp. if there's field
uses around.
When the global variable variant is expaneded on RTL we see a simple
SImode load and a SImode store. That means we should ideally treat
aggregate copies like memcpy (&dest, &src, sizeof (dest)) and then
fold it that way.
But we should let SRA have a chance at decomposing first so this
shouldn't be done as part of general folding but instead by late SRA
for aggregate copies it didn't touch. For the gcc.dg/tree-ssa/pr94969.c
testcase this then allows GIMPLE invariant motion to hoist the load
from g, otherwise we rely on RTL PRE for this which is prone to PR113395.