Hi Jason,
On 24 Jan 2025, at 16:51, Jason Merrill wrote:
> On 1/24/25 6:19 AM, Simon Martin wrote:
>> Hi Jason,
>>
>> On 23 Jan 2025, at 22:56, Jason Merrill wrote:
>>
>>> On 1/23/25 12:06 PM, Simon Martin wrote:
>>>> Hi Marek,
>>>>
>>>> On 23 Jan 2025, at 16:45, Marek Polacek wrote:
>>>>
>>>>> On Thu, Jan 23, 2025 at 02:40:09PM +0000, Simon Martin wrote:
>>>>>> Hi Jason,
>>>>>>
>>>>>> On 20 Jan 2025, at 22:49, Jason Merrill wrote:
>>>>>>
>>>>>>> On 1/20/25 2:11 PM, Simon Martin wrote:
>>>>>>>> Hi Jason,
>>>>>>>>
>>>>>>>> On 15 Jan 2025, at 22:42, Jason Merrill wrote:
>>>>>>>>
>>>>>>>>> On 12/12/24 2:07 PM, Simon Martin wrote:
>>>>>>>>>> We currently ICE upon the following valid (under -Wno-vla)
>>>>>>>>>> code
>>
>>>>>>>>>>
>>>>>>>>>> === cut here ===
>>>>>>>>>> void f(int c) {
>>>>>>>>>> constexpr int r = 4;
>>>>>>>>>> [&](auto) { int t[r * c]; }(0);
>>>>>>>>>> }
>>>>>>>>>> === cut here ===
>>>>>>>>>>
>>>>>>>>>> The problem is that when parsing the lambda body, and more
>>>>>>>>>> specifically
>>>>>>>>>> the multiplication, we mark the lambda as
>>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>>>>>>> even though the replacement of r by 4 is "undone" by the call
>>>>>>>>>> to
>>>>>>>>>> build_min_non_dep in build_x_binary_op. This makes
>>>>>>>>>> prune_lambda_captures
>>>>>>>>>> remove the proxy declaration while it should not, and we trip
>>>>>>>>>> on
>>>>>>>>>> an
>>>>>>
>>>>>>>>>> assert at instantiation time.
>>>>>>>>>
>>>>>>>>> Why does prune_lambda_captures remove the declaration if it's
>>>>>>>>> still
>>>>>>>>> used in the function body? Setting
>>>>>>>>> LAMBDA_EXPR_CAPTURE_OPTIMIZED
>>>>>>>>> just
>>>>>>>>> means "we might have optimized away some captures", the tree
>>>>>>>>> walk
>>>>>>>>> should have found the use put back by build_x_binary_op.
>>>>>>>> I think that this is due to a bug in mark_const_cap_r, that’s
>>>>>>>> been
>>>>>>>> here since the beginning, i.e. r8-7213-g1577f10a637352: it
>>>>>>>> decides
>>>>>>
>>>>>>>> NOT
>>>>>>>> to walk sub-trees when encountering a DECL_EXPR for a constant
>>>>>>>> capture
>>>>>>>> proxy (at lambda.cc:1769). I don’t understand why we
>>>>>>>> wouldn’t
>>>>>>>> want
>>>>>>>> to continue.
>>>>>>>
>>>>>>> Because that's the declaration of the capture proxy, not a use
>>>>>>> of
>>>>>>> it.
>>>>>> Indeed, thanks for clarifying.
>>>>>>
>>>>>>> Why aren't we finding the use in the declaration of t?
>>>>>> After further investigation, the problem is rather that neither
>>>>>> walk_tree_1 nor cp_walk_subtrees walk the dimensions of array
>>>>>> types,
>>>>>> so
>>>>>> we miss the uses.
>>>>>>
>>>>>>>> Removing that line fixes the PR, but breaks 3 existing tests
>>>>>>>> (lambda-capture1.C, lambda-const11.C and lambda-const11a.C,
>>>>>>>> that
>>>>>>>> all
>>>>>>>> assert that we optimise out the capture); that’s why I did
>>>>>>>> not
>>>>>>>> pursue
>>>>>>>> it in the first place.
>>>>>>>
>>>>>>> Right.
>>>>>>>
>>>>>>>> But taking another look, it might not be that big a deal that
>>>>>>>> we
>>>>>>>> don’t
>>>>>>>> optimise those out: as soon as we use -O1 or above, the
>>>>>>>> assignment
>>>>>>>> to
>>>>>>
>>>>>>>> the closure field actually disappears.
>>>>>>>
>>>>>>> Completely breaking this optimization is a big deal,
>>>>>>> particularly
>>>>>>> since it affects the layout of closure objects. We can't always
>>>>>>> optimize everything away.
>>>>>> ACK.
>>>>>>
>>>>>> The attached updated patch makes cp_walk_subtrees walk array type
>>>>>> dimensions, which fixes the initial PR without those 3
>>>>>> regressions.
>>
>>>>>>
>>>>>> Successfully tested on x86_64-pc-linux-gnu. Is it OK?
>>>>>>
>>>>>> Simon
>>>>>>
>>>>>> PS: In theory I think it’d make sense to do do this change in
>>>>>>
>>>>>> walk_tree_1 since C also supports VLAs, but doing so breaks some
>>
>>>>>> OMP
>>>>>> tests. OMP can do interesting things with array bounds (see
>>>>>> r9-5354-gda972c05f48637), and fixing those would require teaching
>>>>>> walk_tree_1 about the “omp dummy var” array bounds, which I
>>>>>> think
>>>>>> would be a bit ugly. And I’m not aware of any C case that would
>>>>>> be
>>>>>> improved/fixed by doing this change, so we’re probably fine not
>>>>>> doing
>>>>>> it.
>>>>>
>>>>>> From e19bb6c943a422b1201c5c9a2a1d4f32141baf84 Mon Sep 17
>>>>>> 00:00:00
>>>>>> 2001
>>>>>> From: Simon Martin <si...@nasilyan.com>
>>>>>> Date: Wed, 22 Jan 2025 16:19:47 +0100
>>>>>> Subject: [PATCH] c++: Don't prune constant capture proxies only
>>>>>> used
>>>>>> in array
>>>>>> dimensions [PR114292]
>>>>>>
>>>>>> We currently ICE upon the following valid (under -Wno-vla) code
>>>>>>
>>>>>> === cut here ===
>>>>>> void f(int c) {
>>>>>> constexpr int r = 4;
>>>>>> [&](auto) { int t[r * c]; }(0);
>>>>>> }
>>>>>> === cut here ===
>>>>>>
>>>>>> When parsing the lambda body, and more specifically the
>>>>>> multiplication,
>>>>>> we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which
>>>>>> indicates
>>>>>> to
>>>>>> prune_lambda_captures that it might be possible to optimize out
>>>>>> some
>>>>>> captures.
>>>>>>
>>>>>> The problem is that prune_lambda_captures then misses the use of
>>
>>>>>> the
>>>>>> r
>>>>>> capture (because neither walk_tree_1 nor cp_walk_subtrees walks
>>>>>> the
>>
>>>>>> dimensions of array types - here "r * c"), hence believes the
>>>>>> capture
>>>>>> can be pruned... and we trip on an assert when instantiating the
>>>>>> lambda.
>>>>>>
>>>>>> This patch changes cp_walk_subtrees so that when walking a
>>>>>> declaration
>>>>>> with an array type, it walks that array type's dimensions. Since
>>>>>> C
>>>>>> also
>>>>>> supports VLAs, I thought it'd make sense to do this in
>>>>>> walk_tree_1,
>>
>>>>>> but
>>>>>> this breaks some omp-low related test cases (because OMP can do
>>>>>> funky
>>>>>> things with array bounds when adjust_array_error_bounds is set),
>>
>>>>>> and
>>>>>> I
>>>>>> don't really want to add code to walk_tree_1 to skips arrays
>>>>>> whose
>>>>>> dimension is a temporary variable with the "omp dummy var"
>>>>>> attribute.
>>>>>>
>>>>>> Successfully tested on x86_64-pc-linux-gnu.
>>>>>>
>>>>>> PR c++/114292
>>>>>>
>>>>>> gcc/cp/ChangeLog:
>>>>>>
>>>>>> * tree.cc (cp_walk_subtrees): Walk array type dimensions.
>>>>>>
>>>>>> gcc/testsuite/ChangeLog:
>>>>>>
>>>>>> * g++.dg/cpp1y/lambda-ice4.C: New test.
>>>>>>
>>>>>> ---
>>>>>> gcc/cp/tree.cc | 11 ++++++
>>>>>> gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 44
>>>>>> ++++++++++++++++++++++++
>>>>>> 2 files changed, 55 insertions(+)
>>>>>> create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
>>>>>>
>>>>>> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
>>>>>> index 36581865a17..fc9a2fbff32 100644
>>>>>> --- a/gcc/cp/tree.cc
>>>>>> +++ b/gcc/cp/tree.cc
>>>>>> @@ -5844,6 +5844,17 @@ cp_walk_subtrees (tree *tp, int
>>>>>> *walk_subtrees_p, walk_tree_fn func,
>>>>>> break;
>>>>>>
>>>>>> default:
>>>>>> + /* If this is an array, walk its dimensions. */
>>>>>> + if (DECL_P (t) && TREE_TYPE (t)
>>>>>> + && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
>>>>>> + {
>>>>>> + tree domain = TYPE_DOMAIN (TREE_TYPE (t));
>>>>>> + if (domain)
>>>>>> + {
>>>>>> + WALK_SUBTREE (TYPE_MIN_VALUE (domain));
>>>>>> + WALK_SUBTREE (TYPE_MAX_VALUE (domain));
>>>>>> + }
>>>>>> + }
>>>>>> return NULL_TREE;
>>>>>> }
>>>>>
>>>>> Is there a reason I'm missing not to put this into the TYPE_P
>>>>> block
>>>>> above?
>>>> Do you mean put it in that block as well, or instead of where I put
>>>> it?
>>>> If the latter, we don’t see any TYPE_P node for “int[r*n]”,
>>>> so
>>>> we
>>>> don’t go into that block, and continue ICE’ing without the
>>>> change
>>>> in
>>>> the “default:”.
>>>>
>>>> Anyway it’s a very good call (thanks!), because it got me to
>>>> check
>>>> what we get if we change the lambda body to just do “typedef int
>>>> MyT[c*r];”, and we still ICE. And from a quick look, doing
>>
>>>> something
>>>> similar in the TYPE_P block does not fix it.
>>>>
>>>> I’ll work something out and report back.
>>>
>>> I would think this should go in the DECL_EXPR handling, so we don't
>>> walk into the type every time we see a use of an array variable.
>> Indeed, we can do the check there, thanks a lot.
>>
>> And concerning the case of a single “typedef int MyT[r*c];”,
>> there
>> already is code in walk_tree_1 that covers such typedefs, but that
>> does
>> not walk array type dimensions.
>>
>> So the result is the attached updated patch, successfully tested on
>> x86_64-pc-linux-gnu. OK for trunk? And if so, also for branches?
>
>> * tree.cc (walk_tree_1): Walk array type dimensions, if any, for
>> TYPE_DECLs.
>
> Hmm, why isn't this covered by the existing call to walk_type_fields?
>
> Ah, I guess because that just walks into TYPE_DOMAIN, and nothing
> walks from there into the TYPE_MIN/MAX_VALUE.
>
> What if instead of touching walk_tree_1, mark_const_cap_r handled
> INTEGER_TYPE?
I don’t think we can look into TYPE_{MIN,MAX}_VALUE from
mark_const_cap_r since we might need to recurse (here we have a
MINUS_EXPR in TYPE_MAX_VALUE) and I don’t think we can from there?
> I know walk_tree used to walk into VLA bounds, which is why we have
> stabilize_vla_size to split out any relevant expressions into
> temporaries; I'm reluctant to re-add that without a lot of
> investigation into the history.
Yeah, makes sense. I learnt that touching walk_tree requires caution
through all that I broke with my few attempts at changing it :-)
> In cp_walk_subtrees, I think we also want to go through the typedef
> handling at the top, so in the DECL_EXPR handling, let's just walk
> into the type of the decl. So we walk into the ARRAY_TYPE, check
> whether it's a typedef, if not walk_type_fields walks into the
> TYPE_DOMAIN, and the mark_const_cap_r adjustment I suggested above
> checks the TYPE_MAX_VALUE from there.
I think that the typedef handling is already done in walk_tree (at least
I was unable to craft a test case where doing something extra for
typedefs in cp_walk_subtrees was required).
And following up on your suggestion, things end up pretty simple: just
walk the type of declarations in DECL_EXPRs, and explicitly handle the
min/max value of INTEGER_TYPEs in cp_walk_subtree.
At least that’s what the updated patch does, successfully tested on
x86_64-pc-linux-gnu. Ok for trunk?
Simon
From 5145b47f01718aa279220a517b06272c9467cae6 Mon Sep 17 00:00:00 2001
From: Simon Martin <si...@nasilyan.com>
Date: Wed, 22 Jan 2025 16:19:47 +0100
Subject: [PATCH] c++: Don't prune constant capture proxies only used in array
dimensions [PR114292]
We currently ICE upon the following valid (under -Wno-vla) code
=== cut here ===
void f(int c) {
constexpr int r = 4;
[&](auto) { int t[r * c]; }(0);
}
=== cut here ===
When parsing the lambda body, and more specifically the multiplication,
we mark the lambda as LAMBDA_EXPR_CAPTURE_OPTIMIZED, which indicates to
prune_lambda_captures that it might be possible to optimize out some
captures.
The problem is that prune_lambda_captures then misses the use of the r
capture (because neither walk_tree_1 nor cp_walk_subtrees walks the
dimensions of array types - here "r * c"), hence believes the capture
can be pruned... and we trip on an assert when instantiating the lambda.
This patch changes cp_walk_subtrees so that (1) when walking a
DECL_EXPR, it also walks the DECL's type, and (2) when walking an
INTEGER_TYPE, it also walks its TYPE_{MIN,MAX}_VALUE.
Successfully tested on x86_64-pc-linux-gnu.
PR c++/114292
gcc/cp/ChangeLog:
* tree.cc (cp_walk_subtrees): Walk the type of DECL_EXPR
declarations, as well as the TYPE_{MIN,MAX}_VALUE of
INTEGER_TYPEs.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/lambda-ice4.C: New test.
---
gcc/cp/tree.cc | 6 +++
gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C | 63 ++++++++++++++++++++++++
2 files changed, 69 insertions(+)
create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 36581865a17..832e3f4ba7e 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -5793,6 +5793,7 @@ cp_walk_subtrees (tree *tp, int *walk_subtrees_p,
walk_tree_fn func,
&& !TREE_STATIC (TREE_OPERAND (t, 0)))))
{
tree decl = TREE_OPERAND (t, 0);
+ WALK_SUBTREE (TREE_TYPE (decl));
WALK_SUBTREE (DECL_INITIAL (decl));
WALK_SUBTREE (DECL_SIZE (decl));
WALK_SUBTREE (DECL_SIZE_UNIT (decl));
@@ -5843,6 +5844,11 @@ cp_walk_subtrees (tree *tp, int *walk_subtrees_p,
walk_tree_fn func,
WALK_SUBTREE (STATIC_ASSERT_MESSAGE (t));
break;
+ case INTEGER_TYPE:
+ WALK_SUBTREE (TYPE_MIN_VALUE (t));
+ WALK_SUBTREE (TYPE_MAX_VALUE (t));
+ break;
+
default:
return NULL_TREE;
}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
b/gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
new file mode 100644
index 00000000000..d8b7af9f992
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-ice4.C
@@ -0,0 +1,63 @@
+// PR c++/114292
+// { dg-do "compile" { target c++14 } }
+// { dg-additional-options "-Wno-vla" }
+
+#define ASSERT_CAPTURE_NUMBER(Lambda, NumCaptures) \
+ { \
+ auto oneCapture = [&](auto) { int t[c]; }; \
+ const auto sizeOneCapture = sizeof (oneCapture); \
+ const auto expected = NumCaptures ? NumCaptures * sizeOneCapture : 1; \
+ static_assert (sizeof (Lambda) == expected, ""); \
+ }
+
+template<int r, int c>
+struct Want_a_Typedef { typedef int Type[r*c]; };
+
+void foo (int c)
+{
+ constexpr int r = 4;
+
+ // This used to ICE.
+ auto ice_1 = [&](auto) { int t[c * r]; };
+ ice_1 (0);
+ ASSERT_CAPTURE_NUMBER (ice_1, 2);
+
+ // Another ICE identified following a great question in the patch submission
+ // mail thread.
+ auto ice_2 = [&](auto) { typedef int MyT[c*r]; };
+ ice_2 (0);
+ ASSERT_CAPTURE_NUMBER (ice_2, 2);
+
+ // All those worked already, but were not covered by any test - do it here.
+ auto ok_0 = [&](auto) { typedef int MyT[c*r]; MyT t; };
+ ok_0 (0);
+ ASSERT_CAPTURE_NUMBER (ok_0, 2);
+
+ auto ok_1 = [&](auto) { Want_a_Typedef<r, 42>::Type t; };
+ ok_1 (0);
+ ASSERT_CAPTURE_NUMBER (ok_1, 0);
+
+ auto ok_2 = [&](auto) { int t[c]; };
+ ok_2 (0);
+ ASSERT_CAPTURE_NUMBER (ok_2, 1);
+
+ auto ok_3 = [&](auto) { int n = r * c; int t[n]; };
+ ok_3 (0);
+ ASSERT_CAPTURE_NUMBER (ok_3, 2);
+
+ auto ok_4 = [&](auto) { int t[r]; };
+ ok_4 (0);
+ ASSERT_CAPTURE_NUMBER (ok_4, 0);
+
+ auto ok_5 = [&](auto) { int t[c * 4]; };
+ ok_5 (0);
+ ASSERT_CAPTURE_NUMBER (ok_5, 1);
+
+ auto ok_6 = [&](auto) { int t[1]; };
+ ok_6 (0);
+ ASSERT_CAPTURE_NUMBER (ok_6, 0);
+
+ auto ok_7 = [&](auto) { int t[c * r]; };
+ ok_7 (0);
+ ASSERT_CAPTURE_NUMBER (ok_7, 2);
+}
--
2.44.0