This is the IR I see in today’s trunk: *** IR Dump Before Module Verifier *** ; Function Attrs: noinline optsize ssp define i32 @_ZThn4_N1C4SeekE6_LARGE(%class.C* %this, %union._LARGE* byval align 4) unnamed_addr #2 align 2 { entry: %L = alloca %union._LARGE, align 8 %this.addr = alloca %class.C*, align 4 %1 = bitcast %union._LARGE* %L to i8* %2 = bitcast %union._LARGE* %0 to i8* call void @llvm.memcpy.p0i8.p0i8.i32(i8* %1, i8* %2, i32 8, i32 4, i1 false) store %class.C* %this, %class.C** %this.addr, align 4, !tbaa !2 %this1 = load %class.C*, %class.C** %this.addr, align 4 %3 = bitcast %class.C* %this1 to i8* %4 = getelementptr inbounds i8, i8* %3, i32 -4 %5 = bitcast i8* %4 to %class.C* %call = tail call i32 @_ZN1C4SeekE6_LARGE(%class.C* %5, %union._LARGE* byval align 4 %L) #6 ret i32 %call }
> On Aug 1, 2016, at 12:55 PM, Reid Kleckner <r...@google.com> wrote: > > rnk added a comment. > > So, if clang were to use a temporary alloca for the byval parameter, then > yes, I agree marking it as a tail call would be incorrect. However, clang > doesn't use an alloca, it forwards the byval pointer parameter directly to > the callee: That would be cleaner code, which I had in mind as follow up optimization. I think it would also require that clang verifies => where tail and byval there cannot be an alloca. > > define i32 @_ZThn4_N1C4SeekE6_LARGE(%class.C* nocapture readnone %this, > %union._LARGE* byval nocapture readonly align 4 %L) unnamed_addr #0 align 2 { > entry: > %call = tail call i32 @_ZN1C4SeekE6_LARGE(%class.C* undef, %union._LARGE* > byval nonnull align 4 %L) > ret i32 %call > } > > Maybe the test case is over-reduced, or the problematic IR was produced by an > older version of clang? You can always double check the larger test case in the PR. > > > https://reviews.llvm.org/D22900 > > > _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits