On Fri, Oct 21, 2011 at 03:20:54PM +0200, Andreas Krebbel wrote: > on s390 a strcat is already decomposed by fold_builtin_strcat into a > strlen and a strcpy. Due to that 3 strlenopt testcases currently > fail: strlenopt-17g.c, strlenopt-4.c, strlenopt-4g.c.
Well, fold_builtin_strcat does such kind of optimization on all targets, but only for known lengths of the second argument. > Perhaps we can remove the strcat folding relying on the strlen > optimization pass to catch all relevant cases? I think that would be the best thing to do, nuke the if (optimize_insn_for_speed_p ()) part of fold_builtin_strcat and add it instead into handle_builtin_strcat if we'd keep it as strcat after that pass. That said, your change makes sense and the nuking of that folding can be done separately. > *** gcc/tree-ssa-strlen.c.orig > --- gcc/tree-ssa-strlen.c > *************** get_string_length (strinfo si) > *** 397,403 **** > callee = gimple_call_fndecl (stmt); > gcc_assert (callee && DECL_BUILT_IN_CLASS (callee) == > BUILT_IN_NORMAL); > lhs = gimple_call_lhs (stmt); > ! gcc_assert (builtin_decl_implicit_p (BUILT_IN_STRCPY)); > /* unshare_strinfo is intentionally not called here. The (delayed) > transformation of strcpy or strcat into stpcpy is done at the place > of the former strcpy/strcat call and so can affect all the strinfos > --- 397,403 ---- > callee = gimple_call_fndecl (stmt); > gcc_assert (callee && DECL_BUILT_IN_CLASS (callee) == > BUILT_IN_NORMAL); > lhs = gimple_call_lhs (stmt); > ! gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY)); > /* unshare_strinfo is intentionally not called here. The (delayed) > transformation of strcpy or strcat into stpcpy is done at the place > of the former strcpy/strcat call and so can affect all the strinfos The above hunk is correct. > *************** handle_builtin_strcpy (enum built_in_fun > *** 1115,1127 **** > dsi->writable = true; > dsi->dont_invalidate = true; > > ! if (dsi->length == NULL_TREE) > { > /* If string length of src is unknown, use delayed length > computation. If string lenth of dst will be needed, it > can be computed by transforming this strcpy call into > stpcpy and subtracting dst from the return value. */ > ! dsi->stmt = stmt; > return; > } > > --- 1115,1144 ---- > dsi->writable = true; > dsi->dont_invalidate = true; > > ! if (builtin_decl_implicit_p (BUILT_IN_STPCPY) && dsi->length == NULL_TREE) Why this? dsi->length will be NULL only if src length is unknown. And at that point case BUILT_IN_STRCPY: case BUILT_IN_STRCPY_CHK: if (lhs != NULL_TREE || !builtin_decl_implicit_p (BUILT_IN_STPCPY)) return; break; should have already returned if it is not true. > { > + strinfo psi; > + > /* If string length of src is unknown, use delayed length > computation. If string lenth of dst will be needed, it > can be computed by transforming this strcpy call into > stpcpy and subtracting dst from the return value. */ > ! psi = get_strinfo (dsi->prev); I think you should do if (dsi->prev != 0 && verify_related_strinfos (dsi) != NULL) { psi = get_strinfo (dsi->prev); if (psi->endptr == dsi->ptr) ... } > ! strinfo npsi = unshare_strinfo (psi); No need to add a new pointer for that. Just do psi = unshare_strinfo (psi); > ! npsi->stmt = stmt; > ! npsi->next = 0; > ! npsi->length = NULL_TREE; > ! npsi->endptr = NULL_TREE; > ! npsi->dont_invalidate = true; > ! } > ! else > ! dsi->stmt = stmt; Do you really need to clear npsi->next and not set dsi->stmt if you tweak npsi->stmt? I mean, if you have: #define _GNU_SOURCE /* to make stpcpy prototype visible */ #include <string.h> size_t foo (char *p, char *q) { size_t l1, l2; char *r = strchr (p, '\0'); strcpy (r, q); l1 = strlen (p); l2 = strlen (r); /* Or with swapped order. */ return l1 + l2; } you can compute either l1, or l2, or both using get_string_length strcpy -> stpcpy transformation (and in any order). Perhaps we could avoid invalidating not just the previous one, but also all earlier ones with endptr == dsi->ptr, say if you have size_t bar (char *p, char *q) { size_t l1, l2, l3; char *r = strchr (p, '\0'); strcpy (r, "abcde"); char *s = strchr (r, '\0'); strcpy (s, q); l1 = strlen (p); l2 = strlen (r); l3 = strlen (s); return l1 + l2 + l3; } No matter what, it needs to have sufficient testsuite coverage added, using something that will not depend on the HAVE_movstr strcat folding (i.e. strchr + strcpy (or strlen + strcpy)). Jakub