On Fri, Aug 18, 2017 at 3:23 PM, Kamil Dudka <kdu...@redhat.com> wrote: > > We have multiple occurrences of automated crash reports: > > https://retrace.fedoraproject.org/faf/reports/1209278/ > > ... about SIGSEGV in partial_quotearg_n() at this location in the source code: > > http://git.savannah.gnu.org/cgit/findutils.git/tree/find/ftsfind.c?h=v4.6.0#n219 > > I guess it crashes because ent->fts_cycle->fts_pathlen is out of bound of > ent->fts_cycle->fts_path but I do not fully understand what the function > is supposed to do...
This function reports a directory loop (which is something for which POSIX requires a diagnostic). There isn't enough information in the bug reports to narrow down the cause of the problem. But, ftsent->fts_path is managed in a somewhat complex way in fts.c; perhaps one of the gnulib maintainers will have an insight. The section of code in findutils 4.5.16 we're talking about is this one: 207 static const char* 208 partial_quotearg_n (int n, char *s, size_t len, enum quoting_style style) 209 { 210 if (0 == len) 211 { 212 return quotearg_n_style (n, style, ""); 213 } 214 else 215 { 216 char saved; 217 const char *result; 218 219 saved = s[len]; 220 s[len] = 0; 221 result = quotearg_n_style (n, style, s); 222 s[len] = saved; 223 return result; 224 } 225 } 226 227 228 /* We've detected a file system loop. This is caused by one of 229 * two things: 230 * 231 * 1. Option -L is in effect and we've hit a symbolic link that 232 * points to an ancestor. This is harmless. We won't traverse the 233 * symbolic link. 234 * 235 * 2. We have hit a real cycle in the directory hierarchy. In this 236 * case, we issue a diagnostic message (POSIX requires this) and we 237 * skip that directory entry. 238 */ 239 static void 240 issue_loop_warning (FTSENT * ent) 241 { 242 if (S_ISLNK(ent->fts_statp->st_mode)) 243 { 244 error (0, 0, 245 _("Symbolic link %s is part of a loop in the directory hierarchy; we have already v 245 isited the directory to which it points."), 246 safely_quote_err_filename (0, ent->fts_path)); 247 } 248 else 249 { 250 /* We have found an infinite loop. POSIX requires us to 251 * issue a diagnostic. Usually we won't get to here 252 * because when the leaf optimisation is on, it will cause 253 * the subdirectory to be skipped. If /a/b/c/d is a hard 254 * link to /a/b, then the link count of /a/b/c is 2, 255 * because the ".." entry of /a/b/c/d points to /a, not 256 * to /a/b/c. 257 */ 258 error (0, 0, 259 _("File system loop detected; " 260 "%s is part of the same file system loop as %s."), 261 safely_quote_err_filename (0, ent->fts_path), 262 partial_quotearg_n (1, 263 ent->fts_cycle->fts_path, 264 ent->fts_cycle->fts_pathlen, 265 options.err_quoting_style)); 266 } 267 } As you will see from the above, the loop we're diagnosing is not a symbolic link loop, but a hard link loop. I just tried creating a suitable test file system (with debugfs) but the problem was diagnosed before this particular code path was hit. gnulib foliks, do you have test data which results in FTS_DC being returned by fts_read? If not, have you tested in any other way that ent->fts_cycle->fts_pathlen is in-bounds for this case? We're using the options FTS_NOSTAT|FTS_TIGHT_CYCLE_CHECK|FTS_CWDFD|FTS_VERBATIM at least (though some other options may also be set in the crashing case, we don't know since we don't know what the affected users' path name is). Thanks, James.