On 21/01/2023 15:53, Pádraig Brady <p...@draigbrady.com> wrote: > > On 21/01/2023 13:05, Łukasz Sroka wrote: > > When the input files contain duplicates, then the rm fails. Because > > duplicates occur most often when the * is used and the shell unwraps > > it. > > There is a very common scenario when a user accidentally enters space > > after a filename, or enters space instead of forward slash. > > Example: > > > > rm prefix_ * > > > > The user intended to remove all files with a `prefix_` but removed all > > of the files in cwd. > > The program quits immediately when a duplicate is detected, to prevent > > pressing `y` because user expected a prompt regarding removing multiple > > files. > > The force option disables this function to enable scripts to work > > without modifying them. > > > > ``` > > diff --git a/src/rm.c b/src/rm.c > > index 354e2b0df..e4f9949f0 100644 > > --- a/src/rm.c > > +++ b/src/rm.c > > @@ -123,6 +123,16 @@ diagnose_leading_hyphen (int argc, char **argv) > > } > > } > > > > +static bool > > +find_duplicates (int n_files, char **files) > > +{ > > + for (int l = 0; l < n_files-1; l++) > > + for (int r = l+1; r < n_files; r++) > > + if (strcmp(files[l], files[r]) == 0) > > + return true; > > + return false; > > +} > > + > > void > > usage (int status) > > { > > @@ -211,6 +221,7 @@ main (int argc, char **argv) > > bool preserve_root = true; > > struct rm_options x; > > bool prompt_once = false; > > + bool force_rm = false; > > int c; > > > > initialize_main (&argc, &argv); > > @@ -238,6 +249,7 @@ main (int argc, char **argv) > > x.interactive = RMI_NEVER; > > x.ignore_missing_files = true; > > prompt_once = false; > > + force_rm = true; > > break; > > > > case 'i': > > @@ -352,6 +364,17 @@ main (int argc, char **argv) > > uintmax_t n_files = argc - optind; > > char **file = argv + optind; > > > > + if (!force_rm && find_duplicates(n_files, file)) > > + { > > + /* Because usually when the input files are duplicated it means > > that the user > > + sumbitted both a directory and an * as separate arguments, > > probably by accident */ > > + fprintf (stderr, > > + "%s: input contains duplicates, most likely you've put " > > + "both * and a file from the same directory.\n", > > + program_name); > > + return EXIT_FAILURE; > > + } > > + > > if (prompt_once && (x.recursive || 3 < n_files)) > > { > > fprintf (stderr, > > ``` > > An interesting proposal. > The main protection would be for `dir/ *` rather than `file_prefix_ *`. > The former would be unusual for a user to type, while the latter more usual, > but wouldn't trigger the protection AFAICS. > This ads O(N^2) on each interaction, so if it was to be included probably > only enabled with --interactive. > > cheers, > Pádraig
Yeah, true. Implemented it quickly and focused on the `dir/ *` scanerio at first, because that has happened to me -.- It was because I tried to do something quickly while editing command from backsearch and ended up with `rm tmp/ *` instead of `rm tmp/*`. To make the prefix_ protection it won't be possible without checking if a "prefix" exits on disk (eg. file and file1), which would be more time complex. I don't like the --interactive approach, as when you're trying to do something quickly, you probably won't opt for doing it interactively and you're probably either disabled or ignoring shell warnings. The more I think about this problem, the more I tilt towards implementing it in shell. If you'd've written "rm prefix_ *" and had 10k files there, it would be so much easier to detect it on the shell side than checking every filename with the other and triggering syscalls for every one... Either way if you have other solutions in mind, please update me, as it wouldn't require to configure shell in dockers or remote machines, thus making the solution more available.