Denis Prost writes: > Attached are 4 log files : > * one from "recoll -t -q gazette" (155 results) > * one from recollrunner with the same query (only "default query > language" checked in recollrunner config) (3 results : only the > ones among the 155 which do not contain spaces in their pathes) > * one from recoll -t -f -q gazette" (46 results) > * one from recollrunner with the same query ("default query language > checked" and "match filenames" checked in recollrunner config) (0 > result) > > I hope it will help solving this issue. > Regards > Denis
Thanks a lot for the log files, my comments below: first: > :4:../rcldb/rcldb.cpp:1525:Rcl::Db::filenameWildExp: pattern: [*gazette*] My guess is that this is from the 3d query (recoll -t -f -q gazette). The "-q" which would specify a "query language" query is ignored (because of how the options are parsed), and this is a filename query where gazette is transformed to *gazette* because it is neither capitalized nor contains wildcards. It is supposed to return all documents with [gazette] as part of their file name. Second: > :4:../rcldb/searchdata.cpp:782:StringToXapianQ:: query string: [gazette] This is from [recoll -t -q gazette], which is a regular text search query, returning all documents with gazette or a derivative ([gazettes]) in the contents, or possibly in the file name field processed as text. Third: > :4:../rcldb/searchdata.cpp:782:StringToXapianQ:: query string: ['gazette'] This is probably from recollrunner with only 'default query language' checked: there is excessive quoting, but it doesn't hurt much because this is a full text search and the quotes get eliminated. I don't know why recollrunner returns few results, but as you mention that these are only the ones without spaces in the file name, I'd suspect a problem parsing the output from recoll. Fourth: > :4:../rcldb/rcldb.cpp:1525:Rcl::Db::filenameWildExp: pattern: [*'gazette'*] This is with recollrunner, "match filenames" and "default query language" checked. "Match filename" takes precedence and the query fails because of the excessive quoting. The only thing that I find strange in the logs is that the 3rd one seems to indicate that the query actually returns more results than the 1st one, when I would have thought that they are identical. But the quoting may have affected the query, the actual Xapian query is truncated in the log for some reason, so we can't be sure: :4:../rcldb/rclquery.cpp:237:Query::SetQuery: Q: ((gazette:(wqf=11) OR gazettes OR gazet:4:../rcldb/rclquery.cpp:344:Fetching for first 50, count 50 So I think that the first fixes should be for recollrunner to: - Avoid excessive single quote quoting - Indicate somehow that "query language" and "file name search" are different and exclusive modes. - Try to better parse the query output when there are spaces in the file names. And then we may get into possible Recoll issues. I'd be quite interested though by the logs from the 2 following commands: recoll -t -q gazette recoll -t -q "'gazette'" Cheers, Jf -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org