On Tue, Nov 18, 2014 at 8:03 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Peter Geoghegan <p...@heroku.com> writes: >> On Tue, Nov 18, 2014 at 3:29 PM, Robert Haas <robertmh...@gmail.com> wrote: >>> On Mon, Nov 17, 2014 at 3:04 PM, Peter Geoghegan <p...@heroku.com> wrote: >>>> postgres=# select qty from orderlines ; >>>> ERROR: 42703: column "qty" does not exist >>>> HINT: Perhaps you meant to reference the column "orderlines"."quantity". > >>> I don't buy this example, because it would give you the same hint if >>> you told it you wanted to access a column called ant, or uay, or tit. >>> And that's clearly ridiculous. The reason why quantity looks like a >>> reasonable suggestion for qty is because it's a conventional >>> abbreviation, but an extremely high percentage of comparable cases >>> won't be. > >> I maintain that omission of part of the correct spelling should be >> weighed less. > > I would say that omission of the first letter should completely disqualify > suggestions based on this heuristic; but it might make sense to weight > omissions less after the first letter.
I think we would be well-advised not to start inventing our own approximate matching algorithm. Peter's suggestion boils down to a guess that the default cost parameters for Levenshtein suck, and your suggestion boils down to a guess that we can fix the problems with Peter's suggestion by bolting another heuristic on top of it - and possibly running Levenshtein twice with different sets of cost parameters. Ugh. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers