On Wed, Dec 3, 2014 at 9:14 AM, pinker [via PostgreSQL] < ml-node+s1045698n5829111...@n5.nabble.com> wrote:
> There is nice extension in postgres: fuzzystrmatch > <http://www.postgresql.org/docs/9.3/static/fuzzystrmatch.html> I have > used to calculate the distance. From documetation: > > SELECT levenshtein_less_equal('extensive', 'exhaustive',2); > > You can use it then with your group by query. > > Something like this - replace the substring(...) comparison with legenshtein_less_equal(...) or whatever comparison you find applicable. In the case below new groups are started whenever the first letter of the value changes. The first group would be NULL so I add a COALESCE() call to make it 0 - subsequent groups start with 1 and increment properly. WITH src (val) AS ( VALUES ('A1'::varchar),('A2'),('B1'),('B2'),('B3'),('C1'),('D1') ) , grp AS ( SELECT val , CASE WHEN substring(val,1,1) <> substring(lag(val) OVER (ORDER BY val),1,1) THEN 1 ELSE NULL END AS changed , ROW_NUMBER() OVER (ORDER BY val) AS val_idx FROM src ) SELECT val, COALESCE(sum(changed) OVER (ORDER BY val_idx), 0) AS group_id FROM grp ; David J. -- View this message in context: http://postgresql.nabble.com/Merge-rows-based-on-Levenshtein-distance-tp5828841p5829143.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.