On Oct 29, 2021, at 9:46 AM, Eric Lease Morgan <emor...@nd.edu> wrote:
> I have a database with the following structure: > > create table bibliographics ( > identifier TEXT PRIMARY KEY, > title TEXT > ); > > create table keywords ( > identifier TEXT, > keyword TEXT > ); > > create table entities ( > identifier TEXT, > entity TEXT > ); Thank you for all the useful replies! After plenty o' reading and plenty o' experimentation, I've decided the use of DISTINCT is the quickest and most immediately practical way for me to remove duplicates, like this: SELECT b.identifier, b.title, GROUP_CONCAT( DISTINCT( e.entity ) ) AS entities, GROUP_CONCAT( DISTINCT( k.keyword ) ) AS keywords FROM bibliographics AS b LEFT JOIN entities AS e ON e.identifier = b.identifier LEFT JOIN keywords AS k ON k.identifier = e.identifier GROUP BY b.identifier; The result is not beautiful but more than functional. For example, here are a few results from a few different records: identifier = author-homer-gutenberg title = The Works of Homer entities = achilles,apollo,hector,ilius,jove,menelaus,minerva,son of atreus,troy,ulysses keywords = achaeans,achilles,hector,jove,menelaus,telemachus,trojans,ulysses identifier = title-AustenSense_1811-gutenberg title = Sense and Sensibility by Jane Austen entities = barton,cleveland,dashwood,dear ma'am,dear, dear elinor,delaford,devonshire,edward,jennings,john willoughby,london,lucy,marianne,marlborough,middletons,no, elinor,norland,now, edward,selfish,willoughby keywords = dashwood,edward,elinor,jennings,lucy,marianne,palmer,willoughby identifier = crawl-planet20210207-code4lib title = A crawl of Planet Code4Lib (October 7, 2021) entities = detroit,donald trump,function,google,leaflet,madoff,rails autoscale,tether,the readme project,topics,trump,twitter,u.s.,us,windsor keywords = data,digital,google,university While I know the use of a sub SELECT would produce more expressive results, I could never get it to work. Too complicated; the use of DISTINCT is more elegant. Fun with SQL. -- Eric Morgan