Re: [CODE4LIB] sql join query

Eric Lease Morgan Sat, 30 Oct 2021 07:50:18 -0700

On Oct 29, 2021, at 9:46 AM, Eric Lease Morgan <emor...@nd.edu> wrote:


> I have a database with the following structure:
> 
>  create table bibliographics (
>      identifier  TEXT PRIMARY KEY,
>      title       TEXT
>  );
> 
>  create table keywords (
>      identifier TEXT,
>      keyword    TEXT
>  );
> 
>  create table entities (
>      identifier TEXT,
>      entity     TEXT
>  );


Thank you for all the useful replies! After plenty o' reading and plenty o' 
experimentation, I've decided the use of DISTINCT is the quickest and most 
immediately practical way for me to remove duplicates, like this:

  SELECT b.identifier,
         b.title,
         GROUP_CONCAT( DISTINCT( e.entity  ) ) AS entities,
         GROUP_CONCAT( DISTINCT( k.keyword ) ) AS keywords
   FROM bibliographics AS b
   LEFT JOIN entities  AS e ON e.identifier = b.identifier
   LEFT JOIN keywords  AS k ON k.identifier = e.identifier
   GROUP BY b.identifier;


The result is not beautiful but more than functional. For example, here are a 
few results from a few different records:

  identifier = author-homer-gutenberg
       title = The Works of Homer
    entities = achilles,apollo,hector,ilius,jove,menelaus,minerva,son of 
atreus,troy,ulysses
    keywords = achaeans,achilles,hector,jove,menelaus,telemachus,trojans,ulysses

 identifier = title-AustenSense_1811-gutenberg
       title = Sense and Sensibility by Jane Austen
    entities = barton,cleveland,dashwood,dear ma'am,dear, dear 
elinor,delaford,devonshire,edward,jennings,john 
willoughby,london,lucy,marianne,marlborough,middletons,no, elinor,norland,now, 
edward,selfish,willoughby
    keywords = dashwood,edward,elinor,jennings,lucy,marianne,palmer,willoughby

 identifier = crawl-planet20210207-code4lib
       title = A crawl of Planet Code4Lib (October 7, 2021)
    entities = detroit,donald trump,function,google,leaflet,madoff,rails 
autoscale,tether,the readme project,topics,trump,twitter,u.s.,us,windsor
    keywords = data,digital,google,university


While I know the use of a sub SELECT would produce more expressive results, I 
could never get it to work. Too complicated; the use of DISTINCT is more 
elegant.

Fun with SQL.

--
Eric Morgan

Re: [CODE4LIB] sql join query

Reply via email to