Hi all, I'm resending this now that I'm subscribed. Any advice would be much appreciated! Thank you,
---------- Forwarded message ---------- From: Jennifer Wilson <[email protected]> Date: Tue, May 23, 2017 at 6:13 PM Subject: Help with the best approach for using the query-UMLS interface To: [email protected] Hello UMLS similarity team, I am trying to compute the similarity between ~30K disease/phenotype terms. Ideally, I would have a matrix of similarity for these terms. My first attempt was to write a python script to call the query-umls-similarity-webinterface.pl script. Though, before releasing the script on my dataset, I was trying to recreate the scores from this paper ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2815481/) in table 1. Here's the command I am using: ./query-umls-similarity-webinterface.pl --sab MSH --rel PAR/CHD "Abortion" "Miscarriage" Default Settings: --default http://atlas.ahc.umn.edu/ --measure path User Settings: --rel PAR/CHD (-1.0, 'Abortion', 'Miscarriage') I also have not processed the text in my dataset much. I have basically pulled diseases and phenotypes from DisGeNet, OMIN, PheWas, and the GWAS catalogue. If I'm using data from all of these sources - do you recommend sending them directly to the query interface? Should I try and map to CUI terms? (examples below) Before I download the database and attempt to query the database (it's not a language that I use in my current work), I just wanted an outside perspective to see if there are best practices for using this data. Thank you in advance for your time! (examples) Here are two more examples showing the disease descriptions in my dataset. Is the UMLS interface robust to these various formats or do they need to be an exact match? ./query-umls-similarity-webinterface.pl --sab MSH --rel PAR/CHD "Testicular Neoplasms" "Amelogenesis imperfecta local hypoplastic form" Default Settings: --default http://atlas.ahc.umn.edu/ --measure path User Settings: --rel PAR/CHD (-1.0, 'Testicular Neoplasms', 'Amelogenesis imperfecta local hypoplastic form') ./query-umls-similarity-webinterface.pl --sab MSH --rel PAR/CHD "Hypotrichosis 2, 146520 (3)" "PERIODONTITIS, LOCALIZED AGGRESSIVE" Default Settings: --default http://atlas.ahc.umn.edu/ --measure path User Settings: --rel PAR/CHD (-1.0, 'Hypotrichosis 2, 146520 (3)', 'PERIODONTITIS, LOCALIZED AGGRESSIVE') -- Jennifer L. Wilson Bioengineering, Stanford University [email protected] / 703.969.3318 <(703)%20969-3318> -- Jennifer L. Wilson Bioengineering, Stanford University [email protected] / 703.969.3318
