Hi I had some questions related to indexing and configurations.
I have tried running UMLS::Similarity with the following configuration: SAB :: include MSH, RXNORM, ICD9CM, NCI, SNOMEDCT_US REL :: include PAR, CHD I was running the indexing on a fairly powerful machine. (16 core CPU with 64G RAM). I let the indexing run for a week and it occupied more than 500G but was still running. This is understandable considering the number of sources i have added is large and that the graph size would grow exponentially. From previous threads i understand SNOMEDCT takes a day. I can definitely afford running it more if i can add more sources. I wish to have more coverage of concepts and hence wish to add more sources. What is the best compromise to achieve more sources within a reasonable amount of time ? Also what is the exact configuration used for the paper Pedersen, T., Pakhomov, S. V. S., Patwardhan, S., & Chute, C. G. (2007). Measures of semantic similarity and relatedness in the biomedical domain. Your input would be very helpful. Chaitanya.
