Chaitanya and I are getting something much larger and I think the indexing has now been running for around 71 hours. We've generated 513mil rows in the MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_table. See below. (I noticed that yours says SNOMEDCT and not SNOMEDCT_US, though when we try SNOMEDCT, it says SAB (SNOMEDCT) does not exist in your current UMLS view.)
-Albert mysql> select * from tableindex; +-----------------------------------------------------+-------------------------------------------+ | TABLENAME | HEX | +-----------------------------------------------------+-------------------------------------------+ | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_intrinsic | af2b21ff9b5244bb5455ca8bc79d0257c2385d17a | | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_parent | a2cc318432dc1b3e4f397f102b3ab8e705e4ffc34 | | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_child | a1a69641a1b573c0baa75f92885a39bb576bceef1 | | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_info | ad174db1eda8d0e60161f39ebf25429ecf62db544 | | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_cache | ac59c43986dcdfe4e5770d8a032adc8118a9b8acf | | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_table | a360e225eec93d0400d75f5faee4256db6e22417b | +-----------------------------------------------------+-------------------------------------------+ 6 rows in set (0.00 sec) mysql> select TABLE_NAME,TABLE_ROWS,DATA_LENGTH,UPDATE_TIME from information_schema.tables where TABLE_SCHEMA='umlsinterfaceindex'; +-------------------------------------------+------------+--------------+---------------------+ | TABLE_NAME | TABLE_ROWS | DATA_LENGTH | UPDATE_TIME | +-------------------------------------------+------------+--------------+---------------------+ | a1a69641a1b573c0baa75f92885a39bb576bceef1 | 1 | 17 | 2014-07-14 15:14:07 | | a2cc318432dc1b3e4f397f102b3ab8e705e4ffc34 | 1 | 17 | 2014-07-14 15:14:07 | | a360e225eec93d0400d75f5faee4256db6e22417b | 513902275 | 100299885452 | 2014-07-18 11:52:24 | | ac59c43986dcdfe4e5770d8a032adc8118a9b8acf | 0 | 0 | 2014-07-14 15:14:08 | | ad174db1eda8d0e60161f39ebf25429ecf62db544 | 0 | 0 | 2014-07-14 15:13:36 | | af2b21ff9b5244bb5455ca8bc79d0257c2385d17a | 158185 | 2689145 | 2014-07-17 03:06:15 | | tableindex | 6 | 584 | 2014-07-14 15:14:38 | +-------------------------------------------+------------+--------------+---------------------+ 7 rows in set (0.00 sec) On Friday, July 18, 2014 11:38 AM, "Ted Pedersen [email protected] [umls-similarity]" <[email protected]> wrote: And here are the specific tables associated with SNOMEDCT. acda27da591145d3b4e9ebf9d3c2a3e1dd4d0f40b | 0 | 16384 | a61994af45895a14eeb8f720eaba083cb9383b698 | 1 | 16384 | a494252ed0bdc44c0cac223f40735da144a347be0 | 1 | 16384 | aa337e568342857c2f7b00366cfba8c8c2859621e | 12761986 | 2353004544 | So it's clearly the last one that takes up the most space. How close have you gotten? BTW, I think I'm going to go ahead and remove the SNOMEDCT index I have and re-create it, just to get a sense of how long that takes (it's been a while since I've done that so I don't exactly remember). Good luck, Ted On Fri, Jul 18, 2014 at 10:29 AM, Ted Pedersen <[email protected]> wrote: We have quite a few different resources indexed, so the output from your command is a little messy. So to start with, here are the table names for the indices associated with SNOMEDCT (available via the command shown), This is using PAR CHD relations with SNOMEDCT in 2013AA version of UMLS... > > >ted@maraca:~$ getTableNames.pl --config config/snomedct.config > > >CuiFinder User Options: > --config option set > > > > >UMLS-Interface Configuration Information > Sources (SAB): > SNOMEDCT > Relations (REL): > CHD > PAR > Database: > umls (MMSYS-2013AA-20130404) > > > > > > > > > > >The tables associated with the given configuration file are as follows: > > > Table Table Name > acda27da591145d3b4e9ebf9d3c2a3e1dd4d0f40b >MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_cache > a61994af45895a14eeb8f720eaba083cb9383b698 >MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_child > a494252ed0bdc44c0cac223f40735da144a347be0 >MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_parent > aa337e568342857c2f7b00366cfba8c8c2859621e >MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_table > > > > > >On Fri, Jul 18, 2014 at 8:57 AM, [email protected] >[umls-similarity] <[email protected]> wrote: > > >> >>Hello >> >> >>Can you share the output of the following SQL query: >> >> >> select TABLE_NAME,TABLE_ROWS,DATA_LENGTH from information_schema.tables >>where TABLE_SCHEMA='umlsinterfaceindex'; >> >> >> >> >> >>This will give us an idea of how much more we have to go. >> >> >>Also, is it possible for you to share the SQL dumps of the indexed tables. >>That would be awesome ! >> >> >>Thanks, >>Chaitanya. > > > > >-- >Ted Pedersen >http://www.d.umn.edu/~tpederse -- Ted Pedersen http://www.d.umn.edu/~tpederse
