Chaitanya and I are getting something much larger and I think the indexing has 
now been running for around 71 hours. We've generated 513mil rows in the 
MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_table. See below. (I noticed that 
yours says SNOMEDCT and not SNOMEDCT_US, though when we try SNOMEDCT, it says 
SAB (SNOMEDCT) does not exist in your current UMLS view.)

-Albert


mysql> select * from tableindex;
+-----------------------------------------------------+-------------------------------------------+
| TABLENAME                                           | HEX                     
                  |
+-----------------------------------------------------+-------------------------------------------+
| MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_intrinsic | 
af2b21ff9b5244bb5455ca8bc79d0257c2385d17a |
| MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_parent    | 
a2cc318432dc1b3e4f397f102b3ab8e705e4ffc34 |
| MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_child     | 
a1a69641a1b573c0baa75f92885a39bb576bceef1 |
| MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_info      | 
ad174db1eda8d0e60161f39ebf25429ecf62db544 |
| MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_cache     | 
ac59c43986dcdfe4e5770d8a032adc8118a9b8acf |
| MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_table     | 
a360e225eec93d0400d75f5faee4256db6e22417b |
+-----------------------------------------------------+-------------------------------------------+
6 rows in set (0.00 sec)

mysql> select TABLE_NAME,TABLE_ROWS,DATA_LENGTH,UPDATE_TIME from 
information_schema.tables where TABLE_SCHEMA='umlsinterfaceindex';
+-------------------------------------------+------------+--------------+---------------------+
| TABLE_NAME                                | TABLE_ROWS | DATA_LENGTH  | 
UPDATE_TIME         |
+-------------------------------------------+------------+--------------+---------------------+
| a1a69641a1b573c0baa75f92885a39bb576bceef1 |          1 |           17 | 
2014-07-14 15:14:07 |
| a2cc318432dc1b3e4f397f102b3ab8e705e4ffc34 |          1 |           17 | 
2014-07-14 15:14:07 |
| a360e225eec93d0400d75f5faee4256db6e22417b |  513902275 | 100299885452 | 
2014-07-18 11:52:24 |
| ac59c43986dcdfe4e5770d8a032adc8118a9b8acf |          0 |            0 | 
2014-07-14 15:14:08 |
| ad174db1eda8d0e60161f39ebf25429ecf62db544 |          0 |            0 | 
2014-07-14 15:13:36 |
| af2b21ff9b5244bb5455ca8bc79d0257c2385d17a |     158185 |      2689145 | 
2014-07-17 03:06:15 |
| tableindex                                |          6 |          584 | 
2014-07-14 15:14:38 |
+-------------------------------------------+------------+--------------+---------------------+
7 rows in set (0.00 sec)


On Friday, July 18, 2014 11:38 AM, "Ted Pedersen [email protected] 
[umls-similarity]" <[email protected]> wrote:
 


  
And here are the specific tables associated with SNOMEDCT.
 acda27da591145d3b4e9ebf9d3c2a3e1dd4d0f40b |          0 |        16384 |

 a61994af45895a14eeb8f720eaba083cb9383b698 |          1 |        16384 |

 a494252ed0bdc44c0cac223f40735da144a347be0 |          1 |        16384 |

 aa337e568342857c2f7b00366cfba8c8c2859621e |   12761986 |   2353004544 |


So it's clearly the last one that takes up the most space. How close have you 
gotten?

BTW, I think I'm going to go ahead and remove the SNOMEDCT index I have and 
re-create it, just to get a sense of how long that takes (it's been a while 
since I've done that so I don't exactly remember). 

Good luck,
Ted



On Fri, Jul 18, 2014 at 10:29 AM, Ted Pedersen <[email protected]> wrote:

We have quite a few different resources indexed, so the output from your 
command is a little messy. So to start with, here are the table names for the 
indices associated with SNOMEDCT (available via the command shown), This is 
using PAR CHD relations with SNOMEDCT in 2013AA version of UMLS...
>
>
>ted@maraca:~$ getTableNames.pl --config config/snomedct.config
>
>
>CuiFinder User Options:
>   --config option set
>
>
>
>
>UMLS-Interface Configuration Information
>  Sources (SAB):
>    SNOMEDCT
>  Relations (REL):
>    CHD
>    PAR
>  Database:
>    umls (MMSYS-2013AA-20130404)
>
>
>
>
>
>
>
>
>
>
>The tables associated with the given configuration file are as follows:
>
>
>    Table                                       Table Name
>    acda27da591145d3b4e9ebf9d3c2a3e1dd4d0f40b   
>MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_cache
>    a61994af45895a14eeb8f720eaba083cb9383b698   
>MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_child
>    a494252ed0bdc44c0cac223f40735da144a347be0   
>MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_parent
>    aa337e568342857c2f7b00366cfba8c8c2859621e   
>MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_table
>
>
>
>
>
>On Fri, Jul 18, 2014 at 8:57 AM, [email protected] 
>[umls-similarity] <[email protected]> wrote:
>
> 
>>  
>>Hello
>>
>>
>>Can you share the output of the following SQL query:
>>
>>
>> select TABLE_NAME,TABLE_ROWS,DATA_LENGTH from information_schema.tables 
>>where TABLE_SCHEMA='umlsinterfaceindex';
>>
>>
>>
>>
>>
>>This will give us an idea of how much more we have to go.
>>
>>
>>Also, is it possible for you to share the SQL dumps of the indexed tables. 
>>That would be awesome !
>>
>>
>>Thanks,
>>Chaitanya.
>
>
>
>
>-- 
>Ted Pedersen
>http://www.d.umn.edu/~tpederse 


-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse 

Reply via email to