Thank you for pointing out you are using SNOMEDCT_US, which I think is new
as of 2013AB. I am using 2013AA so my comparison is not exact here. I will
see if I have a 2013AB or later version installed somewhere.

SNOMEDCT_US is the US edition of SNOMEDCT, which does seem to have some
additional content.

http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/SNOMEDCT_US/


On Fri, Jul 18, 2014 at 11:21 AM, Albert Lai [email protected]
[umls-similarity] <[email protected]> wrote:

>
>
> Chaitanya and I are getting something much larger and I think the indexing
> has now been running for around 71 hours. We've generated 513mil rows in
> the MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_table. See below. (I noticed
> that yours says SNOMEDCT and not SNOMEDCT_US, though when we try SNOMEDCT,
> it says SAB (SNOMEDCT) does not exist in your current UMLS view.)
>
> -Albert
>
>
> mysql> select * from tableindex;
>
> +-----------------------------------------------------+-------------------------------------------+
> | TABLENAME                                           | HEX
>                         |
>
> +-----------------------------------------------------+-------------------------------------------+
> | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_intrinsic |
> af2b21ff9b5244bb5455ca8bc79d0257c2385d17a |
> | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_parent    |
> a2cc318432dc1b3e4f397f102b3ab8e705e4ffc34 |
> | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_child     |
> a1a69641a1b573c0baa75f92885a39bb576bceef1 |
> | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_info      |
> ad174db1eda8d0e60161f39ebf25429ecf62db544 |
> | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_cache     |
> ac59c43986dcdfe4e5770d8a032adc8118a9b8acf |
> | MMSYS_2013AB_20131113_SNOMEDCT_US_CHD_PAR_table     |
> a360e225eec93d0400d75f5faee4256db6e22417b |
>
> +-----------------------------------------------------+-------------------------------------------+
> 6 rows in set (0.00 sec)
>
> mysql> select TABLE_NAME,TABLE_ROWS,DATA_LENGTH,UPDATE_TIME from
> information_schema.tables where TABLE_SCHEMA='umlsinterfaceindex';
>
> +-------------------------------------------+------------+--------------+---------------------+
> | TABLE_NAME                                | TABLE_ROWS | DATA_LENGTH  |
> UPDATE_TIME         |
>
> +-------------------------------------------+------------+--------------+---------------------+
> | a1a69641a1b573c0baa75f92885a39bb576bceef1 |          1 |           17 |
> 2014-07-14 15:14:07 |
> | a2cc318432dc1b3e4f397f102b3ab8e705e4ffc34 |          1 |           17 |
> 2014-07-14 15:14:07 |
> | a360e225eec93d0400d75f5faee4256db6e22417b |  513902275 | 100299885452 |
> 2014-07-18 11:52:24 |
> | ac59c43986dcdfe4e5770d8a032adc8118a9b8acf |          0 |            0 |
> 2014-07-14 15:14:08 |
> | ad174db1eda8d0e60161f39ebf25429ecf62db544 |          0 |            0 |
> 2014-07-14 15:13:36 |
> | af2b21ff9b5244bb5455ca8bc79d0257c2385d17a |     158185 |      2689145 |
> 2014-07-17 03:06:15 |
> | tableindex                                |          6 |          584 |
> 2014-07-14 15:14:38 |
>
> +-------------------------------------------+------------+--------------+---------------------+
> 7 rows in set (0.00 sec)
>
>
>    On Friday, July 18, 2014 11:38 AM, "Ted Pedersen [email protected]
> [umls-similarity]" <[email protected]> wrote:
>
>
>
>  And here are the specific tables associated with SNOMEDCT.
>
>  acda27da591145d3b4e9ebf9d3c2a3e1dd4d0f40b |          0 |        16384 |
>  a61994af45895a14eeb8f720eaba083cb9383b698 |          1 |        16384 |
>  a494252ed0bdc44c0cac223f40735da144a347be0 |          1 |        16384 |
>  aa337e568342857c2f7b00366cfba8c8c2859621e |   12761986 |   2353004544 |
>
> So it's clearly the last one that takes up the most space. How close have
> you gotten?
>
> BTW, I think I'm going to go ahead and remove the SNOMEDCT index I have
> and re-create it, just to get a sense of how long that takes (it's been a
> while since I've done that so I don't exactly remember).
>
> Good luck,
> Ted
>
>
> On Fri, Jul 18, 2014 at 10:29 AM, Ted Pedersen <[email protected]> wrote:
>
> We have quite a few different resources indexed, so the output from your
> command is a little messy. So to start with, here are the table names for
> the indices associated with SNOMEDCT (available via the command shown),
> This is using PAR CHD relations with SNOMEDCT in 2013AA version of UMLS...
>
> ted@maraca:~$ getTableNames.pl --config config/snomedct.config
>
> CuiFinder User Options:
>    --config option set
>
>
> UMLS-Interface Configuration Information
>   Sources (SAB):
>     SNOMEDCT
>   Relations (REL):
>     CHD
>     PAR
>   Database:
>     umls (MMSYS-2013AA-20130404)
>
>
>
>
>
> The tables associated with the given configuration file are as follows:
>
>     Table                                       Table Name
>     acda27da591145d3b4e9ebf9d3c2a3e1dd4d0f40b
> MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_cache
>     a61994af45895a14eeb8f720eaba083cb9383b698
> MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_child
>     a494252ed0bdc44c0cac223f40735da144a347be0
> MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_parent
>     aa337e568342857c2f7b00366cfba8c8c2859621e
> MMSYS_2013AA_20130404_SNOMEDCT_CHD_PAR_table
>
>
>
> On Fri, Jul 18, 2014 at 8:57 AM, [email protected]
> [umls-similarity] <[email protected]> wrote:
>
>
>  Hello
>
> Can you share the output of the following SQL query:
>
>  select TABLE_NAME,TABLE_ROWS,DATA_LENGTH from information_schema.tables
> where TABLE_SCHEMA='umlsinterfaceindex';
>
>
> This will give us an idea of how much more we have to go.
>
> Also, is it possible for you to share the SQL dumps of the indexed tables.
> That would be awesome !
>
> Thanks,
> Chaitanya.
>
>
>
>
> --
> Ted Pedersen
> http://www.d.umn.edu/~tpederse
>
>
>
>
> --
> Ted Pedersen
> http://www.d.umn.edu/~tpederse
>
>
>    
>



-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

Reply via email to