Re: Choosing between FULLTEXT INDEX and regular INDEX

SGreen Wed, 22 Jun 2005 14:17:43 -0700

Responses interspersed....

James <[EMAIL PROTECTED]> wrote on 06/22/2005 04:48:59 PM:


> I did a little test by configure the `name` and `city` with FULLTEXT
> INDEX and a regular INDEX.

> 
> mysql> describe testing_text_performance;
> +-----------+------------------+------+-----+-------------------
> +----------------+
> | Field     | Type             | Null | Key | Default           |
> Extra          |
> +-----------+------------------+------+-----+-------------------
> +----------------+
> | id        | int(10) unsigned |      | PRI | NULL              |
> auto_increment |
> | name      | varchar(100)     |      | MUL |                   |
> |
> | city      | varchar(100)     |      | MUL |                   |
> |
> | age       | tinyint(3)       |      |     | 0                 |
> |
> | timestamp | timestamp        | YES  |     | CURRENT_TIMESTAMP |
> |
> +-----------+------------------+------+-----+-------------------
> +----------------+
> 5 rows in set (0.00 sec)

> 
> select count(*) from testing_text_performance where match(name)
> against ('marge'); --> average of about 1.25 sec

> select count(*) from testing_text_performance where name like
> '%marge%'; --> average of about 0.35 sec

> The FULLTEXT searches were about 3 times as slow.  For 120,000 rows

You should have also tried:
select count(*) from testing_text_performance where name like
'marge%';

and 
select count(*) from testing_text_performance where name =
'marge';

> 
> Also I found that with the two fields the performance between a
> regular INDEX and NO INDEX were pretty much the same...I measured the
> times...with about 120,000 rows, they both took about .35 seconds.

> mysql> explain select count(*) from testing_text_perf_no_index where
> name like '%marge%';
> +----+-------------+----------------------------+------
> +---------------+------+---------+------+--------+-------------+
> | id | select_type | table                      | type |
> possible_keys | key  | key_len | ref  | rows   | Extra       |
> +----+-------------+----------------------------+------
> +---------------+------+---------+------+--------+-------------+
> |  1 | SIMPLE      | testing_text_perf_no_index | ALL  | NULL
> | NULL |    NULL | NULL | 120015 | Using where |
> +----+-------------+----------------------------+------
> +---------------+------+---------+------+--------+-------------+
> 1 row in set (0.00 sec)

> mysql> explain select count(*) from testing_text_performance where
> name like '%marge%';
> +----+-------------+--------------------------+-------
> 
+---------------+------+---------+------+--------+--------------------------+
> | id | select_type | table                    | type  | possible_keys
> | key  | key_len | ref  | rows   | Extra                    |
> +----+-------------+--------------------------+-------
> 
+---------------+------+---------+------+--------+--------------------------+
> |  1 | SIMPLE      | testing_text_performance | index | NULL
> | name |     100 | NULL | 120015 | Using where; Using index |
> +----+-------------+--------------------------+-------
> 
+---------------+------+---------+------+--------+--------------------------+
> 1 row in set (0.00 sec)

> 
> So based on this test it seems that I should either use a regular
> INDEX on `name` or NO INDEX at all.  Does this sound right?
> Shouldn't an INDEX help performance?


Indexes work exactly like a telphone book (some people call them 
directories). If want to find the entries for people whose last name is or 
begins with the letters "Jam" (like "James" or "Jameson") it's pretty easy 
because the information is sorted that way. However, if you wanted to find 
all of the names that contained  or ended with the letters 'son' (like 
Jameson, Carson, Deesonay,...) you would have to look at each and every 
name in the book so that you can tease out those "inner" matches.

If you want to do exact matches or prefix matches (words that are or start 
with...), INDEXES are the perfect tool for the job. However if you are 
always doing substring matches, an INDEX cannot help. The engine would 
have to search every record for a match, just as you would have to search 
the entire phone book to find those names.

There is a technique for creating an index to handle searches of the "ends 
with" type. Store the information reversed and index the column of 
backwards text. Then when you want to look for things that end with "son", 
you search the backwards column for words that _start_ with "nos".  The 
penalty you pay is the space it takes to store and maintain the reversed 
string and extra index but it helps to optimize another type of substring 
search.

Normal and reversed-text indexes cover 2 of the 3 types of substring 
searches. The third kind looks for information anywhere else in the string 
(in the middle). So far, there is not an indexing strategy to optimize 
that kind of search available to use with MySQL (that I know of...) Does 
anyone else out there know of one?


> -James

> 
> At 1:06 PM -0400 6/22/05, James wrote:
> >I have a table which includes the following columns in addition to
> >lots of other ones
> >
> >name - populated with just one name
> >city - populated with just one city
> >keywords - lots of keywords
> >
> >I'm definitely going to use a FULLTEXT on the `keywords` column
> >
> >For `name` and `city` ...I will allow users to search on one name
> >and one city.
> >Should I just stick to regular indices for those two columns and use
> >
> >...WHERE
> >name LIKE  '%bob%' OR
> >city LIKE '%montreal%'
> >
> >
> >Will FULLTEXT indices for these two columns give me any sort of
> >advantage, such as performance boosts?
> >
> >I know that if I use FULLTEXT indices on `name` I would need it to
> >index words that are 2 characters and above!  The reason is that
> >there are names such as `ed` or `bo` that will be lost if I don't.
> >And I can't turn this on a column to column basis.  If I set
> >ft_min_word_length = 2, that will make ALL FULLTEXT indices handle 2
> >chars and above.  This will make the index files huge no?
> >
> >--
> >-James
> >


Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine

Re: Choosing between FULLTEXT INDEX and regular INDEX

Reply via email to