How to speed up SELECT .. LIKE queries in MySQL on multiple columns?

I have a MySQL table for which I do very frequent SELECT x, y, z FROM table WHERE x LIKE '%text%' OR y LIKE '%text%' OR z LIKE '%text%' queries. Would any kind of index help speed things up?

There are a few million records in the table. If there is anything that would speed up the search, would it seriously impact disk usage by the database files and the speed of INSERT and DELETE statements? (no UPDATE is ever performed)

Update: Quickly after posting, I have seen a lot of information and discussion about the way LIKE is used in the query; I would like to point out that the solution must use LIKE '%text%' (that is, the text I am looking for is prepended and appended with a % wildcard). The database also has to be local, for many reasons, including security.

80763 次浏览

An index can not be used to speed up queries where the search criteria starts with a wildcard:

LIKE '%text%'

An index can (and might be, depending on selectivity) used for search terms of the form:

LIKE 'text%'

An index won't help text matching with a leading wildcard, an index can be used for:

LIKE 'text%'

But I'm guessing that won't cut it. For this type of query you really should be looking at a full text search provider if you want to scale the amount of records you can search across. My preferred provider is Sphinx, very full featured/fast etc. Lucene might also be worth a look. A fulltext index on a MyISAM table will also work, but ultimately pursuing MyISAM for any database that has a significant amount of writes isn't a good idea.

An index wouldn't speed up the query, because for textual columns indexes work by indexing N characters starting from left. When you do LIKE '%text%' it can't use the index because there can be a variable number of characters before text.

What you should be doing is not use a query like that at all. Instead you should use something like FTS (Full Text Search) that MySQL supports for MyISAM tables. It's also pretty easy to make such indexing system yourself for non-MyISAM tables, you just need a separate index table where you store words and their relevant IDs in the actual table.

Update

Full text search available for InnoDB tables with MySQL 5.6+.

I would add that in some cases you can speed up the query using an index together with like/rlike if the field you are looking at is often empty or contains something constant.

In that case it seems that you can limit the rows which are visited using the index by adding an "and" clause with the fixed value.

I tried this for searching 'tags' in a huge table which usually does not contain a lot of tags.

SELECT * FROM objects WHERE tags RLIKE("((^|,)tag(,|$))" AND tags!=''

If you have an index on tags you will see that it is used to limit the rows which are being searched.

Maybe you can try to upgrade mysql5.1 to mysql5.7.

I have about 70,000 records. And run following SQL:

select * from comics where name like '%test%';

It takes 2000ms in mysql5.1. And it takes 200ms in mysql5.7 or mysql5.6.

Another alternative to avoid full table scans is selecting substrings and checking them in the having statement:

SELECT
al3.article_number,
SUBSTR(al3.article_number, 2, 3) AS art_nr_substr,
SUBSTR(al3.article_number, 1, 3) AS art_nr_substr2,
al1.*
FROM
t1 al1
INNER JOIN t2 al2 ON al2.t1_id = al1.id
INNER JOIN t3 al3 ON al3.id = al2.t3_id
WHERE
al1.created_at > '2018-05-29'
HAVING
(art_nr_substr = "FLA" OR art_nr_substr = 'VKV' OR art_nr_subst2 = 'PBR');

Another way:

You can mantain calculated columns with those strings REVERSEd and use

SELECT x, y, z FROM table WHERE x LIKE 'text%' OR y LIKE 'text%' OR z LIKE 'text%' OR xRev LIKE 'txet%' OR yRev LIKE 'txet%' OR zRev LIKE 'txet%'

Example of how to ADD a stored persisted column

ALTER TABLE table ADD COLUMN xRev VARCHAR(N) GENERATED ALWAYS AS REVERSE(x) stored;

and then create an indexes on xRev, yRev etc.

Add a Full Text Index and Use MATCH() AGAINST().

Normal indexes will not help you with like queries, especially those that utilize wildcards on both sides of the search term.

What you can do is add a full text index on the columns that you're interested in searching and then use a MATCH() AGAINST() query to search those full text indexes.

  1. Add a full text index on the columns that you need:

    ALTER TABLE table ADD FULLTEXT INDEX index_table_on_x_y_z (x, y, z);
    
  2. Then query those columns:

    SELECT * FROM table WHERE MATCH(x,y,z) AGAINST("text")
    

From our trials, we found these queries to take around 1ms in a table with over 1 million records. Not bad, especially compared to the equivalent wildcard LIKE %text% query which takes 16,400ms.

Benchmarks

MATCH(x,y,z) AGAINST("text") takes 1ms

LIKE %text% takes 16400ms

16400x faster!

When you optimize a SELECT foo FROM bar WHERE baz LIKE 'ZOT%' query, you want the index length to at least match the number of characters in the request.

Here is a real life example from just now:

Here is the query:

EXPLAIN SELECT COUNT(*) FROM client_detail cd
JOIN client_account ca ON cd.client_acct_id = ca.client_acct_id
WHERE cd.first_name LIKE 'XX%' AND cd.last_name_index LIKE 'YY%';

With no index:

+-------+
| rows  |
+-------+
| 13994 |
|     1 |
+-------+

So first try a 4x index,

CREATE INDEX idx_last_first_4x4 on client_detail(last_name_index(4), first_name(4));
+------+
| rows |
+------+
| 7035 |
|    1 |
+------+

A bit better, but COUNT(*) shows there are only 102 results. So lets now add a 2x index:

CREATE INDEX idx_last_first_2x2 on client_detail(last_name_index(2), first_name(2));

yields:

+------+
| rows |
+------+
|  102 |
|    1 |
+------+

Both indexes are still in place at this point, and MySQL chose the latter index for this query---however it will still choose the 4x4 query if it is more efficient.

Index ordering may be useful, try the 2x2 before the 4x4 or vice-versa to see how it performs for your environment. To re-order an index you have to drop and re-create the earlier one.