复合索引是如何工作的?

我之前已经在表格上创建了复合索引(指数对于你们这些数学家来说是这样的) ,并假设了它们是如何工作的。我只是好奇我的假设是否正确。

我假设在列出索引的列顺序时,还要指定如何对索引进行分组。例如,如果有 abc列,并且按照相同的顺序指定索引 a ASCb ASCc ASC,那么结果索引基本上是 a中每个“组”的多个索引。

这是正确的吗? 如果不是,结果索引实际上会是什么样子?

89836 次浏览

No. Resultant index will be single index but with compound key.

KeyX = A,B,C,D; KeyY = 1,2,3,4;

Index KeyX, KeyY will be actually: A1,A2,A3,B1,B3,C3,C4,D2

So that in case you need to find something by KeyX and KeyY - that will be fast and will use single index. Something like SELECT ... WHERE KeyX = "B" AND KeyY = 3.

But it's important to understand: WHERE KeyX = ? requests will use that index, while WHERE KeyY = ? will NOT use such index at all.

Composite indexes work just like regular indexes, except they have multi-values keys.

If you define an index on the fields (a,b,c) , the records are sorted first on a, then b, then c.

Example:

| A | B | C |
-------------
| 1 | 2 | 3 |
| 1 | 4 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 5 |
| 2 | 4 | 4 |
| 2 | 4 | 5 |

The most common implementation of indices uses B-trees to allow somewhat rapid lookups, and also reasonably rapid range scans. It's too much to explain here, but here's the Wikipedia article on B-trees. And you are right, the first column you declare in the create index will be the high order column in the resulting B-tree.

A search on the high order column amounts to a range scan, and a B-tree index can be very useful for such a search. The easiest way to see this is by analogy with the old card catalogs you have in libraries that have not yet converted to on line catalogs.

If you are looking for all the cards for Authors whose last name is "Clemens", you just go to the author catalog, and very quickly find a drawer that says "CLE- CLI" on the front. That's the right drawer. Now you do a kind of informal binary search in that drawer to quickly find all the cards that say "Clemens, Roger", or "Clemens, Samuel" on them.

But suppose you want to find all the cards for the authors whose first name is "Samuel". Now you're up the creek, because those cards are not gathered together in one place in the Author catalog. A similar phenomenon happens with composite indices in a database.

Different DBMSes differ in how clever their optimizer is at detecting index range scans, and accurately estimating their cost. And not all indices are B-trees. You'll have to read the docs for your specific DBMS to get the real info.

Composite index is like a plain alphabet index in a dictionary, but covering two or more letters, like this:

AA - page 1
AB - page 12

etc.

Table rows are ordered first by the first column in the index, then by the second one etc.

It's usable when you search by both columns OR by first column. If your index is like this:

AA - page 1
AB - page 12
…
AZ - page 245
BA - page 246
…

you can use it for searching on 2 letters ( = 2 columns in a table), or like a plain index on one letter:

A - page 1
B - page 246
…

Note that in case of a dictionary, the pages themself are alphabetically ordered. That's an example of a CLUSTERED index.

In a plain, non-CLUSTERED index, the references to pages are ordered, like in a history book:

Gaul, Alesia: pages 12, 56, 78
Gaul, Augustodonum Aeduorum: page 145
…
Gaul, Vellaunodunum: page 24
Egypt, Alexandria: pages 56, 194, 213, 234, 267

Composite indexes may also be used when you ORDER BY two or more columns. In this case a DESC clause may come handy.

See this article in my blog about using DESC clause in a composite index: