MySQL 匹配()对()-按相关性和列顺序排列?

好的,我试着在多个栏目中进行全文搜索,就像这样简单:

SELECT * FROM pages WHERE MATCH(head, body) AGAINST('some words' IN BOOLEAN MODE)

现在我要按照相关性排序,(找到了多少个单词?)我可以用这样的东西做到:

SELECT * , MATCH (head, body) AGAINST ('some words' IN BOOLEAN MODE) AS relevance
FROM pages
WHERE MATCH (head, body) AGAINST ('some words' IN BOOLEAN MODE)
ORDER BY relevance

现在到了我感到困惑的部分,我想优先考虑 head专栏中的相关性。

我想我可以创建两个相关列,一个用于 head,一个用于 body,但是在那个时候,我会在表中进行三次相同的搜索,对于我创建的这个函数来说,性能很重要,因为查询将与其他表进行连接和匹配。

因此,我的主要问题是 ,是否有一种更快的方法来搜索相关性并对某些列进行优先排序?(作为额外的奖励,甚至可能使相关性计数的话出现在专栏中的次数?)

任何建议或意见都可以。

注意: 我将在 LAMP 服务器上运行这个命令(本地测试中的 WAMP)

139784 次浏览

I have never done so, but it seems like

MATCH (head, head, body) AGAINST ('some words' IN BOOLEAN MODE)

Should give a double weight to matches found in the head.


Just read this comment on the docs page, Thought it might be of value to you:

Posted by Patrick O'Lone on December 9 2002 6:51am

It should be noted in the documentation that IN BOOLEAN MODE will almost always return a relevance of 1.0. In order to get a relevance that is meaningful, you'll need to:

SELECT MATCH('Content') AGAINST ('keyword1 keyword2') as Relevance
FROM table
WHERE MATCH ('Content') AGAINST('+keyword1+keyword2' IN BOOLEAN MODE)
HAVING Relevance > 0.2
ORDER BY Relevance DESC

Notice that you are doing a regular relevance query to obtain relevance factors combined with a WHERE clause that uses BOOLEAN MODE. The BOOLEAN MODE gives you the subset that fulfills the requirements of the BOOLEAN search, the relevance query fulfills the relevance factor, and the HAVING clause (in this case) ensures that the document is relevant to the search (i.e. documents that score less than 0.2 are considered irrelevant). This also allows you to order by relevance.

This may or may not be a bug in the way that IN BOOLEAN MODE operates, although the comments I've read on the mailing list suggest that IN BOOLEAN MODE's relevance ranking is not very complicated, thus lending itself poorly for actually providing relevant documents. BTW - I didn't notice a performance loss for doing this, since it appears MySQL only performs the FULLTEXT search once, even though the two MATCH clauses are different. Use EXPLAIN to prove this.

So it would seem you may not need to worry about calling the fulltext search twice, though you still should "use EXPLAIN to prove this"

This might give the increased relevance to the head part that you want. It won't double it, but it might possibly good enough for your sake:

SELECT pages.*,
MATCH (head, body) AGAINST ('some words') AS relevance,
MATCH (head) AGAINST ('some words') AS title_relevance
FROM pages
WHERE MATCH (head, body) AGAINST ('some words')
ORDER BY title_relevance DESC, relevance DESC


-- alternatively:
ORDER BY title_relevance + relevance DESC

An alternative that you also want to investigate, if you've the flexibility to switch DB engine, is Postgres. It allows to set the weight of operators and to play around with the ranking.

Just adding for who might need.. Don't forget to alter the table!

ALTER TABLE table_name ADD FULLTEXT(column_name);

I was just playing around with this, too. One way you can add extra weight is in the ORDER BY area of the code.

For example, if you were matching 3 different columns and wanted to more heavily weight certain columns:

SELECT search.*,
MATCH (name) AGAINST ('black' IN BOOLEAN MODE) AS name_match,
MATCH (keywords) AGAINST ('black' IN BOOLEAN MODE) AS keyword_match,
MATCH (description) AGAINST ('black' IN BOOLEAN MODE) AS description_match
FROM search
WHERE MATCH (name, keywords, description) AGAINST ('black' IN BOOLEAN MODE)
ORDER BY (name_match * 3  + keyword_match * 2  + description_match) DESC LIMIT 0,100;

Just to add that if you're using custom ranking, remember to use HAVING instead of WHERE to reduce the load.

SELECT MATCH(x,y) AGAINST (? IN BOOLEAN MODE) AS r1,
MATCH(z) AGAINST (? IN BOOLEAN MODE) AS r2,
...
FROM table
HAVING (r1 + r2) > 0
ORDER BY (r1 * 3 + r2) DESC
LIMIT 10