我有三个这样的表:
电影:身份证,姓名
标签:ID,名称,值
已标记:id,电影(FK),标签(FK)
因此,每部电影都有自己的标签集。我需要的是基于标签集检索相似的电影。我想说10部按匹配标签计数排序的电影。
如果我创建如下所示的视图,它将使MySQL消失。“标记”表和“标记”表中都有超过30k条记录。
create view relatedtags as select entityLeft.id as id, entityRight.id as rightId, count(rightTagged.id) as matches from entity as entityLeft join tagged as leftTagged on leftTagged.entity = entityLeft.id, entity as entityRight join tagged as rightTagged on rightTagged.entity = entityRight.id where leftTagged.tag = rightTagged.tag and entityLeft.id != entityRight.id group by entityLeft.id, entityRight.id
这将返回所有电影的列表,这些电影<current_movie_id>通过减少共同的标签数量来与给定顺序共享至少1个标签
<current_movie_id>
SELECT movie.*, count(DISTINCT similar.tag) as shared_tags FROM movie INNER JOIN ( tagged AS this_movie INNER JOIN tagged AS similar USING (tag) ) ON similar.movie = movie.id WHERE this_movie.movie=<current_movie_id> AND movie.id != this_movie.movie GROUP BY movie.id ORDER BY shared_tags DESC
希望能给您一些帮助