编写复杂的MySQL查询

小编典典

编写复杂的MySQL查询

sql

表Words_Learned包含用户已知的所有单词以及
单词学习的顺序。它具有3列：1）单词ID和2）用户ID，以及3）
单词学习的顺序。

表中Article包含文章。它具有3列：1）文章ID，2）
唯一字数和3）文章内容。

该表Words包含每篇文章中包含的所有唯一单词的列表。
它有2列1）单词ID和2）文章ID

数据库图如下/

在此处输入图片说明

您可以从此处下载数据库代码：https :
//www.dropbox.com/s/3gr659y5mk05i5w/tests.sql?dl=0

现在，使用此数据库并使用“仅” mysql，我需要完成以下工作。

给定一个用户ID，它应该获得该用户已知的所有单词的列表，并按照
从中得知的顺序重新排序。换句话说，
最近学到的单词将位于列表的顶部。

假设对用户ID的查询表明他们记住了以下
3个单词，我们将跟踪他们学习单词的顺序。章鱼
-3只狗-2匙-1

首先，我们获得所有包含“章鱼”一词的文章的列表，然后
使用表格Words对这些文章进行计算。计算意味着，
如果该文章包含10个以上未出现在用户
词汇表中的单词（从表格中拉出words_learned），则将其从
列表中排除。

然后，我们对所有包含dog但不包含
‘ctopus’的记录进行查询。

然后，我们对所有包含汤匙但不包含“
章鱼”或“狗”的记录进行查询

您将继续执行此重复过程，直到找到100个
符合此条件的记录。

为了实现这一过程，我做了以下工作

SELECT `words_learned`.`idwords`,
Words.`idArticle`
FROM words_learned
INNER JOIN Words ON Words.idWords = Words_Learned.`idwords`
WHERE words_learned.userId = 1
ORDER BY Words_Learned.`order` DESC

在我的查询中，我已经掩盖了获得文章的意思，这意味着到这里-
First we get a list of all articles containing the word Octopus, and then do the calculation using table Words on just those articles.。但是我应该怎么
做才能覆盖其余部分？

更新

这是用于更好理解的伪代码。

Do while articles found < 100
{
 for each ($X as known words, in order that those words were learned)
 {
  Select all articles that contain the word $X, where the 1) article has not been included in any previous loops, and 2)where the count of "unknown" words is less than 10.

  Keep these articles in order. 
 }
}

阅读 435

2021-04-07

共1个答案

小编典典

我很想拥有一个子查询，该查询将获取一个人
学习的所有单词并将其与自身相结合，并带有GROUP_CONCAT单词
和一个计数。所以给：

Octopus, NULL, 0
Dog, "Octopus", 1
Spoon, "Octopus,Dog", 2

因此，子查询将类似于：

SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords

giving

idwords    excl_words    older_words_cnt
1          NULL          0
2          1             1
3          1,2           2

然后将其结果与其他表结合起来，检查
主要idword匹配但没有其他匹配的文章。

像这样的东西（尽管没有作为测试数据进行测试）：

SELECT sub_words.idwords, words_inc.idArticle
(
    SELECT sub0.idwords, SUBSTRING_INDEX(GROUP_CONCAT(sub1.idwords), ',', 10) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
    FROM words_learned sub0
    LEFT OUTER JOIN words_learned sub1
    ON sub0.userId = sub1.userId
    AND sub0.order_learned < sub1.order_learned
    WHERE sub0.userId = 1
    GROUP BY sub0.idwords
) sub_words
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100

编辑-已更新，以排除
尚未学习的超过10个单词的文章。

SELECT sub_words.idwords, words_inc.idArticle,
sub2.idArticle, sub2.count, sub2.content
FROM
(
    SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
    FROM words_learned sub0
    LEFT OUTER JOIN words_learned sub1
    ON sub0.userId = sub1.userId
    AND sub0.order_learned < sub1.order_learned
    WHERE sub0.userId = 1
    GROUP BY sub0.idwords
) sub_words 
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
INNER JOIN
(
    SELECT a.idArticle, a.count, a.content, SUM(IF(c.idwords_learned IS NULL, 1, 0)) AS unlearned_words_count
    FROM Article a
    INNER JOIN words b
    ON a.idArticle = b.idArticle
    LEFT OUTER JOIN words_learned c
    ON b.idwords = c.idwords
    AND c.userId = 1
    GROUP BY a.idArticle, a.count, a.content
    HAVING unlearned_words_count < 10
) sub2
ON words_inc.idArticle = sub2.idArticle
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100

EDIT - attempt at commenting the above query:-

This just selects the columns

SELECT sub_words.idwords, words_inc.idArticle,
sub2.idArticle, sub2.count, sub2.content
FROM

此子查询获取每个已学习的单词，以及以逗号分隔
的具有较大order_learned的单词列表。这是针对特定的用户
ID

(
    SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
    FROM words_learned sub0
    LEFT OUTER JOIN words_learned sub1
    ON sub0.userId = sub1.userId
    AND sub0.order_learned < sub1.order_learned
    WHERE sub0.userId = 1
    GROUP BY sub0.idwords
) sub_words

This is just to get the articles the words (ie, the words learned from the
above sub query) are used in

INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords

此子查询获取的文章中少于10个单词的文章
尚未被特定用户学习。

INNER JOIN
(
    SELECT a.idArticle, a.count, a.content, SUM(IF(c.idwords_learned IS NULL, 1, 0)) AS unlearned_words_count
    FROM Article a
    INNER JOIN words b
    ON a.idArticle = b.idArticle
    LEFT OUTER JOIN words_learned c
    ON b.idwords = c.idwords
    AND c.userId = 1
    GROUP BY a.idArticle, a.count, a.content
    HAVING unlearned_words_count < 10
) sub2
ON words_inc.idArticle = sub2.idArticle

该联接用于从第一个子查询中查找在逗号分隔列表中包含单词的文章（即，order_learned较大的单词）。这是作为LEFT OUTER JOIN完成的，因为我想排除找到的任何单词（这在
WHERE子句中通过检查NULL来完成）

LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100

2021-04-07