使用sqlite在列中查找最常见的单词？

小编典典

使用sqlite在列中查找最常见的单词？

sql

我有看起来像这样的数据：

            movie_id    comment
            1           tom cruise is great
            1           great action movie
            2           got teary eyed
            2           great cast
            1           tom cruise is hott

我想要一个函数，该函数根据我选择的movie_id返回注释中最常用的词。因此，如果我查询movie_id = 1，则会得到：

            tom, 2
            cruise, 2
            is, 2
            great, 2
            hott, 1
            action, 1
            movie, 1

如果我查询movie_id = 2，则会得到：

            got, 1
            teary, 1
            eyed, 1
            great, 1
            cast, 1

我看到了一些使用tsql的解决方案，但我以前从未使用过，也不了解代码。寻找一种在sqlite3中做到这一点的方法。

阅读 212

2021-03-08

共1个答案

小编典典

您可以使用一个非常丑陋的查询来执行此操作。

select word, count(*) from (
select (case when instr(substr(m.comments, nums.n+1), ' ') then substr(m.comments, nums.n+1)
             else substr(m.comments, nums.n+1, instr(substr(m.comments, nums.n+1), ' ') - 1)
        end) as word
from (select ' '||comments as comments
      from m
     )m cross join
     (select 1 as n union all select 2 union all select 3
     ) nums
where substr(m.comments, nums.n, 1) = ' ' and substr(m.comments, nums.n, 1) <> ' '
) w
group by word
order by count(*) desc

这是未经测试的。内部查询需要一个数字列表（此处限制为3；您可以了解如何添加更多数字）。然后检查单词是否在位置n +
1处开始。空格后是一个单词，所以我在注释的开头加了一个空格。

然后，出于聚合目的将其删除。

2021-03-08