用动态编程实现文本对齐

小编典典

用动态编程实现文本对齐

algorithm

我想了解动态规划的概念，通过对MIT OCW课程在这里。关于OCW视频的解释非常棒，所有内容，但是我觉得直到我将解释实现到代码中之前，我才真正理解它。在实施过程中，我会在这里参考讲义中的一些注释，尤其是注释的第3页。

问题是，我不知道如何将一些数学符号转换为代码。这是我已实施的解决方案的一部分（并认为已正确实施）：

import math

paragraph = "Some long lorem ipsum text."
words = paragraph.split(" ")

# Count total length for all strings in a list of strings.
# This function will be used by the badness function below.
def total_length(str_arr):
    total = 0

    for string in str_arr:
        total = total + len(string)

    total = total + len(str_arr) # spaces
    return total

# Calculate the badness score for a word.
# str_arr is assumed be send as word[i:j] as in the notes
# we don't make i and j as argument since it will require
# global vars then.
def badness(str_arr, page_width):
    line_len = total_length(str_arr)
    if line_len > page_width:
        return float('nan') 
    else:
        return math.pow(page_width - line_len, 3)

现在我不理解的部分在讲义中的第3至5点。我真的不明白，也不知道从哪里开始实现这些。到目前为止，我已经尝试过迭代单词列表，并计算每个据称行尾的不良情况，如下所示：

def justifier(str_arr, page_width):
    paragraph = str_arr
    par_len = len(paragraph)
    result = [] # stores each line as list of strings
    for i in range(0, par_len):
        if i == (par_len - 1):
            result.append(paragraph)
        else:
            dag = [badness(paragraph[i:j], page_width) + justifier(paragraph[j:], page_width) for j in range(i + 1, par_len + 1)] 
            # Should I do a min(dag), get the index, and declares it as end of line?

但是然后，我不知道如何继续执行该功能，说实话，我不明白这一行：

dag = [badness(paragraph[i:j], page_width) + justifier(paragraph[j:], page_width) for j in range(i + 1, par_len + 1)]

以及如何将其justifier作为int返回值（因为我已经决定将返回值存储在result，它是一个列表。我应该创建另一个函数并从那里进行递归吗？是否应该有递归？

您能告诉我下一步做什么，并解释一下这是动态编程吗？ 我真的看不到递归在哪里，子问题是什么。

以前谢谢

阅读 363

2020-07-28

共1个答案

小编典典

如果您在理解动态编程本身的核心思想时遇到困难，这里是我的看法：

动态规划本质上是牺牲 空间复杂度 的 时间复杂度 （但使用额外的空间通常是非常
小相比，节省您的时间，使动态，如果正确实施编程完全值得的）。您可以随时存储每个递归调用的值（例如，存储在数组或字典中），这样当您在递归树的另一个分支中遇到相同的递归调用时，可以避免第二次进行计算。

而且不，你不
必须使用递归。这是我正在使用Just循环处理的问题的实现。我非常关注AlexSilva链接的TextAlignment.pdf。希望对您有所帮助。

def length(wordLengths, i, j):
    return sum(wordLengths[i- 1:j]) + j - i + 1


def breakLine(text, L):
    # wl = lengths of words
    wl = [len(word) for word in text.split()]

    # n = number of words in the text
    n = len(wl)

    # total badness of a text l1 ... li
    m = dict()
    # initialization
    m[0] = 0

    # auxiliary array
    s = dict()

    # the actual algorithm
    for i in range(1, n + 1):
        sums = dict()
        k = i
        while (length(wl, k, i) <= L and k > 0):
            sums[(L - length(wl, k, i))**3 + m[k - 1]] = k
            k -= 1
        m[i] = min(sums)
        s[i] = sums[min(sums)]

    # actually do the splitting by working backwords
    line = 1
    while n > 1:
        print("line " + str(line) + ": " + str(s[n]) + "->" + str(n))
        n = s[n] - 1
        line += 1

2020-07-28