Python - 处理Word文档 Python - 处理PDF Python - 阅读RSS提要 要阅读word文档,我们将获得名为docx的模块的帮助。我们首先安装docx,如下所示。然后编写一个程序,使用docx模块中的不同函数按段落读取整个文件。 我们使用以下命令将docx模块放入我们的环境中。 pip install docx 在下面的示例中,我们通过将每个行附加到段落并最终打印出所有段落文本来阅读word文档的内容。 import docx def readtxt(filename): doc = docx.Document(filename) fullText = [] for para in doc.paragraphs: fullText.append(para.text) return '\n'.join(fullText) print (readtxt('path\codingdict.docx')) 当我们运行上面的程序时,我们得到以下输出 - Tutorials Point originated from the idea that there exists a class of readers who respond better to online content and prefer to learn new skills at their own pace from the comforts of their drawing rooms. The journey commenced with a single tutorial on HTML in 2006 and elated by the response it generated, we worked our way to adding fresh tutorials to our repository which now proudly flaunts a wealth of tutorials and allied articles on topics ranging from programming languages to web designing to academics and much more. 阅读个别段落 我们可以使用paragraph属性从word文档中读取特定段落。在下面的例子中,我们只读取word文档中的第二段。 import docx doc = docx.Document('path\codingdict.docx') print len(doc.paragraphs) print doc.paragraphs[2].text 当我们运行上面的程序时,我们得到以下输出 - The journey commenced with a single tutorial on HTML in 2006 and elated by the response it generated, we worked our way to adding fresh tutorials to our repository which now proudly flaunts a wealth of tutorials and allied articles on topics ranging from programming languages to web designing to academics and much more. Python - 处理PDF Python - 阅读RSS提要