我有一个项目,我需要填写预制的PDF,而要实现这一目标,我想到的最合乎逻辑的解决方案是将预制的PDF制作成PDF表单,这样就应该有输入值的标签,然后我可以浏览PDF中的表单标签,并将它们与值字典对齐。
我已经使用 PyPDF2 完成了此 任务 。总的来说,我拍摄了一个Web表单的图像,然后打开Acrobat,并根据图像中看到的字段创建了一个PDF表单,然后使用 PyPDF2 填写了PDF表单字段,但需要注意的是,打印那些填充值似乎在某些浏览器中,越野车是其中之一。
我如何才能将PDF表单转换为标准/平面PDF,这样我可以保留预填充的值,但会丢失可编辑字段(因为我认为这是问题所在)?
from io import BytesIO import PyPDF2 from django.http import HttpResponse from PyPDF2.generic import BooleanObject, NameObject, IndirectObject def pdf_view(request): template = 'templates/template.pdf' outfile = "templates/test.pdf" input_stream = open(template, "rb") pdf_reader = PyPDF2.PdfFileReader(input_stream, strict=False) if "/AcroForm" in pdf_reader.trailer["/Root"]: pdf_reader.trailer["/Root"]["/AcroForm"].update( {NameObject("/NeedAppearances"): BooleanObject(True)}) pdf_writer = PyPDF2.PdfFileWriter() set_need_appearances_writer(pdf_writer) if "/AcroForm" in pdf_writer._root_object: # Acro form is form field, set needs appearances to fix printing issues pdf_writer._root_object["/AcroForm"].update( {NameObject("/NeedAppearances"): BooleanObject(True)}) data_dict = { 'first_name': 'John', 'last_name': 'Smith', 'email': 'mail@mail.com', 'phone': '889-998-9967', 'company': 'Amazing Inc.', 'job_title': 'Dev', 'street': '123 Main Way', 'city': 'Johannesburg', 'state': 'New Mexico', 'zip': 96705, 'country': 'USA', 'topic': 'Who cares...' } pdf_writer.addPage(pdf_reader.getPage(0)) pdf_writer.updatePageFormFieldValues(pdf_writer.getPage(0), data_dict) output_stream = BytesIO() pdf_writer.write(output_stream) # print(fill_in_pdf(template, data_dict).getvalue()) # fill_in_pdf(template, data_dict).getvalue() response = HttpResponse(output_stream.getvalue(), content_type='application/pdf') response['Content-Disposition'] = 'inline; filename="completed.pdf"' input_stream.close() return response def set_need_appearances_writer(writer): try: catalog = writer._root_object # get the AcroForm tree and add "/NeedAppearances attribute if "/AcroForm" not in catalog: writer._root_object.update({ NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)}) need_appearances = NameObject("/NeedAppearances") writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True) except Exception as e: print('set_need_appearances_writer() catch : ', repr(e)) return writer
解决方案非常简单,有疑问时请阅读文档( 第552/978页 ):
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/pdf_reference_archives/PDFReference.pdf
我要做的就是将字段标志的位位置更改为1,使字段变为ReadOnly,如下所示:
from io import BytesIO import PyPDF2 from django.http import HttpResponse from PyPDF2.generic import BooleanObject, NameObject, IndirectObject, NumberObject def pdf(request): template = 'templates/template.pdf' outfile = "templates/test.pdf" input_stream = open(template, "rb") pdf_reader = PyPDF2.PdfFileReader(input_stream, strict=False) if "/AcroForm" in pdf_reader.trailer["/Root"]: pdf_reader.trailer["/Root"]["/AcroForm"].update( {NameObject("/NeedAppearances"): BooleanObject(True)}) pdf_writer = PyPDF2.PdfFileWriter() set_need_appearances_writer(pdf_writer) if "/AcroForm" in pdf_writer._root_object: # Acro form is form field, set needs appearances to fix printing issues pdf_writer._root_object["/AcroForm"].update( {NameObject("/NeedAppearances"): BooleanObject(True)}) data_dict = { 'first_name': 'John\n', 'last_name': 'Smith\n', 'email': 'mail@mail.com\n', 'phone': '889-998-9967\n', 'company': 'Amazing Inc.\n', 'job_title': 'Dev\n', 'street': '123 Main Way\n', 'city': 'Johannesburg\n', 'state': 'New Mexico\n', 'zip': 96705, 'country': 'USA\n', 'topic': 'Who cares...\n' } pdf_writer.addPage(pdf_reader.getPage(0)) page = pdf_writer.getPage(0) pdf_writer.updatePageFormFieldValues(page, data_dict) for j in range(0, len(page['/Annots'])): writer_annot = page['/Annots'][j].getObject() for field in data_dict: # -----------------------------------------------------BOOYAH! if writer_annot.get('/T') == field: writer_annot.update({ NameObject("/Ff"): NumberObject(1) }) # ----------------------------------------------------- output_stream = BytesIO() pdf_writer.write(output_stream) response = HttpResponse(output_stream.getvalue(), content_type='application/pdf') response['Content-Disposition'] = 'inline; filename="completed.pdf"' input_stream.close() return response def set_need_appearances_writer(writer): try: catalog = writer._root_object # get the AcroForm tree and add "/NeedAppearances attribute if "/AcroForm" not in catalog: writer._root_object.update({ NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)}) need_appearances = NameObject("/NeedAppearances") writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True) except Exception as e: print('set_need_appearances_writer() catch : ', repr(e)) return writer