如何使用 C# 在 PDF 中查找和替换文本

本快速指南介绍了如何使用 C# 在详细步骤和可运行代码的帮助下查找和替换 PDF 中的文本。它有助于配置环境，然后提供一个分步过程来使用 C# 替换 PDF 中的文本。文件更新后，您可以将其以原始格式（即 PDF 或 DOCX、Excel、HTML 等）保存在磁盘上。

使用 C# 在 PDF 中查找和替换文本的步骤

使用 NuGet 包管理器将项目配置为使用 Aspose.PDF for .NET
使用 Document 类对象创建或加载包含示例文本的 PDF 文件
使用 TextFragmentAbsorber 类对象，设置要搜索的文本
对于输入 PDF 文件中的所有页面，接受文本吸收器
获取从加载的 PDF 文件中提取文本的片段集合
解析所有片段并设置新文本
保存更新的 PDF 文件

这些步骤描述了如何在 PDF 中使用 C# 搜索和替换文本。将创建一个包含一些示例文本的新文件，但是您可以加载一个现有的 PDF 文件，其文本将被替换。有多种选项可用于搜索 PDF 中的文本，例如忽略阴影文本、将搜索限制为页面绑定等。

使用 C# 替换 PDF 中的文本的代码

 using Aspose.Pdf;
 using Aspose.Pdf.Text;
 namespace FindAndReplaceTextInPdfUsingCSharp
 {
     class Program
     {
         static void Main(string[] args) // Main function to create 7z archive in CSharp
         {
             // Instantiate a license to avoid watermark in output PDF
             Aspose.Pdf.License licForPdf= new Aspose.Pdf.License();
             licForPdf.SetLicense("Aspose.Pdf.lic");
             // Create an empty PDF document
             Document newPDFFile = new Document();
             // Add an empty page in the newly created PDF
             Page page = newPDFFile.Pages.Add();
             // Add sample text in the PDF file
             for(int  iTxtCounter = 0 ; iTxtCounter < 15; iTxtCounter++)
                 page.Paragraphs.Add(new Aspose.Pdf.Text.TextFragment($"my_data\nanother data"));
             // Save the newly created PDF file containing the test data in it
             newPDFFile.Save("InputPDFToReplaceText.pdf");
             // Open PDF document to replace text in it
             Document inputPDFFile = new Document("InputPDFToReplaceText.pdf");
             // Set the text that is to be searched in the TextAbsorber object
             TextFragmentAbsorber txtAbsorber = new TextFragmentAbsorber("my_data");
             // Apply the text absorber for all the pages in the input PDF file
             inputPDFFile.Pages.Accept(txtAbsorber);
             // Get the collection of fragments containing extracted text from the PDF
             TextFragmentCollection textFragmentCollection = txtAbsorber.TextFragments;
             // Parse all the fragments and replace text using particular font, size and foreground/background color
             foreach (TextFragment txtFragment in textFragmentCollection)
                 txtFragment.Text = "MY_DATA";
             // Save resulting PDF document.
             inputPDFFile.Save("OutputPDFAfterReplacingText.pdf");
             System.Console.WriteLine("Done");
         }
     }
 }

view raw How to Find and Replace Text in PDF using C#.cs hosted with ❤ by GitHub

此代码使用 TextFragmentAbsorber 和 TextFragment 来使用 C# 在 PDF 中查找和替换文本。您不仅可以替换文本，还可以更改生成的 PDF 文件中的字体系列、大小、前景色和背景色。选项也可用于一次替换整个 PDF 中的文本或基于正则表达式替换文本。

在本主题中，我们学习了在 PDF 中查找和替换文本，但是，如果您想学习按页面拆分 PDF 文件，请参阅如何在C#中按页面拆分PDF文件上的文章。

Aspose 知识库

查找API的答案

如何使用 C# 在 PDF 中查找和替换文本

使用 C# 在 PDF 中查找和替换文本的步骤

使用 C# 替换 PDF 中的文本的代码