How to Read PDF File in C#

Reading different types of documents programmatically is common practice these days. In this how-to guide, you will get to learn how to read PDF File in C# by following below simple steps.

Steps to Read PDF File in C#

  1. Create an empty C# Console Application in Visual Studio
  2. Add reference to Aspose.PDF for .NET by installing it from NuGet.org
  3. Load an existing PDF file in Document object
  4. Initialize TextAbsorber Class to read PDF file
  5. Extract PDF text and write it to Console output
  6. Iterate through PDF Page Resources to find images
  7. Create FileStream object with found image
  8. Save the image to local disk

Below code snippet explains how to open and read PDF file in C#. You will be able to read text and extract images from a PDF file by using it. The API offers TextAbsorber Class that is used to read text from PDF file and you can achieve the extracted results via Text object. Finding images and saving them to local disk is also possible by looping through PDF Page resources as shown below.

Code to Read PDF File in C#

In the previous topic, you learnt how to process large PDF files in C#. The above information and code example will enable you to open and read PDF files in C# in order to extract text and images.

 English