Extract a Table from PDF to Excel using C#

This short guide describes how to extract a table from PDF to Excel using C#. You will get details to set the IDE for Aspose.PDF and Aspose.Cells, a list of steps, and a sample code to get table from PDF to Excel using C#. The given sample code will demonstrate all the necessary code flow required to perform the task.

Steps to Extract Data from PDF Table to Excel using C#

  1. Set the IDE to use both Aspose.PDF for .NET and Aspose.Cells for .NET in the same project
  2. Apply the Aspose.Total license to exercise the API features without any limitations or watermark
  3. Load the source PDF file into the Document class object
  4. Create a new Excel file using the Workbook class and set a name for the first sheet
  5. Parse through each page in the PDF file
  6. Access all the tables on each page, and for each table, access the text in each row and column
  7. Write each cell’s content into the corresponding row and cell in the destination sheet of the Excel file
  8. Adjust the rows/columns width and save the workbook

These steps summarize how to extract PDF to Excel table using C#. Set the IDE for the development, apply the license, load the source PDF, create a new Excel file, and parse through each page of the PDF. Fetch the collection of tables in each page, parse each table for fetching individual cell content and copy to the corresponding row and column in the destination sheet of the output Excel file.

Code to Extract Table from PDF to Excel using C#

This code demonstrates how to pull table from PDF into Excel using C#. The given sample code writes plain text from the PDF table into the Excel cells. You can preserve formatting by using the PDF cell’s font, size, bold/italic style, and colour. You can also detect the numeric and date-like values in the PDF table and use the appropriate format while writing to the Excel file.

This tutorial explains the process to transfer the PDF table contents into the Excel sheet. To convert a scanned PDF to an editable PDF, refer to the article Convert scanned PDF to editable PDF using C#.