How to Convert PDF to HTML in C#

This brief how-to topic focuses on how to Convert PDF to HTML in C#. You can export PDF to HTML in C# by using simple API calls with no dependence on Adobe Acrobat or any other third party tool. This application can be used in any of .NET based applications running in MS Windows, Linux, or macOS operation systems.

Steps to Convert PDF to HTML in C#

  1. Install Aspose.PDF using NuGet package tool in your application
  2. Add a reference to Aspose.PDF namespace in your application
  3. Initialize Document class instance to load PDF and its conversion to HTML
  4. Initialize HtmlSaveOptions object to set Fonts, SVG and Image save options
  5. Finally, convert PDF to HTML in C# by using Save method

By following the above steps in C# PDF to HTML conversion has been made possible by using simple API calls. You will start by adding necessary API references and then loading source PDF file. Later, you will set the necessary options required in exported HTML by using HtmlSaveOptions class. Finally, by using SaveFormat.Html enumerator inside Save method, HTML will be saved on disk.

Code to Convert PDF to HTML in C#

The above example in C# convert PDF to HTML by making use of HtmlSaveOptions class which will enable you set the options like splitting PDF to multiple pages and managing font settings. It also let you set the SVG export options like compression and path for SVG content along with path settings for exported images from source PDF as well. Finally, the desired HTML file is saved on disk or in a MemoryStream for further usage.

We have witnessed, how convenient it is to convert PDF to HTML in C# and getting a customized output. If you are interested in saving PDF file as images, refer to the article on how to convert PDF to Image in C#.

 English