Export Data from a PDF Form to Excel using Java

This tutorial describes how to export data from a PDF form to Excel using Java. It contains the information to set the IDE, a list of steps, and a sample code to extract data from fillable PDF to Excel using Java. You will get the details to export the PDF Form data to an XML file and then load the XML file into a Workbook for saving it as an XLSX file.

Steps to Export PDF Fields to Excel using Java

  1. Set the IDE for using Aspose.PDF and Aspose.Cells for Java to export PDF Form data
  2. Use the Form object from Aspose.PDF to load the PDF file with form fields in it
  3. Invoke the Form.bindPdf() method to link the PDF with the Form object
  4. Create the FileOutputStream for the output XML file
  5. Call the Form.exportXml() method to fetch form data and fill into the XML file
  6. Use the XmlLoadOptions object from Aspose.Cells API for loading the XML file
  7. Load the Workbook class object to load the XML file and save it as an XLSX file

The above steps summarize how to extract PDF fields to Excel using Java. Initially, transfer the PDF Form data to an XML file using Aspose.PDF API that has a Form.exportXml() method for this purpose. Finally, use Aspose.Cells API to load this XML file into a Workbook object and save as in Excel file format XLSX.

Code to Extract Data from PDF Form to Excel using Java

import com.aspose.pdf.Document;
import com.aspose.pdf.Field;
import com.aspose.pdf.License;
import com.aspose.pdf.WidgetAnnotation;
import com.aspose.pdf.facades.Form;
import com.aspose.cells.Workbook;
import com.aspose.cells.XmlLoadOptions;
import com.aspose.cells.SaveFormat;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileInputStream;
import java.io.IOException;
public class Main {
public static void main(String[] args) throws Exception {
// Set license for Aspose.PDF
License pdfLic = new License();
try {
pdfLic.setLicense("license.lic");
} catch (Exception e) {
e.printStackTrace();
}
// Set license for Aspose.Cells
com.aspose.cells.License cellsLic = new com.aspose.cells.License();
try {
cellsLic.setLicense("license.lic");
} catch (Exception e) {
e.printStackTrace();
}
ExportDataToXml();
ConvertXmlToXlsx();
}
public static void ExportDataToXml() {
try {
Form pdfForm = new Form();
pdfForm.bindPdf("TextBox_out.pdf");
FileOutputStream xmlOutputStream = new FileOutputStream(new File("input.xml"));
pdfForm.exportXml(xmlOutputStream);
xmlOutputStream.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void ConvertXmlToXlsx() {
try {
XmlLoadOptions options = new XmlLoadOptions();
options.setCheckDataValid(true);
Workbook wb = new Workbook("input.xml", options);
wb.save("XmlToXlsx.xlsx", SaveFormat.XLSX);
} catch (Exception e) {
e.printStackTrace();
}
}
}

The above code demonstrates how to export data from fillable PDF to Excel using Java. It leverages the developer to customize the loading of output XML file using various methods and properties in the XmlLoadOptions object. You may also use the setLoadFilter(LoadFilter value) method to filter the data while loading from the XML file.

This article has taught us the process of converting PDF Form data to an Excel file. To extract a selected form field from a particular page, refer to the article on how to Extract data from PDF Form using Java.

 English