How to Get PDF Metadata in Java

This brief tutorial describes how to get PDF metadata in Java. It explains the complete process step-by-step where after configuring the environment, the source PDF file is opened and its metadata is extracted. You will not only check PDF metadata in Java but also learn to add custom metadata to a PDF file.

Steps to Read PDF Metadata in Java

  1. Establish the IDE environment to add Aspose.PDF for Java from the repository to fetch metadata
  2. Load the input PDF file into the Document class object for reading metadata
  3. Fill the DocumentInfo class object from the loaded PDF using the getInfo() method
  4. Display all the desired properties from the DocumentInfo object

These simple steps explain the process to see PDF metadata in Java. First, you have to load the source PDF file from a disk or stream and then get a reference to the DocumentInfo object using the getInfo() that contains all the required properties like creator, modification date, modification date time zone, creation date, and producer to name a few. This class has methods to update the existing metadata as well as the options to add custom information also.

Code for Extracting Metadata from PDF in Java

This code demonstrates the process to get PDF metadata in Java by using the DocumentInfo class. You can fetch other properties like the trapped flag, check if a particular property is predefined or not, title, subject, and author. All these properties can be set using the setter methods along with the option to add custom properties using set_Item() method and retrieving the same using get_Item().

In this quick tutorial, we have learned to extract metadata from a PDF file. If you want to learn the process of reading bookmarks in a PDF file, refer to the article on how to read bookmarks in PDF using Java.

 English