Create Table of Content in PDF using Python

This topic entails the process to create table of content in PDF using Python. It covers the details to establish the development environment, a list of steps, and a working code to add table of contents in PDF using Python. You will also get to know about the the configurations for the table of contents including hyperlinks, the text, and connectivity with different pages from the PDF file.

Steps to Add Table of Contents to PDF using Python

  1. Set the environment to use Aspose.PDF for Python via .NET to add a table of contents
  2. Access the sample PDF Document and insert a page at the start for adding TOC
  3. Create instances of the TocInfo and TextFragment classes for setting the TOC title
  4. Set the headings text for the table of contents
  5. Iterate through all PDF pages to add a respective TOC heading
  6. Set the target page, its coordinates and heading text during each iteration
  7. Save the resultant PDF file having TOC on the first page

The above steps exhibit the process to create clickable table of contents in PDF using Python. Access the source PDF file, add a page at the start of the document to hold the table of contents, and use the instances of the TocInfo and TextFragment to set the characteristics of the TOC. For every page in the loaded PDF document, insert a hyperlink in the table of contents, set the its text and link to the required page.

Code to Add Clickable Table of Contents to PDF using Python

import aspose.pdf as pdf
# Set the source directory path
filePath = "C://Words//"
# Load the license in your application to create TOC in PDF
pdf.License().set_license(filePath + "Conholdate.Total.Product.Family.lic")
# Open the sample PDF document file from the disk
pdfDoc = pdf.Document(filePath + "Sample.pdf")
# Insert a page for table of contents
pageTOC = pdfDoc.pages.insert(1)
# Instantiate an object of TocInfo for TOC information
tocInfo = pdf.TocInfo()
# Create an object of TextFragment for setting TOC title
title = pdf.text.TextFragment("Table Of Contents")
title.text_state.font_size = 20
# Set the title for Table of contents
tocInfo.title = title
pageTOC.toc_info = tocInfo
# Generate a list of strings for TOC
tocTitles = []
# Get count of pages in the PDF
count = pdfDoc.pages.length
for j in range(0, count):
tocTitles.insert(j, "Page "+ str(j + 1))
i = 0
while i < count:
# Instantiate an object of the Heading class
heading = pdf.Heading(1)
heading.toc_page = pageTOC
# Set the destination page for the heading object
heading.destination_page = pdfDoc.pages[i + 1]
# Set the destination coordinates for TOC item
heading.top = pdfDoc.pages[i +1].rect.height
# Set TOC item text
textSegment = pdf.text.TextSegment()
textSegment.text = tocTitles[i]
segments = heading.segments.append(textSegment)
# Add heading to the TOC page
pageTOC.paragraphs.add(heading)
i += 1
# Save document with TOC
pdfDoc.save("outputwithToc.pdf")
print ("Opeartion finished successfully")

This example demonstrates how to add table of content in PDF using Python. I this example, the TOC contents are generated manually by generating a list of strings using the page numbers. However, you can parse the file contents and set the similarly to the standard table of contents where headings from the PDF contents are used inside the TOC and link them with the desired content in the PDF file.

This topic has taught us how to create PDF table of contents using Python. If you are interested in adding hyperlinks in the contents of a PDF file, refer to the article on how to add hyperlink in PDF using Python.

 English