Come leggere la tabella PDF in C#

Questo breve tutorial guida su come leggere la tabella PDF in C# e leggere tutti i contenuti al suo interno. Fornisce una descrizione dettagliata per analizzare tutte le tabelle in un file PDF e quindi accedere a ogni singola riga e cella di una determinata tabella. Per leggere la tabella dal PDF il codice C# comprende poche righe in cui viene caricato il file PDF di origine e quindi tutte le tabelle vengono analizzate per la lettura del contenuto.

Passaggi per leggere la tabella PDF in C#

Aggiungi un riferimento a Aspose.PDF for .NET per leggere i dati della tabella nel PDF
Carica il file PDF di origine utilizzando l’oggetto classe Document
Crea un’istanza dell’oggetto classe TableAbsorber e leggi tutte le tabelle dalla pagina PDF desiderata
Scorri tutte le righe nella tabella PDF di destinazione
Itera tutte le celle in ogni riga e recupera tutti i frammenti di testo
Visualizza o elabora ogni frammento di testo in una cella

In questi passaggi viene seguito un approccio sistematico per leggere la tabella PDF in C#, dove inizialmente viene caricato il file PDF e quindi vengono analizzate tutte le tabelle utilizzando l’oggetto di classe TableAbsorber. Una volta che tutte le tabelle sono state visitate nel file PDF, è possibile ottenere il riferimento a una qualsiasi delle tabelle nella raccolta analizzata. Puoi accedere a qualsiasi tabella, riga, cella e frammento di testo in un file PDF per elaborarlo o visualizzarlo.

Codice per leggere la tabella PDF in C#

	using System;
	using Aspose.Pdf;
	using Aspose.Pdf.Text;

	namespace ReadPDFTableInCSharp
	{
	class Program
	{
	static void Main(string[] args)
	{
	// Instantiate the license to avoid trial limitations while reading table data from PDF
	License asposePdfLicense = new License();
	asposePdfLicense.SetLicense("Aspose.pdf.lic");

	// Load source PDF document having a table in it
	Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(@"PdfWithTable.pdf");

	// Declare and initialize TableAbsorber class object for reading table from the PDF
	Aspose.Pdf.Text.TableAbsorber tableAbsorber = new Aspose.Pdf.Text.TableAbsorber();

	// Parse all the tables from the desired page in the PDF
	tableAbsorber.Visit(pdfDocument.Pages[1]);

	// Get reference to the first table in the parsed collection
	AbsorbedTable absorbedTable = tableAbsorber.TableList[0];

	// Iterate through all the rows in the PDF table
	foreach (AbsorbedRow pdfTableRow in absorbedTable.RowList)
	{
	// Iterate through all the cells in the pdf table row
	foreach (AbsorbedCell pdfTableCell in pdfTableRow.CellList)
	{
	// Fetch all the text fragments in the cell
	TextFragmentCollection textFragmentCollection = pdfTableCell.TextFragments;

	// Iterate through all the text fragments
	foreach (TextFragment textFragment in textFragmentCollection)
	{
	// Display the text
	Console.WriteLine(textFragment.Text);
	}
	}
	}
	System.Console.WriteLine("Done");
	}
	}
	}

view raw How to Read PDF Table in C#.cs hosted with ❤ by GitHub

In questo codice di esempio l’utilizzo di C# parse PDF table è possibile utilizzando la classe TableAbsorber utilizzata per leggere le tabelle. Tuttavia, puoi anche utilizzare altre opzioni come TextAbsorber, ParagraphAbsorber, FontAbsorber e TextFragmentAbsorber per accedere a diversi elementi del documento. Puoi scorrere l’intera raccolta o accedere a singoli elementi utilizzando l’indice di matrice.

Abbiamo imparato come leggere la tabella PDF in C# in questo argomento. Tuttavia, se desideri leggere i segnalibri PDF, fai riferimento all’articolo su come leggere i segnalibri in PDF usando C#.

Aspose Base di Conoscenza

Trova le risposte di API

Come leggere la tabella PDF in C#

Passaggi per leggere la tabella PDF in C#

Codice per leggere la tabella PDF in C#