C＃でPDFテーブルを読む方法

この短いハウツーチュートリアルガイドは、** C＃でPDFテーブルを読み取り、その中のすべての内容を読み取る方法について説明しています。 PDFファイル内のすべてのテーブルを解析し、特定のテーブルの個々の行とセルにアクセスするための詳細な説明を提供します。 * PDFからテーブルを読み取るためにC＃*コードは数行で構成され、ソースPDFファイルが読み込まれ、すべてのテーブルが解析されて内容が読み取られます。

C＃でPDFテーブルを読む手順

Aspose.PDF for .NETへの参照を追加して、PDFのテーブルデータを読み取ります
Documentクラスオブジェクトを使用してソースPDFファイルをロードします
TableAbsorberクラスオブジェクトをインスタンス化し、目的のPDFページからすべてのテーブルを読み取ります
ターゲットPDFテーブルのすべての行を反復処理します
各行のすべてのセルを繰り返し、すべてのテキストフラグメントをフェッチします
セル内の各テキストフラグメントを表示または処理する

これらの手順では、体系的なアプローチに従って* C＃でPDFテーブルを読み取ります*。最初にPDFファイルが読み込まれ、次にTableAbsorberクラスオブジェクトを使用してすべてのテーブルが解析されます。 PDFファイルですべてのテーブルにアクセスすると、解析されたコレクション内の任意のテーブルへの参照を取得できます。 PDFファイル内の任意のテーブル、行、セル、およびテキストフラグメントにアクセスして、それを処理または表示できます。

C＃でPDFテーブルを読み取るためのコード

	using System;
	using Aspose.Pdf;
	using Aspose.Pdf.Text;

	namespace ReadPDFTableInCSharp
	{
	class Program
	{
	static void Main(string[] args)
	{
	// Instantiate the license to avoid trial limitations while reading table data from PDF
	License asposePdfLicense = new License();
	asposePdfLicense.SetLicense("Aspose.pdf.lic");

	// Load source PDF document having a table in it
	Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(@"PdfWithTable.pdf");

	// Declare and initialize TableAbsorber class object for reading table from the PDF
	Aspose.Pdf.Text.TableAbsorber tableAbsorber = new Aspose.Pdf.Text.TableAbsorber();

	// Parse all the tables from the desired page in the PDF
	tableAbsorber.Visit(pdfDocument.Pages[1]);

	// Get reference to the first table in the parsed collection
	AbsorbedTable absorbedTable = tableAbsorber.TableList[0];

	// Iterate through all the rows in the PDF table
	foreach (AbsorbedRow pdfTableRow in absorbedTable.RowList)
	{
	// Iterate through all the cells in the pdf table row
	foreach (AbsorbedCell pdfTableCell in pdfTableRow.CellList)
	{
	// Fetch all the text fragments in the cell
	TextFragmentCollection textFragmentCollection = pdfTableCell.TextFragments;

	// Iterate through all the text fragments
	foreach (TextFragment textFragment in textFragmentCollection)
	{
	// Display the text
	Console.WriteLine(textFragment.Text);
	}
	}
	}
	System.Console.WriteLine("Done");
	}
	}
	}

view raw How to Read PDF Table in C#.cs hosted with ❤ by GitHub

このサンプルコードでは、* C＃parse PDF table *を使用して、テーブルの読み取りに使用されるTableAbsorberクラスを使用できます。ただし、TextAbsorber、ParagraphAbsorber、FontAbsorber、TextFragmentAbsorberなどの他のオプションを使用して、ドキュメントのさまざまな要素にアクセスすることもできます。コレクション全体を反復処理するか、配列インデックスを使用して個々の要素にアクセスできます。

このトピックでは、* C＃でPDFテーブルを読み取る*方法を学びました。ただし、PDFブックマークを読みたい場合は、C＃を使用してPDFでブックマークを読み取る方法の記事を参照してください。

Aspose 知識ベース

APIで回答を見つけます

C＃でPDFテーブルを読む方法

C＃でPDFテーブルを読む手順

C＃でPDFテーブルを読み取るためのコード