วิธีแปลง PDF เป็นข้อความใน Java

บทช่วยสอนสั้นๆ นี้ให้รายละเอียดเกี่ยวกับวิธี แปลง PDF เป็นข้อความใน Java โดยการโหลดเอกสารอินพุต PDF และบันทึกเป็นรูปแบบ Text นอกจากนี้ การใช้ Java PDF to Text converter สามารถปรับแต่งเพื่อควบคุมว่าคุณต้องการให้เอาต์พุต Text มีหรือไม่มีรูปแบบเมื่อเทียบกับไฟล์ PDF ต้นทาง

ขั้นตอนในการแปลง PDF เป็นข้อความใน Java

กำหนดค่าแอปพลิเคชันของคุณโดยเพิ่มการอ้างอิงไปยัง Aspose.PDF จากที่เก็บ Maven เพื่อแปลง PDF เป็นไฟล์ข้อความ
โหลดไฟล์ PDF อินพุตด้วยวัตถุคลาส Document สำหรับการแปลง PDF เป็นไฟล์ข้อความ
สร้างออบเจกต์ของคลาส TextAbsorber เพื่อตั้งค่าตัวเลือกการดึงข้อความ
เขียนข้อความที่แยกออกมาไปยังไฟล์ข้อความ

ขั้นตอนข้างต้นอธิบายขั้นตอนการพัฒนาแอปพลิเคชันตัวแปลงที่ใช้ PDF เป็น Text Java ในขั้นตอนแรก เอกสาร PDF ที่ป้อนจะถูกโหลดโดยใช้อินสแตนซ์ของคลาส Document จากนั้นเลือกว่าคุณต้องการให้ข้อความที่มีการจัดรูปแบบหรือไม่ สุดท้าย คุณสามารถใช้สตริงข้อความเพื่อเขียนลงในไฟล์หรือประมวลผลเพิ่มเติมตามความต้องการของคุณ

รหัสเพื่อแปลง PDF เป็นข้อความใน Java

	import com.aspose.pdf.Document;
	import com.aspose.pdf.License;
	import com.aspose.pdf.TextAbsorber;
	import com.aspose.pdf.TextExtractionOptions;
	import java.io.BufferedWriter;
	import java.io.FileWriter;
	import java.nio.file.Files;

	public class ConvertPdfToTextInJava {
	public static void main(String[] args) throws Exception { // main method to convert a PDF document to Text file
	// Instantiate the license to avoid trial limitations while converting the PDF to a text file
	License asposePdfLicenseText = new License();
	asposePdfLicenseText.setLicense("Aspose.pdf.lic");

	// Load the source PDF file that is to be converted to Text file
	Document convertPDFDocumentToText = new Document("input.pdf");

	// Instantiate a TextAbsorber class object for converting PDF to Text
	TextAbsorber textAbsorber = new TextAbsorber(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Pure));

	// Call the Accept method exposed by the TextAbsorber class
	convertPDFDocumentToText.getPages().accept(textAbsorber);

	// Read the text as string
	String ExtractedText = textAbsorber.getText();

	// Create the BufferedWriter object to open the file
	BufferedWriter writer = new BufferedWriter(new FileWriter(new File("SampleOutput.txt")));

	// Write extracted contents to the file
	writer.write(ExtractedText);

	// Close writer
	writer.close();

	System.out.println("Done");
	}
	}

view raw How to Convert PDF to Text in Java.java hosted with ❤ by GitHub

โค้ดตัวอย่างนี้แสดงให้เห็นว่าการใช้ Java แปลง PDF เป็นข้อความ ด้วยการควบคุมเต็มรูปแบบโดยใช้ตัวเลือกต่างๆ เช่น คลาส TextAbsorber มีตัวสร้างหลายตัว ซึ่งคุณสามารถใช้ TextSearchOptions ซึ่งมีตัวเลือกในการแปลงข้อความที่แรเงาใน PDF ต้นทางเป็นข้อความแยกต่างหาก ในทำนองเดียวกัน คุณสามารถตั้งค่าสถานะเพื่อค้นหาข้อความเฉพาะภายในขอบเขตของเพจ หรือตั้งค่าสี่เหลี่ยมผืนผ้าเพื่อค้นหาข้อความจากพื้นที่ที่ระบุเฉพาะในทุกเพจ

ที่นี่เราได้เรียนรู้วิธีแปลง PDF เป็นข้อความใน Java พร้อมกับข้อมูลโค้ด หากคุณต้องการเรียนรู้ขั้นตอนการแปลง PDF เป็น Word โปรดดูบทความใน วิธีแปลง PDF เป็น Word ใน Java

Aspose ฐานความรู้

ค้นหาคำตอบโดย API

วิธีแปลง PDF เป็นข้อความใน Java

ขั้นตอนในการแปลง PDF เป็นข้อความใน Java

รหัสเพื่อแปลง PDF เป็นข้อความใน Java