如何在 Python 中从 Word 文档中提取所有图像

本快速教程包含有关如何在 Python 中从 Word 文档中提取所有图像的信息。它提供了配置环境的所有资源，并介绍了完成任务所需的重要 API 结构，包括类名、方法和属性。您可以按照这些步骤编写一个完整的程序来从 Python 中的 Word 文档中提取图像，例如从 DOCX 文件中提取图像并将其保存为图像类型 PNG、JPG 等。磁盘上。

在 Python 中从 Word 文件中提取图像的步骤

配置环境以使用 Aspose.Words for Python 通过 .NET 提取图像
使用 Document 类对象加载带有图像的源 Word 文件
使用 get_child_nodes() 方法从加载的文档中获取所有形状的列表
解析形状集合中的所有元素并检测图像
为形状集合中检测到的每个图像创建唯一的文件名
使用唯一名称将提取的图像保存在磁盘上

这些步骤通过共享环境配置和编写程序的步骤来描述从Python中的Word文档中提取图片的过程。建立环境后，使用 Document 类对象加载带有图像的 Word 文件，并从中获取所有形状的集合。由于 Shape 类具有检测图像的方法，您可以提取图像并将其以指定名称保存在磁盘上。

在 Python 中从 Word 文档中提取照片的代码

	import aspose.words as aw

	# Load the license
	wordProtected = aw.License()
	wordProtected.set_license("Aspose.Total.lic")

	# Load a document
	wordDocument = aw.Document("WordFileWithImages.docx")

	# Get shapes collection
	allShapes = wordDocument.get_child_nodes(aw.NodeType.SHAPE, True)

	# Declare counter for images
	index = 0

	# Iterate through all the shapes to detect and save images
	for shape in allShapes:
	# Type cast the node object to shape
	shape = shape.as_shape()
	if(shape.has_image):
	index = index + 1

	# Prepare file name using the image counter and image type in the shape object
	image_file_name = f"File.extract_images.{index}{aw.FileFormatUtil.image_type_to_extension(shape.image_data.image_type)}"

	# Save the extracted image on the disk
	shape.image_data.save(image_file_name)
	print ("Images extracted successfully from the Word file")

view raw How to Extract All Images from Word Document in Python.py hosted with ❤ by GitHub

这里提供的代码演示了从 Python 中的 Word 中提取图片 的过程，方法是使用 Document 类对象加载源文件，该对象具有许多功能，例如为受保护文件提供密码、设置编码和附加警告回调来控制加载过程。类似地，get_child_nodes() 方法用于提取形状，但是，您也可以获取其他节点，如页眉、页脚、表格、注释、脚注和正文等。

本文指导我们从 Python 中的 Word 文件中获取所有照片。如果您想了解在 Word 文件中插入照片的过程，请参阅如何使用Python在Word中插入图片上的文章。

Aspose 知识库

查找API的答案

如何在 Python 中从 Word 文档中提取所有图像

在 Python 中从 Word 文件中提取图像的步骤

在 Python 中从 Word 文档中提取照片的代码