adobe.pdfservices.operation.pdfops package
Subpackages
- adobe.pdfservices.operation.pdfops.options package
- Subpackages
- adobe.pdfservices.operation.pdfops.options.extractpdf package
- Submodules
- adobe.pdfservices.operation.pdfops.options.extractpdf.extract_pdf_options module
- adobe.pdfservices.operation.pdfops.options.extractpdf.extract_element_type module
- adobe.pdfservices.operation.pdfops.options.extractpdf.extract_renditions_element_type module
- adobe.pdfservices.operation.pdfops.options.extractpdf.table_structure_type module
- Module contents
- adobe.pdfservices.operation.pdfops.options.autotagpdf package
- adobe.pdfservices.operation.pdfops.options.extractpdf package
- Module contents
- Subpackages
Submodules
adobe.pdfservices.operation.pdfops.extract_pdf_operation module
- class adobe.pdfservices.operation.pdfops.extract_pdf_operation.ExtractPDFOperation(create_key)
Bases:
Operation
An Operation that extracts pdf elements such as text and tables in a structured format from a PDF, along with renditions for tables and figures.
Sample usage.
try: base_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) credentials = Credentials.service_principal_credentials_builder(). \ with_client_id(os.getenv('PDF_SERVICES_CLIENT_ID')). \ with_client_secret(os.getenv('PDF_SERVICES_CLIENT_SECRET')). \ build() execution_context = ExecutionContext.create(credentials) extract_pdf_operation = ExtractPDFOperation.create_new() source = FileRef.create_from_local_file(base_path + "/resources/extractPdfInput.pdf") extract_pdf_operation.set_input(source) extract_pdf_options: ExtractPDFOptions = ExtractPDFOptions.builder() \ .with_elements_to_extract([ExtractElementType.TEXT, ExtractElementType.TABLES]) \ .with_elements_to_extract_renditions([ExtractRenditionsElementType.TABLES, ExtractRenditionsElementType.FIGURES]) \ .with_get_char_info(True) \ .with_include_styling_info(True) \ .build() extract_pdf_operation.set_options(extract_pdf_options) result: FileRef = extract_pdf_operation.execute(execution_context) result.save_as(base_path + "/output/ExtractTextTableWithFigureTableRendition.zip") except (ServiceApiException, ServiceUsageException, SdkException): logging.exception("Exception encountered while executing operation")
- SUPPORTED_SOURCE_MEDIA_TYPES = {adobe.pdfservices.operation.internal.extension_media_type_mapping.ExtensionMediaTypeMapping.PDF.mime_type}
Supported source file formats for
ExtractPdfOperation
is .pdf.
- classmethod create_new()
creates a new instance of ExtractPDFOperation.
- Returns:
A new instance of ExtractPDFOperation
- Return type:
- execute(execution_context: ExecutionContext)
Executes this operation synchronously using the supplied context and returns a new FileRef instance for the resulting Zip file. The resulting file may be stored in the system temporary directory. See
adobe.pdfservices.operation.io.file_ref.FileRef
for how temporary resources are cleaned up.- Parameters:
execution_context (ExecutionContext) – The context in which the operation will be executed.
- Returns:
The FileRef to the result.
- Return type:
- Raises:
ServiceApiException – if an API call results in an error response.
- get_options()
gets the ExtractPDFOptions.
- Returns:
The options parameter of the operation
- Return type:
- set_input(source_file_ref: FileRef)
Sets an input file.
- Parameters:
source_file_ref (FileRef) – An input file.
- Returns:
This instance to add any additional parameters.
- Return type:
- set_options(extract_pdf_options: ExtractPDFOptions)
sets the ExtractPDFOptions.
- Parameters:
extract_pdf_options (ExtractPDFOptions) – ExtractPDFOptions to set.
- Returns:
This instance to add any additional parameters.
- Return type:
adobe.pdfservices.operation.pdfops.autotag_pdf_operation module
- class adobe.pdfservices.operation.pdfops.autotag_pdf_operation.AutotagPDFOperation(create_key)
Bases:
Operation
An operation that enables clients to improve accessibility of the PDF document. It generates the tagged PDF, along with an optional XLSX report providing detailed information about the added tags. The operation replaces any existing tags within the input document, so it provides the most benefit for PDFs that have no tags or low-quality tags.
Sample usage.
try: base_path = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) credentials = Credentials.service_principal_credentials_builder(). \ with_client_id(os.getenv('PDF_SERVICES_CLIENT_ID')). \ with_client_secret(os.getenv('PDF_SERVICES_CLIENT_SECRET')). \ build() execution_context = ExecutionContext.create(credentials) autotag_pdf_operation = AutotagPDFOperation.create_new() input_file_path = 'autotagPdfInput.pdf' source = FileRef.create_from_local_file(base_path + "/resources/" + input_file_path) autotag_pdf_operation.set_input(source) autotag_pdf_options: AutotagPDFOptions = AutotagPDFOptions.builder() \ .with_shift_headings() \ .with_generate_report() \ .build() autotag_pdf_operation.set_options(autotag_pdf_options) autotag_pdf_output: AutotagPDFOutput = autotag_pdf_operation.execute(execution_context) input_file_name = Path(input_file_path).stem base_output_path = base_path + "/output/AutotagPDFWithOptions/" Path(base_output_path).mkdir(parents=True, exist_ok=True) tagged_pdf_path = f'{base_output_path}{input_file_name}-tagged.pdf' report_path = f'{base_output_path}{input_file_name}-report.xlsx' autotag_pdf_output.get_tagged_pdf().save_as(tagged_pdf_path) autotag_pdf_output.get_report().save_as(report_path) except (ServiceApiException, ServiceUsageException, SdkException) as e: logging.exception(f'Exception encountered while executing operation: {e}')
- SUPPORTED_SOURCE_MEDIA_TYPES = {adobe.pdfservices.operation.internal.extension_media_type_mapping.ExtensionMediaTypeMapping.PDF.mime_type}
Supported source file formats for
AutotagPdfOperation
is .pdf.
- classmethod create_new()
creates a new instance of AutotagPDFOperation.
- Returns:
A new instance of AutotagPDFOperation
- Return type:
- execute(execution_context: ExecutionContext)
Executes this operation synchronously using the supplied context and returns a new AutotagPDFOutput instance for the generated tagged pdf file and XLSX report file. The resulting file may be stored in the system temporary directory. See
adobe.pdfservices.operation.io.file_ref.FileRef
for how temporary resources are cleaned up.- Parameters:
execution_context (ExecutionContext) – The context in which the operation will be executed.
- Returns:
The instance of AutotagPDFOutput.
- Return type:
AutotagPDFOutput
- Raises:
ServiceApiException – if an API call results in an error response.
- get_options()
gets the AutotagPDFOptions.
- Returns:
The options parameter of the operation
- Return type:
- set_input(source_file_ref: FileRef)
Sets an input file.
- Parameters:
source_file_ref (FileRef) – An input file.
- Returns:
This instance to add any additional parameters.
- Return type:
- set_options(autotag_pdf_options: AutotagPDFOptions)
sets the AutotagPDFOptions.
- Parameters:
autotag_pdf_options (AutotagPDFOptions) – AutotagPDFOptions to set.
- Returns:
This instance to add any additional parameters.
- Return type: