adobe.pdfservices.operation.pdfjobs.params.extract_pdf package
Submodules
adobe.pdfservices.operation.pdfjobs.params.extract_pdf.extract_element_type module
- class adobe.pdfservices.operation.pdfjobs.params.extract_pdf.extract_element_type.ExtractElementType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
Supported inputs for Elements to Extract
ExtractPDFJob
.- TABLES = 'tables'
Tabular Data
- TEXT = 'text'
Textual Data
adobe.pdfservices.operation.pdfjobs.params.extract_pdf.extract_pdf_params module
- class adobe.pdfservices.operation.pdfjobs.params.extract_pdf.extract_pdf_params.ExtractPDFParams(*, table_structure_type: TableStructureType = TableStructureType.XLSX, add_char_info: bool = False, styling_info: bool = False, elements_to_extract: List | None = None, elements_to_extract_renditions: List | None = None)
Bases:
PDFServicesJobParams
Parameters to extract content from PDF using
ExtractPDFJob
.Construct a new
ExtractPDFParams
- Parameters:
table_structure_type (TableStructureType) – TableStructureType for output type of table structure. (Optional, use key-value)
add_char_info (bool) – Boolean specifying whether to add character level bounding boxes to output json. (Optional, use key-value)
styling_info (bool) – Boolean specifying whether to add styling information to output json. (Optional, use key-value)
elements_to_extract (List) – List of
ExtractElementType
to be extracted. (Optional, use key-value)elements_to_extract_renditions (List) – List of
ExtractElementType
. (Optional, use key-value)
- get_add_char_info()
- Returns:
Whether character level information was invoked for operation.
- Return type:
bool
- get_elements_to_extract()
- Returns:
The list of elements (Text and/or Tables) invoked for operation
- Return type:
list
- get_elements_to_extract_renditions()
- Returns:
Returns the list of
ExtractElementType
invoked for job.- Return type:
list
- get_styling_info()
- Returns:
Whether styling information was invoked for operation.
- Return type:
bool
- get_table_structure_type()
- Returns:
Returns the TableStructureType of the resulting rendition
- Return type:
adobe.pdfservices.operation.pdfjobs.params.extract_pdf.extract_renditions_element_type module
- class adobe.pdfservices.operation.pdfjobs.params.extract_pdf.extract_renditions_element_type.ExtractRenditionsElementType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
Supported inputs for Renditions To Extract
ExtractPDFJob
.- FIGURES = 'figures'
Image Data
- TABLES = 'tables'
Tabular Data
adobe.pdfservices.operation.pdfjobs.params.extract_pdf.table_structure_type module
- class adobe.pdfservices.operation.pdfjobs.params.extract_pdf.table_structure_type.TableStructureType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
Supported Formats for exporting Table Data
ExtractPDFJob
.- CSV = 'csv'
CSV Format
- XLSX = 'xlsx'
XLSX Format