API Reference¶
ClientConfig¶
-
class
adobe.pdfservices.operation.client_config.
ClientConfig
¶ Bases:
object
Encapsulates the API request configurations
-
class
Builder
¶ Bases:
object
Builds a
ClientConfig
instance.-
build
()¶ Returns a new
ClientConfig
instance built from the current state of this builder.- Returns
A ClientConfig instance.
- Return type
-
from_file
(client_config_file_path: str)¶ Sets the connect timeout and read timeout using the JSON client config file path. All the keys in the JSON structure are optional.
- Parameters
client_config_file_path (str) – JSON client config file path
- Returns
This Builder instance to add any additional parameters.
- Return type
JSON structure:
{ "connectTimeout": "4000", "readTimeout": "20000" }
-
with_connect_timeout
(connect_timeout: int)¶ Sets the connect timeout. It should be greater than zero.
- Parameters
connect_timeout (int) – determines the timeout in milliseconds until a connection is established in the API calls. Default value is 4000 milliseconds
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_read_timeout
(read_timeout: int)¶ Sets the read timeout. It should be greater than zero.
- Parameters
read_timeout (int) – Defines the read timeout in milliseconds, The number of milliseconds the client will wait for the server to send a response after the connection is established. Default value is 10000 milliseconds
- Returns
This Builder instance to add any additional parameters.
- Return type
-
-
static
builder
()¶ Creates a new
ClientConfig
builder.- Returns
A ClientConfig.Builder instance.
- Return type
-
class
ClientConfigBuilder¶
-
class
adobe.pdfservices.operation.client_config.ClientConfig.
Builder
¶ Bases:
object
Builds a
ClientConfig
instance.-
build
()¶ Returns a new
ClientConfig
instance built from the current state of this builder.- Returns
A ClientConfig instance.
- Return type
-
from_file
(client_config_file_path: str)¶ Sets the connect timeout and read timeout using the JSON client config file path. All the keys in the JSON structure are optional.
- Parameters
client_config_file_path (str) – JSON client config file path
- Returns
This Builder instance to add any additional parameters.
- Return type
JSON structure:
{ "connectTimeout": "4000", "readTimeout": "20000" }
-
with_connect_timeout
(connect_timeout: int)¶ Sets the connect timeout. It should be greater than zero.
- Parameters
connect_timeout (int) – determines the timeout in milliseconds until a connection is established in the API calls. Default value is 4000 milliseconds
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_read_timeout
(read_timeout: int)¶ Sets the read timeout. It should be greater than zero.
- Parameters
read_timeout (int) – Defines the read timeout in milliseconds, The number of milliseconds the client will wait for the server to send a response after the connection is established. Default value is 10000 milliseconds
- Returns
This Builder instance to add any additional parameters.
- Return type
-
Credentials¶
-
class
adobe.pdfservices.operation.auth.credentials.
Credentials
¶ Bases:
abc.ABC
Marker base class for different types of credentials. Currently it supports only
ServiceAccountCredentials
. The factory methods within this class can be used to create instances of credentials classes.-
static
service_account_credentials_builder
()¶ Creates a new
ServiceAccountCredentials
builder.- Returns
An instance of ServiceAccountCredentials Builder.
- Return type
-
static
ExecutionContext¶
-
class
adobe.pdfservices.operation.execution_context.
ExecutionContext
¶ Bases:
object
Represents the execution context of an Operation. An execution context typically consists of the desired authentication credentials and client configurations such as timeouts.
For each set of credentials, a ExecutionContext instance can be reused across operations.
Sample Usage:
try: base_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) credentials = Credentials.service_account_credentials_builder() \ .from_file(base_path + "/pdftools-api-credentials.json") \ .build() execution_context = ExecutionContext.create(credentials) extract_pdf_operation = ExtractPDFOperation.create_new() source = FileRef.create_from_local_file(base_path + "/resources/extractPdfInput.pdf") extract_pdf_operation.set_input(source) extract_pdf_options: ExtractPDFOptions = ExtractPDFOptions.builder() \ .with_elements_to_extract([PDFElementType.TEXT, PDFElementType.TABLES]) \ .with_elements_to_extract_renditions([PDFElementType.TABLES, PDFElementType.FIGURES]) \ .with_get_char_info(True) \ .build() extract_pdf_operation.set_options(extract_pdf_options) result: FileRef = extract_pdf_operation.execute(execution_context) result.save_as(base_path + "/output/ExtractTextTableWithFigureTableRendition.zip") except (ServiceApiException, ServiceUsageException, SdkException): logging.exception("Exception encountered while executing operation")
-
static
create
(credentials: adobe.pdfservices.operation.auth.credentials.Credentials, client_config: Optional[adobe.pdfservices.operation.client_config.ClientConfig] = None)¶ Creates a context instance using the provided Credentials and ClientConfig
- Parameters
credentials (Credentials) – A Credentials instance
client_config (ClientConfig, optional) – A ClientConfig instance for providing custom http timeouts, defaults to None
- Returns
A new
ExecutionContext
instance- Return type
-
static
ExtractPDFOperation¶
-
class
adobe.pdfservices.operation.pdfops.extract_pdf_operation.
ExtractPDFOperation
(create_key)¶ Bases:
adobe.pdfservices.operation.operation.Operation
An Operation that extracts pdf elements such as text, images, tables in a structured format from a PDF.
Sample usage.
try: base_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) credentials = Credentials.service_account_credentials_builder() \ .from_file(base_path + "/pdftools-api-credentials.json") \ .build() execution_context = ExecutionContext.create(credentials) extract_pdf_operation = ExtractPDFOperation.create_new() source = FileRef.create_from_local_file(base_path + "/resources/extractPdfInput.pdf") extract_pdf_operation.set_input(source) extract_pdf_options: ExtractPDFOptions = ExtractPDFOptions.builder() \ .with_elements_to_extract([PDFElementType.TEXT, PDFElementType.TABLES]) \ .with_elements_to_extract_renditions([PDFElementType.TABLES, PDFElementType.FIGURES]) \ .with_get_char_info(True) \ .build() extract_pdf_operation.set_options(extract_pdf_options) result: FileRef = extract_pdf_operation.execute(execution_context) result.save_as(base_path + "/output/ExtractTextTableWithFigureTableRendition.zip") except (ServiceApiException, ServiceUsageException, SdkException): logging.exception("Exception encountered while executing operation")
-
SUPPORTED_SOURCE_MEDIA_TYPES
= {'application/pdf'}¶ Supported source file formats for
ExtractPdfOperation
is .pdf.
-
classmethod
create_new
()¶ creates a new instance of ExtractPDFOperation.
- Returns
A new instance of ExtractPDFOperation
- Return type
-
execute
(execution_context: adobe.pdfservices.operation.execution_context.ExecutionContext)¶ Executes this operation synchronously using the supplied context and returns a new FileRef instance for the resulting Zip file. The resulting file may be stored in the system temporary directory. See
adobe.pdfservices.operation.io.file_ref.FileRef
for how temporary resources are cleaned up.- Parameters
execution_context (ExecutionContext) – The context in which the operation will be executed.
- Returns
The FileRef to the result.
- Return type
- Raises
ServiceApiException – if an API call results in an error response.
-
set_input
(source_file_ref: adobe.pdfservices.operation.io.file_ref.FileRef)¶ Sets an input file.
- Parameters
source_file_ref (FileRef) – An input file.
- Returns
This instance to add any additional parameters.
- Return type
-
set_options
(extract_pdf_options: adobe.pdfservices.operation.pdfops.options.extractpdf.extract_pdf_options.ExtractPDFOptions)¶ sets the ExtractPDFOptions.
- Parameters
extract_pdf_options (ExtractPDFOptions) – ExtractPDFOptions to set.
- Returns
This instance to add any additional parameters.
- Return type
-
ExtractPDFOptions¶
-
class
adobe.pdfservices.operation.pdfops.options.extractpdf.extract_pdf_options.
ExtractPDFOptions
(elements_to_extract, elements_to_extract_renditions, get_char_info, table_output_format)¶ Bases:
object
An Options Class that defines the options for ExtractPDFOperation.
extract_pdf_options: ExtractPDFOptions = ExtractPDFOptions.builder() \ .with_elements_to_extract([PDFElementType.TEXT, PDFElementType.TABLES]) \ .with_get_char_info(True) \ .with_table_structure_format(TableStructureType.CSV) \ .with_elements_to_extract_renditions([PDFElementType.FIGURES, PDFElementType.TABLES]) \ .build()
-
class
Builder
¶ Bases:
object
The builder for
ExtractPDFOptions
.-
build
()¶
-
with_element_to_extract
(element_to_extract: adobe.pdfservices.operation.pdfops.options.extractpdf.pdf_element_type.PDFElementType)¶ adds a pdf element type for extracting structured information.
- Parameters
element_to_extract (PDFElementType) – PDFElementType to be extracted
- Returns
This Builder instance to add any additional parameters.
- Return type
- Raises
ValueError – if element_to_extract is None.
-
with_element_to_extract_renditions
(element_to_extract_renditions: adobe.pdfservices.operation.pdfops.options.extractpdf.pdf_element_type.PDFElementType)¶ adds a pdf element type for extracting rendition.
- Parameters
element_to_extract_renditions (PDFElementType) – PDFElementType whose renditions have to be extracted
- Returns
This Builder instance to add any additional parameters.
- Return type
- Raises
ValueError – if element_to_extract_renditions is None.
-
with_elements_to_extract
(elements_to_extract: List[adobe.pdfservices.operation.pdfops.options.extractpdf.pdf_element_type.PDFElementType])¶ adds a list of pdf element types for extracting structured information.
- Parameters
elements_to_extract (List[PDFElementType]) – List of PDFElementType to be extracted
- Returns
This Builder instance to add any additional parameters.
- Return type
- Raises
ValueError – if elements_to_extract is None or empty list.
-
with_elements_to_extract_renditions
(elements_to_extract_renditions: List[adobe.pdfservices.operation.pdfops.options.extractpdf.pdf_element_type.PDFElementType])¶ adds a list of pdf element types for extracting rendition.
- Parameters
elements_to_extract_renditions (List[PDFElementType]) – List of PDFElementType whose renditions have to be extracted
- Returns
This Builder instance to add any additional parameters.
- Return type
- Raises
ValueError – if elements_to_extract is None or empty list.
-
with_get_char_info
(get_char_info: bool)¶ sets the Boolean specifying whether to add character level bounding boxes to output json
- Parameters
get_char_info (bool) – Set True to extract character level bounding boxes information
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_table_structure_format
(table_structure: adobe.pdfservices.operation.pdfops.options.extractpdf.table_structure_type.TableStructureType)¶ adds the table structure format (currently csv only) for extracting structured information.
- Parameters
table_structure (TableStructureType) – TableStructureType to be extracted
- Returns
This Builder instance to add any additional parameters.
- Return type
- Raises
ValueError – if table_structure is None.
-
-
static
builder
()¶ Returns a Builder for
ExtractPDFOptions
- Returns
The builder class for ExtractPDFOptions
- Return type
-
property
elements_to_extract
¶ List of pdf element types to be extracted in a structured format from input file
-
property
elements_to_extract_renditions
¶ List of pdf element types whose renditions needs to be extracted from input file
-
property
get_char_info
¶ Boolean specifying whether to add character level bounding boxes to output json
-
property
table_output_format
¶ export table in specified format - currently csv supported
-
class
FileRef¶
-
class
adobe.pdfservices.operation.io.file_ref.
FileRef
¶ Bases:
abc.ABC
This class represents a local file. It is typically used by an SDK Operation which accepts or returns files.
When a FileRef instance is created by this SDK while referring to a temporary file location, calling any of the methods to save the fileRef (For example,
create_from_stream()
etc.) will delete the temporary file.-
static
create_from_local_file
(local_source: str, media_type: Optional[str] = None)¶ Creates a FileRef instance from a local file path. If no media type is provided, it will be inferred from the file extension.
- Parameters
local_source (str) – Local file path, either absolute path or relative to the working directory.
media_type (str, optional, defaults to None) – Media type to identify the local file format, defaults to None
- Returns
A FileRef instance.
- Return type
-
static
create_from_stream
(input_stream: _io.BufferedReader, media_type: str)¶ Creates a FileRef instance from a readable stream using the specified media type. The stream is not read by this method but by consumers of file content i.e. the execute method of an operation such as
execute()
.- Parameters
input_stream (BufferedReader) – Readable Stream representing the file.
media_type (str) – Media type to identify the file format.
- Returns
A FileRef instance.
- Return type
-
abstract
save_as
(local_file_path: str)¶
-
abstract
write_to_stream
(writer_stream)¶
-
static
Exceptions¶
-
exception
adobe.pdfservices.operation.exception.exceptions.
SdkException
(message, request_tracking_id=None)¶ Bases:
Exception
SdkException is typically thrown for client-side or network errors.
-
property
request_tracking_id
¶ The request tracking id of the exception.
-
property
-
exception
adobe.pdfservices.operation.exception.exceptions.
ServiceApiException
(message, request_tracking_id, status_code=0, error_code='UNKNOWN')¶ Bases:
Exception
ServiceApiException is thrown when an underlying service API call results in an error.
-
DEFAULT_ERROR_CODE
= 'UNKNOWN'¶ Returns the HTTP Status code or DEFAULT_STATUS_CODE if the status code doesn’t adequately represent the error.
-
DEFAULT_STATUS_CODE
= 0¶ The default value of status code if there is no status code for this service exception.
-
property
error_code
¶ Returns the detailed message of this error.
-
property
request_tracking_id
¶ The request tracking id of the exception.
-
property
status_code
¶ Returns the HTTP Status code or DEFAULT_STATUS_CODE if the status code doesn’t adequately represent the error.
-
-
exception
adobe.pdfservices.operation.exception.exceptions.
ServiceUsageException
(message, request_tracking_id, status_code=429, error_code='UNKNOWN')¶ Bases:
Exception
ServiceUsageError is thrown when either service usage limit has been reached or credentials quota has been exhausted.
-
DEFAULT_ERROR_CODE
= 'UNKNOWN'¶ The default value of error code if there is no status code for this service failure.
-
DEFAULT_STATUS_CODE
= 429¶ The default value of status code if there is no status code for this service failure.
-
property
error_code
¶ Returns the detailed message of this error.
-
property
request_tracking_id
¶ The request tracking id of the exception.
-
property
status_code
¶ Returns the HTTP Status code or DEFAULT_STATUS_CODE if the status code doesn’t adequately represent the error.
-
ServiceAccountCredentials¶
-
class
adobe.pdfservices.operation.auth.service_account_credentials.
ServiceAccountCredentials
(client_id, client_secret, private_key, organization_id, account_id, ims_base_uri='https://ims-na1.adobelogin.com', claim=None)¶ Bases:
adobe.pdfservices.operation.auth.credentials.Credentials
,abc.ABC
Service Account credentials allow your application to call PDF Tools Extract API on behalf of the application itself, or on behalf of an enterprise organization. For getting the credentials, Click Here.
-
class
Builder
¶ Bases:
object
Builds a
ServiceAccountCredentials
instance.-
build
()¶ Returns a new
ServiceAccountCredentials
instance built from the current state of this builder.- Returns
A ServiceAccountCredentials instance.
- Return type
-
from_file
(credentials_file_path: str)¶ Sets Service Account Credentials using the JSON credentials file path. All the keys in the JSON structure are optional.
JSON structure:
{ "client_credentials": { "client_id": "CLIENT_ID", "client_secret": "CLIENT_SECRET" }, "service_account_credentials": { "organization_id": "org_ident@AdobeOrg", "account_id": "id@techacct.adobe.com", "private_key_file": "private.key" } }
private_key_file is the path of private key file. It will be looked up in the classpath and the directory of JSON credentials file.
- Parameters
credentials_file_path (str) – JSON credentials file path
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_account_id
(account_id: str)¶ Set Account Id (format: id@techacct.adobe.com)
- Parameters
account_id (str) – Account ID (format: id@techacct.adobe.com)
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_client_id
(client_id: str)¶ Set Client ID (API Key)
- Parameters
client_id (str) – Client Id (API Key)
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_client_secret
(client_secret: str)¶ Set Client Secret
- Parameters
client_secret (str) – Client Secret
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_organization_id
(organization_id: str)¶ Set Organization Id (format: org_ident@AdobeOrg) that has been configured for access to PDF Tools API
- Parameters
organization_id (str) – Organization ID (format: org_ident@AdobeOrg)
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_private_key
(private_key: str)¶ Set private key
- Parameters
private_key (str) – Content of the Private Key (PEM format)
- Returns
This Builder instance to add any additional parameters.
- Return type
-
-
property
account_id
¶ Account ID(format: id@techacct.adobe.com)
-
property
claim
¶ Identifies the Service for which Authorization(Access) Token will be issued
-
property
client_id
¶ Client Id (API Key)
-
property
client_secret
¶ Client Secret
-
property
organization_id
¶ Identifies the organization(format: org_ident@AdobeOrg) that has been configured for access to PDF Tools API.
-
property
private_key
¶ Content of the Private Key (PEM format)
-
class
ServiceAccountCredentialsBuilder¶
-
class
adobe.pdfservices.operation.auth.service_account_credentials.ServiceAccountCredentials.
Builder
¶ Bases:
object
Builds a
ServiceAccountCredentials
instance.-
build
()¶ Returns a new
ServiceAccountCredentials
instance built from the current state of this builder.- Returns
A ServiceAccountCredentials instance.
- Return type
-
from_file
(credentials_file_path: str)¶ Sets Service Account Credentials using the JSON credentials file path. All the keys in the JSON structure are optional.
JSON structure:
{ "client_credentials": { "client_id": "CLIENT_ID", "client_secret": "CLIENT_SECRET" }, "service_account_credentials": { "organization_id": "org_ident@AdobeOrg", "account_id": "id@techacct.adobe.com", "private_key_file": "private.key" } }
private_key_file is the path of private key file. It will be looked up in the classpath and the directory of JSON credentials file.
- Parameters
credentials_file_path (str) – JSON credentials file path
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_account_id
(account_id: str)¶ Set Account Id (format: id@techacct.adobe.com)
- Parameters
account_id (str) – Account ID (format: id@techacct.adobe.com)
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_client_id
(client_id: str)¶ Set Client ID (API Key)
- Parameters
client_id (str) – Client Id (API Key)
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_client_secret
(client_secret: str)¶ Set Client Secret
- Parameters
client_secret (str) – Client Secret
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_organization_id
(organization_id: str)¶ Set Organization Id (format: org_ident@AdobeOrg) that has been configured for access to PDF Tools API
- Parameters
organization_id (str) – Organization ID (format: org_ident@AdobeOrg)
- Returns
This Builder instance to add any additional parameters.
- Return type
-
with_private_key
(private_key: str)¶ Set private key
- Parameters
private_key (str) – Content of the Private Key (PEM format)
- Returns
This Builder instance to add any additional parameters.
- Return type
-