lineitems
data type is used for fields that extract tabular information for a specific type of table and pre-defined columns. There are many different lineitems
fields tailored to different extraction use-cases.invoice.lineitems
field captures tables containing invoice line items, while the statement.transactions
field returns credit and debit transaction rows from bank and credit card statements.ndis.lineitems
includes inference of Support Item Reference Numbers from line level description text. Always prefer the best matching lineitems
field to your use-case over generic table extraction (i.e.generic.table)
when available.lineitems
data type return a common data structure. Each prediction is a list of tables, one for each table found in the source document. Each of these tables is a JSON object with three keys:types
aligns each extracted column to a specific column type.headers
identify the specific text and position of header cells within the source document.cells
contain the content of the table arranged as an array-of-arrays; organised rows by columns.types
array contains a type identifier and confidence score for the corresponding table column. These identifiers can be used to interpret the corresponding cell content for that column in the table. For example, a column labelled "Item Price" might be classified as a sypht.invoice.lineitems.unitPrice
column) and contain prices for each listed item.null
indicates the corresponding column does not match a pre-defined column type for the field. Header and cell content is still returned for these columns.headers
are not needed to interpret the content of the table for a lineitems
field, but may be useful to understand the content of non-aligned columns and how the data was originally presented in the source document.text
and bounds
information used to locate headers in the source.cells
array represents a row, and each row contains one item per column in the table. Row items may be null
indicating an empty cell for a given column. Rows with no extracted cells are omitted from the output.text
and bounds
information.lineitems
field in Python. We utilise the pandas
library to format tabular results.invoice.lineitems
types:invoice.lineitems.date
invoice.lineitems.description
null
invoice.lineitems.total