Classification fields
How to use classification field types
What are classification fields?
Classification field predictions categorise content on a document into a predefined set of classifications. These are distinct from e.g. extraction type fields which make predictions about the content of a document.
Classification types
Multi-class
A multi-class field is a type of document classification that assigns a single category, or class to a document. For example, the invoice.currencyCode
field determines the currency used in an invoice. It is implemented as a multi-class field as there is usually only a single currency used in a single invoice.
Multi-label
A multi-label field is a similar type of classification, but instead of assigning a single category to a document, a multi-label classification can assign many categories, or labels to a document. For example, the document.type
field determines which of Sypht's fieldsets are applicable to a given document. Since multiple fieldsets can be applicable to a single document, this field is implemented as a multi-label field.
Samples
For multi-class and multi-label fields, the value returned for each field is a JSON object.
Here is an example of output for the invoice.currencyCode
multi-class field:
The list of keys in the object enumerate each possible class/label for the field. For the invoice.currencyCode
shown above, there are seven possible classes, six of which represent a currency code and a seventh Unknown
class for cases where Sypht cannot determine the currency. Note that each class is qualified with the field name.
Decoding classifications
Each class/label has a value and a confidence. For multi-class fields (as above), only a single class will have a value of true
.
For multi-label fields (such as document.type
), multiple labels can have a value of true
. Here is an example of output for the document.type
multi-label field that demonstrates how multiple labels can simultaneously have a value of true:
Note that invoice
, issuer
, recipient
and vehicle
all have values of true
.
Last updated