Introduction

Sypht is an API for document data extraction. This documentation lays out how to extract unstructured data using our AI products. A Sypht AI product is essentially an intelligent model that extracts a collection of fields from a document. We have a whole range of different AI products that meet different business needs. To learn more about our AI products you can contact us here.

Before getting started, you'll need to signup for a Sypht account. By signing up you'll gain free trial subscriptions to a handful of our AI products out of the box and you'll be granted API credentials that will enable you to generate a bearerToken to access the Sypht API.

Just Getting Started?

You can check out our development quickstart guide.

Not a Developer?

You can use our app to start Sypht-ing! No coding required. Check out app.sypht.com.

AI Products

For a complete list of the AI products offered, visit app.sypht.com/marketplace. You'll gain a free trial subscription upon signing up.

How to Sypht

In as little as three steps, you can start Sypht-ing.

  1. Getting your bearer token
  2. Upload a document
  3. Retrieve results

Response Codes

Success

In the responses, Sypht returns these HTTP status codes for successful requests:

Status Description
200 OK The request succeeeded

Error

Status Message Cause
400 Bad Request Validation errors A descriptive message will be returned to explain the specific error encountered.
401 Unauthorized Invalid credentials Credentials entered are incorrect.

Authentication

OAuth2

We use an Oauth2 authentication flow. Users can create a client_id and client_secret upon signing up with Sypht. Use these to generate your reusable bearer token, which must be included in the Authorization header of each request. You can store and reuse these tokens for up to an hour.

To get a token you need to POST to the tokenUrl with the client_id and grant_type in the request body and colon separated, base64 encoded client_id and client_secret in a basic Authorization header. For example Basic {Base64(client_id:client_secret)}

Security Scheme Type OAuth2
clientCredentials OAuth Flow
Token URL: https://auth.sypht.com/oauth2/token
Scopes:
  • fileupload:all -

    Provides access to Fileupload Service

  • result:all -

    Provides access to Result Service

  • validate:annotation -

    Upload annotation data for your documents for training purposes.

Legacy

As of 16/06/2020 we have had to switch to a new Oauth provider. You will still be able to older client keys that will be marked as "legacy" keys in the app to generate your reusable bearer tokens. You can store and reuse these tokens for up to 24 hours.

To get a token for these keys POST to https://login.sypht.com/oauth/token with client_id, client_secret, grant_type and audience in the request body.

Security Scheme Type OAuth2
clientCredentials OAuth Flow
Token URL: https://login.sypht.com/oauth/token
Scopes:
  • fileupload:all -

    Provides access to Fileupload Service

  • result:all -

    Provides access to Result Service

  • validate:annotation -

    Upload annotation data for your documents for training purposes.

Authenticate

/oauth2/token

header Parameters
Authorization
required
string <Basic {Base64(Client ID:Client Secret)}>
Request Body schema: application/x-www-form-urlencoded
client_id
string

The client id obtained on sign-up with sypht.

grant_type
string

The type of authentication request - always set this to "client_credentials"

Responses

200

OK

post /oauth2/token
https://auth.sypht.com/oauth2/token

Request samples

Copy
import requests
import base64
import os

url = 'https://auth.sypht.com/oauth2/token'
payload = 'client_id=' + os.environ['CLIENT_ID'] + '&grant_type=client_credentials'
auth_slug = os.environ['CLIENT_ID'] + ':' + os.environ['CLIENT_SECRET']
auth_slug_enc = base64.b64encode(auth_slug.encode('utf-8')).decode('utf-8')
headers = {
  'Accept': 'application/json',
  'Content-Type': 'application/x-www-form-urlencoded',
  'Authorization': 'Basic ' + auth_slug_enc
}

response = requests.request('POST', url, headers=headers, data=payload, allow_redirects=False)
print(response.text)

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6Ik9VUkRNelV3UVVGQ01qRXhNakF5T1Rrek5VSTBORU5DUXpORE5EZzBOak5ETWtZMU5FSTFOdyJ9.eyJpc3MiOiJodHRwczovL3N5cGhlbi5hdS5hdXRoMC5jb20vIiwic3ViIjoiOXQzeHdUSGhDVzY4MTk5MTBzRjR2RUV2SWN2QkdYQkZAY2xpZW50cyIsImF1ZCI6Imh0dHBzOi8vYXBpLnN5cGh0LmNvbSIsImlhdCI6MTUyNjg2NzAyNSwiZXhwIjoxNTI2OTUzNDI1LCJhenAiOiI5dDN4d1RIaENXNjgxOTkxMHNGNHZFRXZJY3ZCR1hCRiIsInNjb3BlIjoicmVzdWx0OmFsbCBmaWxldXBsb2FkOmFsbCIsImd0eSI6ImNsaWVudC1jcmVkZW50aWFscyJ9.samplesamplesamplesamplesample-ZWPj3qNo0NYFUFZ3UvcWp2EZHdjP6xpGa8hA04k1M4Abad0IPkPBPP9I3WzcIpHGALgTslDOOL_sl7wWgscU9gkaq5mePh25DqrAs6cE1YXqaixyM3y3FS8EO8jRaD8m_AuKkYJxttqZNmKf6c7PgyT_w5_thObzxa4yvv-ULMDOj4WYlc6qurdKYZkg3KC_EaYOwTrdOirOy40FtHm6hLNY1_yikImtBJ3MGbyR_pz54GA",
  • "expires_in": 3600,
  • "token_type": "Bearer"
}

Upload Document

Use one these endpoints to upload a document in either PDF, PNG or JPG format to the Sypht platform for extraction. Maximum file size supported is 20Mb and maximum pages per document is 16.

Note: With our upcoming changes to introducing AI products, we've grandfathered in the use of fieldSet and fieldSets. For ease of use we recommend using the products parameter when uploading a document.

/fileupload/

Please supply a valid bearerTokenand products or fieldSet or fieldSets.

Authorizations:
OAuth2 (fileupload:all)
Request Body schema: multipart/form-data
fileToUpload
required
string <binary>

Document for extraction.

products
Array of strings

Array of products to predict on. Cannot be supplied with fieldSet or fieldSets

fieldSet
string

A single field set for extraction

fieldSets
Array of strings

Optionally use this parameter to extract an array of field sets. Needs to be a json array in a string.

tags
string

Tags to identify upload set. Only alphanumeric & hyphen characters are allowed

workflowId
string

Workflow to execute, defaults to the extraction workflow

workflowOptions
string

Options to pass to the workflow.

Responses

200

Results were finalised.

default

Error

post /fileupload/
https://api.sypht.com/fileupload/

Request samples

Copy
curl --location --request POST "https://api.sypht.com/fileupload" \
  --header "Accept: application/json" \
  --header "Content-Type: multipart/form-data" \
  --header "Authorization: Bearer {{access token}}" \
  --form "fileToUpload=@{{path to file}}" \
  --form "fieldSets=[\"sypht.invoice\",\"sypht.document\"]"

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "fileId": "30d3e543-47c8-42e6-a6f6-40c50d199395",
  • "uploadedAt": "2018-05-15T04:23:14.027Z",
  • "status": "RECEIVED"
}

/fileupload/json

Optional upload endpoint that allows uploading files as base64 in an application/json payload. Please supply a valid bearerTokenand products or fieldSet or fieldSets.

Authorizations:
OAuth2 (fileupload:all)
Request Body schema: application/json

Upload a file as part of JSON payload

file
required
object
tags
Array of strings

tags associated with your file

workflow
object
products
Array of strings

Array of products to predict on. Cannot be supplied with fieldSet or fieldSets

fieldSet
string

A single field set for extraction

fieldSets
Array of strings

Optionally use this parameter to extract an array of field sets. Needs to be a json array in a string.

Responses

200

File was successfully uploaded.

default

Error

post /fileupload/json
https://api.sypht.com/fileupload/json

Request samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "file":
    {
    },
  • "tags":
    [
    ],
  • "workflow":
    {
    },
  • "products":
    [
    ],
  • "fieldSet": "string",
  • "fieldSets":
    [
    ]
}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "fileId": "30d3e543-47c8-42e6-a6f6-40c50d199395",
  • "uploadedAt": "2018-05-15T04:23:14.027Z",
  • "status": "RECEIVED"
}

Upload Annotation

This endpoint allows you to feed back data to Sypht for training and evaluation. This is to help our models learn faster and also evaluate and compare them to other systems / reference extractions. For reference, here is a working example of the endpoint in our python client.

The body content of this request should contain a JSON-encoded object with the gold-standard (i.e. human annotated) extraction data in the following format, e.g.:

  {
    "origin": "external",
    "fields": [
      {
        "id": "issuer.name",
        "type": "simple",
        "data": {
          "value": "John Smith"
        }
      },
      // ... etc for additional fields
    ]
  }

For the best performance, you should only push data here that you know has been checked by a human reviewer (even if that means some fields are missing on certain documents). If that data changes over time, you can override or clear the record through subsequent PUT requests.

/app/docs/{docId}/companyannotation/{companyId}/data

Add company annotation for a document.

Authorizations:
OAuth2 (validate:annotation)
path Parameters
docId
required
string

The id of the document being annotated.

companyId
required
string

The id of the company providing the annotation.

Request Body schema: application/json

The body content of this request should contain a JSON-encoded object with the gold-standard (i.e. human annotated) extraction data.

data
object

Responses

200

Success

default

Error

put /app/docs/{docId}/companyannotation/{companyId}/data
https://api.sypht.com/app/docs/{docId}/companyannotation/{companyId}/data

Request samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "data":
    {
    }
}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "id": "string",
  • "userId": "string",
  • "docId": "string",
  • "data": { }
}

Results

User can query these endpoints to gather the results produced by Sypht.

/result/final/{fileId}

This endpoint allows you to query the results gathered from Sypht's field extraction.

Refer to AI Products for detailed description of fields extracted for each AI product.

Authorizations:
OAuth2 (result:all)
path Parameters
fileId
required
string

The unique identifier associated to the document received at the time of upload.

Responses

200

OK

202

Results are in progress and yet to be finalised.

default

Error

get /result/final/{fileId}
https://api.sypht.com/result/final/{fileId}

Request samples

Copy
curl --location --request GET "https://api.sypht.com/result/final/{{file id}}" \
  --header "Accept: application/json" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer {{access token}}" \
  --data ""

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "fileId": "a44b272a-aabb-4c7d-a473-769e09344044",
  • "status": "FINALISED",
  • "results": "{\n \"timestamp\": \"2018-05-10T01:42:12.078Z\"\n \"fields\": |\n [\n {\n \"name\": \"bpayBillerCode\",\n \"value\": \"7773\",\n \"confidence\": 1\n },\n {\n \"name\": \"dueDate\",\n \"value\": \"2018-01-17\",\n \"confidence\": 1\n },\n {\n \"name\": \"bpayCRN\",\n \"value\": \"2376513400\",\n \"confidence\": 1\n },\n {\n \"name\": \"amountDueAfterDiscount\",\n \"value\": \"27.50\",\n \"confidence\": 1\n }\n ]\n }\n"
}

/result/image/{fileId}

This endpoint allows you to retrieve an image copy of the uploaded document.

Authorizations:
OAuth2 (result:all)
path Parameters
fileId
required
string

The unique identifier associated to the document received at the time of upload.

query Parameters
page
string

The page to serve, defaults to one

Responses

200

The processed image

202

Image processing still in progress.

404

No document requiring transcription matching the uuid

get /result/image/{fileId}
https://api.sypht.com/result/image/{fileId}

Request samples

Copy
curl --location --request GET "https://api.sypht.com/result/image/{{file id}}" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer {{access token}}" \
  --data "" > {{target filename}}

Authenticate (legacy)

/oauth/token

Request Body schema: application/json
client_id
string

The client id obtained on sign-up with sypht.

client_secret
string

The client secret obtained on sign-up with sypht.

audience
string

The API audience you are authenticating for - always set this to "https://api.sypht.com"

grant_type
string

The type of authentication request - always set this to "client_credentials"

Responses

200

OK

post /oauth/token
https://login.sypht.com/oauth/token

Request samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "client_id": "string",
  • "client_secret": "string",
  • "audience": "string",
  • "grant_type": "string"
}

Response samples

Content type
application/json
Copy
Expand all Collapse all
{
  • "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6Ik9VUkRNelV3UVVGQ01qRXhNakF5T1Rrek5VSTBORU5DUXpORE5EZzBOak5ETWtZMU5FSTFOdyJ9.eyJpc3MiOiJodHRwczovL3N5cGhlbi5hdS5hdXRoMC5jb20vIiwic3ViIjoiOXQzeHdUSGhDVzY4MTk5MTBzRjR2RUV2SWN2QkdYQkZAY2xpZW50cyIsImF1ZCI6Imh0dHBzOi8vYXBpLnN5cGh0LmNvbSIsImlhdCI6MTUyNjg2NzAyNSwiZXhwIjoxNTI2OTUzNDI1LCJhenAiOiI5dDN4d1RIaENXNjgxOTkxMHNGNHZFRXZJY3ZCR1hCRiIsInNjb3BlIjoicmVzdWx0OmFsbCBmaWxldXBsb2FkOmFsbCIsImd0eSI6ImNsaWVudC1jcmVkZW50aWFscyJ9.samplesamplesamplesamplesample-ZWPj3qNo0NYFUFZ3UvcWp2EZHdjP6xpGa8hA04k1M4Abad0IPkPBPP9I3WzcIpHGALgTslDOOL_sl7wWgscU9gkaq5mePh25DqrAs6cE1YXqaixyM3y3FS8EO8jRaD8m_AuKkYJxttqZNmKf6c7PgyT_w5_thObzxa4yvv-ULMDOj4WYlc6qurdKYZkg3KC_EaYOwTrdOirOy40FtHm6hLNY1_yikImtBJ3MGbyR_pz54GA",
  • "scope": "result:all fileupload:all",
  • "expires_in": 86400,
  • "token_type": "Bearer"
}