SmartDocumentor
  • Overview
    • SmartDocumentor
  • GETTING STARTED
    • About SmartDocumentor
    • How SmartDocumentor Works
    • Main concepts
      • Workspaces
      • Organization
    • Quickstart 101
    • Licenses
      • How to Buy?
      • Support
    • Privacy
    • Security
  • Technical
    • Initial Setup
    • Mappings
    • API Reference
      • [POST] Client Credentials Access Token
      • [GET] Task Status
      • [GET] Task Status List
      • [GET] Task Status List Paginated
      • [GET] Get Workspaces
      • [PATCH] Reprocess Task
      • [POST] Create Task
      • [POST] Create Batch
      • [GET] Get Batch
      • [GET] Get Batch Paginated
      • Transcripts
        • Speakers
          • Workspaces
            • [GET] Workspace Speakers
            • [PUT] Workspace Speakers
          • Tasks
            • [GET] Task Speakers
            • [POST] Task Speaker
            • [PUT] Task Speaker
            • [DELETE] Task Speaker
      • Invite Users
        • [GET] List Available Roles
        • Tenants
          • [GET] List All Tenant Users
          • [POST] Invite Users
        • Workspaces
          • [GET] List All Workspace Users
          • [POST] Invite Users
    • Folder Worker
    • Export
      • Webhooks
        • Webhook (Text Documents)
        • Webhook (Transcripts)
        • Webhook Url To File (Transcripts)
        • Webhook Url to File (Anonymization)
    • Changelog
  • FAQs
    • FAQs
Powered by GitBook
On this page
  • Create Task
  • Body application/json CreateTaskRequest
  • Working with context
  • Working with taskAssignment
  • Working with metadata
  • Document Size Limits
  • Uploading your own files
  • Example Request
  • Responses
  1. Technical
  2. API Reference

[POST] Create Task

Use this request to create a new Task and add it to the processing queue.

Create Task

POST/api/v1.0/external/sdtask/process

Processing times

When a task is created, it is added to the pre-processing queue and will be processed in the future.

Processing times can vary with:

  • the global load of the system;

  • the size of your file;

  • the number of pages/minutes in your file;

  • the complexity of the model used to detect fields.

Body application/json CreateTaskRequest

Name
Type
Description

displayName

string (optional)

An optional display name for your task. A random one will be generated automatically if not provided.

description

string (optional)

An optional description for your task.

workspaceId

integer (optional)

batchId

integer (optional)

The ID of the Batch to add this task to. Note that the Batch must belong to the workspace specified in this request (or the default workspace, if none is specified). To find the ID of your Batch, access the Batches screen and click Import -> API on the relevant Batch.

priority

integer (optional)

Optional, automatically set to 1 (low). Either 1 or larger than 1, where 1 represents low priority, and > 1 high priority. There are only two priorities: low and high. Higher values for priority do not represent higher priority. High priority is only available for certain Organizations, and with intervention by the SmartDocumentor development team.

context

object

taskAssignment

object (optional)

customFormValues

object (optional)

An untyped object, used to pass custom form data to the Document Viewer in Advanced Edit mode. Only available to certain Organizations, and with intervention by the SmartDocumentor development team.

Working with context

The context object is fed directly to our AI algorithm during processing, and directly influences the detection results.

The context property supports the following configuration values:

Property Name
Type
Description
Supported Values

url

string

A required property, must point to either a public or SAS secured URL that grants sufficient access permissions to SmartDocumentor during the task's lifetime.

Any valid public or SAS secured URL.

transcriptLanguage

string (optional, only in Transcription Workspaces)

Explicitely set the document language for transcription purposes, improving model performance. If this value is not included, then the model defaults to the language configured for the Workspace, under Advanced Settings.

pt-PT, en-US, fr-FR, de-DE, es-ES

splitTaskInChunks

boolean (optional, not available in Template Workspaces)

true or false

splitChunkDurationMinutes

integer (optional, only in Transcription Workspaces)

Only when splitTaskInChunks is set to true. Maximum approximate duration of each split part, defaults to 10 minutes. Silence detection is used to check where it is safe to split audio and video. Each split part is only split when a suficiently large enough chunk of silence is detected, thus the effective duration may exceed the configured value.

Any integer above 0.

splitChunkNumberPages

integer (optional, in all Workspaces except Transcripts and Templates)

Only when splitTaskInChunks is set to true. Maximum size, in pages, that each split document must have.

Any integer above 0.

Note that other data within context or customFormValues is validated. Only supported fields can be included.

Working with taskAssignment

The taskAssignment object defines which users can review a task and the order in which they must do so before the task being finished.

Usage in split tasks

When this property is used in a split task scenario, all resulting smaller tasks will inherit these properties from the original request.

The taskAssignment property supports the following configuration values:

Property Name
Type
Description
Supported Values

assignedUsers

string[]

A required property, must contain a list of user email addresses to be added as task reviewers. Their order in the list will determine the sequence when a sequential review process is required.

Array of strings containing valid email address.

order

int

A required property, that determines what type of assignment order should be applied to the task.

Working with metadata

The metadata object defines textual key-value pairs of information that the user may want to store. What is stored is up to the user however the following properties cause the application to trigger specific behavior.

Property name
Type
Description
Supported Values
Behavior

SentByEmail

string

Email address of the user that has inserted the document/task in the system

Any valid email address

None

Document Size Limits

For text documents, the following limits apply:

  • Images must be larger than 50x50 pixels and smaller than 10000x10000 pixels.

  • PDFs must be smaller than 17x17 inches.

A document may begin processing before these limits are verified. It will transition to the ValidationErrorInvalidDimensions status if it does not comply with these limits.

Additionally, when dealing with PDFs and OCR-enabled Workspaces, only the first 200 pages of the document will be processed. The remaining pages will contain no detections.

Uploading your own files

You can now upload your own files using the TUS Protocol - https://tus.io.

TUS is an open-source file upload protocol with widespread adoption by the community. It facilitates resumable uploads, with simple client and server implementations in many languages and frameworks.

To begin uploading content to SmartDocumentor, please choose a TUS Client in your language or framework of choice from the following page: https://tus.io/implementations. You may also find additional community client implementations around the web.

Here is an example implementation in C#, using the Tus.Net.Client .NET Client (https://github.com/hoss-green/Tus.Net.Client).

var tusUploadServerUrl = "https://cloud.smartdocumentor.net";
var serverEndpoint = $"{tusUploadServerUrl}/files/";
var token = "your access token";                        // Obtained via the "Client Credentials Access Token" request. 
var contentType = "the content type of your file";      // i.e application/pdf, image/png, image/jpeg, etc.
var workspaceId = 1234;                                 // The workspace ID you want to upload to. 
var filePath = "/Path/To/Your/File";                    // The path to the file you wish to upload. 
var customHeaders = new Dictionary<string, string>
{
    ["Authorization"] = $"Bearer {token}"
};
var tusOptions = new TusOptions { LogRequests = true };
var metadata = new Dictionary<string, string>
{
    ["workspaceid"] = workspaceId.ToString(),
    ["filetype"] = contentType
};
// Get a stream to read the contents
// This implementation assumes your file is stored locally on disk. 
// If not the case, you can create a temporary file to write data to, then delete it when upload is done. 
await using var fileStream = File.OpenRead(filePath);
var fileSize = fileStream.Length;
var createdEndpoint = await TusClient.CreateEndpointAsync(
    serverEndpoint,
    fileSize,
    filePath,
    contentType,
    customHeaders,
    metadata,
    tusOptions
);
var uploadUrl = $"{tusUploadServerUrl}{createdEndpoint.Location}";
var tusFile = new TusFile(
    fileStream,
    filePath,
    fileSize,
    contentType,
    uploadUrl,
    metadata,
    async (_, errorEvent) =>
    {
        // Handle Errors Here
        // Here you would also perform any cleanup logic if upload fails, such as deleting temporary files. 
    },
    (_, progress) =>
    {
        Console.WriteLine($"${progress.Percentage}%"); // Optionally show file upload progress here
    },
    async (_, _) =>
    {
        // On success, make a "POST Task Process" request here following the documentation,
        // but replacing the "url" parameter with the contents of the "uploadUrl" variable
        // Here you would also perform any cleanup logic after upload is complete, such as deleting temporary files.
        var client = new HttpClient();
        var request = new HttpRequestMessage(HttpMethod.Post, "https://cloud.smartdocumentor.net/api/v1.0/sdtask/process");
        request.Headers.Add("Authorization", $"Bearer {token}");
        var payload = new
        {
            displayName = "Example Document",
            workspaceId = workspaceId,
            context = new
            {
                url = uploadUrl
            }
        };
        string jsonPayload = JsonSerializer.Serialize(payload);
        var content = new StringContent(jsonPayload, Encoding.UTF8, "application/json");
        request.Content = content;
        var response = await client.SendAsync(request);
        response.EnsureSuccessStatusCode();
        Console.WriteLine(await response.Content.ReadAsStringAsync());
    }
);
// Tus uploads are asynchronous, calling this method will trigger the callbacks above. 
await tusFile.UploadAsync(customHeaders, new TusOptions { LogRequests = true });

Regardless of your client choice, you must always include the following details in your request:

  • A custom header, Authorization, in the format "Bearer {token}" to authenticate the request with your access token.

  • A metadata dictionary, containing two entries:

    • workspaceid: the Workspace ID you wish to upload to. If null, will upload to the default workspace in your Organization.

    • filetype: the content type / MIME type of the file you are uploading, i.e application/pdf, image/png, image/jpeg, etc.

  • This example implementation only supports uploads up to 100MB, which is the server's maximum allowed upload chunk size. When working with other clients and/or larger files, you may set chunk size to a value under 100MB. For best performance, we recommend 95MB.

Other client implementations may differ from the example above in the way these properties are passed to the request.

Example Request

curl --location 'https://cloud.smartdocumentor.net/api/v1.0/sdtask/process' \
--header 'Authorization: Bearer null' \
--header 'Content-Type: application/json' \
--data '{
    "displayName": "Example Document",
    "workspaceId": 1,
    "context": {
        "url": "https://formrecognizer.appliedai.azure.com/documents/samples/prebuilt/invoice-english.pdf"
    },
    "metadata": {
        "SentByEmail":"someUser@someaddress.net",
    }
}'
var client = new HttpClient();

var request = new HttpRequestMessage(HttpMethod.Post, "https://cloud.smartdocumentor.net/api/v1.0/sdtask/process");
request.Headers.Add("Authorization", "Bearer null");

var content = new StringContent("{\r\n    \"displayName\": \"Example Document\",\r\n    \"workspaceId\": 1,\r\n    \"context\": {\r\n        \"url\": \"https://formrecognizer.appliedai.azure.com/documents/samples/prebuilt/invoice-english.pdf\"\r\n    },    \"metadata\": {\r\n        \"SentByEmail\":\"someUser@someaddress.net\",\r\n    }\r\n}", null, "application/json");
request.Content = content;

var response = await client.SendAsync(request);
response.EnsureSuccessStatusCode();

Console.WriteLine(await response.Content.ReadAsStringAsync());
import requests
import json

url = "https://cloud.smartdocumentor.net/api/v1.0/sdtask/process"

payload = json.dumps({
  "displayName": "Example Document",
  "workspaceId": 1,
  "context": {
    "url": "https://formrecognizer.appliedai.azure.com/documents/samples/prebuilt/invoice-english.pdf"
  },
  "metadata": {
    "SentByEmail":"someUser@someaddress.net"
  }
})
headers = {
  'Authorization': 'Bearer null',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)
var myHeaders = new Headers();
myHeaders.append("Authorization", "Bearer null");
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({
    "displayName": "Example Document",
    "workspaceId": 1,
    "context": {
        "url": "https://formrecognizer.appliedai.azure.com/documents/samples/prebuilt/invoice-english.pdf"
    },
    "metadata": {
        "SentByEmail":"someUser@someaddress.net",
    }
});

var requestOptions = {
  method: 'POST',
  headers: myHeaders,
  body: raw,
  redirect: 'follow'
};

fetch("https://cloud.smartdocumentor.net/api/v1.0/sdtask/process", requestOptions)
  .then(response => response.text())
  .then(result => console.log(result))
  .catch(error => console.log('error', error));

Responses

{
  "taskId": 123123,
  "displayName": "Example Document",
  "description": null,
  "totalPages": 111,
  "taskStatus": 30,
  "taskStatusText": "QueuedForPreProcessing",
  "integrationStatus": 1,
  "integrationStatusText": "Never",
  "workspaceId": 1,
  "requestId": "43af5dd0-62cf-4fe5-93d7-2c4f5f65da47",
  "taskAssignment": null,
}

Response Body CreateTaskResponse

Name
Type
Description

taskId

integer

The unique identifier assigned to your Task.

displayName

string

The display name you assigned to your Task, or an auto-generated one in case you did not provide a name.

description

string (optional)

The description you provided to your Task.

externalId

string (optional)

The External ID you assigned to your Task.

taskStatus

The processing status of this Task. The initial Task status is QueuedForPreProcessing.

taskStatusText

string

A textual representation of the Task Status value for ease of use.

integrationStatus

The integration status of this Task. The initial Task integration status is Never.

integrationStatusText

string

A textual representation of the Task Integration Status value for ease of use.

workspaceId

integer

The workspace ID you assigned to this Task, or the ID of the default Workspace in your Organization, if none was provided.

requestId

guid

Issued by the system and unique to the request just performed.

batchId

integer (optional)

The ID of the Batch this task belongs to.

taskAssignment

object (optional)

Response body has no content.

Note Returned if:

  • the workspace provided (or the default workspace) does not have a processing pipeline defined. All default workspaces have a pipeline.

  • If the context property contains unsupported properties.

  • If your license has expired or has no free volume.

  • The provided batchId does not belong to the provided workspace (or to the default workspace, if no workspace is provided).

  • If you are trying to split a task in a Template workspace.

  • If you are trying to use the taskAssignment object property under the following conditions:

    • In a Template workspace.

    • Without providing the assignedUsers property or leaving it empty.

    • By including an email in the assignedUsers property that belongs to an existing tenant user who lacks task review permissions.

    • By including an email in the assignedUsers property that belongs to an existing tenant user who is currently disabled.

Note Returned if:

  • The provided workspace does not exist.

  • The provided batch does not exist.

Response body has no content.

Note Returned if:

  • If the URL points to an unreacheable resource or machine, causing timeouts.

Previous[PATCH] Reprocess TaskNext[POST] Create Batch

Last updated 16 days ago

The ID of the workspace to add this task to. Note each workspace processes a specific . If not specified, the default workspace in your tenant is used.

An untyped object, used by our AI backend to initiate processing. The url property is required, pointing to a public or SAS secured file. See the section for details.

Defines the review process for an imported task by specifying assigned users and the order in which they must review it. When set, only the designated users can review the task in the specified order. See the section for details. This parameter is not applicable to tasks with Labelling.

Set to true to separate large files in smaller parts, which can be reviewed individually and automatically exported as a single document when all tasks are reviewed. When is defined, its properties will be inherited by the resulting smaller parts.

Any value from the enum.

Object containing the email addresses of users assigned to the task, along with the designated .

Without providing the order property or using an invalid value.

context
taskAssignment
taskAssignment
Task Type
Task Assignment Order Type
Task Type
Task Assignment Order Type
TaskStatus
IntegrationStatus
Task Assignment Order Type