Task Operations¶
This page discusses operations relevant to Task management. Please go over the Task State Machine to understand the different states a task can be in and how operations applied (and external changes) move a task from one state to another.
Note
Please go through Cluster Op Spec to understand the operation parameters being sent.
For tasks only the timeout
parameter is relevant.
Note
Only one operation can be active on a particular task identified by a {sourceAppName,taskId}
at a time.
Warning
Only the leader controller will accept and process operations. To avoid confusion, use the controller endpoint exposed by Drove Gateway to issue commands.
Cluster Operation Specification¶
When an operation is submitted to the cluster, a cluster op spec needs to be specified. This is needed to control different aspects of the operation, including parallelism of an operation or increase the timeout for the operation and so on.
The following aspects of an operation can be configured:
Name | Option | Description |
---|---|---|
Timeout | timeout |
The duration after which Drove considers the operation to have timed out. |
Parallelism | parallelism |
Parallelism of the task. (Range: 1-32) |
Failure Strategy | failureStrategy |
Set this to STOP . |
Note
For internal recovery operations, Drove generates it's own operations. For that, Drove applies the following cluster operation spec:
- timeout - 300 seconds
- parallelism - 1
- failureStrategy -
STOP
The default operation spec can be configured in the controller configuration file. It is recommended to set this to a something like 8 for faster recovery.
How to initiate an operation¶
Tip
Use the Drove CLI to perform all manual operations.
All operations for task lifecycle management need to be issued via a POST HTTP call to the leader controller endpoint on the path /apis/v1/tasks/operations
. API will return HTTP OK/200 and relevant json response as payload.
Sample api call:
curl --location 'http://drove.local:7000/apis/v1/tasks/operations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic YWRtaW46YWRtaW4=' \
--data '{
"type": "KILL",
"sourceAppName" : "TEST_APP",
"taskId" : "T0012",
"opSpec": {
"timeout": "5m",
"parallelism": 1,
"failureStrategy": "STOP"
}
}'
Note
In the above examples, http://drove.local:7000
is the endpoint of the leader. TEST_APP
is the name
of the application that started this task and taskId
is a unique client generated id. Authorization is basic auth.
Warning
Task operations are not cancellable.
Create a task¶
A task can be created issuing the following command.
Preconditions:
- Task with same {sourceAppName,taskId}
should not exist on the cluster.
State Transition:
- none →
PENDING
→PROVISIONING
→STARTING
→RUNNING
→RUN_COMPLETED
→DEPROVISIONING
→STOPPED
To create a task a Task Spec needs to be created first.
Once ready, CLI command needs to be issued or the following payload needs to be sent:
drove -c local tasks create sample/test_task.json
Sample Request Payload
{
"type": "CREATE",
"spec": {...}, //(1)!
"opSpec": { //(2)!
"timeout": "5m",
"parallelism": 1,
"failureStrategy": "STOP"
}
}
- Spec as mentioned in Task Specification
- Operation spec as mentioned in Cluster Op Spec
Sample response
{
"status": "SUCCESS",
"data": {
"taskId": "TEST_APP-T0012"
},
"message": "success"
}
Warning
There are no separate create/run steps in a task. Creation will start execution automatically and immediately.
Kill a task¶
A task can be created issuing the following command.
Preconditions:
- Task with same {sourceAppName,taskId}
needs to exist on the cluster.
State Transition:
RUNNING
→RUN_COMPLETED
→DEPROVISIONING
→STOPPED
CLI command needs to be issued or the following payload needs to be sent:
drove -c local tasks kill TEST_APP T0012
Sample Request Payload
{
"type": "KILL",
"sourceAppName" : "TEST_APP",//(1)!
"taskId" : "T0012",//(2)!
"opSpec": {//(3)!
"timeout": "5m",
"parallelism": 1,
"failureStrategy": "STOP"
}
}
- Source app name as mentioned in spec during task creation
- Task ID as mentioned in the spec
- Operation spec as mentioned in Cluster Op Spec
Sample response
{
"status": "SUCCESS",
"data": {
"taskId": "T0012"
},
"message": "success"
}
Note
Task metadata will remain on the cluster for some time. Metadata cleanup for tasks is automatic and can be configured in the controller configuration.