The following activities are automatically triggered when a new Batch process is executed:
A new Batch entry is added in Cassandra.
A new Job entry is recorded in also the k2system_jobs table with the following parameters:
The WAITING_FOR_JOB status is then assigned to the Batch process.
After this, any available node, or any node whose affinity has been specified in the Batch command, handles the execution of the Job and of the subcommand as specified in the Batch command.
Once the corresponding Job begins, and is set to an IN_PROCESS stage, the Batch process undergoes the following stages:
The illustration below shows how, once triggered from the command line, an asynchronous batch process is automatically encapsulated into a Job process.
The Job Process then launches the batch command which, in turn, is executed through its lifecycle phases.
To schedule that a Batch process be executed either at a given time or recurrently, a scheduled Job process must be created. This can be achieved using a user job, containing the batch command that needs to be repeatedly invoked.
Basically, this consists in creating a scheduled Job that calls a Batch process - which in turn will create multiple or scheduled one-time Jobs (each one parametered thanks to the execution settings parsed in the Batch command).
The illustration below describes the following steps:
User defines a job scheduled job to run a specific batch command. Fabric assigns a job process for this batch command.
The dedicated job runs the scheduled or multiple instances of the batch command.
The Batch process triggers a new (temporary) job dedicated for this specific process as described in the section above.
The new job runs the batch command.
The Jobs table is updated for next run and the dedicated job will wait for the next instance of the scheduled batch process.
All batch-related information is displayed in the k2batchprocess keyspace in the batchprocess_list table.
cassandra@cqlsh:k2batchprocess> select * from batchprocess_list;
Additional fields featuring in the table:
Command
BATCH AUTODATA_DELTA FROM idsFile USING ('select id from ids limit 100') FABRIC_COMMAND="sync_instance AUTODATA_DELTA.?" with JOB_AFFINITY='10.21.2.102' ASYNC='true';
In this case the command describes a synchronization process of a list of IDs with affinity set to Node: 10.21.2.102
extra_stats
This field shows the slowest-processed entities, along with their ID, processing time, status, and fields changes:
{"slowestProcessed":[{"entityId":"4","processTimeMS":572,"status":"COMPLETED","result":"{\"Added\":1,\"Updated\":0,\"Unchanged\":0}"},{"entityId":"5","processTimeMS":573,"status":"COMPLETED","result":"{\"Added\":1,\"Updated\":0,\"Unchanged\":0}"},{"entityId":"47","processTimeMS":645,"status":"COMPLETED","result":"{\"Added\":1,\"Updated\":0,\"Unchanged\":0}"}
When executed asynchrounously (async flag set to true), the batch process inherits from the Jobs ability to transfer the process to a different node when a node is no longer active or no longer responding.
This handover mechanism uses the hearbeats and keepalive parameters defined within the node.id file.
The next handling node picks up the batch process (via its associated job) and resumes its execution from the latest known recorded stage.
Each Fabric node uses its Fabric built-in BatchProcessAPI and Job Manager classes to manage the Batch process through its different lifecycle stages, as defined in the illustrations above.
When a migration process is initiated, it is treated as a batch of multiple entities synchronization processes.
The illustration below shows the sequence of actions involved in this process.
The following activities are automatically triggered when a new Batch process is executed:
A new Batch entry is added in Cassandra.
A new Job entry is recorded in also the k2system_jobs table with the following parameters:
The WAITING_FOR_JOB status is then assigned to the Batch process.
After this, any available node, or any node whose affinity has been specified in the Batch command, handles the execution of the Job and of the subcommand as specified in the Batch command.
Once the corresponding Job begins, and is set to an IN_PROCESS stage, the Batch process undergoes the following stages:
The illustration below shows how, once triggered from the command line, an asynchronous batch process is automatically encapsulated into a Job process.
The Job Process then launches the batch command which, in turn, is executed through its lifecycle phases.
To schedule that a Batch process be executed either at a given time or recurrently, a scheduled Job process must be created. This can be achieved using a user job, containing the batch command that needs to be repeatedly invoked.
Basically, this consists in creating a scheduled Job that calls a Batch process - which in turn will create multiple or scheduled one-time Jobs (each one parametered thanks to the execution settings parsed in the Batch command).
The illustration below describes the following steps:
User defines a job scheduled job to run a specific batch command. Fabric assigns a job process for this batch command.
The dedicated job runs the scheduled or multiple instances of the batch command.
The Batch process triggers a new (temporary) job dedicated for this specific process as described in the section above.
The new job runs the batch command.
The Jobs table is updated for next run and the dedicated job will wait for the next instance of the scheduled batch process.
All batch-related information is displayed in the k2batchprocess keyspace in the batchprocess_list table.
cassandra@cqlsh:k2batchprocess> select * from batchprocess_list;
Additional fields featuring in the table:
Command
BATCH AUTODATA_DELTA FROM idsFile USING ('select id from ids limit 100') FABRIC_COMMAND="sync_instance AUTODATA_DELTA.?" with JOB_AFFINITY='10.21.2.102' ASYNC='true';
In this case the command describes a synchronization process of a list of IDs with affinity set to Node: 10.21.2.102
extra_stats
This field shows the slowest-processed entities, along with their ID, processing time, status, and fields changes:
{"slowestProcessed":[{"entityId":"4","processTimeMS":572,"status":"COMPLETED","result":"{\"Added\":1,\"Updated\":0,\"Unchanged\":0}"},{"entityId":"5","processTimeMS":573,"status":"COMPLETED","result":"{\"Added\":1,\"Updated\":0,\"Unchanged\":0}"},{"entityId":"47","processTimeMS":645,"status":"COMPLETED","result":"{\"Added\":1,\"Updated\":0,\"Unchanged\":0}"}
When executed asynchrounously (async flag set to true), the batch process inherits from the Jobs ability to transfer the process to a different node when a node is no longer active or no longer responding.
This handover mechanism uses the hearbeats and keepalive parameters defined within the node.id file.
The next handling node picks up the batch process (via its associated job) and resumes its execution from the latest known recorded stage.
Each Fabric node uses its Fabric built-in BatchProcessAPI and Job Manager classes to manage the Batch process through its different lifecycle stages, as defined in the illustrations above.
When a migration process is initiated, it is treated as a batch of multiple entities synchronization processes.
The illustration below shows the sequence of actions involved in this process.