Take Migration Process To Next Level Using Dexecutor

You have Data Migration process, which updates the Application from version X to X+1, by running Migration Scripts (each script consists of sequence of instructions) sequentially, to bring the application to a desired state.

Problem

The synchronous process is causing delays leading to unproductive wait times and dissatisfaction from users. There is a need for process to decrease the scripts execution time by running tasks in parallel where ever applicable to come to desired state.

Driving Forces

The following are driving forces behind Dexecutor.

  • Supports Parallel execution, conditionally may revert to sequential execution (provided such logic is provided)
  • Ultra light (Version 1.1.1 is 44KB)
  • Ultra fast
  • Distributed Execution supported
  • Immediate/Scheduled Retry logic supported
  • Non-terminating behaviour supported
  • Conditionally skip the task execution

Solution

Incorporate Dexecutor into your script execution logic, additionally distribute the execution using Infinispan, Hazelcast or Ignite. Here is the sample application which demonstrate this functionality, fork it and have fun 🙂

Dexecutor can be used in this case easily by adding an Algorithmic logic on top of Dexecutor which builds the graph based on table names. Lets assume the following scripts:

Script 1 ==> operates on Tables t1 and t2 and takes 5 minute
Script 2 ==> operates on Tables t1 and t3 and takes 5 minute
Script 3 ==> operates on Tables t2 and t4 and takes 5 minute
Script 4 ==> operates on Tables t5 and t6 and takes 5 minute
Script 5 ==> operates on Tables t5 and t7 and takes 5 minute
Script 6 ==> operates on Tables t6 and t8 and takes 5 minute

Normally these scripts are executed sequentially as follows.

Script 1  5 minutes
  |
  V
Script 2  5 minutes
  |
  V
Script 3  5 minutes
  |
  V
Script 4  5 minutes
  |
  V
Script 5  5 minutes
  |
  V
Script 6  5 minutes

Total time 30 minutes 

In sequential case, total execution time would be 30 minutes, However if we could parallelize the script execution, make sure scripts are executed in right sequence and order, then we could save time, decreasing the total execution time to just 10 minutes.

       +----------+                       +----------+
       | Script 1 |                       | Script 4 |             ==> 5 minutes
  +----+----------+--+               +----+----------+-----+
  |                  |               |                     |
  |                  |               |                     |
+-----v----+   +-----v----+     +----v-----+        +------v---+
| Script 2 |   | Script 3 |     | Script 5 |        | Script 6 |   ==> 5 minutes
+----------+   +----------+     +----------+        +----------+

Total Time 10 minutes

Using Dexecutor, we just have to write the algorithm which facilitates building graph using the API exposed by Dexecutor, and rest would be taken care by Dexecutor.  MigrationTasksExecutor implements that algorithm, considering the SQLs in the migration scripts. Since table names in the SQL plays a crucial role in building the graph, we need an efficient, ultra light and ultra fast library to extract table names out of SQLs, and hence we would use sql-table-name-parser, use it by adding the following dependency in your POM.

<dependency>
    <groupId>com.github.mnadeem</groupId>
    <artifactId>sql-table-name-parser</artifactId>
    <version>0.0.2</version>
  </dependency>

And of course, Dexecutor should be added as dependency as well

<dependency>
   <groupId>com.github.dexecutor</groupId>
   <artifactId>dexecutor-core</artifactId>
   <version>LATEST_VERSION</version>
 </dependency>

The graph, that would be built, considering the migration script is the following.

 

dexecutor-graph

As can be seen here node base1, base3 and base 4 runs in parallel and once, one of them finishes its children are executed, for example if node base1 is finished then its children base2 and app3-1 are executed and so on.

Notice that for node app2-4 to start, app1-4 and app2-1 must finish, similarly for node app3-2 to start, app3-1 and app2-4 must finish.

Just Run this class to see how things proceed.

Conclusion

We can indeed run dependent/independent tasks in easy and reliable way with Dexecutor.

References