Updated March 30, 2023
Introduction to Spring Batch Partitioner
Spring batch partitioner processes a variety of data sets, partitioner uses several threads. A programmatically defined range of data sets can be specified. It depends on the use case and how many threads we want to establish for the partitioner. The amount of threads used is solely determined by the need. We must partition the steps of the batch task in order to make parallel processing possible.
What is a spring batch partitioner?
- One of the open-source batch processing frameworks is Spring Batch. A single thread is used to execute spring batch scripts.
- However, there may be instances where the utilization of a single thread consumes a significant amount of time during execution.
- This can be accomplished by using partitioned step execution, in which each job handles a small chunk of work and all jobs run at the same time.
- We’ll be delivering data from numerous csv files to the database at the same time. This would result in a decrease in elapsed time and an increase in efficiency.
- A variety of options are available to implement a job with some parallel processing. There are two types of parallel processing at a higher level:
- Multi-Process
- Multi-threaded, single-process
- When there are millions of data to read from source systems, and we can’t rely on a single thread to process them all, which might be time-consuming, a partitioner comes in handy.
- We’d like to read and process data in many threads to make the best use of the system’s resources.
- By employing multiple thread execution, the spring boot batch partitioner will assist enhance the pace of the spring boot batch.
- The spring boot batch partitioner will divide and run batch processes using several threads in a multi-threaded environment.
- The partitioner will speed up batch processing. The partitioner will handle each csv file in a different thread.
- Because batch jobs run in the background and do not require human engagement, they are a little more difficult to scale.
- As a result, monitoring the time it takes for a user to respond to a request isn’t a useful performance indicator.
- The time it takes to process a batch job is constrained. Batch applications are typically performed at night and have a set amount of time to finish. Scaling a batch task aims to fulfill the required execution time.
Spring batch partitioner step
- The partitioner is the central strategy interface for producing ExecutionContext instances as input parameters for a partitioned phase.
- We’re querying the dataset for the MAX and MIN id values and building partitions amongst all records based on that.
- We used gridSize = number of threads for the partitioner. We can create our own custom value-based as per our needs.
- Below are the steps which were used at the time of developing any partitioner application.
- Create a project template by using a spring initializer.
- After creating the project, open the project template by using the spring tool suite.
- Add the spring batch partitioner dependency.
- Then create a partitioner and define the range of records.
- After creating the partitioner, create batch and processor.
- Load the data and run the application.
Spring batch partitioner example
The below example shows the spring batch partitioner are as follows.
- Create project template of spring batch partitioner by using spring initializer
In the below step, we have providing project group name as com.example, artifact name as SpringBatchPartitioner, project name as SpringBatchPartitioner, and selected java version as 8. Also, we have defined the spring boot version as 2.6.0, defined the project as maven.
We have selected spring web, spring batch, and PostgreSQL driver dependency in the below project to implement the spring batch partitioner project.
Group – com.example Artifact name – SpringBatchPartitioner
Name – SpringBatchPartitioner Spring boot – 2.6.0
Project – Maven Java – 8
Package name – com.example.SpringBatchPartitioner
Project Description – Project for SpringBatchPartitioner
Dependencies – spring web, PostgreSQL driver, spring batch.
- After generating project extract files and open this project by using spring tool suite –
- After generating the project using the spring initializer in this step, we extract the jar file and open the project using the spring tool suite.
- After opening the project using the spring tool suite, check the project and its files –
In this step, we are checking all the project template files. We also need to check maven dependencies and system libraries.
- Add dependency packages –
In this step, we are adding the partitioner dependency in the spring batch processing project.
Code –
<dependency> -- Start of dependency tag.
<groupId>org.springframework.batch</groupId> -- Start and end of groupId tag.
<artifactId>spring-boot-starter-batch</artifactId> -- Start and end of artifactId tag.
</dependency> -- End of dependency tag.
<dependency> -- Start of dependency tag.
<groupId>org.postgresql</groupId> -- Start and end of groupId tag.
<artifactId>postgresql</artifactId> -- Start and end of artifactId tag.
</dependency> -- End of dependency tag.
- Create database, tables and add records in stud table –
Code –
create database springbatchpartitioner;
create table stud(user_id int);
- Configure application.properties file –
Code –
spring.datasource.url=jdbc:postgresql://localhost/springbatchpartitioner
spring.datasource.driverClassName=com.postgresql.jdbc.Driver
spring.datasource.username=postgres
spring.datasource.password=postgres
spring.batch.job.enabled=false
- Create partitioner –
Code –
public class springbatchpartition implements Partitioner
{
private JdbcOperations JT;
private String t;
private String c;
public void setTable (String tab) {
this.t = tab;
}
public void setColumn (String col) {
this.c = col;
}
}
- Configure spring batch job –
Code –
@Configuration
public class configure {
@Autowired
private JobBuilderFactory JBF;
@Autowired
private StepBuilderFactory SBF;
@Autowired
private DataSource DS;
@Bean
public CRP partitioner()
{
CRP CRP = new CRP();
CRP.setCol ("id");
CRP.setCol (DS);
CRP.setTab ("stud");
return CRP;
}
}
- Create entity and mapper class –
Code –
@Data
@AllArgsConstructor
@Builder
@NoArgsConstructor
public class Stud
{
private int user_id;
}
- Create the main class –
Code –
@SpringBootApplication
@EnableBatchProcessing
public class springpartitioner implements CommandLineRunner
{
@Autowired
private JobLauncher JL;
@Autowired
private Job J;
public static void main(String[] args) {
SpringApplication.run (springpartitioner.class, args);
}
- Run the application –
Conclusion
The partitioner is the central strategy interface for producing ExecutionContext instances as input parameters for a partitioned phase. Partitioner processes a variety of data sets, partitioner uses several threads. We must partition the steps of the batch task in order to make parallel processing possible.
Recommended Articles
This is a guide to Spring Batch Partitioner. Here we discuss What are a spring batch partitioner and the Steps of partitioner along with the examples and codes. You may also have a look at the following articles to learn more –