From a step, you can put data into the StepExecutionContext.
Then, with a listener, you can promote data from StepExecutionContext to JobExecutionContext.
This JobExecutionContext is available in all the following steps.
Becareful : data must be short.
These contexts are saved in the JobRepository by serialization and the length is limited (2500 chars if I remember well).
So these contexts are good to share strings or simple values, but not for sharing collections or huge amounts of data.
Sharing huge amounts of data is not the philosophy of Spring Batch.
Spring Batch is a set of distinct actions, not a huge Business processing unit.
the job repository is used indirectly for passing data between steps (Jean-Philippe is right that the best way to do that is to put data into the StepExecutionContext and then use the verbosely named ExecutionContextPromotionListener to promote the step execution context keys to the JobExecutionContext.
It's helpful to note that there is a listener for promoting JobParameter keys to a StepExecutionContext as well (the even more verbosely named JobParameterExecutionContextCopyListener); you will find that you use these a lot if your job steps aren't completely independent of one another.
Otherwise you're left passing data between steps using even more elaborate schemes, like JMS queues or (heaven forbid) hard-coded file locations.
As to the size of data that is passed in the context, I would also suggest that you keep it small (but I haven't any specifics on the
Use StepContext and promote it to JobContext and you have access to it from each step, you must as noted obey limit in size
Create @JobScope bean and add data to that bean, @Autowire it where needed and use it (drawback is that it is in-memory structure and if job fails data is lost, migh cause problems with restartability)
We had larger datasets needed to be processed across steps (read each line in csv and write to DB, read from DB, aggregate and send to API) so we decided to model data in new table in same DB as spring batch meta tables, keep ids in JobContext and access when needed and delete that temporary table when job finishes successfully.
I was given a task to invoke the batch job one by one.Each job depends on another. First job result needs to execute the consequent job program.
I was searching how to pass the data after job execution. I found that this ExecutionContextPromotionListener comes in handy.
1) I have added a bean for "ExecutionContextPromotionListener" like below
@Bean
public ExecutionContextPromotionListener promotionListener()
{
ExecutionContextPromotionListener listener = new ExecutionContextPromotionListener();
listener.setKeys( new String[] { "entityRef" } );
return listener;
}
2) Then I attached one of the listener to my Steps
public class YourItemWriter implements ItemWriter<Object> {
private StepExecution stepExecution;
public void write(List<? extends Object> items) throws Exception {
// Some Business Logic
// put your data into stepexecution context
ExecutionContext stepContext = this.stepExecution.getExecutionContext();
stepContext.put("someKey", someObject);
}
@BeforeStep
public void saveStepExecution(Final StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
}
Now you need to add promotionListener to your job
@Bean
public Step step1() {
return stepBuilder
.get("step1")<Company,Company> chunk(10)
.reader(reader()).processor(processor()).writer(writer())
.listener(promotionListener()).build();
}
@Bean
public ExecutionContextPromotionListener promotionListener() {
ExecutionContextPromotionListener listener = new ExecutionContextPromotionListener();
listener.setKeys(new String[] {"someKey"});
listener.setStrict(true);
return listener;
}
Now, in step2 get your data from job ExecutionContext
public class RetrievingItemWriter implements ItemWriter<Object> {
private Object someObject;
public void write(List<? extends Object> items) throws Exception {
// ...
}
@BeforeStep
public void retrieveInterstepData(StepExecution stepExecution) {
JobExecution jobExecution = stepExecution.getJobExecution();
ExecutionContext jobContext = jobExecution.getExecutionContext();
this.someObject = jobContext.get("someKey");
}
}
If you are working with tasklets, then use the following to get or put ExecutionContext
As Nenad Bozic said in his 3rd option, use temp tables to share the data between steps, using context to share also does same thing, it writes to table and loads back in next step, but if you write into temp tables you can clean at the end of job.
Spring Batch creates metadata tables for itself (like batch_job_execution, batch_job_execution_context, batch_step_instance, etc).
And I have tested (using postgres DB) that you can have at least 51,428 chars worth of data in one column (batch_job_execution_context.serialized_content). It could be more, it is just how much I tested.
When you are using Tasklets for your step (like class MyTasklet implements Tasklet) and override the RepeatStatus method in there, you have immediate access to ChunkContext.
Or if you dont have a Tasklet and have like a Reader/Writer/Processor, then
class MyReader implements ItemReader<MyObject> {
@Value("#{jobExecutionContext['mydatakey']}")
List<MyObject> myObjects;
// And now myObjects are available in here
@Override
public MyObject read() throws Exception {
}
}