I'm trying to give useful information but I am far from being a data engineer.
I am currently using the python library pandas to execute a long series of transformation to my data which has a lot of inputs (currently CSV and excel files). The outputs are several excel files. I would like to be able to execute scheduled monitored batch jobs with parallel computation (I mean not as sequential as what I'm doing with pandas), once a month.
I don't really know Beam or Airflow, I quickly read through the docs and it seems that both can achieve that. Which one should I use ?