In one project, I ended up reverting from Parquet to Avro containers because the schema was too extensive and nested (being derived from some fairly hierarchical object-oriented classes) and resulted in 1000s of Parquet columns. In turn, our row groups were really wide and shallow which meant that it took forever before we could process a small number of rows in the last column of each group.