Going from EJB2 to JPA Made Us Discover Bad Data
It is funny, how going from EJB2.x CMP data access code to JPA ORM data access code in one of my projects, made us discover bad data.
Over a serious of years, this project has moved through EJB1.x, EJB2.0 and EJB.2.1. Our use of the old EJB standard have made the impact, that the domain modelling has become very coarse grained. Basically, there are only entities, one for each table, and simple data types for all properties. Lately, we have started to map that same data model (again), but this time using JPA. And when doing JPA ORM mapping, we have greater flexibility, to do a more fine grained domain model, with stuff like enumerated types and value objects.
Then, we did a batch job, which basically had to dig through a lot of millions of rows, and it should do this using the new JPA mapped domain code. What happened was, that the job bombed out many times, while digging through the data, due to data inconsistencies. When Hibernate (our JPA provider) tried to instantiate and set data on our model, with the data coming from the database, the batchjob failed. And it is not JPA or Hibernate, there is to blame. Our old, coarse grained, simple datatype, mapped model, simply did not detect this. In the JPA model, we have stuff like enumerated types, and this requires the data to only contain when the enumerated type contains, or loading it will fail.
So now, we correct the data inconsistencies, and go on with JPA.

