Sep 27 • Posted 4 years ago
Someone recently pointed out to me, rather insightfully, that MongoDB is a good fit for archival of relational data.
I had not really considered this before, but it is a good point : flexible schemas are very helpful for archival. How do we keep an archive of data, say, 10 years or more of data history, when over that time period the schema will undergo significant changes? It is not so easy.
One approach would be to apply any schema changes from the online / operational database at the archival database too. However, there are some issues. First, the archival database may be huge, making schema migrations impractical. But more importantly, these changes may not be what we want in an archive. Imagine we decide to drop a column in the online db. It may now be deprecated and unneeded. However, a true and complete archive would still have that data. Dropping the column in the archive is not what we want.
Document-oriented databases, with their flexible schemas, provide a nice solution. We can have older documents which vary a bit from the newer ones in the archive. The lack of homogeneity over time may mean that querying the archive is a little harder. However, keeping the data is potentially much easier.