Using Apache Cassandra as an highly available Big Data Platform is in my eyes a good choice as Cassandra Cluster is easy to handle. We are using Apache Cassandra as an archive platform for our human centric worklfow engine Imixs-Workflow. But how does Cassandra perform with large files?
If you start with Cassandra you have to change the way how you use databases, especially when you come from the SQL direction. Although Cassandra can handle very large amounts of data easily, you have to consider the concept of the partition size. This means in short that the data within a partition (defined by the Partitionkey) should not exceed 100 MB. If you plan to store large files (e.g media files) you need to split up your data into smaller chunks. In the following I will explain in short how this can be done. Continue reading “Cassandra – How to Handle Large Media Files”