The last 15 years have seen several successive waves of big data platforms and companies offering new database and analytics capabilities. Leveraging the public cloud, these large scale distributed data systems work well in centralized, single region or single data center topologies and are geared for “Read intensive” data problems.
Companies such as Cloudera, Snowflake, and more recently Databricks are examples of successful companies that have innovated with novel “read architectures” that reduce the structural cost of reading immutable data and built new big data orientated analytics apps and capabilities by exploiting their lower costs of reads.
We are now faced with new challenges with workloads and use cases that are “write intensive”. The write path is far more challenging than the read path. This is because the read path is built on immutable or non-changing data. The write path necessarily involves mutable data with varying rates of change and growth in the data sets of use cases.