Data Lake – 3 Principles

As that being said, there are generally 3 principles to follow to build and use a data lake:

  1. to store all the data by its original format, and use a curated layer in an open-source format.
  2. to have a foundational compute layer that supports all of the core lake-house use cases such as ETL with/without streaming processing, data science with machine learning, and SQL analytics on the data lake.
  3. to be able to accept new or additional use cases in terms of integration as that not a part of the core lake-house use cases.

To the last point, the curated data lake, the foundational compute layer, and other services with tools become key requirements to support easy integration.

my signature
August 2, 2021