- In our daily lives, we often use words that are so ingrained in our vocabulary that we rarely stop to consider their true meaning.
- These words, simple as they may seem, carry profound significance and can be the subject of extensive study and interpretation. They form the foundation of our communication and understanding, and yet, we might struggle to provide a comprehensive definition if asked.
In this exploration, we will delve into seven such words. Each of these terms is a universe unto itself, rich in nuance and depth. They are words that we may take for granted, but each could easily fill the pages of numerous books…
- Data warehouse: The data warehouse is a central data repository that stores data in a predefined model for business intelligence. Analysts and managers use the data warehouse to gain a business view of data that support their decision making.
- Data lake: The data lake is a repository that stores structured, semi-structured and unstructured data in its native format. Data lakes originated as on-premises repositories running on Apache Hadoop, then evolved to run in the cloud as object stores.
- Data lakehouse: The data lakehouse combines elements of a data lake and a data warehouse in a hybrid repository. It applies SQL queries to cloud object stores to support business intelligence, data science, and self-service analytics.
- Data vault: The data vault is an approach to data modeling, architecture, and methodology that adds to elements of Ralph Kimball’s star schema model and Bill Inmon’s third-normal form framework. Dan Linstedt and his team at Lockheed Martin created the data vault as a hybrid approach that stores all data, tracks history, and accommodates changing schemas and data containers.
- Data mesh: The data mesh is a distributed data architecture in which business units own, manage, and publish data as a product for others to consume. Analysts and other data consumers use a self-service platform in a federated governance model.
- Data fabric: The data fabric unifies data integration, preparation, cataloging, security, and discovery into a cohesive and automated process. It uses metadata, machine learning, and automation to combine data across formats and locations.
- Data vault 2.0: The data vault 2.0 solution incorporates people, process, and technology. It includes prescriptive methodologies and reference architectures for technologies such as the data warehouse, data lake, data lakehouse, virtualization, data fabric, and data mesh. The 2.0 methodology was founded on SEI’s Capability Maturity Model and derives from Six Sigma, total quality management, disciplined agile delivery, and lean.