By Thibaut De Vylder

We have seen the origins and limitations of the approaches that gave rise to data-vaulting

Data-vaulting is three things in one: a technique, a methodology, and an approach. It is more than just a tool: it represents a trend in the modelling of an organisation’s data, whether that organisation is public or private.

Invented and continually being updated by Dan Linsted, data-vaulting is now expanding in scope. It was initially designed to interconnect large, independent data warehouses, but by using it for more than 10 years for smaller assignments in which it has combined a variety of sources, dFakto has demonstrated that its potential uses go far beyond these large-scale applications.

different data models and architectures

Let’s take, for each data-warehouse model (enterprise data warehouse, dimensional design, and data-vaulting), a representation in which the data storage and/or processing spaces are represented by black bars, in order to build a data warehouse that is fed by three sources and that provides information to three groups of users:

 

Entreprise Data Warehouse

Dimensional Design

Data Vaulting

Two linked processes: centralisation and restitution A series of “extract transform and load” (ETL) processes based on the same data and interdependent (for instance, the same enhancements in two ETL processes). What happens when an enhancement has to be updated? Three families of independent processes:
  • data-vault storage
  • enhancement in the business vault
  • restitution through data marts
  Long start-up time

  Short start-up time

  Heavy maintenance

  Long-term consistency

  Short start-up time

  Light maintenance

  Requirement to use the same team for both processes   Complexity of ETL processes

  Limited resources

  Three independent processes:
  Teams can work in parallel (without even knowing each other)
  Training and expertise available on the market   No prior knowledge of concepts required Anyone can programme a basic ETL.   An understanding of data-vaulting concepts is required
  Scalability   Scalability   Designed to be scalable
  The model respects the third normal form, which is used in IT   There is no model and therefore no constraints   A model that tends to add tables in which data is stored and enriched, thus giving the appearance of complexity.

strict adherence to simple rules. It is sometimes tempting to go off on a tangent and break the ground rules. DFakto’s 10 years of experience in implementing data vaulting has taught us that, in every case where we have not followed the rules, we have paid the price afterwards and have had to realign the installation with the basics.

data-faulting

In the same vein, some of the customers and prospects we’ve met over the last 10 years have shown us one and another implementation of the data vault that was in fact anything but that. Rather than looking at the theory, I recommend, especially for early implementations, working with consultants (1) who will speak your business language, (2) who have already implemented data vaulting and can show you a real implementation of a data-vaulting solution that is actually being used by customers who can attest to their experience, and (3) with whom you feel comfortable.

Pin It