The data language for business

#AgileDataWow

Jul 23, 2024

I saw a post the other day where somebody was saying One Big Table (OBT) was a form of Dimensional Modeling and it triggered me, so I posted in the substack Practical Data Modeling community to both rant and to check I hadn’t missed a reclassification of what a Dimensional Model actually was (hint it has Facts and Dims).

One of the responses to my post got me thinking about the difference between the terms non data people typically understand vs the terms we use as data professionals.

Define one thing as only one thing

When I work in an organisation I become incredibly pedantic about the semantic terms I use, is Active Marketing Customer calculated differently to Active Risk Customer?

Is there an agreed definition of the term Customer on its own?

As part of building AgileData I have been constantly testing data terms with non data people and seeing what they get straight away and what they don’t.

And when they don’t get it straight away I try something else until they do.

I say things like “we need to quickly Design the Data so we can make sure it will be fit for purpose”.

To me this is a form of Conceptual Data Modeling, but that is only a term I use with data professionals (and more and more only with a subset of them).

If I start talking about Data Modeling I can see the non data professionals eyes immediately start to glaze over.

I use Data Design over Data Modeling.

Which made me think about the other semantic terms I switch between.

Information Product vs Data Asset

An Information Product is the thing the stakeholder consumes, a Data Asset is the things we use to build it, tables, code etc.

Concept vs Entity/Dim/Hub

A Core Business Concept is something in the organisation we want to manage and typically count. We have a concept of a Customer, a Supplier, an Employee, a product, an Order, a Payment.

These become entities, dimensions, hubs depending on the physical data modeling pattern we are using.

Details vs attributes/fields/columns

Details describe the Concepts. I have a Customer Name, a Product SKU, an Order Date, a Payment Amount.

Again these can be physically modeled / stored in different ways depending on your favourite physical data modeling pattern.

Events vs Relationships/Facts/Hubs

This one I am not 💯 on, but its the best I have found (for now) and it does work to a degree.

But I tend to use some other words to describe it, which proves its not perfect.

I can see an Event happen in the data that represents a Core Business Process. I can see a bunch of Concepts happen together that proves the Core Business Process happened. A Customer Orders a Product, a Customer Pays for the Order.

Again I can physically model this as a data event, a set of relations, a fact, a link.

But see how I have reused the term Event to describe both the Data Design version and the Data Modeling version, this is how I know I haven’t got it 💯, ideally a term should only be single thing

Not perfect, but still valuable.

I have a bunch of other terms that I find valuable to use when talking to non data people.

But that is a post for another day.

Márton Horváth

Jul 25

Imagine us in Europe where we also have to switch to the local languages. It's a mess! :D

Expand full comment

1 reply by Shagility

Johnny Winter

Jul 23

We currently refer to our conceptual data modelling as "data product design". When I run those conceptual modelling sessions, I very deliberately avoid the terms fact and dimension (even though our conceptual models are ultimately high level star schemas). I often use the work entity as a substitute for dimension, but I think I prefer your term of concept and detail. Event as fact makes sense to me.

2 more comments...

Agile Data N’ Info

Discussion about this post