Things in the "Context Plane"
How I think about the things that need to be in the "Context Plane" to power "AI Agents"
How I think about the things that need to be in the "Context Plane" to power "AI Agents"
“Context” of this post
I often find writing helps me coalesce and refine my thoughts when new patterns start to emerge, but aren’t very clear yet.
So this article is a brain dump / train of thought continuation of the architecture needed to have one Context Plane to rule them all, as part of a proposed “AI Data Stack”.
This article provides an overview of the things or metadata I think should be included in the Context Plane as part of a new “AI Data Stack”.
The “Context Plane”
My current thinking is the architecture diagram for the “Context Plane” should look something like this:
Things stored in the Context Plane
So with that context for how I currently think of the “Context Plane” lets get into the meat of this article, of what I think should be stored in that plane.
I have worked in the data domain for over three decades. The data patterns and the data terms I regularly use I have learnt over those decades. I find them very difficult to unlearn.
As somebody mentioned to me lately I am haunted by “the Ghost of Data Past”.
But also there is a recurring pattern where people seem to invent a new term for a pattern that has been around for decades (medallion data layered architecture anyone), so the patterns of the past do have value.
(and yes I realise the dichotomy of me talking about “Context” not “Semantics” in the same article where I point the finger at the “Medallion Architecture”)
So with all that said, let me use the data components of the past to frame what we need to hold and surface via a “Context Plane” to support what a “AI Data Stack” future might look like.
I originally thought of the boxes in the “Context Plane” as components, things with clear boundaries.
But as I started to try and define the descriptions for each of these components with clarity, and how they could be used by the AI Agent, I realised that I don’t have that clarity, yet.
I can’t clearly articulate the boundary of some of the components as they clearly overlap, or hold things that are subsets of another components.
So I iterated the focus of this article to be describing the things, and try and create an abstracted version of those things that do not overlap.
This is deffo a train of thought article, the journey is as important as the final destination.
Business Glossary
The Business Glossary stores a list of agreed-upon definitions for organisational terms.
This holds a list of plain-language descriptions and aliases that standardise terminology across stakeholders, ensuring consistent interpretation and usage of key terms such as ‘Customer’, ‘Order’, or ‘Churn’.
Core Business Concepts
Core Business Concepts stores the primary things (‘entities’) an organisation manages or counts. This includes concepts such as ‘Customer’, ‘Product’, ‘Order’, and ‘Payment’.
This holds a list of the definitions and identifiers for the Core Business Concepts.
It contains a subset of the terms that exist in the Business Glossary and forms the foundation of the organisations data design.
This describes a list of “things”.
Core Business Processes
Core Business Processes stores the major operational workflows, events or life cycles (‘relationships’) within an organisation.
This holds a list of the interactions between Core Business Concepts, representing relationships such as “Customer Places Order” and “Customer Pays for Order”
This also holds the sequence of activities or states that Core Business Concepts transition through, such as invoice approval or support resolution (which I call “Administration Processes”).
This describes ‘relationships’ between “things”.
With the current definition its not going to hold hierarchy relationships Product Category > Product, as that relationship is not part of a relationship between Core Business Concepts, nor is it a state change within a Core Business Concept.
Something to be resolved.
Conceptual Data Model
The Conceptual Data Model stores a high-level map or diagram that provides an abstraction of the organisation’s data design.
This holds information that outlines the Core Business Concepts, Core Business Processes the other relationships between Core Business Concepts in a platform-independent format and without specifying implementation details.
The Conceptual Data Model can also hold the objects that generate the Diagram, so overlaps with the Core Business Concepts and Core Business Processes.
This describes “things” and the ‘relationships’ between “things”
Logical Data Model
The Logical Data Model stores a more detailed map or diagram that provides Details (‘attributes’) that are part of the organisation’s data design.
This holds information that outlines the Core Business Concepts, Core Business Processes / Relationship’s in a platform-independent format.
It also holds the Detail (‘attributes’), that are related to a Core Business Concept, in a platform-independent format
This describes the Details of “things”.
Physical Data Model
The Physical Data Model stores a technical implementation view of how the organisation’s data is stored and accessed within a specific system.
This holds information on the the physical structures such as tables, columns, data types, indexes, partitions, and naming conventions used in the data platform.
It is derived from the Logical Data Model but adds system-specific optimisations and constraints.
It overlaps with the Logical Data Model by representing the same Core Business Concepts and their Details but also includes physical performance and deployment details that are specific to the technology used.
This describes how to implement the “things”, their ‘relationships’, and their Details.
Data Contracts
Data Contracts store the expected structure, quality, and behaviour of data exchanged between systems, or components. They describe formal agreements between a data producer and a data consumer.
This holds information such as schema definitions, required fields, data types, validation rules, and guarantees such as update frequency, timeliness, or completeness.
It overlaps with the Logical Data Model and Physical Data Model by referencing the same attributes and structures, but focuses on expectations and enforcement rather than storage or modelling.
This describes the Rules that govern the movement of the data.
Data Dictionary
The Data Dictionary lists and describes the metadata for data fields used across the organisation’s data assets.
This holds a list and information for data fields, such as field names, data types, descriptions, allowable values, formats and default values.
It overlaps with the Business Glossary. Logical and Physical Data Models by documenting the same ‘attributes’.
This describes how “things” and their Details were implemented.
Data Profiles
Data Profiles provide statistical summaries and characteristics of the data physically stored.
This holds data such as minimum and maximum values, distribution ranges, cardinality, null counts, uniqueness, and data type conformity.
They overlap with the Data Dictionary (by describing the same fields) and with Facts (by profiling the raw values captured).
This describes the observed characteristics of the Details of “things” and ‘relationships’ between “things”.
Transformation Logic
Transformation Logic stores how raw or source data is converted into a trusted, usable form.
This holds the business rules, mappings, calculations, filters, joins and aggregation logic or code applied to reshape data.
It may reference fields defined in the Logical or Physical Data Models and rely on terms from the Business Glossary and Core Business Concepts to apply meaning.
It overlaps with Measures, Metrics, Actions, and Information Products by forming part of the logic used to answer questions or trigger downstream outcomes.
This describes how the raw “things”, Details of “things” and the ‘relationships’ between “things” were reshaped into something more useful.
Facts
Facts are raw numerical values stored or sourced directly from systems.
This holds a list of data such as quantities, amounts, durations.
Facts are the foundational inputs used in downstream calculations. They are not aggregated or transformed, but instead represent the atomic, immutable data from systems where that data is first captured.
They overlap with the Business Glossary, Physical Data Model and Data Dictionary where they are described and with Measures and Metrics, which are derived from them.
This describes the raw numerical values about “things”.
Yes I know this is a different definition from the typical definition of a “Fact” from the Dimensional Modeling pattern, but I am trying to abstract these things from specific data modeling or technology patterns.
Measures
Measures are standardised aggregations of raw Facts.
This holds the logic or code for calculations such as “total revenue”, “count of orders”, or “average order amount”, typically derived using functions like sum, count, min, max, or average based on a defined Fact.
They overlap with Facts (as their source), Transformation Logic (which defines how they are calculated), and Metrics (which may use them as components).
This describes aggregated values about the Details of “things”.
Metrics
Metrics are calculated formulas that combine Measures, Facts, or other Metrics to express business performance or operational efficiency.
This holds the logic or code for calculations such as “average revenue per customer”, “conversion rate”, or “churn percentage”, typically involving arithmetic, ratios, or conditional logic.
They overlap with Measures (which they reference), Transformation Logic (which defines their formulas), and Business Questions (which they help answer).
This describes derived insights from aggregations of the Details of “things”.
Business Questions
Business Questions are predefined or commonly asked queries that reflect specific decision-making needs within the organisation.
This holds a list of the questions that have been asked before such as “What is our monthly churn rate?”, “Which products are underperforming?”, or “How many new customers joined last quarter?”
They overlap with Metrics (which may be the answer), Transformation Logic (which defines how to calculate the answer), and Information Apps (where answers are delivered).
This describes the repeatable questions asked about the “things”, Details of “things” and ‘relationships’ between “things” to support decisions to take action.
Actions
Actions define the decisions, tasks, or system behaviours triggered as a result of answering a Business Question.
This holds a list of actions that have been taken in the past or need to be taken int he future.
It also holds information of the mappings between specific Business Questions and the Actions they enable, such as “Identify outstanding support tickets” and “resolve outstanding support tickets,” or “Identify Customers who havent placed an order in the last 6 months” and “Offer discount to Customers to reduce the rate of customer churn.”
They overlap with Business Questions (which initiate them), Information Apps (where they may be launched), and Core Business Processes (which they may automate or influence).
This describes the decisions and actions we take based on what we know about the “things”, Details of “things” and ‘relationships’ between “things”
Information Apps
Information Apps are curated outputs that present data and information in a usable form to answer Business Questions and support Actions.
This holds lists and information about dashboards, reports, visualisations, datasets, APIs, data servies, and user interfaces that package answers to Business Questions and trigger related Actions.
They overlap with Business Questions (which they answer), Actions (which they enable), and Facts, Metrics and Measures (which they visualise or expose).
This describes the delivery mechanism that provides access to what we know about the “things”, Details of “things” and ‘relationships’ between “things”
Fragmentation and Overlaps
I realised as I wrote this one of the major problems is deffo “the Ghost of Data Past” and specifically the fragmented set of data technology and tools that I have used over the years. These set of data technology and tools helps define the architecture and language I use, and also influences the patterns I applied to list the things that should be stored in the ”Context Plane”.
I have ended up using these technologies and tools to define the things that are needed in the “Context Plane” and also to define boxes or boundary around things.
The problem I hit is when I use these standard languages / boundaries from the data domain for those things then I get overlaps, there are things that are in within / across multiple boundaries.
For example Core Business Concepts are held/described within both a Business Glossary and a Conceptual Model.
An example using a Metric
Metrics are a part of a Business Question, but also defined in both a Metrics Layer / Tool, Business Glossary and potentially a Data Dictionary.
Lets look at an example of the overlaps in more detail using a specific Metric ‘Active Users’.
Metrics:
'Active users' is the number of people who engaged with our site or app in the specified date range.
An active user is any user who has an engaged session or when Analytics collects:
* the first_visit event or engagement_time_msec parameter from a website
* the first_open event or engagement_time_msec parameter from an Android app
* the first_open or user_engagement event from an iOS app
The user is considered an active user as soon as the user_engagement event is detected within a second.
(Source: https://support.google.com/analytics/answer/12253918?hl=en)
Business Questions:
"The Number of Active Users who accessed our Website last month."
"The Number of Active Users who accessed our App last month."
The Metrics Layer often holds the relationship the Metric has with the Core Business Processes, the Business Glossary often does not.
Yaml Metric Definition in a Metrics Layer tool:
views:
- name: active_users
description: "14 days rolling count of active users"
includes:
# Measure
- users.rolling_count
# Dimensions
- users.is_paying
- users.signup_date
- company.name
(Source: https://cube.dev/blog/introducing-views)
Business Glossary:
Active user
Active users are the people who currently use your product. A user who becomes inactive may have churned.
(Source: https://posthog.com/docs/glossary)
As you can see there are overlaps in what we define for the Metric of “Active Users”, depending on which traditional data component we define it in.
Removing the Boundaries and Overlaps
When I get stuck with a problem like this I find the key is to keep breaking it down into smaller and smaller things until I get to a list of atomic things that are unique.
Here is where I have ended up with two different languages I can use so far:
(compiled with some help from my ChatGPT friend using the above as the input, and then the human iterated it)
The list of Context we store
Or put another way the things we store.
Business Terms
Agreed-upon definitions and aliases used across the organisation (from the Business Glossary).
Core Business Concepts (entities)
The primary entities the organisation manages or counts (e.g. Customer, Product, Order).
Core Business Processes
The workflows, events, or life cycles that define how Core Business Concepts interact and change over time.
Conceptual Relationships
High-level, platform-independent relationships between Core Business Concepts.
Details (attributes)
Descriptive details of Core Business Concepts, such as name, status, date of birth (from the Logical Model).
Field Metadata
Technical descriptions of fields, such as data types, formats, default values, and allowable values (from Logical Data Model and Data Dictionary).
Physical Structures
Tables, columns, data types, and indexes used to implement data in a specific platform (from the Physical Model and Data Dictionary).
Data Rules and Expectations
Contractual schema definitions and guarantees that govern how data is exchanged and validated (from Data Contracts).
Statistical Profiles
Observed characteristics of data, including distributions, cardinality, and null rates (from Data Profiles).
Transformation Logic
Business rules and operations that convert raw data into usable outputs (e.g. mappings, joins, calculations).
Raw Numerical Values
Source data captured as facts, such as quantity, amount, or duration (from Facts).
Aggregated Values
Summarised calculations based on raw numerical values (from Measures).
Derived Metrics
Formulas combining multiple measures or facts to express performance or ratios (from Metrics).
Business Questions
Repeatable questions that drive insight and inform decisions (from Business Questions).
Actions
Operational steps or decisions taken based on answers to Business Questions.
Delivery Interfaces
Mechanisms that present data and enable interactions, such as dashboards, reports, datasets, and APIs (from Information Apps).
The types of Context we store
Or put it another way, how it is stored.
Lists
Repeating sets of labelled items.
Examples:
– List of business terms (Business Glossary)
– List of fields (Data Dictionary)
– List of metrics, measures, concepts, questions, or actions
– List of attributes, rules, or processes
Definitions
Text-based descriptions that explain meaning, purpose, or intent.
Examples:
– Term definitions
– Field descriptions
– Action explanations
– Business question phrasing
Identifiers
Unique keys, codes, or labels used to reference or join data.
Examples:
– Concept identifiers
– Field names
– Relationship keys
Relationships
Structured mappings between two or more things.
Examples:
– Concept A interacts with Concept B
– Concept A, Concept B and Concept C are involved in Process A– Detail belongs to Concept (attribute belongs to entity)
– Measure uses Fact
– Metric uses Measure– Question → Metric → Action
– Term → Concept → Field
Rules / Logic
Outputs produced from raw data using logic or formulas or expressions that define behaviour or constraints.
Examples:
– Validation rules
– Data contracts
– Calculation logic
– Transformation pipelines
– Measures (aggregated)
– Metrics (calculated)
Structures / Schemas
Models that define how things are organised or composed.
Examples:
– Logical/physical data models
– API schemas
– Table definitions
Raw Data / Values
Captured data points or measurements.
Examples:
– Facts (quantities, amounts)
– Timestamps
– Event logs
– Data Profile results
– Data Quality results
Interfaces / Outputs
Representations used for delivery or consumption.
Examples:
– Information Apps
– Dashboards– Reports
– APIs
– Data services
Time to Cross Domains
As I mentioned at the start, my experience and expertise is founded in the data domain.
I think its time to look for help from the other domains, and for this one specifically the Library and Information Sciences domain.
Juan Sequeda posted an LinkedIn article on how he is thinking about metadata/ontologies/knowledge graph/semantic layers. In that article he uses this framing:
I can probably simplify “the List of Context we store” and “the Type of Context we store” above to the Business/Technical/Mapping metadata categories in that diagram.
But Im not sure simplifying it will get me any closer to achieving my goal, I think I need to go down into the weeds a little more to get the clarity I need.
One option to do this safely I think I need to identify a number of use cases where “AI” will use the “Context Plane” and see what I need to provide it to be successful.
Another option is to try and collaborate with an expert from the Library and Information Sciences domain to help add their views and language to see if it gets me to the next step.
The other option is to follow a suggestion Joe Reis made on the Practical Data Modeling Discord,
“As a thought experiment, ask any AI what it needs. It’s very different from what we’ve devised so far”.
Time to have a little think about what the next step in my train of thought for the “AI Data Stack” will be.
Wood from the Trees
Still a way to go before I have a coherent set of Patterns that I can Coach / Mentor / Teach somebody else for the “Context Plane”, and the “AI Data Stack” or present as a robust Architecture map.
But as I have already said, writing my half formed ideas helps me think.
An incoherent stream of thought
You can find all the previous articles with my train of thought combined on this article over at: