Background to the pattern of the "Context Plane"
How I think about our current version of the "Context Plane"
How I think about our current version of the "Context Plane".
“Context” of this post
I often find writing helps me coalesce and refine my thoughts when new patterns start to emerge, but aren’t very clear yet.
So this article is a brain dump / train of thought continuation of the architecture needed to have one Context Plane to rule them all, as part of a proposed “AI Data Stack”.
This article provides an overview of how I think of the Context Plane in our AgileData Product as that anchors a lot of my thinking for what the Context Plane in a new “AI Data Stack” would look like.
First defining our current Context Plane pattern
When we started building the AgileData Platform and AgileData App, Nigel and I agreed on a set of core principles and patterns that we would align to as much as was practical.
One of these was the use of what we called Config (what I now refer to as Context) which should be at the center of everything we would build.
Context is a Pet not Cattle
The Context Plane is where we hold everything we care about that makes our AgileData Platform and AgileData App work, apart from our customers actual data.
We treat both the Context we hold and the Customers data we hold as Pets, and we treat everything else in our Platform as Cattle.
We generate, deploy, execute and destroy code at will. We automate the generation of all code from the Context we hold.
Context drives our App and our Platform
For example everything displayed on our Data Catalog screen is stored in our Context Plane, and then rendered in the AgileData App as needed.
The Concept, Conceptual, Logical and Physical data models are stored in the Context Plane and again rendered in the AgileData App, or used as part of our DataOps processes as and when required.
We define and hold Core Business Concepts, Detail for those Concepts and Core Business Events as the core “entities” in our Context Model.
Concepts, Details and Events are the semantic language we use as part of our Context Language.
We hold the relationships of how Core Business Concepts relate to Core Business Events in the Context Plane.
Which allows us to render and explore a “Graph” view of these relationships in our App.
For those old school of us, we also render these relationships as Bus Matrix in our App. Out Context Plane allows us to render the same Context in any visualisation or Map format that helps simplify complex data tasks.
Our version of Data Quality (Trust Rules) and Observability (Notifications) are again stored in the Context Plane and rendered / used as needed.
Context drives the generation of our code
Our version of Data Transformation logic (Change Rules) is you guessed it, stored in the Context Plane and then used as and when required.
If you perused that last series of screenshots you will see we actually hold surface the Change Rule logic in multiple languages, we surface it via the UI notebook style, we surface it as form of a Gherkin script and we surface the SQL that it is generating to load or transform data in Google BigQuery.
These are all dynamically generated from the Context stored in the Context Plane for each Change Rule we hold. We have defined it once and we can render it in many different ways, many different languages and for many different use cases.
Some people call this a table driven pattern, a metadata driven pattern or a model driven pattern etc.
Orchestrate via dynamic manifests based on Context
To orchestrate the execution of the Data Transformations (Change Rules) we of course build out a Directed Graph to run them in the order they need to be run, and to honour the dependencies they need to honour.
But this Directed Graph is dynamically generated.
If I go and add another Change Rule, it is stored in Context Plane and the next time I come into this screen it will be dynamically rendered including that rule.
When a new record turns up in a table on the left, it will trigger a bunch of steps.
The Context Plane will be queried to see all the Change Rules that are dependent on that data.
The Context Plane will be queried to find the dependencies between the Change Rules that will run, and tables they will load.
A “Manifest” is created to describe what Change Rules need to run.
The SQL code for each relevant Change Rule will be generated from the Context stored in the Context Plane about that Change Rule
The code will be submitted to run using a Pub/Sub model, in parallel.
When all the code has been successfully run, the code and the manifest will be destroyed. (We do hold a copy of the code that was executed in our Audit Vault, so we can always view what was run when).
The latest Observability and Trust data will be stored in the Context Plane.
As we hold the Context of how every physical table (Tile in our language) relates to every other table (Tile), via these Data Models / Change Rules, we can easily show all the dependencies (Related Tiles) for a specific table (Tile), when viewing the Context of that table (Tile).
This includes the relationships for both the Data Model. i.e for this Concept of “customer” we hold these Details, and that Concept of “customer” is part of these four Events, as well as the relationships for all the Change and Trust and Consume rules that the “customer” Concept is part of, i.e this Detail tile is loaded by this Change Rule and validated by these Trust Rules.
Multiple languages to create Context
You can create new Context in the Context Plane via the AgileData App UI, or you can use the API’s to create the Context.
It would be relatively simple to allow the creation of the Context in yet another language, for example by uploading YAML files.
(Fun fact the first interface we had to create Context was Google Sheets!)
One set of Context to rule them all
So you can see how the Context we hold on in our Context Plane is the thing that drives everything our AgileData Platform and AgileData App does.
Context is first and powers everything that happens
I know what you thinking, this sounds a little bit like a data catalog, or a metadata repository.
But heres the thing, data catalogs run by sucking on the Data and Metadata exhaust.
We treat Context as the fuel that powers everything we do.
You cannot run anything if Context does not exist for that thing.
I cannot create a data catalog object, I cannot create a table, I cannot create code or run it, without it existing in the Context Plane.
I first create the Context and then the AgileData Platform then does all the other horrible and complex data work for me.
Its about more than just data and code
As you would expect of any mature data platform, we have added “Data Management” and “Data Governance” capabilities over the years where we found them useful.
One of those is the ability to store the Context needed to render a “Data Dictionary”.
But we are still missing a lot of useful Context
We only build things in our AgileData Platform and AgileData App, when they save us time or automate something we hate doing.
The Data Dictionary feature was built so we could find fields we needed to find by searching all the Context of all the data we hold for a customer.
There is some other Context we know will give us value that we have yet to add.
Things like Context typically related with Business Glossaries, Metric Layers/Engines and Semantic BI Layers to name three.
I believe they are all part of the Context that is needed in a Context Plane to make it useful for AI Agents.
Which will be the next article I write to help me with my thinking.
Wood from the Trees
Still a way to go before I have a coherent set of Patterns that I can Coach / Mentor / Teach somebody else for the “Context Plane”, and the “AI Data Stack” or present as a robust Architecture map.
But as I have already said, writing my half formed ideas helps me think.
An incoherent stream of thought
You can find all the articles with my thoughts combined on this over at: