#06 - DataOps Notifications
Why do your data work with the equivalent of a blindfold on? When you can observe everything that is going on in the AgileData Platform, as it happens!
What
Observe everything that is going on in the AgileData Platform, as it happens.
Stay connected and observe every important change within the AgileData Platform with real-time notifications. Designed to keep you informed, this feature ensures you never miss critical updates or changes to your data.
Whether it’s a completed data refresh, a new data Tile becoming available, or an alert about an issue with Trust Rules, notifications provide instant insights into every key data flow in the platform.
Integrated seamlessly into the AgileData App, notifications empower you to act quickly, maintain data trust, and stay constantly informed about your data work.
Feature Requirement
The system must provide real-time notifications to data users, keeping them informed about significant changes, updates, or alerts within the AgileData Platform that directly impact their data tasks.
Requirement Rationale
Notifications ensure data users stay connected and informed about critical events, such as data refreshes, data quality alerts, or the availability of new data assets. This empowers them to act promptly, maintain trust in their data, and complete their data tasks efficiently without missing key updates or changes.
How
Why
This was another scale the machine not the Nigel, driven feature.
In the early days of the AgileData App, when something didn’t seem to work properly I had to go into the Google Cloud Console and search the logs to see what had happened.
As you can imagine I didn’t do this, I just slacked Nigel and asked him to have a look
As you can also imagine Nigel wasn’t a fan of this approach so he built the Notifications feature.
The key to this feature is it shows everything that happens in the AgileData Platform.
Again as you can imagine this causes a signal vs noise problem, which has been a major focus of iterating this feature over the years.
Real-time notifications are visible at a glance at the top menu of the app. This means when I do a data task, for example uploading a manual csv file, or manually running a data flow, I can quickly observe it happening at a glance.
This removes the need for me to go to the Notifications screen to watch things run, I can stay in the screen that does the data task I need to do, while still observing the process.
To provide this real-time view of the Notifications was a major piece of platform engineering, as it required the use of a Websockets component to provide the pub/sub pattern.
Real-time notifications are great, but you have to have the AgileData App open in a browser to see them.
Often I want to check all the Notifications for a Customers Tenancy that I am not currently logged into.
So we built the Notifications screen.
This screen shows all the Notifications for the Tenancy.
The initial versions of this screen hit the signal vs noise problem, in multiple ways.
First was the fact that there are lots of notifications and loading them for all time was slow. So we defaulted it to only show the last 24 hours, and then provided an easy way to extend the time window to show older notifications.
It’s interesting how rarely I need to extend the time window, most times a snapshot of notifications over the last 24 hours is enough for me to get the data task done.
As I mentioned earlier the Notifications screen shows everything that is happening on the AgileData Platform.
We display when Tiles are created, new data is loaded, Change Rules are iterated, data fails Trust Rule validation.
Nigel is constantly adding more events to the Notifications as we find more things we want to be able to easily observe and investigate.
We also show every successful event and every event that is treated as a failure.
This amplifies the signal vs noise problem,
First thing we did was to apply a DataOps scorecard pattern at the top of the screen that allows you to see the number of Notifications within a specific category.
The key categories we found valuable.
Success vs Failure
System identified Anomalies
Trust Rule identified anomalies
Data Freshness anomalies
This scorecard is interactive, click on Anomalies indicator and the Notification list will filter to only those Notifications.
Saves a few clicks and as I have mentioned in previous articles, a few clicks and seconds saved in a data task you do repeatably all add up to minutes and hours saved overtime.
We added search to enable you to filter the Notifications based on any freeform text.
The next thing we found useful was adding Context to a Notification type.
For example in the Tile load notification we show a small bar graph showing the number of records loaded and a red line which shows the average load volume for that Tile. This allows me to quickly confirm at a glance that the load volume is within the expected threshold, without needing to click the Notification to see more details.
These glance widgets are context sensitive to the Notification type.
At any time I can click on the Menu Anywhere (3 dots) for a Notification and drill down to more detail.
For example drill down to the Trust Rule that has generated an anomaly, or the Change Rule that was used to load data into the Tile.
I can also see the detail behind any Notification.
Again we provide context sensitive details based on the Notification Type.
For Tile Load Notifications we will show the load stats for the historical loads for that Tile.
For an error Notification we used to show the BigQuery error.
As you can imagine, as soon as I saw one of those I would just message Nigel to have a look at it and tell me what I needed to change.
And again as you can imagine Nigel automated that task, so he doesn’t have to do it for me.
We now pass the error to the Google Gemini LLM and get it to return both a description of what the error is likely to mean, and a suggestion on how to resolve it.
Unsurprisingly this has resulted in me resolving a lot of the issues myself.
We also broadcast some of the Notifications to Private slack channels for a Tenancy. This allows us to view Notifications from multiple AgleData tenancies in one place, without the need to login to each Tenancy and check the Notifications screen.
I have a backlogged feature to create a new cross tenancy capability where we can see Notifications for multiple tenancies in an AgileData App screen, a cross tenancy portal so to speak.
As we have AgileData Network Partners starting to support multiple customers in multiple tenancies, this feature will become important in reducing the time they spend monitors and managing their customers data, and also remove the need for us to use Slack for this purpose.