#05 - Data Search
Why use complex query tools to find the data you need? When you can just search for your data values like Google Search.
What
Search your data as effortlessly as using Google Search, finding the exact data you need in seconds.
Simply type in any data value you want to find in a Tile (table) and the AgileData App will automagically search every row and every column in that Tile and it returns the rows that have that value.
Data Search enables Data Persona to look beyond metadata and dive into the actual content of datasets, finding the specific data values or patterns directly within the cataloged data.
This makes finding and using relevant data faster, removing the need to write query code or the need to use a third party query tools to find the data they need to complete their current data task.
Feature Requirement
The system must allow users to perform data search within the catalog, enabling them to search not only metadata but also the actual content of datasets for specific values or patterns.
Requirement Rationale
Catalog data search enhances the ability to find relevant information quickly, supports deeper data exploration, and improves the efficiency of data analysis by allowing users to locate specific data points directly within the cataloged datasets, and minimising the need to write code or use third party query tools.
How
Why
This feature was built out of Nigel’s frustration with me asking him to write and run queries for me all the time.
Like Track N Trace, it is another one of those small micro features that we built to remove a frustration that ended up being one of the surprise features I use a lot.
As I often joke “I can't code, won’t code, don’t code”.
This statement is not quite true, while I can’t sit at a blank screen and write code from scratch, I can read and edit other peoples code, Im ok at googling and reusing code snippets and with the advent of LLM co-pilots I am even more proficient with cobbling code examples together to get a data job done.
But one of our core principles is if I have to write code in our AgileData App, or heaven forbid I have to write and execute code in BigQuery Studio in the Google Cloud Console, then we have some work to do to make the no/low code experience magical.
To be clear, a magical no code solution is nirvana for us.
The analogy I often use is the “Magic Sorting Hat”, stealing a pattern from Harry Potter. If we could build a magic sorting hat that would allow us to dump any of our customers data, of any complexity, into that hat, and it then did all the data work we have to do manually, automagically, then we are Done Done. I don’t believe that I will see that in my working lifetime, but then again, I wouldn’t have believed a tool like ChatGPT would have existed in my lifetime either.
As we are bootstrapping, so don’t have billions of VC money to build the magic sorting hat, and are not young enough anymore to couch surf and subsist on 2 minutes noodles, we are taking a softly softly catchy monkey approach to this, implementing AskAI, AssistedAI and AutomatedAI patterns where they add value.
We are iteratively moving towards the magic sorting hat, one small step at a time, knowing we are unlikely to get to the final end we desire, but still getting massive value and reducing the complexity of managing data on every step of the journey.
An interesting point from the three AI patterns lens is that we decided to use a Search pattern to solve this problem, not an AskAI chat pattern.
Back to Data Search.
I use this feature in a few different ways, often in combination with Track N Trace.
One use case is when I want to see the Details that relate to a specific Concept.
For example I have a Concept of Customer and I have some Details about the Customer, attributes such as name, date of birth, phone number etc. For a specific customer record there is a problem with the data, maybe the phone number isn’t what is expected.
So I will go into the Catalog Detail screen for the relevant Tile and search for that Customer, maybe based on their customer id, maybe on their customer name. I will get back a row (or rows) for any row/column that has that value, and I can then peruse all the values on that row.
Or maybe I want to search for a specific phone number to find which Customer(s) it is stored against.
Nothing that can’t be done using a SQL query tool, but remember I won’t code and I never want to have to leave the AgileData App to get the data work done.
And Nigel doesn’t want to write and run queries for me, hence this feature appearing.
Before this feature it was either Ask Nigel, use the BigQuery Studio Console or push the data to a Consume Tile and use Looker Studio on it.
The Data Search feature reduces the time, my effort and my cognition to find the data I need, which is what we are all about.
After using this feature for a while, there are a few things we need to iterate.
It would be good to be able to have more control on the columns that are searched and to be able to search for two different values at the same time. Of course those iterations will start us down the journey of building a query tool in the app, not sure that is something we want to do.
And I would love to be able to search every Tile for a data value, a bit like Track N Trace.
One of the secrets of Data Search is we are leveraging a BigQuery SEARCH() function for this, we are not writing and running a custom SQL query.
For now I will keep saving time when I need to find a data value by using it, and sleep well knowing I haven’t had to Ask Nigel to “write me another quick query”.