Orchestrating Dynamic Data Flows, AgileData Engineering Pattern #1

The Dynamic Data Flow Orchestration pattern dynamically generates and self-heals data flow manifests (DAGs) at runtime from a central context repository

Jun 30, 2025

Quicklinks

Description
Context Diagram
Pattern Template
Press Release Template
AgileData App / Platform Example
AgileData Podcast Episode
AgileData Podcast Transcript

Agile Data Engineering Pattern

An AgileData Engineering Pattern is a repeatable, proven approach for solving a common data engineering challenge in a simple, consistent, and scalable way, designed to reduce rework, speed up delivery, and embed quality by default.

Orchestrating Dynamic Data Flows

The Dynamic Data Flow Orchestration pattern dynamically generates and self-heals data flow manifests (DAGs) at runtime from a central context repository, enabling adaptive, cost-effective, and low-latency data processing primarily for micro-batching workloads

Pattern Context Diagram

«TBD»

Pattern Template

Pattern Name

Dynamic Data Flow Orchestration

The Problem It Solves

You know that moment when you've updated a data rule or schema, only to find your data pipelines have broken or are out of sync, forcing hours of manual fixes? Or perhaps you're struggling with inconsistent DAG deployments, especially when multiple engineers are making changes, leading to "drift" in what should be running? This pattern addresses the recurring challenge of maintaining data flows that are constantly up-to-date, reliable, and cost-effective, without constant manual intervention or the "sinking problem" between rules and physical DAGs.

When to Use It

Use this pattern when:

You need data pipelines to self-heal and dynamically adapt to changes in rules, schemas, or dependencies.
You require your data lineage and manifests to be always up-to-date at runtime.
You are primarily working with microbatching workloads, where refresh latency is typically 15 minutes or more.
Your pipelines are often event-driven, starting when new data or files arrive.
You want to achieve low latency and low operational costs by avoiding persistent orchestration servers.
Maintaining auditability and preventing invalid changes in your data logic is critical.

How It Works

Trigger:

The pattern is initiated when:

The first piece of data arrives, triggering a refresh of a data table.
A scheduled execution occurs (e.g., a daily run at 7:00 AM).
A new row of data or file appears in a dependent upstream table, if the process is configured for "autosync".

Inputs:

Data transformation and loading rules, stored within a context database.
Configuration details, including attributes like "autosync" or "manual sync".

Steps:

Define Rules in Context: Data rules and logic for transformations are created and stored in a central "context database".
Initial Trigger and Lookup: When data arrives or a schedule triggers, a refresh of the relevant data table is initiated. Once loaded, the system pauses and performs a lookup in the context database to identify all downstream objects dependent on this table.
Dynamic Manifest Generation: A manifest (or Directed Acyclic Graph - DAG) is built on-the-fly, outlining all necessary steps and their dependencies. This ensures the manifest always reflects the freshest logic.
Parallel Job Execution: All identified jobs in the manifest are "seeded off" simultaneously. Each job is tagged, and they run in parallel.
Pub/Sub Communication: As each job completes, it publishes an "I'm done" message via Pub/Sub. The system monitors these messages to track progress.
Iterative Orchestration: Once a "driving table" (a key dependency for the next stage) is loaded, another config lookup occurs, generating a new manifest to continue the flow. This repeats until all steps in the manifest are completed.
Change Validation and Self-Healing: Any changes to the context layer undergo validation before being deployed to production. If an upstream change (like adding a column) is detected, the system can automatically recreate tables, roll back watermarks, reload data, and revalidate rules downstream, handling 99% of common issues without manual intervention. Alerts are raised for changes that cannot be automatically resolved.

Outputs:

Always up-to-date and reliable data flows that adapt dynamically to changes.
Significantly reduced manual intervention for pipeline failures.
Low latency and cost-effective data processing due to serverless execution and no persistent orchestration server.
Data catalog and lineage graphs that are inherently accurate and current because they are driven directly from the context layer.

Why It Works

This pattern works because it embraces a context-driven approach. Instead of static, manually managed DAGs that can get out of sync with changing rules, all data logic is stored in a central repository. Data flows are then dynamically generated at runtime from this context, much like a recipe that's always updated with the freshest ingredients, ensuring what's running is always the latest version.

The use of Pub/Sub messaging and serverless infrastructure (like BigQuery and Cloud Run) enables a "fire and forget" execution model. This means there’s no constant, costly orchestration server running; jobs are spun up only when needed, then destroyed, leading to "very little latency" and "low costs".

Built-in validation and "stage gates" ensure that only valid changes are pushed to the production context, acting as a quality control filter that prevents broken pipelines before they even run. Furthermore, its self-healing nature means that common changes, like adding a column, are automatically handled, dramatically reducing manual firefighting and boosting trust in the data. This pattern effectively shifts the burden of dependency management from the data engineer to the automated system.

Real-World Example

Imagine a data engineer adds a new column to a table or renames an existing one. In older systems, this might require manual updates to numerous DAGs, leading to errors and delays. With Dynamic Data Flow Orchestration, the engineer simply updates the rule in the context database. The next time data arrives, or a scheduled execution occurs, the system detects this change via its validation checks. It then automatically recreates the affected table, rolls back its watermark, reloads the data, and revalidates all downstream rules. The data flows seamlessly adapt to the new structure, often handling 99% of such changes without any manual intervention, ensuring the data remains fresh and pipelines keep running smoothly.

Anti-Patterns or Gotchas

Streaming Data: This pattern is primarily designed for microbatching (data refresh latency of 15 minutes or more). For true real-time, row-by-row streaming, this pattern's natural breakdown into nodes and links could create bottlenecks. A different approach, like submitting the entire end-to-end transformation as a single code stream, or a "two-speed pipeline" (similar to a Lambda Architecture), might be required for lower latency streaming needs.
Skipping Context Definition: The system relies entirely on the context layer. Trying to create transformation code, tables, or schedules without first defining them in the context will not work, as it forces all work to be based on the latest, validated version.

Tips for Adoption

Prioritise Context Definition: Ensure all data rules, logic, and dependencies are meticulously captured in the context database, as this is the single source of truth for dynamic flow generation and lineage.
Leverage Validation: Implement robust validation mechanisms for any changes to the context layer. This ensures that invalid configurations don't reach production, preventing pipeline breaks. This includes integrating practices like peer review, Git review, PR pull requests, and data quality tests before context updates are active.
Embrace Serverless: Utilise cloud services like Google Cloud's BigQuery, Pub/Sub, Cloud Functions, and Cloud Run to fully benefit from the low-cost, fire-and-forget execution model.
Focus on Supporting Patterns: While the core orchestration pattern is robust, ensure that supporting patterns (like automated rebuilding, deployment, and destroy models for safe self-healing) are also "bulletproof" to avoid issues.
Start with Microbatches: Begin by applying this pattern to workloads that fit the microbatching use case (15 minutes + latency) to build familiarity and confidence.
Scalability Mindset: The pattern offers headroom for scaling (horizontal/vertical scaling of BigQuery instances, chunking data flows) if performance or cost issues arise, so be prepared to iterate.
Trust the Automation: Allow the system to self-heal. It can handle 99% of common changes automatically, reducing the need for manual intervention.

Related Patterns

Context Driven Development: This pattern is a direct application of the broader "context driven" principle, where all logic is dynamically generated from a central repository.
Pub/Sub Messaging: Utilised as the core communication mechanism between processing steps, enabling the fire-and-forget execution and decoupled architecture.
Layered Data Architecture: Operates within a typical layered data architecture (e.g., History, Design, Consume layers) common in modern data platforms.
Lambda Architecture: A hybrid approach (streaming + batching) that might be adopted when the anti-pattern of real-time streaming is encountered, augmenting this pattern.
Data Validation and Quality Gates: Integral to ensuring that changes to the context don't break downstream processes, encompassing peer review, Git PRs, and data quality tests.
Deploy and Destroy Infrastructure: The underlying cloud infrastructure model (e.g., Cloud Functions, Cloud Run) that allows resources to be provisioned only for the duration of a job, contributing to cost efficiency.

Press Release Template

Capability Name

Dynamic Data Flow Orchestration

Headline

New Dynamic Data Flow Orchestration Ensures Self-Healing, Cost-Effective, and Always Up-to-Date Data Pipelines for Data Teams

Introduction

The Data Platform team is thrilled to announce the launch of our new Dynamic Data Flow Orchestration capability. This revolutionary approach automates the management of data pipelines, dynamically adapting to changes in data rules and dependencies. It’s designed for data engineers and platform users, ensuring data flows are always current, reliable, and efficiently processed.

Problem

“As a data engineer, I used to dread making changes to data pipelines. Things would constantly get out of sync, and I’d spend hours manually fixing failures just because a dependency changed or a new column was added. It was hard to trust that what was running was the freshest version of our data flows.”

Solution

Our new Dynamic Data Flow Orchestration leverages a context database to store all data rules and logic. When new data arrives, or a scheduled trigger occurs, the system dynamically generates a manifest on-the-fly, identifying all downstream dependencies and launching jobs in parallel. This means data flows self-heal and adapt automatically to changes like new rules or renamed tables, eliminating manual syncing and reducing failures.

By using Pub/Sub messaging and serverless infrastructure like BigQuery and Cloud Run, pipelines run with very little latency and significantly lower costs, as there’s no constant orchestration server running. Validation mechanisms ensure only valid changes update the production context, preventing breaks. The system will even automatically rebuild and refresh tables downstream if upstream changes are detected, generally taking care of 99% of common issues without manual intervention.

Data Platform Product Manager

“With Dynamic Data Flow Orchestration, we’ve fundamentally improved the maintainability and reliability of our data pipelines. The self-healing nature and always-up-to-date lineage significantly boost trust in our data, allowing our engineers to focus on value creation rather than constant firefighting.”

Data Platform User

“I love that I don’t have to worry if the data I’m using is fresh or if a pipeline has broken behind the scenes. This system just works! I can add a new rule or even rename a table, and the data flows seamlessly adapt without me having to lift a finger.”

Get Started

This capability is active across our data platform today, ensuring reliable and fresh data for all users. To learn more about how Dynamic Data Flow Orchestration ensures always-fresh and robust data, please contact your data platform product manager or visit agiledata.io for documentation.

AgileData Podcast Episode

https://podcast.agiledata.io/e/orchestrating-dynamic-data-flows-agiledata-engineering-pattern-1-episode-67/

Listen to Podcast Epsiode

AgileData Podcast Episode Transcript

Shane: Welcome to the Agile Data Podcast. I'm Shane Gibson.

Nigel: And I'm Nigel Vining.

Shane: Hey, Nigel. Today we're trying a new thing. We are going to start a series where we effectively describe one Pattern in each episode that we use, and I'm thinking of it as pages in our Pattern library. The first one we wanna talk about is the way that we orchestrate the data flows in our way of working.

So why don't you start off. By giving me an overview of the patents that we use to engineer that particular task.

Nigel: So we do a little bit different after many years background, using the likes of Composer and Airflow to deploy dags and create run manifests we thought would be a little bit different and.

Use a context driven, generate the manifest on the fly approach. So effectively what we do is we create our data rules for transforming loading data. We put them into our context database, and then what happens is the first piece of data that turns up effectively triggers a refresh of that. Data table, and then this is where it changes from the usual.

So as soon as that table's loaded, we then pause and we do another lookup of the config, and we basically say, what are the downstream objects that hang off this table? Now that's loaded. From that, we build a manifest on the fly. So we effectively say, great, we need to load this table. Depends on this one.

So we will load it first. So we build that manifest and then we seed off all those jobs. We basically set 'em off all at the same time. We attach a tag to each of them and then we wait for them to finish. They all run in parallel, and as each one finishes, it checks that manifest and goes, am I the last?

Nope. Am I the last? Nope. Until we effectively get the driving table, which would start the next flow. So once that table comes back and says, yep, loaded, we do another config lookup and we start again, we create another manifest and we keep repeating this until we've effectively got nothing left in the manifest.

The. Beauty of this approach is in our early days we did a lot of rebuilds and changing context. Shane would introduce new contexts while jobs were already running. He'd delete tables or he'd recreate 'em on the different names. So the whole time what was happening is we were getting lots of failures until we went to this Pattern of context and manifest on demand, where effectively the whole time it was always checking and saying, oh, new piece of configs turned up.

Got another table to load and I would effectively slot it into the mix and it was all quite seamless and painless for me. So that's our Pattern a nutshell.

Shane: And the reason I like it is because if we think about each of our blobs of transformation code as been a rule, that's the term we use. I. I can go and add a new rule.

So I can say there's a table sitting in history, think raw bronze. I need to do something nasty to the data to bring it through into our design layer. And I can go and add a rule, I can add a small piece of logic that takes that data and moves it into our design layer. And as soon as I do that, the next time there's a scheduled execution or an execution that's triggered for a load, the manifest, the dag directed graph for that flow will dynamically rebuild itself.

And so it just self-heal. And then as you said, when I break my own rules for naming conventions, I can go and rename things. And again, that the data flows self-heal themselves. So effectively I can do the work I wanna do, and I don't need to care about what the directed graph for those data flows are. I don't need to care about the dependencies because the context engine.

This Pattern we use for orchestration of these data flows is doing it for me, right? I just don't care. And importantly, it's doing it for you 'cause you don't care either, right? Is effectively we can make any change we want and the machine just takes care of it. So I think for me, that was the biggest value.

But think about the alternatives. So one of the core patterns we use, which you can talk about in another podcast, is this idea of context driven. So everything we do, every piece of logic is stored in a repository, and everything that needs to execute is hydrated or generated dynamically from that repository.

So what that means is when we make a change, we can dynamically regenerate those data flows versus the old way of doing it, which is we'd hold that context in one place and we'd have a physical instantiation of it as a set of dags, and now I have to sync it. After, like, I wanna change the rules. Okay, now I need to change the dag.

And those things used to always get outta sync. We're kinda like, that's not the way it's meant to work. Oh shit. There's this piece of code stored as a stored process somewhere. As part of that dag. Well, the dags were a stored process and that sinking problem used to be a real problem in the old patterns.

Nigel: Yeah. And even today, I occasionally. Struggle with consistent DAG deployment. If there's multiple people, engineers, developers, making changes in that space, you can effectively get a little bit of drift of what should be running and what's associated with other things. 'cause there's a little bit of change going on as commits are happening, whereas this way we effectively leave it to runtime and it's at runtime.

As an object loads, we look at what's next. So it's always the freshest piece of lineage or manifest that you can have for any object. It's always up to date because it's always doing a query back to our context layer to say, Hey, what's next? And then it starts next, and that's cool until that's finished.

So I like it for that because it takes care of itself.

Shane: Also all of the maps that we generate, so our data catalog, our lineage graphs, everything else is driven off that context versus the previous generation where we would've treated all the code and all the schemas and all that structure and all those schedules as exhaust, we would've sucked it into the catalog to try and render it.

We flipped that model, which is you define the, the things. In context, and then everything else is generated off it. It's always up to date because I cannot do a piece of work without the context being entered. I cannot create a transformation code. I cannot create the idea of a table. I cannot create the idea of a schedule.

I cannot orchestrate dependencies across tables without first creating that piece of context. So again, that. Forces us to make sure that anything I'm looking at is the latest version because there is no choice.

Nigel: Yeah. Yep. And I guess the natural question that others would ask in the profession, what about when a change is made that break something?

And this is an interesting one because effectively when any changes are made, we validate them and we don't allow. Updates to the context layer unless it's a valid update. So every change that's made won't update the latest production context until it's valid, so we effectively stop anything getting into the context layer that's gonna cause a problem.

It stays at a draft. Effectively until it's validated. So when the manifest is generated itself, it's always looking for the active record, and it's fine for this to be draft floating around. It won't touch it until it's valid, and then it flips over to replace the current production record.

Shane: And then there's a bunch of attributes for that context.

So when I define a table or a piece of code, I can define that it's what we call auto sync. That, that it will execute, that manifest that data flow. Whenever a a row turns up in, in one of its dependencies, Salesforce sends us some records. It has that history tile or table. So raw bronze. Then if it's set to auto sync, it will then trigger.

The hydration of that manifest and run it. But if I flip it and say that for that piece of coat or that table, it's manual sink, custom sink, and I say it's gonna only run at 7:00 AM on a Monday, then that's when it triggers the hydration of that manifest and running that coat. So there's a bunch of attributes that are actually really important when you start moving all your data flow stuff back to a context layer.

Nigel: Yeah, absolutely, and that's a good point. You touch on effectively allowing that orchestration to start based on a record turning up or a file turning up or a schedule that's been attached to it that says, run it daily at 7:00 AM which is quite a common Pattern for pipelines, but 99% of our pipelines tend to start when something arrives.

File some data, which effectively triggers the pipeline to start and pull that data all the way through the layers.

Shane: And then the other Pattern that we use a lot in there is this idea of pub sub. As you said, it's a fire and forget. It figures out all the steps. It fires them off into effectively a queue.

It waits for the queue to run. So we, again, as you always do with patterns, you're using other patterns to make this Pattern really efficient and effective.

Nigel: Yeah, so the driver, as I said at the start of, I'm done, are you done? Are we ready to run something? Those are all pup sub messages. So we have a standard pup message, which basically says.

I'm done. And that's attached to everything in the execution layer. So anytime a table's refreshed, we get a, I'm done message off the end of that. And then we basically look for those messages as we, uh, walk through our layers. So that's our, basically, I guess our wrapper Pattern, which holds it all together.

Has something happened. Yep. Cool. Trigger something else. And we just use messages.

Shane: And then what's our average runtime? It's always dependent on we, we use BigQuery under the covers and Google Pub sub, and it depends on how much data on that, from what you are seeing of all the different customers we've got and the volumes of data.

What's an actual execution for a standard data flow way to go from, you know, arriving data and history all the way to data being consumable in that final layer.

Nigel: It is generally quite fast. We have very little latency because there's no overhead and like an airflow example, there's no overhead of starting up, I guess execution pods, Kubernetes pods to run a task to shut down, to start up and run another task.

Because we are effectively just using a message. We can tell BigQuery to run something. And, and parallel as much as possible. So our, our execution times are generally very fast 'cause there's zero overhead. We're basically just telling BigQuery, here's a list of things I want you to do and letting it do it.

Shane: And our costs are low because we're not running a container, Kubernetes and EC2, anything like that permanently to run the orchestration of the scheduling engine. 'cause we're using pub, sub and cloud functions. But moving to cloud run, it's doing all that magic for us. So it turns itself on the code's deployed, it's run, it's then destroyed, and we only get charged for that period of time that it's actually running, rather than have a 24 by seven service sitting there to just schedule and orchestrate our code.

Nigel: Yeah, that's exactly right. So we have no concept of an orchestration server running the background. Uh, we are effectively creating the BigQuery jobs on the fly, let it run them, and waiting for that I've finished message to come at the end of it. So there is never a service that is physically waiting for a job to run and costing us money.

Basically BigQuery is running jobs, and when it finishes, we get notified and we automatically start the next job. So it makes it very economic to run and we can run hundreds of jobs, thousands of messages, but it basically costs us nothing because there's nothing running to orchestrate that. Those are just messages bouncing around, starting jobs, waiting for jobs to finish.

Very effective.

Shane: But there is an anti-patent for this because we only use this for what we call micro batching. We think about anything above 15 minutes of refresh latency. This works for anything below that, we start hitting a problem and it's not a problem of Pattern around this orchestration of data flows.

It's a problem of the dependencies of everything else we've built. So if we think that we have a layered. Data, architecture, history, design, consume. We use tables under the cover in BigQuery, so there's a dependency on who's writing to them. And we know we can orchestrate this stuff very quickly, but if we move to a streaming model.

If we wanted to pick up one row and push it all the way through this, we would actually end up slightly rearchitecting the way we deploy the code to pub sub. We wouldn't actually fire off 10 tasks and wait for them to daisy chain and orchestrate themselves and have dependencies. What we'd probably do is look at the entire piece of code for the end-to-end orchestration and then submit that to run as a single code stream in BigQuery.

End to end is one task, wouldn't we? Because the patent we've got naturally breaks the task down into nodes and links. It says, run these, and when they're done, run these. And then if we keep firing it with volumes of streaming data, we will actually end up having a bottleneck, won't we? So that's an anti-patent.

We would use the same context approach. We would still hydrate and fire, but. We would slightly change the way the Pattern is deployed at runtime to handle that streaming volume.

Nigel: So in that case, we would let data stream constantly into the landing area, which is what we do in some projects. And then effectively, I guess our compromise around that one is then we would micro batch at 15 minute intervals from that stream data and move it through in micro batches currently.

Otherwise, yes, we would architect to stream. Through the layers, and we would probably run a two speed pipeline. In that case, we would micro batch it, say. Maybe 15 minute intervals. But then what we would also do is we would stream around the outside data directly to the consumer layer for the required attributes that need to be consumed in real time.

And then we would catch up the rest of the layers of our slower data. It through the normal pipeline, and then we get the bests

Shane: effectively adopting a lamb architecture, right? Yes. But the, the majority of our patents would just survive for this orchestration of data flows. But we know that we actually have to change or tailor or augment some of them when we hit that anti-patent for the way that we've currently running it and we're currently designing it.

So. Apart from a little bit of tweaking for streaming, is there any reason you wouldn't use this Pattern? Any use cases where you think this dynamic hydration of the data flows and this creating a manifest and this pushing it off to fire and forget and run? Is there any reason that you wouldn't use this Pattern?

Nigel: No. We've seen it run. Day in, day out for coming up five years now. The original architecture and code that was deployed back in the first year is largely exactly the same. We have not changed the core Pattern, so it's run happily across multiple customer use cases for five years. I think that's a fairly good testament to, it's a Pattern that works.

Shane: I think the other thing we can do is we can, if we needed to, we can scale horizontally or scale vertically. If we ever do get to a situation where we need to reduce the latency time of the end-to-end process, we can scale up the BigQuery instances and make them execute faster. We could chunk the. The data flows down into more domain orientated buckets and then figure out how to paralyze the runs of them.

So again, we've still got lots of headroom of how we can iterate this Pattern if we ever strike a cost or a performance or a complexity issue. So we know we're always gonna be iterating it, but like you said, it's been fairly good and bulletproof. I think the other thing is you've spent quite a lot of time on what we call the blast radius.

As you started building out that core Pattern as code and as I started doing the things that I would expect to do as a non-data engineer to data, I should be able to do this. And you suck your teeth going, oh my God, you've built those safety nets around that Pattern. So if you talk about the fact that I can create a piece of context, but it's not actually treated as part of that.

Production data flow is not executable when it runs, until it meets all these checks. We're effectively bringing in the peer review process, the Git review, the PR pool requests, the data quality tests, all of that's happening before any of those things allow the data flow to actually execute as a new version.

Nigel: Yeah, and so the other thing that's. Wrapping, this is in the context layer. We've got a couple of flags that reply to the context. So if something has changed in the context layer and successfully deployed into production state, we effectively put a flag on it that says, this config has changed. It's gonna require changes of downstream objects to flow this change.

For example, you've added a new column into a table. Now you expect that column to flow through and hydrate and all the layers downstream of it. So effectively we tag that object and say, Hey, this has changed. You're gonna need to recreate it and you're gonna need to check all the things. Downstream of it to make sure that they're gonna be okay.

So when we come to run that manifest, we run it as usual, but effectively we hit an object and say, oh, there's a flag on this that says it's changed. So we identify the change. We usually recreate the table and we roll the watermark back. We reload that table. That's cool. We get to the next one and we say.

Great. Something upstream of me has changed. I need to make sure how that's gonna work. So what we actually do is we do a, another validation of that context to make sure it's a hundred percent gonna work with that change. And at that point, if we determine, oops, this is gonna have to be rebuilt as well, we do that and we keep doing this all the way the end of the manifest, but we also stop the manifest.

Some reason we've actually caused something that's not gonna keep working and we stop and we raise an alert and say. Hey, this context, uh, is now gonna be invalidated because something, two or three layers upstream has changed enough that I can't fix it for you automatically. You're gonna have to look at change and then basically say what you need to do.

So that was quite a breakthrough 'cause we used to spend a lot of time. Manually trying to fix pipelines that stopped for small things like comma being added or removed. Whereas 99% of the time we can handle that automatically just by doing a full rebuild and refresh the table and then revalidating the rules.

And generally everything takes care of itself. 'cause we wanted it to be self-healing. 'cause there's nothing worse than pipelines that fall over just because you've added a column. Whereas something like that's easy, a column can just be added, populated, and away you go. So that was quite a breakthrough, basically putting those extra bags in the context.

Shane: Yeah. And effectively they're like stage gates. It's basically every time it's validating, am I still to run the next step? Yep. And then in the beginning, like you said, we had lots of times where it stopped. I would do something and then it would say, oh, actually you're not meant to be able to do that. But as each one of those happened, we'd go in and we'd either fix the way I created that context so I could no longer create it in an.

Invalid way, or we'd automate the rebuilding, the deploy and destroy model that it was trying to rebuild so it could rebuild itself safely. So again, we didn't change the core orchestration Pattern, we changed all the other things around it to make sure that patent was bulletproof and ran. And we didn't get call out at two o'clock in the morning because our data flow had stopped, or at seven o'clock to have a look at it and have a whole lot of catch up work to rebuild it.

So I think those supporting patterns were. As important as this core orchestration of the data flows by them being dynamic and using Manifest publisher.

Nigel: Yeah, that's exactly right. It makes it all quite robust and as I said, generally the pipelines run very accurately without a lot of tension.

Shane: Excellent.

Alright, well I think that one's done. So orchestrating data flows in a dynamic way using manifest in pub sub, right. Another one next week. But for now, I hope everybody has a simply magical day.

Agile Data N’ Info

Discussion about this post