Organized by:

Global AI Jacksonville

Overview

This session explores the strategy of leveraging AI to move beyond manual implementation and into the next level of data engineering. We dive into a process that positions the AI not as a syntax generator, but as a cognitive partner in the engineering lifecycle. We will examine the architectural shift required to transform raw data lake assets into high-performance, orchestrated systems, focusing on the strategic collaboration between human intent and agentic design.

Live presentation link to YouTube


Agenda

  • Data Lake Discovery The strategy of deploying discovery agents to autonomously identify patterns and define the foundation of the data grain.
  • Governance & Requirements Establishing the strategic guardrails and requirements that empower an "Architect" agent to maintain system consistency.
  • Logical Design for the Staging Area A process dive into using AI to propose and build a logical abstraction layer, separating raw sources from core business logic.
  • Designing and Implementing the Physical Model How agents navigate the transition to physical storage, building Dimension and Fact tables while maintaining referential integrity.
  • Incremental Update Strategy Developing a sustainable approach to support continuous data feeds from the data lake using idempotent, self-healing processes.
  • Pipeline Design and Orchestration The coordination of complex tasks to manage the relationship between dimensions and facts, ensuring strict lineage and integrated observability.

Why Attend?

  • Elevate Your Role: Learn how to shift your focus from writing repetitive code to defining high-level architectural intent and performing strategic design reviews.
  • Master Systemic Reasoning: Understand how to leverage AI to solve complex engineering challenges like referential integrity and dependency management at scale.
  • Build for Operations: Move toward a model where system health and observability are built-in byproducts of the design process, not afterthoughts.

Who is this for?

  • Data Engineers & Architects: Looking to evolve their workflow from manual scripting to high-level systemic design.
  • Engineering Leaders: Interested in the ROI and reliability of integrating autonomous agents into the development lifecycle.
  • AI Enthusiasts: Wanting to see a practical, "beyond-the-chatbot" application of agentic reasoning in a production environment.
  • Technical Decision Makers: Seeking a strategy for maintaining governance and referential integrity in an AI-augmented organization.

🔗 Having trouble with the video player?

If the embedded livestream does not load, you can join directly on YouTube:

Direct Link: https://www.youtube.com/live/opelf_XJ8Js


Link to the presentation material:

https://www.ozkary.com/2026/04/architecting-an-agentic-data-pipeline-from-data-lake-discovery-to-managed-orchestration.html

Link to the GitHub Repo:

https://github.com/ozkary/data-engineering-mta-turnstile/tree/main/ai-agents

🙌 Support the project

If you enjoy the session, please consider:

  • Joining the YouTube channel to follow future livestreams
  • Starring the GitHub repository to support the open‑source work

Your support helps keep these community sessions going.

Topics

Agent Framework Data Engineering AI, Agents, Data, Databricks, Cloud Computing, Big Data