AI/ML

July 30, 2025

Quick Insights to Start Your Week

Ai-ml

🎧 Listen to the Huddle

This is an AI generated audio, for feedback or suggestions, please click: Here

Facebook
linkedin
WhatsApp

Welcome to this week’s AI/ML huddle – your go-to source for the latest trends, industry insights, and tools shaping the industry. Let’s dive in! 🔥

⏱️ Estimated Read Time:

Table of Contents

🎧 Listen to the Huddle
How Graph Thinking Empowers Agentic AI
How to Train a Linear Regression Model with dbt and BigFrames
A Deep Dive into Image Embeddings and Vector Search with BigQuery on Google Cloud
🛠️ Tool of the Week
🤯 Fun Fact of the Week
Huddle Quiz 🧩
⚡ Quick Bites: Headlines You Can’t Miss!

How Graph Thinking Empowers Agentic AI

Agentic AI systems are designed to adapt to new situations without constant human oversight, offering transformative potential in healthcare, supply chains, robotics, and autonomous vehicles. Neuro-Symbolic Knowledge Graphs (NSKGs) are pivotal in enabling this autonomy by combining structured reasoning, contextual understanding, and long-term memory. These systems blend thinking and action, allowing intelligent agents to reason and respond dynamically.

The Neuro-Symbolic Foundation of NSKGs

NSKGs evolved from traditional knowledge graphs, storing structured data about entities (e.g., patients, aircraft) and their relationships. Their core functions include classification, anomaly detection, and predictive analytics. To achieve this, NSKGs integrate three key methodologies:

First-Order Logic (FOL): A reliable framework for explainable inference, essential for understanding entities and behaviors.
Machine Learning: Techniques like RNNs and LSTMs analyze sequential data to predict future events, excelling at temporal pattern recognition.
Generative AI (GenAI): Handles unstructured data (over 70% of real-world knowledge graphs) through tasks like entity extraction, RAG, and ontology creation.

While GenAI isn’t suited for complex logic tasks, it enhances machine learning pipelines and streamlines data workflows.

Why NSKGs and Agentic AI Work Together

Agentic AI systems—designed to perceive, decide, and act—benefit immensely from NSKGs. As an experimental psychologist and cognitive scientist, I argue they complement each other seamlessly. For instance:

Standardized Communication: NSKGs enable agents to share semantic data via RDF, OWL, and knowledge graphs, aligning with Gartner’s vision of AI-driven software implementation.
Long-Term Memory: Knowledge graphs act as repositories for historical decisions and outcomes, enabling predictive analytics through RNNs and GenAI.
Orchestration: Central knowledge graphs optimize multi-agent collaboration by analyzing decision rationales and outcomes.
Learning & Adaptation: Structured data allows agents to refine behaviors via LLM training, parameter updates, or rule refinements.

Critics’ role is equally vital: by recording decision rationales, knowledge graphs identify errors and refine agent performance over time.

Conclusion

NSKGs provide the structured, semantically rich framework needed for Agentic AI to thrive. By integrating logic, machine learning, and GenAI, these systems enable agents to adapt, learn, and act autonomously in complex environments.

How to Train a Linear Regression Model with dbt and BigFrames

dbt is a framework for transforming data in modern data warehouses, using modular SQL or Python. It empowers data teams to build analytics code collaboratively, leveraging software engineering best practices like version control, modularity, and testing. BigQuery DataFrames (BigFrames), an open-source Python library by Google, scales data processing by translating pandas and scikit-learn APIs into BigQuery SQL. Together, these tools enable seamless integration for large-scale machine learning tasks.

By combining dbt with BigFrames via the dbt-bigquery adapter (dbt-BigFrames), users gain the ability to run Python models in GCP projects using Colab Enterprise notebook executors. This setup executes BigFrames code, which is transpiled into BigQuery SQL, enabling efficient processing of massive datasets.

To demonstrate, we’ll train a linear regression model to predict atmospheric ozone levels using the epa_historical_air_quality dataset from BigQuery Public Data. This example highlights how dbt’s structure and orchestration capabilities streamline model development, ensuring maintainability and scalability.

Key Stages of the Project

Data Preparation: Leverage dbt’s modular Python models to clean and transform data.
Model Training: Use BigFrames to scale linear regression training on large datasets.
Deployment: Integrate models into production workflows with dbt’s CI/CD and testing features.

This approach demonstrates how dbt and BigFrames can be used together for scalable, production-ready machine learning pipelines.

A Deep Dive into Image Embeddings and Vector Search with BigQuery on Google Cloud

In the fast-paced world of e-commerce, AI is transforming how we shop. Imagine finding that perfect dress you saw on social media—just by uploading a photo. This is where image embeddings and vector search powered by Google Cloud’s BigQuery come into play. By converting images into numerical vectors, businesses can now search for visually similar products, making shopping smarter and more intuitive.

What Are Image Embeddings?

Image embeddings are numerical representations of images in a high-dimensional space. Similar images, like a blue ball gown and a navy blue dress, generate vectors that are “close” to each other. This allows for advanced comparisons beyond simple metadata. For example, a search for “Blue dress” can return visually matching items, even if the exact keyword isn’t present.

How Does This Work?

The process involves three key steps:

Model Creation: A model named image_embeddings_model is built using the multimodal_embedding@001 endpoint in the image_embedding dataset.
External Table Setup: An external table external_images_table is created in BigQuery to reference images stored in a Google Cloud Storage bucket.
Embedding Generation: The ML.GENERATE_EMBEDDING function generates embeddings for dress images, which are stored in the dress_embeddings table.

Results: Finding the Perfect Match

Vector search enables both text and image-based queries. For instance:

A text search for “Blue dress” converts the query into a vector and retrieves similar results.
An image search uses the ML.GENERATE_EMBEDDING function to generate a vector from a test image, then performs a vector search against the dress_embeddings table.

Query Results:

White dress: Distance 0.2243
Sky-blue dress: Distance 0.3645
Polka-dot dress: Distance 0.3828

These results demonstrate how image embeddings can identify visually similar items, revolutionizing search for e-commerce and content management systems.

🛠️ Tool of the Week

firebase.studio is a cloud-based, agentic development environment, is designed for building and prototyping AI-powered full-stack applications. It leverages Google’s generative AI model, Gemini, and Firebase services to streamline the development process. Essentially, it’s a platform that combines a code editor with AI assistance and Firebase’s backend capabilities, enabling developers to build, test, iterate, and deploy AI-driven applications in a single location.

🤯 Fun Fact of the Week

In 1980, John Searle explained the division of ‘’weak’’ and ‘’strong’’ AI. Weak AI focuses on one narrow task, while strong AI is similar to full human intelligence.

Huddle Quiz 🧩

Question 1 of 5

Score: 0

Subscribe this huddle for more weekly updates on AI/ML! 🚀