May 28, 2025
Quick Insights to Start Your Week
Data-engineering-analytics🎧 Listen to the Huddle
This is an AI generated audio, for feedback or suggestions, please click: Here
Share
Welcome to this week’s Data Engineering and Analytics huddle – your go-to source for the latest trends, industry insights, and tools shaping the industry. Let’s dive in! 🔥
⏱️ Estimated Read Time:
- 🎧 Listen to the Huddle
- Finally, a Simple, Cloud-Friendly Apache Iceberg Catalog That Just Works
- What Salesforce’s $8B Acquisition of Informatica Means for Enterprise Data and AI 🎯
- Deliver Bi-Directional Integration for Oracle Autonomous Database and Databricks 🔄
- 🛠️ Tool of the Week
- 🤯 Fun Fact of the Week
- Huddle Quiz 🧩
- ⚡ Quick Bites: Headlines You Can’t Miss!
Finally, a Simple, Cloud-Friendly Apache Iceberg Catalog That Just Works
🎉 Exciting news for Apache Iceberg users! Working with Iceberg in production environments has often been a drag, thanks to the mandatory and rigid catalog system. Until now, you’ve faced two unappealing options: overly complex corporate solutions or questionable hacks no one trusts.
🌟 Introducing boring-catalog: A lightweight, open-source Iceberg catalog that works seamlessly in the cloud, without any fuss! Developed by Julien Hurault, this catalog finally delivers what Iceberg has been missing: simplicity.
Why is boring-catalog a game-changer?
- 🔌 No Infrastructure Dependency: It doesn’t need a server and doesn’t assume you’re Google or a massive corporation.
- 🚀 Quick Setup: You can have it up and running in no time! Here’s how:
- Step 1: Install the essentials
- Step 2: Initialize your Iceberg warehouse (Ensure AWS credentials are exported)
- Step 3: Manually create your table (a minor inconvenience for major gains)
- Step 4: Write data using Polars and Arrow libraries
- Step 5: Read it back to confirm everything works
💡 Why is this important?
Apache Iceberg has lacked a simple, cloud-friendly catalog option that doesn’t require dedicated DevOps support. Existing solutions feel more like enterprise grabs than developer-friendly tools. boring-catalog gives individual engineers and smaller teams the freedom to experiment and deploy Iceberg effortlessly.
🤝 A Call for Developer-First Tooling: The fact that an independent developer managed to create a usable, cloud-native open-source Iceberg catalog highlights the need for Apache Iceberg’s community to adopt more developer-first approaches. Lessons can be learned from the Delta Lake ecosystem, which emphasizes easy onboarding and developer convenience.
Embrace boring-catalog, and let Apache Iceberg work its magic without the production headaches! 😎📈
What Salesforce’s $8B Acquisition of Informatica Means for Enterprise Data and AI 🎯
Salesforce has made a significant move into the enterprise space with its acquisition of Informatica for an impressive $8 billion 🤑. This strategic bid aims to elevate Salesforce’s data management capabilities, enhancing its foundation for agentic AI 🚀.
Why Informatica? Informatica, founded in 1993, is a veteran in the enterprise data sector and an early pioneer in ETL (Extract, Transform, Load) processes 🔄. Over the years, it has adapted to cloud-based SaaS models and recently embraced generative AI 🔮. Just last week at Informatica World, they unveiled new AI offerings aimed at improving enterprise data management and operations 📈.
The Benefits for Salesforce 🏆 By acquiring Informatica, Salesforce seeks to bolster its trusted data foundation for deploying agentic AI agents safely, responsibly, and on a large scale across enterprises 💪. The integration will create a unified architecture combining:
- Data catalog: Robust metadata management
- Data integration & governance: Ensuring data quality and privacy
- Master Data Management (MDM) services
Forrester analyst Noel Yuhanna notes that this acquisition fills critical gaps in Salesforce’s data management capabilities, positioning them strongly across all aspects of modern data management 👥.
Expert Opinions 🗣️
- Hyoun Park (CEO & Chief Analyst at Amalgam Insights) believes the acquisition aligns with Salesforce’s push for robust IT management capabilities, enhancing its standing against competitors like ServiceNow and Boomi 🤝.
- Kevin Petrie (Vice President of Research at BARC) highlights that Informatica’s data management strengths, including master data management, data catalog, and security, are more advanced than MuleSoft’s offerings within Salesforce’s portfolio 🏅.
Impact on Customers 🌐 This acquisition promises substantial benefits for both Salesforce and Informatica enterprise customers 💡:
- Salesforce: Seamless access to all customer data, real-time insights, and accelerated agentic AI deployment for low-code, low-maintenance solutions 🔐.
- Informatica: Faster path to agentic AI workloads backed by the Salesforce ecosystem, enabling automated data workflows with minimal human intervention 🔄.
In summary, this acquisition represents a powerful union, promising enhanced data management and accelerated AI capabilities for both companies’ customers 🎯.
Deliver Bi-Directional Integration for Oracle Autonomous Database and Databricks 🔄
Expert Opinion:
Oracle Autonomous Database (ADB) has embraced Delta Sharing, a game-changing open protocol for secure data collaboration across platforms. This integration allows Databricks users to access Oracle ADB data without the hassle of copies or ETL processes—simplifying real-time analytics and AI workflows! 😍
Why Oracle Chose Delta Sharing?
ADB is a fully managed, serverless database that automatically handles provisioning, scaling, and tuning for both transactional and analytical workloads. But when it comes to sharing data with partners or analytics platforms, historical methods like FTP, email, or ETL pipelines often resulted in duplicated data, increased storage costs, and delayed insights. 😴
Delta Sharing presents a modern, open-protocol solution that breaks down vendor-specific barriers, ensuring secure, cloud-agnostic data sharing without duplicating data manually. 🌐🔒
Key Benefits:
- Bi-directional Data Exchange: ADB can act as both a data provider and consumer, enabling seamless data sharing between Oracle ADB and Databricks users or any platform supporting Delta Sharing. 📈📊
- Real-Time Access: No more waiting for data updates! Real-time visibility into operational and transactional datasets fuels advanced analytics and machine learning workflows. ⚡
- Simplified Data Management: ADB’s serverless nature automates provisioning, scaling, and tuning—delivering high performance and reducing operational overhead. 🌟
Real-World Use Cases:
Manufacturing Companies: Share product lifecycle data with suppliers in real-time for better visibility and collaboration.
Retailers: Feed transactional data from Oracle ADB into Databricks for advanced analytics, machine learning, and purchase-shipping reconciliation. 🎯💰
Partnerships & Success Stories:
KPMG helps clients like large national retailers modernize financial reconciliation processes using Delta Sharing. By securely exposing curated financial datasets from Oracle ADB directly to BI tools and reconciliation platforms, this integration eliminates redundant data movement and legacy patterns—resulting in faster, more consistent insights! 🔄🔝
What’s Next?
Oracle and Databricks continue to push boundaries with exciting developments on the horizon. 🚀
- Watch: Video showcasing Oracle’s adoption of Delta Sharing
- Explore: OracleLive Labs for step-by-step guidance on setting up Delta Sharing between Oracle ADB and Databricks 🌟
Ready to harness the power of Delta Sharing? Dive into this transformative data integration today! 🎉
🛠️ Tool of the Week
KNIME is an open-source data analytics, reporting, and integration platform allowing users to create data flows visually, selectively execute some or all analysis steps, and inspect the results, models, and interactive views.
🤯 Fun Fact of the Week
Data quality is crucial for businesses worldwide. Inefficiencies and errors in data lead to significant financial losses, averaging $15 million annually per organization. Data engineers must implement robust data governance and quality control measures to ensure accurate and reliable data systems.
Huddle Quiz 🧩
Trend Explained:
⚡ Quick Bites: Headlines You Can’t Miss!
- Fast, approximate analytics at scale: Apache DataSketches available in BigQuery.
- ClickHouse vs StarRocks vs Presto vs Trino vs Apache Spark™ — Comparing Analytics Engines.
- How to Keep Your Data Team From Becoming a Money Pit.
- AI may already be shrinking entry-level jobs in tech, new research suggests.
Share
Subscribe this huddle for more weekly updates on Data Engineering and Analytics! 🚀

Share Your Score!