Liberating IIoT Data: Snowflake, Ignition! and Sparkplug

The Streamline Approach

At Streamline Control, we understand that transforming industrial data into actionable insights requires more than traditional historian solutions. Today’s industrial operations generate massive amounts of data that demand scalable, flexible platforms capable of not just storing information but making it truly accessible and valuable across the organization. While conventional historian systems provide important capabilities, they often create isolated data repositories that limit an organization’s ability to fully leverage their operational intelligence.

The Next Evolution in Industrial Data Management

We’ve witnessed how cloud-based data platforms can revolutionize industrial operations. Our recent project with a mid-sized energy company operating multiple pipeline systems and refineries demonstrated this transformation firsthand. By transitioning from their traditional historian architecture to Snowflake, we helped our client break down data silos, enable cross-functional analytics, and drive data-informed decisions throughout their organization. Our approach combined the reliability of industrial protocols with the power of cloud computing to create a unified data ecosystem that serves both operations and enterprise needs.

Why Snowflake?

Traditional historian solutions excel at storing time-series data but often struggle with scale, accessibility, and integration with business systems. Our client’s multiple legacy historians were creating bottlenecks for data access and limiting their ability to perform advanced analytics. Snowflake changed this paradigm by offering a cloud-native platform designed for enterprise-wide data management. As a cloud data historian, Snowflake provided:

  • Unlimited scalability for both storage and compute
  • Seamless integration with analytics tools and business applications
  • Built-in data sharing capabilities across departments and partners
  • Cost-effective storage with separation of compute resources
  • Robust security and compliance features
  • Native support for structured and semi-structured data formats

The Power of MQTT/Sparkplug for Industrial Data

When it comes to real-time data transmission in industrial environments, not all protocols are created equal. For our client’s distributed operations, we implemented MQTT with the Sparkplug specification for several critical reasons:

  • Report-by-exception methodology that transmits data only when values change, reducing network
    traffic by over 80% compared to their previous polling system
  • Efficient bandwidth utilization through lightweight message formatting
  • Built-in state management for industrial devices, and applications
  • Standardized payload format that ensures interoperability
  • Native support for data context through topic namespaces and metadata
  • Resilience against network disruptions with state management and store and forward
  • Enhanced security features including TLS encryption and authentication

State Management: Ensuring Data Integrity

One of the most powerful aspects of the Sparkplug specification is its robust state management framework, which we leveraged extensively in our implementation:

  • Device State Awareness. The MQTT/Sparkplug architecture maintains awareness of every node’s connection state through its Birth/Death certificate mechanism. When devices reconnect after a disruption, they publish a “Birth Certificate” containing their complete current state, ensuring that any data generated during the disconnection is automatically restored upon reconnection.
  • Data Integrity through Primary Applications. We configured the Cirrus Link IoT Bridge as a Primary Application in the Sparkplug architecture, allowing it to maintain the state of the entire MQTT infrastructure. This enabled automatic detection of data gaps and triggered state restoration requests when inconsistencies were detected, effectively creating a self-healing data pipeline.
  • Last Known Good Value Retention. The Sparkplug protocol’s state management ensures that the last known good value for any metric is always available, even when devices are offline. This allows analysis and reporting to continue without data gaps during temporary outages.

MQTT/Sparkplug created the perfect foundation for feeding real-time operational data into Snowflake while maintaining the reliability and context required for industrial systems, particularly important for our client’s remote facilities with limited connectivity

Leveraging Cirrus Link IoT Bridge for Snowflake

A key component of our implementation was the Cirrus Link IoT Bridge for Snowflake, which provides a seamless connector between the MQTT/Sparkplug infrastructure and the Snowflake cloud platform. This specialized bridge offers several critical advantages:

  • Native Sparkplug B payload parsing that preserves all metadata and automatically maps to appropriate Snowflake data types
  • Stateful session management that tracks and utilizes the connection state of every device and application in the MQTT infrastructure
  • Birth/Death certificate processing that ensures proper initialization and handling of unexpected disconnections
  • Store-and-forward capabilities that buffer data during outages or maintenance windows
  • Session state reconstruction that allows newly connected clients to automatically receive the
    current state of the system
  • Automatic MQTT topic to Snowflake table mapping that simplifies data model implementation
  • Configurable data aggregation that optimizes storage efficiency while maintaining data fidelity

Ignition! as the Edge-to-Cloud Bridge

The missing link in many industrial IoT architectures is a robust middleware layer that can connect operational technology to cloud platforms. For our client, we deployed Ignition! by Inductive Automation as this critical bridge:

  • Native MQTT/Sparkplug support for efficient data collection
  • Edge computing capabilities for local processing and store-and-forward
  • Broad driver library for connecting to their diverse control systems
  • Scalable architecture that supported their enterprise-wide deployment
  • Robust security features including encryption and authentication
  • Built-in redundancy and high-availability options

This architecture created a seamless flow from field devices through Ignition! and into their Snowflake data platform, ensuring that operational data maintained its context and reliability while becoming accessible for enterprise analytics.

Architecting the Data Ecosystem

Our implementation created a comprehensive data architecture that spans from field devices to enterprise analytics:

  • Data Collection Layer: We have connected 24 pipeline pump stations and 3 processing facilities using local Ignition! instances, integrating with existing PLCs, SCADA, DCS’s, and local databases using a combination of native drivers and OPC-UA.
  • Edge Processing Layer: At each site, Ignition! instances performed local data processing, buffering, and protocol conversion, ensuring data integrity even during the frequent connectivity disruptions experienced at remote sites.
  • Cloud Storage Layer: Snowflake ingested, stored, and organized both real-time and historical data in optimized formats, with appropriate time-series partitioning for efficient query performance.
  • Analytics Layer: Business intelligence tools, data science platforms, and custom applications connected directly to Snowflake, transforming data into insights for operations, maintenance, and executive teams.
  • Cloud Ignition! Gateway: In their AWS Cloud environment, an additional Ignition! Gateway is used to provide real-time applications connected directly to the MQTT/Sparkplug data, and Snowflake, accessible from the corporate network.

Implementation Strategy: An Incremental Approach

For our client, we took an incremental approach to minimize risk while demonstrating immediate value:

  1. Pilot Project: Using a portion of the system that already leveraged MQTT and Ignition! as a SCADA system, we were able to quickly prove the technology and approach.
  2. Data Model Design: Our team developed a Snowflake data model that accommodated both timeseries operational data and contextual information about assets and processes.
  3. Analytics Development: We created initial dashboards and reports that demonstrated immediate value to operations, maintenance, and management teams.
  4. Scaling Out: We expanded and deployed multiple sites into production, leveraging lessons learned from the pilot to accelerate deployment. Additional facilities and sites continue to join the project.
  5. Advanced Analytics: Once the base system was stabilized, we introduced machine learning models for forecasting and operational optimization.

This phased approach minimized risk while accelerating time-to-value, the business to see immediate benefits while building toward a comprehensive solution.

Key Technical Considerations

Our implementation highlighted several technical considerations essential for success:

Data Modeling in Snowflake

  • We implemented a data mesh approach ensuring that data for each area of the business was only available to those with authorization
  • Designed query patterns to support both real-time operational dashboards and historical analysis
  • Leveraged dbt to created materialized views for commonly accessed aggregations

MQTT/Sparkplug Configuration

  • Established a standardized topic namespace design aligned with their operational hierarchy
  • Configured efficient Sparkplug payload designs to minimize bandwidth consumption

Ignition! Deployment Strategy

  • Implemented redundant gateways at sites with automatic failover
  • Configured store-and-forward capabilities at remote sites to handle up to 72 hours of connectivity loss
  • Optimized tag configuration for efficient data collection

Integration Architecture

  • Implemented proper data governance and security boundaries
  • Established clear segregation between OT and IT systems, with outbound only connections from OT
  • Created role-based access control aligned with organizational structure

Real-World Impact

This architecture provided our client with unprecedented flexibility, allowing them to implement use cases including:

  • Real-time pipeline optimization based on forecasted receipts and current inventories
  • Live facility production reporting
  • Integration with yield accounting systems at refineries
  • Tracking and reporting against refinery integrity operating windows

Through the platform and use cases, our client continues to achieve remarkable results:

  • Reduction in data integration costs compared to their previous historian-to-database approaches
  • Near real-time analytics across operational datasets that previously were only available through manual integration processes
  • Cross-functional data utilization with operations, maintenance, and business planning teams all
    working from the same data source
  • Faster analytics development through standardized access to operational data

Conclusion

The industrial sector is undergoing a profound transformation driven by the convergence of operational technology and information technology. By implementing Snowflake as a cloud data historian with MQTT/Sparkplug and Ignition!, organizations like our client can create a foundation for this transformation that maintains the reliability and security of industrial systems while unlocking new possibilities for datadriven decision making.

At Streamline Control, we’re committed to helping industrial organizations navigate this journey. Our approach combines deep industrial expertise with cloud technology knowledge to create solutions that deliver immediate value while building toward a comprehensive digital transformation.

Are you ready to transform how your organization leverages industrial data? Let’s start the conversation.

Contact

For further information on Streamline Control’s cloud historian services, please contact Dan Lozie, Director of Analytics & Machine Learning, at dan.lozie@streamlinecontrol.com.