Industrial DAS with Edge and Data Lake using AWS
Below reference architecture provides a baseline of integrating Bosch’s DeviceBridge DAS (Data Acquisition System) and Indus Visualizer with AWS services, which yields modern industrial IoT solution addressing Connect-Collect-Consume layers to full extent.
You must first understand what is DeviceBridge. It is explained below…
DeviceBridge is a one-stop data acquisition software solution from Bosch/RBEI (Robert Bosch Engineering & Business Solutions) which enables industries to connect & collect process/machine data seamlessly (termed as IoT data or sensor data) from wide range of shop floor devices and machines. The acquired data can then be transformed into different data formats using its own state-of-the-art user interface and delivered to the upstream IT systems like MES, NoSQL/SQL Databases, File systems, Message Bus like Kafka/Solace and Cloud IoT platforms of your own choice like AWS IoT/Azure IoT/Bosch IoT. Upstream IT systems connectivity is made simple in the DeviceBridge which is through “easily configurable” user interfaces. Since every connector & collector in DeviceBridge comes as plug-n-playable software components, it is very highly customizable to make it suitable for every industry needs. For example, Google Cloud connector can be developed as a stand-alone component and plugged into DeviceBridge easily. In simple words, DeviceBridge is an Industry 4.0 enabler solution which one must consider to succeed in Industry 4.0 journey.
Data acquisition systems (abbreviated with the acronym DAS or DAQ) typically convert analog waveforms into digital values that can be processed near to the machine & make immediate decisions to benefit the business without needing to worry about downtimes of the upstream IT systems layer.
Thanks to the creators of such data acquisition software solutions – who helped industries to acquire data from devices/machines, compute and display live production and/or machine status on real-time dashboards inside the shop floor. It also helps us to store data on a computer and use it for making better decisions or predictions that benefit the business.
DeviceBridge supports connectivity with 40+ controllers of most popular brands like Bosch, Rexroth, Siemens, Fanuc, Mitsubishi, Allen Bradley, Beckhoff, Mazak etc. It is also capable of extracting data from systems like MES, ERP, Historian, PLM SCADA etc. and stream them to configured destination IT system.
Edge related components in the DeviceBridge are having standard functionality like data transformation or computation and data filter (non-required or confidential production data will be eliminated by edge processor) while data is in-transit. Transformed data can be directly routed to real-time dashboards in the Shop floor. However, all edge computing (ex: AI/ML) related logic are fully customizable based on the customer needs.
- Advantages of software based DAS over hardware based DAS
You can use PC servers to install & use DeviceBridge where PC servers are already adhering to your IT infrastructure guidelines. - Anybody can read through installation manual, install it on a PC server & make it up & running within few minutes. Means, there is no need of installation experts.
- It is platform independent and hence doesn’t matter whether you have Linux/Ubuntu/Windows based PC server.
- It is portable in such a way that it can be installed & used on IoT Gateway hardware boxes also, in addition to normal PC servers.
- It is highly scalable as one instance of software can connect to multiple machines/devices.
- High availability of machine data, machine data persistence & disaster recovery of machine data in the shop floor – all these will be part of your own IT infrastructure management & guidelines. Hence you no need to follow or implement anything different here.
- You can develop your own destination connectors and plug-into DAS software. Example: If you want to stream IoT data to Google cloud tomorrow, you can do it yourself in DeviceBridge.
There are many reasons why you should have DAS as part of your Industry 4.0 journey. Data acquisition is very important in diverse fields such as automotive, healthcare, civil engineering, etc. All these sectors bet on a DAS because it brings them many advantages like below…
- Improves the efficiency and reliability of machines/devices in shop floor.
- Supervision of production processes without human interaction.
- Supervision of machine condition without human interaction (predictive rather reactive).
- Better quality control and hence reduced scrap & rework.
- Better & automated control over data security.
- Easy data integration and transmission between needed IT systems in your organization. You can share certain data with 3rd parties also if needed.
- Reduced human errors in data (previously made through HMIs or other softwares), hence the increased accuracy level of KPIs.
- Real-time dashboards in shop floors to see live production status and machine condition.
- Reduced data redundancy.
- Problems are analyzed and solved faster, thereby increased human productivity too.
In general, DAS solutions improve the performance of an organization and increase the economic benefit too, likewise it provides greater control over an organization’s processes and faster response to failures that may occur.
Below is the brief about Indus Visualizer (an Industry 4.0 platform)…
Indus is an Industry 4.0 platform, a software bundle containing various modules including visualization module, from Bosch/RBEI (Robert Bosch Engineering & Business Solutions). Visualization module helps industries to monitor live production status (like OEE) at any level in the plant hierarchy like, starting from global enterprise level to machine or device level in the particular plant. Indus is a highly customizable web application which can be hosted inside premises or onto cloud based on customer’s interest.
Without the right visualization software to supervise industrial facilities and processes, a company may struggle to truly take advantage of the benefits automation brings. Production status visualization refers to the ability of an industrial automation software to generate graphical representations of the equipment, operations and conditions in a plant. A clear visual display simplifies process control for users and may offer opportunities for further automation of factors like energy consumption.
By making data easy to understand and bringing important processes under control, data visualization software can benefit manufacturing and energy companies in numerous ways. This can include improved production efficiency, increased overall equipment effectiveness (OEE) and enhanced quality management. Therefore, it’s important to choose right manufacturing visualization software with a user-friendly interface, like Indus.
A reference architecture for industrial DAS with edge and Data Lake using AWS services goes below…
Refer the numbers marked in above solution architecture diagram while reading below paragraphs.
1, 2, 3, 4
We are using DeviceBridge to acquire (IoT) data from machines or devices that are present in the Shop floor, (optionally) pre-process it (edge processor) on edge and stream raw-data or pre-processed data to the configured destination. What type of controllers or systems (data sources) DeviceBridge is able to connect and collect data and to what type destination systems it can ingest pre-processed and/or raw data, what type of protocols DeviceBridge uses etc.. — all these are already explained above.
5
AWS IoT Core – It is a managed cloud service that lets connected devices easily and securely interact with cloud applications and other devices. AWS IoT Core can support billions of devices and trillions of messages, and can process and route those messages to other AWS endpoints and to other devices reliably and securely.
In our solution architecture, AWS IoT Core receives Machine data (machine condition and/or process data called as IoT data or telemetry data) from on-premise DeviceBridge securely using the protocol of customer’s choice like HTTPS (one-way) or MQTT (bi-directional) or LoRaWAN.S
Refer AWS Documentation for more details.
6
AWS IoT Events – It continuously watches IoT data coming from the devices, processes, applications, and other AWS services to identify significant events so you can take action. It enables you to monitor your equipment or device fleets for failures or changes in operation, and to trigger actions when such events occur.
We can use AWS IoT Events to build complex event monitoring applications in the AWS Cloud that we can access through the AWS IoT Events console or APIs.
In our solution architecture, AWS IoT Events is enabling us to trigger action through Amazon Simple Notification Service (SNS) which inturn sends notification to Operations Engineers.
Refer AWS Documentation for more details.
7, 9
We are basically building Data Lake solution here. We are using AWS Kinesis Data Firehose to collect IoT data (or raw-data) from AWS IoT Core and save it into AWS S3 bucket storage which acts as a cold storage in our solution. AWS Glue will be used to pre-process raw-data and save pre-processed data into another AWS S3 bucket. This data is called as computed data which can be used by our applications to display on the UI.
Amazon Kinesis Data Firehose captures and loads data in near real time. It loads new data into your destinations within 60 seconds after the data is sent to the service. As a result, you can access new data sooner and react to business and operational events faster. In other words, it is the easiest way to reliably load streaming data into data lakes or data stores. It can capture, transform, and deliver streaming data to S3 bucket. It is a fully managed service that automatically scales to match the throughput of your IoT data and requires no ongoing administration. It can also batch or aggregate, compress, transform, and encrypt your data streams before loading, thereby minimizing the amount of storage used and increasing security.
Refer AWS Documentation for more details.
Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance. Customers of all sizes and industries can store and protect any amount of data for virtually any use case, such as data lakes, cloud-native applications, and mobile apps. With cost-effective storage classes and easy-to-use management features, you can optimize costs, organize data, and configure fine-tuned access controls to meet specific business, organizational, and compliance requirements.
Refer AWS Documentation for more details.
AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months.
Refer AWS Documentation for more details.
8
We are basically building Digital Twin solution here. Amazon IoT SiteWise is enabling us to collect, store, organize, and monitor data from industrial equipment at scale to help us make better and data-driven decisions. We can use Amazon IoT SiteWise to monitor operations across facilities, quickly compute common industrial performance metrics, and create applications that analyze industrial equipment data to prevent costly equipment issues and reduce gaps in production. This allows us to collect data consistently across devices, identify issues with remote monitoring more quickly, and improve multi-site processes with centralized data.
We can use Amazon IoT SiteWise to model our physical assets, processes and facilities, quickly compute common industrial performance metrics, and create fully managed web applications to help analyze industrial equipment data, reduce costs and make faster decisions. With Amazon IoT SiteWise, we can focus on understanding and optimizing our operations, rather than building costly in-house data collection and management applications.
Refer AWS Documentation for more details.
AWS IoT Analytics automates the steps required to analyze IoT data. It filters, transforms and enriches IoT data before storing it into a time-series data store. We can setup the service to collect only the data we need from our devices, apply mathematical equations to process the data and enrich the data with device-specific metadata such as device type and location before storing it. While processing IoT data, if any anomaly is detected, information can be routed to SNS so that it can trigger notifications to Operations Manager. Also, we can analyze our device data by running queries using the built-in SQL query engine or perform more complex analytics and machine learning inference. In our solution, we are visualizing this through integration with Amazon QuickSight as depicted.
Refer AWS Documentation for more details.
QuickSight allows us to easily create and publish interactive BI dashboards as well as receive answers in seconds through natural language queries. QuickSight dashboards can be accessed from any device and seamlessly embedded into our applications, portals and websites.
10, 12
DeviceBridge transmits non-IoT data like ERP, MES and PLM data (only those data which is necessary by cloud applications) by invoking Lambda based API securely via API Gateway as shown. Lambda function saves non-IoT data, which is basically relational data, into Amazon Arora database. Based on the use-case and performance requirement by customer, Amazon DynamoDB can replace Arora database. If we are looking for a single-digit millisecond performance at database level, then DynamoDB is the perfect choice. Finally, saved data shall be consumed by various other processes or solutions like Data Lake or end-user applications.
Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases. It provides the security, availability, and reliability of commercial databases at 1/10th the cost.
Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale. DynamoDB offers built-in security, continuous backups, automated multi-region replication, in-memory caching, and data export tools.
Refer AWS Documentation for more details.
11, 14
In our solution, we are using AWS Database Migration Service (AWS DMS) to migrate bulk non-IoT or IoT data like Historian into relational Arora database. Based on the use-case and performance requirement by customer, Amazon DynamoDB can replace Arora database. Please note that, this database could be the same database what is explaint in point #12 above.
Amazon Arora and DynamoDB are already described above.
14
In our solution, we are using Amazon Elastic Container Service (ECS) for hosting custom web applications and web APIs which are accessed by end-users in the factory. These applications and APIs shall be backed by AWS API Gateway if needed. However ECS shall use AWS Fargate for enabling auto-scaling and multi-AZ capabilities.
Amazon ECS is a fully managed container orchestration service that makes it easy for you to deploy, manage, and scale containerized applications. Using ECS you can build container-based applications on-premises or in the cloud with Amazon ECS Anywhere and enjoy consistent tooling, management, workload scheduling, and monitoring across environments. It can automatically scale and run web applications in multiple Availability Zones with the great performance, scalability, reliability and high availability.
Refer AWS Documentation for more details.
15, 16, 17
In our solution, we are using Amazon Redshift Data warehouse (Redshift DWH) to finally store computed data for near real-time or offline visualization of IoT data. Keeping high storage cost of Redshift DWH in mind (also the high cost of Redshift cluster setup and management), we should consider to store recent few weeks data only in the Redshift DWH. Older data shall be archived into S3 storage periodically. Redshift Spectrum shall be used for fetching and merging S3 storage (archived data) and Redshift DWH data (most recent data) and providing it to the visualization tool like QuickSight as shown. Please note that Redshift Spectrum don’t load archived data into Redshift DWH, rather it merges archived data and DWH data on-the-fly in virtual tables and provides it to visualization tool.
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. In our solution, we are using AWS Athena for manually querying and analyzing data stored in S3 bucket. AWS Athena provides the capabilities of SQL language to query data in seamless way. Athena APIs shall be used to fetch and visualize data through QuickSight kind of tools. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.
Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers. Amazon Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the best price performance at any scale. You can combine structured data from your data warehouse and semi-structured data from your data lake to generate application and system insights. Use Redshift ML to automatically create, train, and deploy Amazon SageMaker ML models on your data with SQL.
Refer AWS Documentation for more details.
Amazon Redshift Spectrum is a feature within AWS’s Redshift DWH that lets a data analyst conduct fast, complex analysis on objects stored on the AWS cloud. With Redshift Spectrum, an analyst can perform SQL queries on data stored in Amazon S3 buckets.
18
In our solution, AWS Managed Microsoft AD (Active Directory) is used for enabling single sign-on for AWS hosted applications and APIs.
AWS Directory Service for Microsoft Active Directory, also known as AWS Managed Microsoft Active Directory (AD), enables your directory-aware workloads and AWS resources to use managed Active Directory (AD) in AWS. AWS Managed Microsoft AD is built on actual Microsoft AD and does not require you to synchronize or replicate data from your existing Active Directory to the cloud. You can use the standard AD administration tools and take advantage of the built-in AD features, such as Group Policy and single sign-on.
Refer AWS Documentation for more details.
19
A visualization module of Indus platform from Bosch/RBEI (which is already explained above) has been used in the solution for visualizing various KPIs of plants, supporting both real-time and historical KPI displays.
DAS with Data Lake and Data Analytics Solution on AWS - Elaborated
A Data Lake is a storage repository that can store large amount of structured, semi-structured, and unstructured data. Data Lakes Aggregate All Data, Independent Of Format Or Source. Put Your Big Data To Work. The data warehouse & data lake can now come together as a modern data warehouse.
Data analytics is the science of analyzing raw data to make conclusions about that information.
Below is the reference architecture for Data Lake & Data Analytics based on AWS services. It demonstrates the usage of Bosch’s DeviceBridge and Indus software suites to address data collection and consume layers.
1, 2, 3
A factory shall have various data sources (IoT and non-IoT data) like devices or machines, SCADA system, HMI system, media files, data files, NoSQL/SQL databases, ERP, MES, PLM, cloud based SaaS application etc. Bosch-RBEI’s DeviceBridge DAS supports connectivity with many of these data sources to extract data and stream to AWS cloud as explained in previous solution architecture.
Some other data sources can be directly integrated with AWS components to ingest data as shown in diagram. These data are called as non-IoT data which can be ingested into AWS cloud either periodically or continuously using appropriate AWS tools like Amazon DB Migration Services, Amazon DataSync etc.
4
Capabilities of Bosch-RBEI’s DeviceBridge is already explained in previous solution architecture.
5
Based on the type of data source to which DeviceBridge connects and collects data, different types of data ingestion components like AWS IoT Core, AWS Kinesis and AWS Managed Streaming for Kafka are used to ingest data.
6
Other local data like Historian, PLM data, master data & other data from MES, ERP data etc. are transferred to cloud using data ingestion components like AWS DataSync, AWS DB Migration Service and custom-built APIs (via API Gateway).
AWS AppFlow is the best option to use for extracting data from cloud SaaS applications & push into your Data Lake as necessary.
7
AWS Data Exchange is used for integrating third-party data into the data lake.
8
AWS Lake Formation is used to build the scalable data lake, and Amazon Simple Storage Service (Amazon S3) is used for data lake storage. AWS Lake Formation is also used to enable unified governance to centrally manage security, access control, and audit trails.
9
AWS Glue is used to extract, transform, catalog, and ingest data across multiple data stores. AWS Glue DataBrew could be used for visual data preparation.
10
INDUS is a visualization software solution from Bosch/RBEI (Robert Bosch Engineering & Business Solutions), which is already explained in previous solution architecture.
Amazon Kinesis Data Analytics is used to transform and analyze streaming data in real time.
Amazon QuickSight provides machine learning (ML) powered business intelligence.
Amazon OpenSearch can be used for operational analytics.
Amazon Redshift is used as a cloud data warehouse which provides capabilities of near real-time computations that can be provided to real-time dashboards in the Shop floor.
Amazon Redshift Spectrum and Amazon Athena enable interactive querying, analyzing, and processing capabilities.
Amazon EMR provides the cloud big data platform for processing vast amounts of data using open source tools.
Amazon SageMaker and AWS AI services can be used to build, train, and deploy ML models, and add intelligence to your applications.
Amazon IoT SiteWise is enabling us to collect, store, organize, and monitor data from industrial equipment at scale to help us make better and data-driven decisions. We can use Amazon IoT SiteWise Monitor to monitor operations across facilities, quickly compute common industrial performance metrics, and create applications that analyze industrial equipment data to prevent costly equipment issues and reduce gaps in production. This allows us to collect data consistently across devices, identify issues with remote monitoring more quickly, and improve multi-site processes with centralized data.
AWS IoT Analytics automates the steps required to analyze IoT data. It filters, transforms and enriches IoT data before storing it into a time-series data store.
Conclusion
With both DeviceBridge and AWS together, one can implement end-to-end industrial IoT solution that covers machine/device data collection, edge processing, data ingestion to AWS cloud, data collection & big data processing on cloud through Data Lake & Big Data Analytics tools & services (near real-time to historical data analysis possibilities as explained above), data transformation in cloud (ETL), data warehouse and data visualization.
In AWS, with a broad set of managed services to collect, process, and analyze big data, AWS makes it easier to build, deploy, and scale big data applications. This enables industries to focus on business problems instead of updating and managing these tools. AWS provides many solutions to address your big data analytic requirements.
Data analytics is important because it helps businesses optimize their performances. Implementing it into the business model means companies can help reduce costs by identifying more efficient ways of doing business and by storing large amounts of data. It helps in understanding the current state of the business or process and provides a solid foundation to predict future outcomes. In other words, it enables businesses to understand the current market scenario and change the process or trigger a need for new product development that matches the market needs.
Your home is valueble for me. Thanks!…
I like this weblog very much, Its a rattling nice situation to read and receive info . “You have to lead people gently toward what they already know is right.” by Philip.
This web site is really a stroll-by for all of the data you needed about this and didn’t know who to ask. Glimpse right here, and you’ll positively uncover it.
Hi! Do you use Twitter? I’d like to follow you if that would be ok. I’m definitely enjoying your blog and look forward to new posts.
I am not very active in Twitter. Instead you can follow me on LinkedIn. Thank you.
Thank you for the auspicious writeup. It actually was a amusement account it. Glance advanced to far delivered agreeable from you! By the way, how can we keep up a correspondence?
Thanks for reading my blog. You can connect with me on LinkedIn.
Please connect me on linkedin. Thanks!
This site can be a walk-by means of for all the info you wished about this and didn抰 know who to ask. Glimpse here, and also you抣l undoubtedly uncover it.
Do you mind if I quote a couple of your posts as long as I provide credit and sources back to your weblog? My blog site is in the very same area of interest as yours and my visitors would genuinely benefit from a lot of the information you provide here. Please let me know if this alright with you. Cheers!
That’s totally fine for me. Please go ahead.
I am very satisfied to see your article. Thanks a lot and i am looking ahead to touch you. Will you please drop me a e-mail?
Wow! Thank you! I continuously wanted to write on my website something like that. Can I include a part of your post to my website?
Thanks for reading. Yes you can pickup the content from my blog. But it would be great if you can give me credit in your website by mentioning the source of information.
Hi, I do believe this is a great web site. I stumbledupon it 😉 I will return once again since i have bookmarked it. Money and freedom is the best way to change, may you be rich and continue to help other people.
Greetings! Very useful advice within this post! It’s the little changes which will make the biggest changes. Thanks a lot for sharing!
This is a very good tips especially to those new to blogosphere, brief and accurate information… Thanks for sharing this one. A must read article.