Code with SAP Labs India - Discover, Design, Deliver

6701 Registered Allowed team size: 1 - 4
6701 Registered Allowed team size: 1 - 4

Winners are announced.

POC Submission
starts on:
Mar 16, 2021, 03:30 AM ()
ends on:
May 16, 2021, 06:25 PM ()


Anomaly Detection

Anomaly Detection Using Air Borne Sound

Maintenance is a strategic concern in mining, manufacturing and various other industries. For machine operators and factory managers, unplanned failures and avoidable asset repairs consume unnecessary resources, increase operational costs and cripple daily operations. In such cases, Predictive Maintenance could be helpful. It can be implemented by drawing insights from the data collected by different IoT devices, related to temperature, vibrations, sound, etc. made available by installing various sensors on individual machines. This is however difficult in older plants where retrofitting sensors can be costly. A new approach could be to use airborne sound by placing various acoustic sensors. Strategically around the facility to listen for changes in acoustic frequencies and amplitudes as an indication to where failures may exist. Another critical aspect to this approach is the ability to learn across systems. For many major industrial plant failures may be rare, but when they do occur they are expensive. Hence our ability to learn in one plant alone may not be enough to effectively train a model and hence we are considering Federated learning as an approach to support this technique.

There are numerous paradigms in the Distributed Learning approach, and we must select the most aligned approach. Some of the factors to consider with Distributed Learning:

  • Consistency – Aggregation of data, optimization, etc.
  • Fault Tolerance – Device failure, Communication failure etc.
  • Communication – I/O (Storage on devices), network, distributed file systems etc.
  • Security – Privacy, encryption.
  • Data – The audio data should not be persisted on the cloud.
  • Ability to transfer learnings.



Create a library of anomaly datasets or models

  1. Consider, onboarding a new device: If the data for a specific machine type, exists in the library, the new device should start with the model from the library as a base model.
  2. Signal Processing at edge: Approach to process the sound data and detect anomalies for multiple device types and device IDs.
  3. Distributed Learning: Federated Approach / alternate methods for learning from distributed learning and building a robust model to detect anomaly using Air Borne Sound.


  1. Best approach to solve the problem of distributed learning [Federated Learning or alternatives].
  2. Scalable architecture with a working Demo.


  1. Demo semi-supervised or unsupervised approach.
  2. Additional dataset other than MIMII.
  3. Anomaly detection on unlabelled dataset (approach).

Boundary Conditions:

  • No restriction on: Framework, technology, programming languages, services to be utilised, and security.
  • External datasets are welcome to be used. 


MIMII Dataset: Sound Dataset with the detailed explanation.


(a) Setup

The recording setup is clearly described in the paper. Recordings were collected using the same type of microphone, TAMAGO-03 manufactured by System In Frontier Inc, in a circulated format of eight microphones in same distance. The left figure shows the direction and distance for each type of machine. The microphones continuously record the sounds and segmented into the length of 10 seconds in a reverberant environment and eventually the dataset contains eight separate channels for each segment. Apart from the target machine sound, background noises are added to the audio track later where all the noises were recorded from the real factories.

Schematic experimental setup for dataset recording -


(b) Analysis of the dataset

MIMII dataset by Hitachi contains the sounds that generate from 4 types of industrial parts, i.e., valves, pumps, fan and slide rails and each type includes seven individual machines. Note that the individual machines can be totally different product models. In this Proof of Concept, we applied fans' audios for edge computing. The fans represent industrial fans, which are used to provide a continuous flow of gas or air in the factories. For fans' machines, there are seven types of machine types (model id), and we pick two machines. Each machine will do local training on its own devices. In each machine, there are normal and abnormal audios. The ratio of normal:abnormal is around 5:2. The sound is recorded in the 16 kHz sampled signals. The real background noise recorded in multiple real factories was mixed with the machine sounds, and hence the audio track is machine sound and background sound as additive noises.

File preview:

Click here to view the PPT for understanding this problem statement

Social Share

View All Notifications