A system that hears and thinks like humans.

What we do

The sound cognition ability of modern AI systems is mostly focused on speech recognition. However, sound contains a vast amount of non-verbal information that humans use to think, to decide, and to act. For instance, we can easily notice whether someone caught a cold by listening to their voice, a baby is crying or sleeping, or someone is approaching by hearing footstep sound. creates a system that understands the semantics of audio, like humans. We believe that future AI system should provide appropriate services without the intentional input, thus inferring context from the audio will take an important role in AI technology in the near future.

Acoustic Scene/Event Identification

Humans can understand the environment through various sensory inputs and hearing is one of the important human senses. We extract the information about acoustic scene and events, which are the most critical information for AI to understand the surrounding context.

Music Information Retrieval

Music is one of the unique forms of audio. Extracting high-level information such as tempo, downbeat, instruments, and mood from the raw audio signal can open up the new possibility for creative music applications and context-aware systems.

Speech Analysis

Speech recognition is used in various AI applications to understand the command. However, voice contains much more hidden information about the speaker such as age, gender, and emotion, which is highly useful for understanding the current status of users.

Cutting-edge deep learning research team achieved top rankings in all tasks of IEEE Audio and Acoustic Signal Processing (AASP) Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2017, one of the largest competition in its field.

Acoustic scene classification


Detection of rare sound events


Sound event detection in real life audio


Sound event detection for smart cars

2nd & 1st