When devices hear like people do, but even better
It’s a noisy world out there
Conventional voice user interfaces struggle in real-world environments. Background noise, overlapping conversations, and unpredictable conditions often degrade the quality of captured speech, leading to poor voice recognition accuracy and speaker frustration.
In comes clarity
Spatial Hearing AI is a core technology that powers Kardome’s solutions and products. It delivers breakthrough capabilities that enable voice UIs to hear what users say with unprecedented precision.
Making devices hear like humans
Spatial Hearing AI enables devices to hear and perceive their surroundings with precision. It listens to the 3D acoustic environment and maps the soundscape, isolating each source and distinguishing between multiple speakers , so devices can respond adaptively, guided by the context of their surroundings. And because it runs entirely on-device, it also ensures speed and privacy.
Spatial Hearing AI products
SoundMap
A key product, SoundMap spatially detects where speech and sound come from. It tracks moving sources, enables interview mode by transcribing speakers into separate audio streams, and captures speech solely from desired zones.
ClearZone NS
A two-stage noise suppression engine that filters out ambient sound so speech comes through clearly.
Barge-In
Delivers real-time, multi-channel acoustic echo cancellation that suppresses unwanted ambient noise, so devices can hear a user even while loud music is playing in the background.
Advanced Voice
A suite of voice processing algorithms that optimizes speech for ASR, hands-free telephony, and other voice-enabled applications.
Get more insights
Have questions? We’ve got answers.
What is Spatial Hearing AI?
Spatial Hearing AI is Kardome’s core acoustic clustering technology. Unlike traditional beamforming that focuses on directional “beams,” Spatial Hearing AI creates a dynamic 3D map of the acoustic scene. It treats sound sources as distinct objects in space, allowing it to separate speech from noise and distinguish between multiple speakers based on their precise location (depth and elevation), not just direction.
What do you mean by "3D Acoustic Analysis"?
Unlike standard voice technologies that only detect direction, Kardome’s Spatial Hearing AI analyzes the entire 3D acoustic scene. It understands depth, distance, and elevation. This allows the system to distinguish between speakers in the environment, something traditional beamforming cannot do effectively.
How does Spatial Hearing AI differ from traditional beamforming?
Traditional beamforming focuses on a general direction but struggles with reflections and multiple speakers in the same “beam.” Spatial Hearing AI creates a complete 3D acoustic map of the environment. It spatially separates sound sources in real-time, allowing it to isolate a specific speaker from a specific location, even in reverberant or crowded spaces, delivering far higher accuracy than directional beamforming.
Can this technology distinguish between multiple speakers?
Yes. The core advantage of Spatial Hearing AI is its ability to treat every voice as a distinct object in 3D space. By spatially separating sound sources, the system can isolate the active speaker from background chatter or other people talking nearby, ensuring the voice assistant responds only to the intended command.
Does the spatial hearing technology work if the speaker is moving around the room?
Absolutely. Unlike static directional microphones, our Spatial Hearing AI algorithms continuously track the acoustic scene in real-time. This allows the system to “lock on” to a user and follow their voice as they move through the room, maintaining consistent voice capture without signal drop-offs.
Can the device hear commands while playing loud music?
Yes. The technology includes advanced Acoustic Echo Cancellation (AEC) capabilities. It effectively suppresses the device’s own audio output (like loud music or navigation prompts), allowing the system to clearly hear the user’s wake word or command without needing to lower the volume first.
Is this technology hardware-dependent?
No. Kardome’s Spatial Hearing AI is hardware-agnostic. It delivers high-performance results using standard, low-cost microphones and integrates seamlessly with various processor architectures (ARM, DSP, etc.), helping OEMs reduce BOM costs while upgrading performance.
Does the processing happen on cloud or on edge?
The technology is designed for on-device processing. This approach ensures zero latency for real-time interaction, reduces data usage, and guarantees user privacy since raw audio data is processed locally and never leaves the device.