Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.bragi.com/llms.txt

Use this file to discover all available pages before exploring further.

A wakeword interaction is a hands-free trigger mechanism that activates an AI assistant or feature on an audio device when the user speaks a defined word or phrase. The device continuously monitors audio input at low power for the wakeword — a task handled on the device itself rather than in the cloud — and activates the full AI interaction pipeline only when the wakeword is detected. Wakeword interaction is the primary mechanism through which audio devices transition from passive listening tools to active AI interfaces without requiring the user to reach for a button or a phone.

How wakeword detection works

Wakeword detection operates in two stages on a modern AI-enabled audio device. The first stage runs continuously on the device at minimal power consumption. A small, efficient model monitors the audio stream for the specific acoustic pattern of the wakeword. This model is optimised for low false-negative rates — missing the wakeword when it is spoken — and acceptable false-positive rates — activating when the wakeword was not spoken. The trade-off between these two error types determines the wakeword experience: a model too sensitive to false positives activates intrusively; a model too insensitive misses genuine activations. The second stage activates when the wakeword is detected. The device signals the companion app and AI interaction layer that a voice session has begun. The full AI processing pipeline — intent recognition, action execution, response generation — engages. The user’s spoken input following the wakeword is processed and responded to. The session ends when the interaction is complete or a defined timeout is reached. The entire first stage runs on the device hardware — specifically on the neural processing capability of the SoC — rather than requiring a cloud connection. This means wakeword detection works without internet connectivity and without sending audio to external servers until the wakeword is detected and a session begins.

Wakeword vs button-press activation

Wakeword interaction is one of two primary activation mechanisms for AI features on audio devices. The alternative is button-press activation — holding or pressing a hardware button to begin a voice session. Button-press activation is simpler to implement, more reliable in noisy environments, and carries no risk of unintended activation. Its limitation is that it requires a physical interaction with the device — reaching for the earbud or headphone to initiate a command. In contexts where the user’s hands are occupied — exercise, driving, cooking, working — button-press activation is inconvenient. Wakeword interaction enables genuinely hands-free AI access. Its limitations are false-positive risk — the device activating when it should not — and the engineering complexity of training a reliable wakeword model. Products targeting users who need hands-free interaction benefit most from wakeword activation. Products targeting users who are primarily stationary may find button-press activation sufficient. Many AI audio products support both mechanisms, giving users the choice of activation method depending on their context.

Custom wakewords vs platform wakewords

Wakewords come in two forms on AI audio products. Platform wakewords — “Hey Siri”, “OK Google”, “Alexa” — activate a third-party AI assistant and route the interaction through that assistant’s infrastructure. The brand has no control over the interaction beyond the activation mechanism. Custom wakewords activate a brand-defined interaction rather than a third-party assistant. The wakeword — which can carry the brand’s name or any other defined phrase — triggers an experience that the brand controls. The assistant persona, the capability set, and the interaction model are brand-defined rather than inherited from a third party. Custom wakewords require training a specific acoustic model for the defined phrase and testing it across the acoustic conditions the product will be used in. This is more engineering effort than activating a platform wakeword but produces a branded interaction that reinforces product identity rather than redirecting users to a third-party platform.

Privacy considerations for wakeword interaction

Because wakeword detection requires continuous audio monitoring, it raises a privacy consideration that is specific to always-on voice features. The first-stage detection model processes audio continuously — raising the question of what happens to audio that does not contain the wakeword. On well-architected AI audio products, first-stage audio processing happens entirely on the device and is not transmitted or stored. Only audio captured after the wakeword is detected — the actual voice session — enters the processing pipeline that may involve cloud infrastructure. The continuous monitoring audio is processed locally and discarded. This architecture is a design requirement for GDPR compliance and equivalent frameworks in other jurisdictions, which require that audio data is not transmitted or stored without explicit user consent. Products that transmit continuous audio to cloud infrastructure for wakeword detection rather than processing on device face significant compliance exposure in regulated markets.

How Bragi AI implements wakeword interaction

The Bragi platform supports wakeword interaction as part of its voice interaction layer — including both platform wakeword integration and the infrastructure for custom wakeword implementation. The on-device detection architecture, activation pipeline, and session management are platform components that brands configure rather than build independently. Bragi AI enables brands to build AI-enabled audio products with fast, easy control and a continuously expanding services ecosystem. Wakeword interaction is the primary mechanism through which “fast, easy control” becomes hands-free — removing the device from the interaction and making AI access a natural part of how users engage with their audio environment. For the broader context of what building a branded AI voice experience requires, see What does it take to build a branded AI voice experience?. For the privacy architecture that governs how voice data is handled after activation, see How does Bragi AI handle user voice data and privacy?.