[LAB FILE]

Voice AI Systems

Latency, interruption, and interface trust.

Question

What changes when an AI system is spoken to instead of typed to?

Hypothesis

Voice systems need latency budgets, interruption handling, and repair strategies treated as core architecture.

Method

Map the flow across capture, transcription, policy, tool execution, generation, and speech output.

Prototype

Prototype a turn manager that handles barge-in, confirmation, and partial state repair.

Notes

Voice makes uncertainty obvious. The system needs to admit delay and recover gracefully.

Results / Open Questions

Open question: where should a voice agent ask for confirmation versus continuing with reversible action?

References

Placeholder for realtime speech systems, turn-taking, and multimodal interaction research.