Multimodal Models · Skywork AI
Audio Interaction Model: A Streaming Audio LLM That Decides When to Speak
The Audio Interaction Model runs a perceive-decide-respond loop so an audio LLM listens, decides if and when to reply, and answers on the fly — trained on StreamAudio-2M and competitive across 8 benchmarks.