Vision-Language-Action · RLWRLD
RLDX-1: A Multi-Stream Vision-Language-Action Model for Dexterous Robots
RLDX-1, from RLWRLD and KAIST, adds motion, memory and tactile streams to a Qwen3-VL backbone. It catches fast-moving objects 87.5% of the time vs 29.2% for pi0.5, and beats GR00T N1.6 on LIBERO-Plus 86.7% to 72.6%.