RobustNav. (a) A navigation agent pretrained in clean environments is asked to navigate to targets in unseen environments in the presence of (b) visual and (c) dynamics based corruptions. Visual corruptions (ex. camera crack) affect the agent’s egocentric RGB observations while Dynamics corruptions (ex. drift in translation) affect the agent’s transition dynamics.
As an attempt towards assessing the robustness of embodied navigation agents, we propose RobustNav, a framework to quantify the performance of embodied navigation agents when exposed to a wide variety of visual – affecting RGB inputs – and dynamics – affecting transition dynamics – corruptions. Most recent efforts in visual navigation have typically focused on generalizing to novel target environments with similar appearance and dynamics characteristics. With RobustNav, we find that some standard embodied navigation agents significantly underperform (or fail) in the presence of visual or dynamics corruptions. We systematically analyze the kind of idiosyncrasies that emerge in the behavior of such agents when operating under corruptions. Finally, for visual corruptions in RobustNav, we show that while standard techniques to improve robustness such as data-augmentation and self-supervised adaptation offer some zero-shot resistance and improvements in navigation performance, there is still a long way to go in terms of recovering lost performance relative to clean “non-corrupt” settings, warranting more research in this direction.
Visual Corruptions. Visual Corruptions affect the agent's ego-centric RGB frame. Except for Camera Crack and Lower FOV, all corruptions are supported at five different levels of severity.
Dynamic Corruptions. Dynamics corruptions affect transition dynamics in the target environment. Motion Bias (Constant and Stochastic) are modeled to mimic friction. Motion Drift models a setting where translation actions have a bias towards rotating right (or left). In Motor Failure, one of the rotation actions fail.