Utonia: One Encoder for All 3D Point Clouds

A single self-supervised 137M-parameter model trained across five 3D domains, from satellite scans to indoor rooms, beats domain-specific encoders and improves robotic grasping to 82.1% success.
artificial-intelligence
Author

Kabui, Charles

Published

2026-03-13

Keywords

point-cloud, 3d-vision, self-supervised-learning, foundation-model, robotics