PAPER_TITLE

FIRST_AUTHOR_LAST, FIRST_AUTHOR_FIRST; SECOND_AUTHOR_LAST, SECOND_AUTHOR_FIRST

HALO:Closing Sim-to-Real Gap for Heavy-loaded Humanoid Agile Motion Skills via Differentiable Simulation

Xingyi Wang^1,2*, Chenyun Zhang^2*, Weiji Xie^2,3*, Chao Yu⁴, Wei Song¹, Chenjia Bai^2†, Shiqiang Zhu^1†

¹Zhejiang University
²Institute of Artificial Intelligence (TeleAI), China Telecom
³Shanghai Jiao Tong University
⁴Lumos Robotics

^*Equal Contribution ^†Corresponding Author

Paper arXiv Code coming soon

Demo

Abstract

Humanoid robots deployed in real-world scenarios often need to carry unknown payloads, which introduce significant mismatch and degrade the effectiveness of simulation-to-reality reinforcement learning methods. To address this challenge, we propose a two-stage gradient-based system identification framework built on the differentiable simulator MuJoCo XLA. The first stage calibrates the nominal robot model using real-world data to reduce intrinsic sim-to-real discrepancies, while the second stage further identifies the mass distribution of the unknown payload. By explicitly reducing structured model bias prior to policy training, our approach enables zero-shot transfer of reinforcement learning policies to hardware under heavy-load conditions. Extensive simulation and real-world experiments demonstrate more precise parameter identification, improved motion tracking accuracy, and substantially enhanced agility and robustness compared to existing baselines.

Method

Overview of HALO:(a)Data Collection:Trajectories are collected under both loaded and unloaded conditions using exploration policy trained with wide DR, followed by real-world deployment with a fixed foot constraint.(b)Data Processing:Full-body trajectories reconstruction from joint-state measurements via forward kinematics and foot-height alignment.(c)Two-stage Payload-related Parameter Identification:Stage 1 optimize the full set of model parameters to yield a calibrated base model using trajectories without payload. Based on the calibrated model, stage 2 optimize only the payload-related parameters, using trajectories collected under loaded conditions. (d)Heavy-loaded Motion Skills:The accurate identified model parameters enabling zero-shot sim-to-real transfer of the learned skills to the physical heavy-loaded humanoid.

BibTeX

@misc{wang2026haloclosingsimtorealgapheavyloaded,
      title={HALO:Closing Sim-to-Real Gap for Heavy-loaded Humanoid Agile Motion Skills via Differentiable Simulation}, 
      author={Xingyi Wang and Chenyun Zhang and Weiji Xie and Chao Yu and Wei Song and Chenjia Bai and Shiqiang Zhu},
      year={2026},
      eprint={2603.15084},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.15084}, 
}

More Works from Our Lab

Paper Title 1

Paper Title 2

Paper Title 3

HALO:Closing Sim-to-Real Gap for Heavy-loaded Humanoid Agile Motion Skills via Differentiable Simulation

Demo

Abstract

Method

BibTeX