2026-03-29
Towards Embodied AI with MuscleMimic: Unlocking full-body musculoskeletal motor learning at scale
Chengkun Li, Cheryl Wang, Bianca Ziliotto, Merkourios Simos, Jozsef Kovecses, Guillaume Durandau, Alexander Mathis et al.
problem
learning motor control for muscle-driven musculoskeletal models is hindered by two bottlenecks: the computational cost of biomechanically accurate simulation (seconds per timestep on CPU), and the scarcity of validated open full-body models. most prior work on humanoid control uses simplified rigid-body dynamics with joint torque actuators, ignoring the complexity of real muscles (activation dynamics, force-length-velocity relationships, tendon compliance). this limits transfer to real biomechanics research and prosthetics.
prior approaches:
- DeepMimic (peng et al.): joint-torque-driven characters, no muscle dynamics. can’t study biomechanics.
- AMP (peng et al.): adversarial motion priors for torque-driven humanoids. same limitation.
- MyoSuite (macklin et al.): musculoskeletal but limited to isolated body parts (hand, arm). no full-body locomotion.
- learning muscle control for biomechanics (various): small-scale, single-task, often hand-crafted reward functions.
architecture
flowchart LR
mocap[SMPL motion capture] --> RT[retargeting pipeline]
RT --> joint_ref[reference joint angles]
joint_ref --> PPO[PPO policy]
obs[joint state muscle state task info] --> PPO
PPO --> exc[muscle excitations u_t]
exc --> Sim[MuJoCo GPU simulator]
Sim --> torque[Hill-type muscle forces]
torque --> env[environment physics]
env --> obs
style Sim fill:#c4b8a6,color:#fff
style PPO fill:#b09a84,color:#fff
simulator: uses MuJoCo with custom muscle actuators. each muscle modeled as a Hill-type actuator with:
- activation dynamics: $\dot{a} = (u - a) / \tau_a$, where $u \in [0, 1]$ is the neural excitation and $\tau_a$ is the activation time constant
- muscle force: $f = a \cdot f_l(l) \cdot f_v(v) \cdot f_{\max}$, where $f_l$ and $f_v$ are the force-length and force-velocity curves
two validated embodiments:
- fixed-root upper-body ($N = 126$ muscles, $30$ DOF): for manipulation tasks (reaching, grasping, tool use)
- full-body ($N = 416$ muscles, $76$ DOF): for locomotion (walking, running, jumping)
retargeting pipeline: maps SMPL motion capture data to musculoskeletal joint space via inverse kinematics, then extracts joint angle trajectories as reference for imitation.
policy: RL policy (PPO) that outputs muscle excitations $u_t$ per muscle per timestep. the observation space includes joint positions, velocities, muscle states (length, velocity, activation), and task-specific information (target positions, object states).
\[\pi_\theta(a_t \mid o_t) \rightarrow u_t \in [0, 1]^{N_{\text{muscles}}}\]massively parallel GPU simulation: custom MuJoCo-based simulator runs $4096+$ environments in parallel on a single GPU, giving order-of-magnitude speedup over CPU simulation.
training
- hardware: single NVIDIA GPU (type not specified, likely A100 or similar)
- training time: “days” for a generalist policy across hundreds of motions (vs months on CPU)
- algorithm: PPO with shared generalist policy across diverse motion clips
- dataset: hundreds of diverse motions from CMU Mocap and AMASS, retargeted to both embodiments via SMPL pipeline
- reward: combination of imitation reward (joint position tracking) and biomechanical regularization (muscle effort penalties, joint limit avoidance)
- curriculum: starts with easy motions, progressively adds harder ones
evaluation
single generalist policy performance:
- trained on hundreds of diverse motions, achieves robust performance across unseen motion categories
- strong biomechanical validation against experimental walking/running data
- mean correlation $r = 0.90$ for joint kinematics against ground-truth motion capture
muscle activation analysis:
- key finding: kinematic imitation alone does NOT achieve physiological muscle fidelity
- the policy can match joint trajectories while using physiologically implausible muscle coordination patterns
- this suggests future work needs explicit biomechanical objectives (e.g., EMG matching, metabolic cost minimization), not just motion matching
embodiment-specific results:
- upper-body (126 muscles): successful manipulation across diverse reaching and grasping tasks
- full-body (416 muscles): locomotion (walking, running, turning) with physically plausible ground reaction forces
reproduction guide
- clone the repo:
git clone https://github.com/amathislab/musclemimic - install dependencies: MuJoCo, PyTorch, CUDA. the README should have exact version pins
- download preprocessed motion datasets (links in repo)
- for a quick test: train on a single locomotion motion first (e.g., walking). expect convergence in hours on a single GPU
- for the full generalist: train across the full motion library. this takes days on a single GPU
- the repo includes pre-trained checkpoints, musculoskeletal model files, and retargeted datasets - excellent reproducibility
- known gotcha: muscle simulation is sensitive to timestep size. use the default timestep from the paper - smaller timesteps improve stability but slow training
notes
this is highly relevant for bopi’s embodied AI interests. the key contribution is making musculoskeletal simulation practical at scale through GPU parallelization. $416$ muscles is a real full-body model, not a toy.
the finding that kinematic imitation alone doesn’t produce physiological muscle fidelity is important and underappreciated. it means if you want to study real biomechanics (for prosthetics, rehabilitation, ergonomics), you need to go beyond motion matching and add explicit biomechanical objectives.
the open-source nature is excellent - code, models, datasets, and checkpoints all available. this makes it a strong foundation for future work.
open questions:
- can you combine muscle-based control with learned tactile feedback for dexterous manipulation?
- what happens if you add explicit EMG matching as a training objective? does it improve physiological fidelity?
- can the retargeting pipeline handle motion capture from different body shapes/sizes?
- how does the sample efficiency compare to torque-driven approaches? is the extra complexity of muscle dynamics worth it for robotics applications that don’t need biomechanical accuracy?