If you read Article 1, you understand why high-fidelity hand data is the bottleneck in robot dexterity training. But dexterous hands are only one part of a humanoid robot. Arms, shoulders, hips, spine, legs — the full kinematic chain needs to be captured if you want a humanoid to learn natural, whole-body movement from human demonstration.
MANUS handles the hands. Xsens handles the rest. Together, they form the complete human motion input layer that leading humanoid robotics teams are deploying today.
This article explains how the two systems work individually, how they integrate, and what the combined output enables for teleoperation and AI training workflows.
Why Full-Body Capture Matters for Humanoids
Teaching a robot to grasp an object is partly a hand problem and mostly a whole-body coordination problem. The trajectory of the arm as it approaches, the shoulder rotation, the hip counterbalance as the robot leans forward — these are all captured in a human demonstration and need to be translated to the robot's kinematic chain.
This is why isolated hand tracking, however precise, is insufficient for humanoid robot training. The policy needs to see the complete body motion, not just finger configuration. Without full-body data, you are training on an incomplete picture of what the human operator actually did.
The problem scales with task complexity. Simple reach-and-grasp can be approximate. But tasks like sorting small objects, operating tools, or performing two-handed manipulation — the kinds of tasks that create real economic value for humanoid robots — require the full kinematic context.
Two different tracking technologies solving two different problems
Xsens uses inertial measurement units (IMUs) — miniature sensors that measure acceleration and angular rate. IMUs excel at tracking large body segments (torso, limbs) over unrestricted ranges of motion without any external infrastructure.
MANUS uses electromagnetic field (EMF) tracking for fingers — far more precise than IMU for small joint angles, and immune to the occlusion problems that affect optical hand tracking. The two technologies are complementary: neither one covers the other's use case well.
What Xsens Provides: The Body Layer
The Xsens Link is a full-body wearable motion capture system built around 17 IMU sensors distributed across the body — pelvis, sternum, head, both upper arms, forearms, hands, upper legs, lower legs, and feet. Each sensor measures six degrees of freedom, giving Xsens MVN software the raw data to compute a full-body skeletal pose at up to 400Hz.
What makes Xsens well-suited for robotics (versus entertainment mocap) is its engineering for real-world operating conditions:
- Magnetic disturbance compensation — the sensor fusion algorithm is designed to maintain accuracy in environments with metal structures, motors, and magnetic interference common in robotics labs
- No external infrastructure — no cameras, markers, or fixed reference points required; the operator can move freely anywhere in the working space
- 400Hz orientation output — high enough frequency for closed-loop gait control and contact-rich manipulation policy training
- ROS 1 and ROS 2 drivers — data streams directly into standard robotics middleware with minimal integration overhead
- NVIDIA Jetson compatible — cross-platform SDK including embedded computing targets used widely in humanoid robot onboard systems
Xsens has been deployed in hundreds of humanoid robots across development, testing, and training phases. The system's track record in demanding real-world conditions is why it remains the reference choice for robotics teams that need reliable body kinematics data.
How MANUS and Xsens Integrate
The integration point is software: MANUS Core and Xsens MVN Animate are designed to work together natively. MANUS Core has a dedicated MVN integration mode that synchronizes finger data from the gloves with the full-body skeleton computed by MVN, merging them into a single unified data stream.
From an operator setup perspective, this means:
- Operator puts on the Xsens Link suit (approximately 5 minutes)
- Operator puts on MANUS Metagloves over the suit's hand sensors
- MANUS Core and Xsens MVN are calibrated in sequence — the glove finger data replaces MVN's estimated hand pose with measured finger articulation
- The combined skeleton — full body with precise 25 DoF hand data — streams in real time
"We really enjoy using MANUS products, as they allow us to demonstrate intuitive motion to a variety of clients with high satisfaction."
Sun-Myung Kim, R&D Team Chief — Tesollo Inc.The key integration detail is that MANUS replaces MVN's estimated hand pose with measured hand pose. By default, Xsens MVN infers hand and finger position from wrist orientation and general body model assumptions. This is fine for animation but insufficient for robot manipulation training. MANUS gloves override this inference with actual per-joint measurements — a significant data quality improvement for any pipeline that depends on hand configuration.
The Unified Skeleton Output
The combined MANUS + Xsens system outputs a single synchronized skeleton with the full kinematic chain from head to fingertip — no post-processing required to merge the two data sources. The output is available via:
Full-body + hand skeleton published as standard ROS messages. Retargeting nodes consume the skeleton and map human joint angles to robot joint configurations.
Combined data streams into Xsens MVN Analyze for ergonomic research, labeling, and dataset annotation workflows.
Export full-body + hand demonstrations as structured files. Standard formats compatible with PyTorch, TensorFlow, and most robot learning frameworks.
MANUS is natively supported in NVIDIA Isaac Lab 2.3 and Isaac Teleop. Xsens body data feeds into the same pipeline for whole-body humanoid control.
Real-time visualization and debugging in 3D environments. Useful for verifying retargeting quality before committing to large-scale data collection runs.
Real-World Deployments Using Both Systems
The MANUS + Xsens combination isn't a theoretical stack — it's already in production at several robotics organizations. A few documented deployments:
Kepler demonstrated K2 humanoid teleoperation using MANUS gloves paired with Xsens motion capture suits — full-body + hand control in a single integrated pipeline.
Alisys showcased real-time teleoperation at Mobile World Congress 2026 using MANUS gloves alongside Xsens suits for the complete motion capture layer.
Westlake University's Titan o1 humanoid uses MANUS gloves to mirror a human operator's hand movements with millimeter-level precision in real time.
Xsens powers full-body motion capture for Shanghai Humanoid Robots' manufacturing automation research, covering everything from bean sorting to assembly tasks.
Use Cases: Teleoperation and AI Training
Whole-body teleoperation
The most immediate application is full-body teleoperation: an operator wearing the Xsens suit and MANUS gloves controls a humanoid robot's complete kinematic chain in real time. Every arm movement, torso rotation, and finger position maps directly to the robot's corresponding joints via a retargeting layer (typically ROS 2 nodes with inverse kinematics).
This matters for tasks that require body-level coordination — moving an arm while maintaining balance, using both hands simultaneously, or adapting whole-body posture to reach a specific target. Hands-only control systems cannot capture this coordination.
Full-body imitation learning datasets
For teams building foundation models for humanoid manipulation, the data collection workflow is straightforward: operators perform tasks while the system records synchronized full-body + hand kinematics. The resulting demonstrations capture both what the hands did and how the whole body was configured during each task — the complete context a policy needs to generalize.
Why synchronized capture matters more than separate capture
Recording body and hand data with separate systems and then time-aligning them in post-processing introduces synchronization error — typically 10–50ms depending on the systems involved. For manipulation policies, this means hand configuration data is mismatched with body pose data at every time step. The MANUS + Xsens native integration avoids this entirely: both systems share a clock reference, and MANUS Core handles the merge in real time.
Hardware: What to Order
The complete MANUS + Xsens stack for humanoid robotics involves three products. All three are available through Knoxlabs on a single purchase order.
17-sensor IMU suit, 400Hz, ROS 1/2, NVIDIA Jetson compatible.
View on Knoxlabs
25 DoF, EMF, native MVN integration. Best for data collection.
View on Knoxlabs
25 DoF + per-finger haptics. Best for live robot control.
View on KnoxlabsThe Metagloves Pro is the right choice for teams running data collection and training workflows — no need for haptic feedback when the goal is recording demonstrations. The Metagloves Pro Haptic is the right choice for teams running live teleoperation, especially tasks where contact sensing improves operator performance.
Both gloves integrate natively with Xsens MVN via MANUS Core. Both connect to ROS 2 and NVIDIA Isaac via the MANUS C++ SDK.
Getting Started with Knoxlabs
Knoxlabs is an authorized reseller of both MANUS and Xsens and can source, configure, and ship the complete stack as a single order. For teams deploying both systems together, this matters practically: hardware arrives at the same time, pre-tested for compatibility, with a single point of contact for any support issues.
For teams who want to validate the configuration before committing to full-scale procurement, Knoxlabs can advise on pilot configurations — a common starting point is a single operator kit (one Xsens Link suit, one pair of MANUS Metagloves Pro or Haptic) before scaling to multiple operator stations.
Ready to configure your motion capture stack?
Tell us your use case. We'll spec the right combination and quote within 24 hours.
The complete component catalog — including Xsens, MANUS, VR headsets, and robot hands — is on the Knoxlabs Robotics & Teleoperation hub.
Leave a comment