In this paper, we explore an important yet underexplored task in robot manipulation: cycle-based manipulation, where robots need to perform cyclic or repetitive actions with an expected terminal time. These tasks are crucial in daily life, such as shaking a bottle or knocking a nail. However, few prior works have explored this task, leading to two main challenges: 1) the imitation methods often fail to complete these tasks within the expected terminal time due to the ineffective utilization of history; 2) the absence of a benchmark with sufficient data and automatic evaluation tools hinders development of effective solutions in this area.
To address these challenges, we firstly propose the CycleManip framework to achieve cycle-based task manipulation in an end-to-end imitation manner without requiring any extra models, hierarchical structure or significant computational overhead. The core insight is to enhance effective history perception by a cost-aware sampling strategy and to improve historical understanding by multi-task learning. Secondly, we introduce a cycle-based task manipulation benchmark, which provides diverse cycle-based tasks, and an automatic evaluation method.
Extensive experiments conducted in both simulation and real-world settings demonstrate that our method achieves high success rates in cycle-based task manipulation. The results further show strong adaptation performance in general manipulation, and the plug-and-play ability on imitation policies such as Vision-Language-Action (VLA) models. Moreover, the results show that our approach can be applied across diverse robotic platforms, including bi-arm grippers, dexterous hands, and humanoid robots.
✕No Cycling
 Drum Hammering
✕No Cycling
  Dual-Knife Chopping (Sim)
✕Infinite Looping
  Bottle Shaking (Sim)
✕Infinite Looping
  Bottle Shaking (Real)
Block Hammering
Bottle Shaking
Carrot Cutting
Chemistry Mixing
Dual-Knife Chopping
Egg Beating
Morse SOS
Roller Rolling
Block Hammering
Bottle Shaking
Drum Hammering
Table Cleaning
Carrot Cutting (Dexterous Hand)
Tire Pumping (Humanoid)
| Method | Block Hammering | Bottle Shaking | Roller Rolling | Carrot Cutting | Dual-Knife Chopping | Egg Beating | Chemical Mixing | Morse Tapping | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Suc. | Cyc. | Suc. | Cyc. | Suc. | Cyc. | Suc. | Cyc. | Suc. | Cyc. | Suc. | Cyc. | Suc. | Cyc. | Suc. | Cyc. | |
| DP | 8 | 8.33 | 8 | 7.91 | 25 | 1.88 | 4 | 5.65 | 8 | 3.79 | 15 | 2.18 | 20 | 1.16 | 0 | - |
| DP3 | 23 | 5.55 | 16 | 4.58 | 33 | 1.44 | 38 | 1.92 | 48 | 0.81 | 19 | 1.95 | 18 | 1.41 | 1 | - |
| RDT | 20 | 2.15 | 15 | 1.53 | 35 | 1.55 | 36 | 1.24 | 42 | 2.13 | 16 | 2.31 | 12 | 2.0 | 0 | - |
| Pi-0 | 13 | 3.44 | 19 | 2.00 | 14 | 3.80 | 8 | 2.54 | 1 | 3.14 | 4 | 2.15 | 2 | 2.37 | 0 | - |
| Ours | 86 | 0.25 | 95 | 0.29 | 97 | 0.03 | 86 | 0.81 | 90 | 0.4 | 74 | 0.61 | 53 | 0.76 | 91 | - |
Suc. = Success Rate (%), Cyc. = Cycle Count Deviation
| Task | Setting | DP3 | w/o Task | Ours | |||
|---|---|---|---|---|---|---|---|
| Suc. | Cyc. | Suc. | Cyc. | Suc. | Cyc. | ||
| Block Hammering | Single Gripper | 37.5 | 1.12 | 62.5 | 0.5 | 93.75 | 0.125 |
| Bottle Shaking | Single Gripper | 12.5 | 3.81 | 31.25 | 1.31 | 68.75 | 0.375 |
| Drum Beating | Bi-Gripper | 0 | 2.4 | 60 | 0.8 | 90 | 0.2 |
| Table Cleaning | Bi-Gripper | 20 | 0.9 | 40 | 1.6 | 100 | 0.00 |
| Tire Pumping | Humanoid | 10 | 3.70 | 20 | 2.0 | 50 | 1.5 |
| Knife Cutting | Bi-Dexterous | 0 | 1.75 | 25 | 4.125 | 75 | 0.88 |
w/o Task = Ours without historical understanding
@inproceedings{wei2025cyclemanip,
author = {Yi-Lin Wei and Haoran Liao and Yuhao Lin and Pengyue Wang and Zhizhao Liang and Guiliang Liu and Wei-Shi Zheng},
title = {CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understanding},
}