Separate Motion from Appearance: Customizing Motion via Customizing Text-to-Video Diffusion Models

Video Results


Customization on Sports


Reference Videos Edited Videos Edited Videos Edited Videos
视频1 视频2 视频3 视频4
"A person is walking." "A cowboy is walking in the wild west." "A robot is walking in a cyberpunk city." "A wizard is walking in a mystical forest."
视频5 视频6 视频7 视频8
"A person is riding a bicycle." "A panda is riding bicycle in a garden." "A girl is riding a bicycle in a country road." "Iron Man is riding a bicycle in a city."
"A person is skateboarding." "An astronaut is skateboarding on Mars." "A teddy bear skateboarding in Times Square New York." "A robot is skateboarding in a cyberpunk city."]
"A monkey is playing golf on a field full of flowers." "An astronaut is skateboarding on Mars." "A bear is playing golf in a garden." "Iron Man is playing golf in the square."

Customization on Cinematic Shots


Reference Videos Edited Videos Edited Videos Edited Videos
"A cup in a lab captured with a zoom in." "A wolf standing on a snowy mountain captured with a zoom in." "An eagle standing on the edge of a cliff captured with a zoom in." "A wizard standing in a mysterious forest captured with a zoom in."
"A man standing in room captured with a zoom out." "A wolf standing on a snowy mountain captured with a zoom out." "An eagle standing on the edge of a cliff captured with a zoom out." "A wizard standing in a mysterious forest captured with a zoom out."
"A cup in a lab captured with a dolly zoom." "A wolf standing on a snowy mountain captured with a dolly zoom." "An eagle standing on the edge of a cliff captured with a dolly zoom." "A wizard standing in a mysterious forest captured with a dolly zoom."

Customization on One-Shot


Reference Videos Edited Videos Edited Videos Edited Videos
"A car is running on the road." "A horse is running in a forest." "A tank is running on the road." "A dog is running in a garden."
Reference Videos Edited Videos Reference Videos Edited Videos
"A ship sails on the sea during sunset." "A floating island sails on the sea during sunset." "Two grey sharks swim in the blue ocean on a coral reef." "Two sea dragons swim in the blue ocean on a coral reef."
"A man is surfing inside the barrel of a wave." "A bear is surfing inside the barrel of a wave." "A women is eating a pizza with various toppings." "A woman is eating a sandwich with various toppings."

Comparisions to Baseline


Reference MotionDirector[1] Ours Reference MotionDirector[1] Ours
"A person is playing golf." "A monkey is playing golf on a field full of flowers." "A monkey is playing golf on a field full of flowers." "Two grey sharks swim in the blue ocean on a coral reef." "Two sea dragons swim in the blue ocean on a coral reef." "Two sea dragons swim in the blue ocean on a coral reef."
"A person is skateboarding." "A robot is skateboarding in a cyberpunk city." "A robot is skateboarding in a cyberpunk city." "A ship sails on the sea during sunset." "A floating island sails on the sea during sunset." "A floating island sails on the sea during sunset."
"A person is walking." "A teddy bear is walking in Times Square New York." "A teddy bear is walking in Times Square New York." "A woman is eating a pizza with various toppings" "A woman is eating a sandwich with various toppings" "A woman is eating a sandwich with various toppings"

Reference

[1] Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jia-Wei Liu, Weijia Wu, Jussi Keppo, and Mike Zheng Shou. Baseline: Motion customization of text-to-video diffusion models. In European Conference on Computer Vision, pages. Springer, 2025.