-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[WIP]Add Wan2.2 Animate Pipeline (Continuation of #12442 by tolgacangoz) #12526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dg845
wants to merge
98
commits into
huggingface:main
Choose a base branch
from
dg845:add-wan2.2-animate-pipeline
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
98 commits
Select commit
Hold shift + click to select a range
3529a0a
template1
tolgacangoz 4f2ee5e
temp2
tolgacangoz 778fb54
up
tolgacangoz d77b6ba
up
tolgacangoz 2fc6ac2
fix-copies
tolgacangoz d667d03
Add support for Wan2.2-Animate-14B model in convert_wan_to_diffusers.py
tolgacangoz 6182d44
style
tolgacangoz 8c9fd89
Refactor WanAnimate model components
tolgacangoz d01e941
Enhance `WanAnimatePipeline` with new parameters for mode and tempora…
tolgacangoz 7af953b
Update `WanAnimatePipeline` to require additional video inputs and im…
tolgacangoz a0372e3
Add Wan 2.2 Animate 14B model support and introduce Wan-Animate frame…
tolgacangoz 05a01c6
Add unit test template for `WanAnimatePipeline` functionality
tolgacangoz 22b83ce
Add unit tests for `WanAnimateTransformer3DModel` in GGUF format
tolgacangoz 7fb6732
style
tolgacangoz 3e6f893
Improve the template of `transformer_wan_animate.py`
tolgacangoz 624a314
Update `WanAnimatePipeline`
tolgacangoz fc0edb5
style
tolgacangoz eb7eedd
Refactor test for `WanAnimatePipeline` to include new input structure
tolgacangoz 8968b42
from `einops` to `torch`
tolgacangoz dce83a8
Merge branch 'main' into integrations/wan2.2-animate
tolgacangoz 75b2382
Add padding functionality to `WanAnimatePipeline` for video frames
tolgacangoz 802896e
style
tolgacangoz e06098f
Enhance `WanAnimatePipeline` with additional input parameters for imp…
tolgacangoz 84768f6
up
tolgacangoz 06e6138
Refactor `WanAnimatePipeline` for improved tensor handling and mask g…
tolgacangoz 5777ce0
Refactor `WanAnimatePipeline` to streamline latent tensor processing …
tolgacangoz b8337c6
style
tolgacangoz f4eb9a0
Add new layers and functions to `transformer_wan_animate.py` for enha…
tolgacangoz 4e6651b
Merge branch 'main' into integrations/wan2.2-animate
tolgacangoz d80ae19
Refactor `transformer_wan_animate.py` to improve modularity and type …
tolgacangoz 348a945
Refactor `transformer_wan_animate.py` to enhance modularity and updat…
tolgacangoz 7774421
Update the `ConvLayer` class to conditionally apply bias based on act…
tolgacangoz a5536e2
Simplify
tolgacangoz 6a8662d
refactor transformer
tolgacangoz 96a126a
Enhance `convert_wan_to_diffusers.py` for Animate model integration
tolgacangoz 050b313
Merge branch 'main' into integrations/wan2.2-animate
tolgacangoz 0566e5d
Enhance `convert_wan_to_diffusers.py` and `WanAnimatePipeline` for im…
tolgacangoz fe02c25
simplify
tolgacangoz 04ab262
Refactor `WanAnimatePipeline` to enhance reference image handling and…
tolgacangoz 7bfbd93
Enhance weight conversion logic in `convert_wan_to_diffusers.py`
tolgacangoz 7092a28
Enhance documentation and tests for WanAnimatePipeline, adding exampl…
tolgacangoz 5d01574
Merge branch 'main' into integrations/wan2.2-animate
tolgacangoz 9c0a65d
Clarify contribution of M. Tolga Cangöz
tolgacangoz 28ac516
Update face_embedder key mappings in `convert_wan_to_diffusers.py`
tolgacangoz b71d3a9
up
tolgacangoz 5818d71
up
tolgacangoz bfda25d
Fix image embedding extraction in WanAnimatePipeline to return the la…
tolgacangoz 0ac259c
Adjust default parameters in WanAnimatePipeline for num_frames, num_i…
tolgacangoz e2e95ed
Update example docstring parameters for num_frames and guidance_scale…
tolgacangoz 7146bb0
Refactor tests in WanAnimatePipeline: remove redundant assertions and…
tolgacangoz 6ffdb99
Add fused relu for Wan animate activations
dg845 4556730
Refactor motion encoder to use custom Conv2d and Linear with weight s…
dg845 c3e69fc
Refactor WanAnimateFaceEncoder to make it easier to understand
dg845 7f4dde9
Refactor Wan Animate transformer to reuse WanTimeTextImageEmbedding
dg845 4f204ec
Refactor Wan Animate face blocks to use an attention processor
dg845 57e9ea3
Refactor Wan Animate transformer, taking into account previous changes
dg845 091b7ce
Remove unused imports in transformer_wan_animate
dg845 8216aef
Merge branch 'main' into add-wan2.2-animate-pipeline
dg845 50329d7
Add initial Wan Animate transformer tests
dg845 275d324
Refactor face block attn into its own Attention class and fix some bugs
dg845 ac2962d
Fix issues (such as device placement issues) to get remaining transfo…
dg845 0145135
Update Wan Animate conversion script to reflect changes to transformer
dg845 bdbd141
Add _repeated_blocks to Wan Animate transformer for regional compilation
dg845 2537133
Refactor Wan Animate pipeline to make latent preparation code more clear
dg845 332d3c2
Update Wan Animate pipeline tests after transformer an pipeline changes
dg845 99e56e3
Merge branch 'main' into add-wan2.2-animate-pipeline
dg845 00ddbb9
Fix some batching and device placement issues in Wan Animate pipeline
dg845 1e1e706
Remove reference_images tests for Wan Animate
dg845 a56bee1
Get Wan Animate pipeline fp16 inference tests working
dg845 6fb5ca8
Skip test_callback_inputs since the Wan Animate pipline is not compat…
dg845 1e61ed7
Fix mask video shapes for Wan Animate replacement
dg845 e2846f6
Use a separate VaeImageProcessor for the reference image as it uses d…
dg845 3a80241
Fix some more Wan Animate pipeline shape errors
dg845 86be600
Fix more bugs in Wan Animate pipeline
dg845 6748d25
Ensure that the replacement mask only has one channel
dg845 f696682
Support Wan Animate image preprocessing, fix bugs, clean up code
dg845 80d9f8b
Add docs for WanAnimateTransformer3DModel
dg845 d9c6bc6
make style and make quality
dg845 4e415d3
Merge branch 'main' into add-wan2.2-animate-pipeline
dg845 cbfc0ad
Fix first segment I2V mask for prev segement cond latents
dg845 b80be86
Use same Open CLIP checkpoint as other Wan2.1-based models
dg845 d87baa5
Merge branch 'main' into add-wan2.2-animate-pipeline
dg845 6420f0e
Copy Wan blocks for Wan Animate with # Copied from
dg845 dd680ee
Get regional compilation working without recompilation
dg845 6d92b3e
Remove Wan2.2 TI2V timestep logic as Wan Animate is based on Wan 2.1
dg845 d0c7750
Move motion encoder batch inference logic to forward and remove the m…
dg845 c2ec703
Move (de)standardize latents logic into Wan Animate pipeline __call__
dg845 2f549ee
Move Wan Animate ref image processing logic to its own VaeImageProces…
dg845 68da86a
make style and make quality
dg845 cb7977e
Make motion encoder inference batch size configurable from Wan Animat…
dg845 847e4a2
Avoid list comprehension for batched motion encoder inference as it u…
dg845 e4b1db0
Address more review comments
dg845 f0a0d21
Merge branch 'main' into add-wan2.2-animate-pipeline
dg845 e96f638
make style, make quality, make fix-copies
dg845 a6ddd02
Make motion_encode_batch_size configurable in pipeline __call__
dg845 6ad82e5
Merge branch 'main' into add-wan2.2-animate-pipeline
dg845 e74373b
Update Wan Animate pipeline example
dg845 2259ded
Have Wan image processor take into account the spatial patch size as …
dg845 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| <!-- Copyright 2025 The HuggingFace Team. All rights reserved. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
| specific language governing permissions and limitations under the License. --> | ||
|
|
||
| # WanAnimateTransformer3DModel | ||
|
|
||
| A Diffusion Transformer model for 3D video-like data was introduced in [Wan Animate](https://github.com/Wan-Video/Wan2.2) by the Alibaba Wan Team. | ||
|
|
||
| The model can be loaded with the following code snippet. | ||
|
|
||
| ```python | ||
| from diffusers import WanAnimateTransformer3DModel | ||
|
|
||
| transformer = WanAnimateTransformer3DModel.from_pretrained("Wan-AI/Wan2.2-Animate-14B-720P-Diffusers", subfolder="transformer", torch_dtype=torch.bfloat16) | ||
| ``` | ||
|
|
||
| ## WanAnimateTransformer3DModel | ||
|
|
||
| [[autodoc]] WanAnimateTransformer3DModel | ||
|
|
||
| ## Transformer2DModelOutput | ||
|
|
||
| [[autodoc]] models.modeling_outputs.Transformer2DModelOutput |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be in a follow-up PR but all these resize methods, including these for wan2.1 and wan2.2 5b, can be added to wan image processor now