-
Notifications
You must be signed in to change notification settings - Fork 439
Description
Git commit
8f6c5c2
version https://github.com/leejet/stable-diffusion.cpp/releases/tag/master-348-8f6c5c2
Operating System & Version
windows 10 22h2 19045.4717
GGML backends
Vulkan
Command-line arguments used
./sd.exe -M vid_gen --diffusion-model ./Wan2.2-TI2V-5B-Q8_0.gguf --vae ./wan2.2_vae.safetensors --t5xxl ./umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作 品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG 压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走" -W 480 -H 832 --diffusion-fa --offload-to-cpu --video-frames 33 --flow-shift 3.0 -v
Steps to reproduce
- run ./sd.exe -M vid_gen --diffusion-model ./Wan2.2-TI2V-5B-Q8_0.gguf --vae ./wan2.2_vae.safetensors --t5xxl ./umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作 品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG 压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走" -W 480 -H 832 --diffusion-fa --offload-to-cpu --video-frames 33 --flow-shift 3.0 -v
throw error
[ERROR] ggml_extend.hpp:75 - ggml_gallocr_reserve_n: failed to allocate Vulkan0 buffer of size 21773256964
[ERROR] ggml_extend.hpp:1588 - wan_vae: failed to allocate the compute buffer
What you expected to happen
success
What actually happened
throw error
Logs / error messages / stack trace
➜ vulkan ./sd.exe -M vid_gen --diffusion-model ./Wan2.2-TI2V-5B-Q8_0.gguf --vae ./wan2.2_vae.safetensors --t5xxl ./umt5-xxl-encoder-Q8_0.gguf -p "a lovely cat" --cfg-scale 6.0 --sampling-method euler -v -n "色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作 品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG 压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走" -W 480 -H 832 --diffusion-fa --offload-to-cpu --video-frames 33 --flow-shift 3.0 -v
Option:
n_threads: 12
mode: vid_gen
model_path:
wtype: unspecified
clip_l_path:
clip_g_path:
clip_vision_path:
t5xxl_path: ./umt5-xxl-encoder-Q8_0.gguf
qwen2vl_path:
qwen2vl_vision_path:
diffusion_model_path: ./Wan2.2-TI2V-5B-Q8_0.gguf
high_noise_diffusion_model_path:
vae_path: ./wan2.2_vae.safetensors
taesd_path:
esrgan_path:
control_net_path:
embedding_dir:
photo_maker_path:
pm_id_images_dir:
pm_id_embed_path:
pm_style_strength: 20.00
output_path: output.png
init_image_path:
end_image_path:
mask_image_path:
control_image_path:
ref_images_paths:
control_video_path:
auto_resize_ref_image: true
increase_ref_index: false
offload_params_to_cpu: true
clip_on_cpu: false
control_net_cpu: false
vae_on_cpu: false
diffusion flash attention: true
diffusion Conv2d direct: false
vae_conv_direct: false
control_strength: 0.90
prompt: a lovely cat
negative_prompt: 色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作 品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG 压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走
clip_skip: -1
width: 480
height: 832
sample_params: (txt_cfg: 6.00, img_cfg: 6.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: default, sample_method: euler, sample_steps: 20, eta: 0.00, shifted_timestep: 0)
high_noise_sample_params: (txt_cfg: 7.00, img_cfg: 7.00, distilled_guidance: 3.50, slg.layer_count: 3, slg.layer_start: 0.01, slg.layer_end: 0.20, slg.scale: 0.00, scheduler: default, sample_method: default, sample_steps: -1, eta: 0.00, shifted_timestep: 0)
moe_boundary: 0.875
prediction: default
flow_shift: 3.00
strength(img2img): 0.75
rng: cuda
seed: 42
batch_count: 1
vae_tiling: false
force_sdxl_vae_conv_scale: false
upscale_repeats: 1
chroma_use_dit_mask: true
chroma_use_t5_mask: false
chroma_t5_mask_pad: 1
video_frames: 33
vace_strength: 1.00
fps: 16
System Info:
SSE3 = 1
AVX = 1
AVX2 = 1
AVX512 = 0
AVX512_VBMI = 0
AVX512_VNNI = 0
FMA = 1
NEON = 0
ARM_FMA = 0
F16C = 1
FP16_VA = 0
WASM_SIMD = 0
VSX = 0
[DEBUG] stable-diffusion.cpp:147 - Using Vulkan backend
[DEBUG] ggml_extend.hpp:66 - ggml_vulkan: Found 1 Vulkan devices:
[DEBUG] ggml_extend.hpp:66 - ggml_vulkan: 0 = AMD Radeon RX 7900 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 64 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
[INFO ] stable-diffusion.cpp:203 - loading diffusion model from './Wan2.2-TI2V-5B-Q8_0.gguf'
[INFO ] model.cpp:1001 - load ./Wan2.2-TI2V-5B-Q8_0.gguf using gguf format
[DEBUG] model.cpp:1018 - init from './Wan2.2-TI2V-5B-Q8_0.gguf'
[ERROR] ggml_extend.hpp:75 - gguf_init_from_file_impl: tensor 'patch_embedding.weight' has invalid number of dimensions: 5 > 4
[ERROR] ggml_extend.hpp:75 - gguf_init_from_file_impl: failed to read tensor info
[ERROR] model.cpp:1027 - failed to open './Wan2.2-TI2V-5B-Q8_0.gguf' with gguf_init_from_file. Try to open it with GGUFReader.
[DEBUG] gguf_reader.hpp:198 - GGUF v3, tensor_count=825, metadata_kv_count=3
[DEBUG] model.cpp:1739 - patch_embedding_channels 147456
[INFO ] stable-diffusion.cpp:243 - loading t5xxl from './umt5-xxl-encoder-Q8_0.gguf'
[INFO ] model.cpp:1001 - load ./umt5-xxl-encoder-Q8_0.gguf using gguf format
[DEBUG] model.cpp:1018 - init from './umt5-xxl-encoder-Q8_0.gguf'
[INFO ] stable-diffusion.cpp:264 - loading vae from './wan2.2_vae.safetensors'
[INFO ] model.cpp:1004 - load ./wan2.2_vae.safetensors using safetensors format
[DEBUG] model.cpp:1109 - init from './wan2.2_vae.safetensors', prefix = 'vae.'
[DEBUG] model.cpp:1739 - patch_embedding_channels 147456
[INFO ] stable-diffusion.cpp:285 - Version: Wan 2.2 TI2V
[INFO ] stable-diffusion.cpp:312 - Weight type stat: f32: 74 | f16: 720 | q8_0: 469
[INFO ] stable-diffusion.cpp:313 - Conditioner weight type stat: f32: 73 | q8_0: 169
[INFO ] stable-diffusion.cpp:314 - Diffusion model weight type stat: f32: 1 | f16: 524 | q8_0: 300
[INFO ] stable-diffusion.cpp:315 - VAE weight type stat: f16: 196
[DEBUG] stable-diffusion.cpp:317 - ggml tensor size = 400 bytes
[INFO ] wan.hpp:2123 - Wan2.2-TI2V-5B
[INFO ] stable-diffusion.cpp:451 - Using flash attention in the diffusion model
[DEBUG] ggml_extend.hpp:1783 - t5 params backend buffer size = 5757.05 MB(RAM) (242 tensors)
[DEBUG] ggml_extend.hpp:1783 - Wan2.2-TI2V-5B params backend buffer size = 5153.43 MB(RAM) (825 tensors)
[DEBUG] ggml_extend.hpp:1783 - wan_vae params backend buffer size = 1344.24 MB(RAM) (196 tensors)
[DEBUG] stable-diffusion.cpp:592 - loading weights
[DEBUG] model.cpp:1920 - using 12 threads for model loading
[DEBUG] model.cpp:1942 - loading tensors from ./Wan2.2-TI2V-5B-Q8_0.gguf
|================================> | 825/1263 - 428.57it/s
[DEBUG] model.cpp:1942 - loading tensors from ./umt5-xxl-encoder-Q8_0.gguf
|==========================================> | 1067/1263 - 219.91it/s
[DEBUG] model.cpp:1942 - loading tensors from ./wan2.2_vae.safetensors
|==================================================| 1263/1263 - 230.52it/s
[INFO ] model.cpp:2151 - loading tensors completed, taking 5.48s (process: 0.00s, read: 4.12s, memcpy: 0.00s, convert: 0.00s, copy_to_backend: 0.00s)
[INFO ] stable-diffusion.cpp:690 - total params memory size = 12254.72MB (VRAM 12254.72MB, RAM 0.00MB): text_encoders 5757.05MB(VRAM), diffusion_model 5153.43MB(VRAM), vae 1344.24MB(VRAM), controlnet 0.00MB(VRAM), pmid 0.00MB(VRAM)
[INFO ] stable-diffusion.cpp:769 - running in FLOW mode
[DEBUG] stable-diffusion.cpp:801 - finished loaded file
[INFO ] stable-diffusion.cpp:2745 - generate_video 480x832x33
[INFO ] stable-diffusion.cpp:947 - attempting to apply 0 LoRAs
[INFO ] stable-diffusion.cpp:967 - apply_loras completed, taking 0.00s
[DEBUG] stable-diffusion.cpp:968 - prompt after extract and remove lora: "a lovely cat"
[DEBUG] conditioner.hpp:1415 - parse 'a lovely cat' to [['a lovely cat', 1], ]
[DEBUG] t5.hpp:402 - token length: 512
[INFO ] ggml_extend.hpp:1698 - t5 offload params (5757.05 MB, 242 tensors) to runtime backend (Vulkan0), taking 1.35s
[DEBUG] ggml_extend.hpp:1598 - t5 compute buffer size: 297.00 MB(VRAM)
[DEBUG] conditioner.hpp:1515 - computing condition graph completed, taking 1728 ms
[DEBUG] conditioner.hpp:1415 - parse '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作 品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG 压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走' to [['色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作 品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG 压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走', 1], ]
[DEBUG] t5.hpp:402 - token length: 512
[INFO ] ggml_extend.hpp:1698 - t5 offload params (5757.05 MB, 242 tensors) to runtime backend (Vulkan0), taking 1.28s
[DEBUG] ggml_extend.hpp:1598 - t5 compute buffer size: 297.00 MB(VRAM)
[DEBUG] conditioner.hpp:1515 - computing condition graph completed, taking 1667 ms
[INFO ] stable-diffusion.cpp:2999 - get_learned_condition completed, taking 3412 ms
[DEBUG] stable-diffusion.cpp:3055 - sample 30x52x9
[INFO ] ggml_extend.hpp:1698 - Wan2.2-TI2V-5B offload params (5153.43 MB, 825 tensors) to runtime backend (Vulkan0), taking 1.69s
[DEBUG] ggml_extend.hpp:1598 - Wan2.2-TI2V-5B compute buffer size: 335.35 MB(VRAM)
|==================================================| 20/20 - 4.17s/it
[INFO ] stable-diffusion.cpp:3082 - sampling completed, taking 83.52s
[INFO ] stable-diffusion.cpp:3103 - generating latent video completed, taking 84.18s
[INFO ] ggml_extend.hpp:1698 - wan_vae offload params (1344.24 MB, 196 tensors) to runtime backend (Vulkan0), taking 0.23s
ggml_vulkan: Device memory allocation of size 2760376320 failed.
ggml_vulkan: Requested buffer size exceeds device buffer size limit: ErrorOutOfDeviceMemory
[ERROR] ggml_extend.hpp:75 - ggml_gallocr_reserve_n: failed to allocate Vulkan0 buffer of size 21773256964
[ERROR] ggml_extend.hpp:1588 - wan_vae: failed to allocate the compute buffer
[1] 1506 segmentation fault ./sd.exe -M vid_gen --diffusion-model ./Wan2.2-TI2V-5B-Q8_0.gguf --vae -p
Additional context / environment details
cpu 5900x
gpu 7900xt (20G memory 32G share memeory)