Skip to content

[Feature] Allow droping models between steps from memory altogether. #908

@ClearStaff

Description

@ClearStaff

Feature Summary

Add toggle that forces each model to be in memory just when needed.

Detailed Description

Hello, as you may know Apple Silicon MacBooks have unified memory architecture. Which basically means kicking models out of VRAM doesnt do anything, when they "return to RAM" on this devices those models still occupy unified memory.

Considering how fast SSD's are on this device it would be great to be able to load CLIP's, encode text, then unload CLIP, load for example WAN, create latent image, unload WAN, then finally load VAE, and generate output.

Unloading models completely changes game for MacBooks, instead of worrying that all you models combined + theirs context must take less space than is available your only limit would be current model + it's context + input from previous model.

I would tackle it, but i don't really use / know C++ i'm afraid, you can still point me in the right direction and i can try to see if i can do something alone if nobody else has time for this.

Alternatives you considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions