[Feature] Allow droping models between steps from memory altogether.

### Feature Summary

Add toggle that forces each model to be in memory just when needed.

### Detailed Description

Hello, as you may know Apple Silicon MacBooks have unified memory architecture. Which basically means kicking models out of VRAM doesnt do anything, when they "return to RAM" on this devices those models still occupy unified memory.

Considering how fast SSD's are on this device it would be great to be able to load CLIP's, encode text, then unload CLIP, load for example WAN, create latent image, unload WAN, then finally load VAE, and generate output.

Unloading models completely changes game for MacBooks, instead of worrying that all you models combined + theirs context must take less space than is available your only limit would be current model + it's context + input from previous model. 

I would tackle it, but i don't really use / know C++ i'm afraid, you can still point me in the right direction and i can try to see if i can do something alone if nobody else has time for this.

### Alternatives you considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Allow droping models between steps from memory altogether. #908

Feature Summary

Detailed Description

Alternatives you considered

Additional context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature] Allow droping models between steps from memory altogether. #908

Description

Feature Summary

Detailed Description

Alternatives you considered

Additional context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions