LTX-2 Inflate It!

Prompt: The video showcases a figurine. The camera zooms out. Then infl4t3 inflates it, the figurine expanding into giant, inflated balloon against the landscape.

Prompt: The video showcases a tank, then infl4t3 inflates it, the tank expanding into giant, inflated balloon against the landscape.

Prompt: The video shows an anime girl on a table. Then infl4t3 inflates it, the anime girl transforming into perfect, inflated balloon. She then flies upwards.

Prompt: The video shows three game die located at the center of the scene. Then infl4t3 inflates it, the dies transforming into perfect, inflated balloons.

Prompt: The video showcases an anime girl, then infl4t3 inflates it, the anime girl expanding into giant, inflated balloon against the landscape.

Prompt: A rabbit rests against beige background, then infl4t3 inflates it, its form expanding, and its skin and feathers balloons around the rabbit in a chic, air-filled design.

Model description

LTX-2 Inflate it!

We all know the potential of a model reveals itself only when LoRAs are trained. This is my second LTX-2 LoRA after the Hydraulic press one. It follows my journey of porting some the classic VFX LoRAs (I have most affection to) to LTX-2.

In contrast to the press lora, this one was harder to train. Even with CREPA, Prodigy and TREAD activated, my first attempt was disasterous: the characters and objects weren't inflating - balls and spheres spawned near them and only the balls flied upwards.

My hypothesis is that LTX-2 has been RL (reinforcement learning) maxxed on subject integrity, that's why in contrast to still transformations like hydraulic press or "being cut as cake" it's harder to make surrealism or alterations to living subjects.

In the previous LoRAs the attention part of the transformer was unfrozen. As we know, the "attention" mechanism is responsible for "mixing" the concepts, and it is usually enough when training a LoRA. However, if a model doesn't know (because of LTX-2 being trained on much fever concepts than Wan) - or refuses to know a concept (because of RL's integrity enforcement), the FFN part of the transformer - which is thought to be responsible for a transformer's concept space - needs to be unfrozen. The downside of this is that most of a model's parameters reside in FFN and the LoRA is correspondingly larger (1 GB vs 600 Mb).

The runtime for this experiment totalled 3.6 hours (2000 steps) on a single 5090 (had 4 musubi block swap to compensate for the larger LoRA size).

I'd say, I'm satisfied with the result - the objects, even living or cartoonish ones inflate realistically, don't merge with each other, don't inflate the objects not mentioned in the prompt and don't spawn orbs. (At least, not that often)

In the end, if you are training a LoRA on concepts around serious transformation of living subjects and don't want to waste GPU time, I strongly advice you to unfreeze the FFN ("ff.net.0.proj", "ff.net.2").

The SimpleTuner training and dataset configs are under config.json and ltx2-multiresolution-inflate-t2v.json respectively.

Trigger words

You should use infl4t3 it to trigger the image generation.

Video examples have sound! Check it out by pressing them!

Download model

Download them in the Files & versions tab.

Downloads last month: 20

Model tree for kabachuha/ltx2-inflate-it

Base model

Lightricks/LTX-2

Adapter

(13)

this model

kabachuha
/

ltx2-inflate-it