Fix: inject LoRA adapters before loading LoRA weights in from_pretrained by MarkovChain-why · Pull Request #28 · dreamzero0/dreamzero

MarkovChain-why · 2026-03-10T18:22:44Z

Summary

Bug: When defer_lora_injection=True, VLA.from_pretrained() (inference path) calls load_lora_weight() without first injecting LoRA adapters. The LoRA weight keys (lora_A, lora_B) have no matching parameters, so loaded weights are silently dropped.
Root cause: The else branch correctly calls inject_lora_after_loading(), but the if lora_weights_path is not None branch does not.
Fix: Call inject_lora_after_loading() before load_lora_weight() when a LoRA weights path is provided.

Note: The training path in base.py (create_model) is not affected — it already calls inject_lora_after_loading() unconditionally after loading pretrained weights.

Affected code

groot/vla/model/dreamzero/base_vla.py, VLA.from_pretrained() method.

Before (buggy):
```python
if lora_weights_path is not None:
model.load_lora_weight(lora_weights_path) # LoRA adapters don't exist yet!
else:
if ... defer_lora_injection:
model.action_head.inject_lora_after_loading()
```

After (fixed):
```python
if lora_weights_path is not None:
if hasattr(model.action_head, 'inject_lora_after_loading'):
model.action_head.inject_lora_after_loading() # Create adapters first
model.load_lora_weight(lora_weights_path) # Then load weights
else:
if ... defer_lora_injection:
model.action_head.inject_lora_after_loading()
```

Test plan

Load a LoRA-finetuned checkpoint with defer_lora_injection=True via VLA.from_pretrained() and verify LoRA weights are correctly applied
Verify inference outputs differ from base model (confirming LoRA weights took effect)

When `defer_lora_injection=True`, LoRA adapters are not created during model init — they are deferred until after base weights are loaded so that the pretrained weight key paths match the model's state dict. However, `VLA.from_pretrained` (the inference/deployment path) skips adapter injection when `lora_weights_path` is provided: it jumps straight to `load_lora_weight()` without first calling `inject_lora_after_loading()`. This means the LoRA weight keys (lora_A, lora_B) have no matching parameters in the model, so the loaded weights are silently dropped. The training path in `base.py` does not have this bug — it correctly calls `inject_lora_after_loading()` after loading pretrained weights (line 723-726). Fix: call `inject_lora_after_loading()` before `load_lora_weight()` when a LoRA weights path is provided. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: inject LoRA adapters before loading LoRA weights in from_pretrained#28

Fix: inject LoRA adapters before loading LoRA weights in from_pretrained#28
MarkovChain-why wants to merge 1 commit intodreamzero0:mainfrom
MarkovChain-why:fix/lora-deferred-injection-inference

MarkovChain-why commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MarkovChain-why commented Mar 10, 2026

Summary

Affected code

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant