Skip to content

fix(inference): use config fields instead of hardcoded max_seq_length/load_in_4bit#28

Merged
sacredvoid merged 1 commit intomainfrom
fix/inference-config-hardcoded-params
Mar 26, 2026
Merged

fix(inference): use config fields instead of hardcoded max_seq_length/load_in_4bit#28
sacredvoid merged 1 commit intomainfrom
fix/inference-config-hardcoded-params

Conversation

@sacredvoid
Copy link
Owner

Summary

  • Adds max_seq_length and load_in_4bit fields to InferenceConfig (defaults: 2048, True)
  • _load_unsloth now reads from config instead of hardcoding
  • Fixes silent truncation when training used max_seq_length=4096 but inference was hardcoded to 2048

Fixes #27

Test plan

  • All 152 tests pass
  • Defaults match previous hardcoded values (no breaking change)

…ead of hardcoding

_load_unsloth hardcoded max_seq_length=2048 and load_in_4bit=True
instead of reading from InferenceConfig. Added both fields to
InferenceConfig with matching defaults. Users training with
max_seq_length=4096 can now set it for inference too.

Fixes #27
@sacredvoid sacredvoid merged commit 5a167c3 into main Mar 26, 2026
@sacredvoid sacredvoid deleted the fix/inference-config-hardcoded-params branch March 26, 2026 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: ModelServer._load_unsloth hardcodes max_seq_length=2048, ignoring config

1 participant