Skip to content

Add LBFGS checkpointing and warm-start via LBFGSState#29

Open
XingyuZhang2018 wants to merge 2 commits intoJutho:masterfrom
XingyuZhang2018:feature/lbfgs-checkpoint
Open

Add LBFGS checkpointing and warm-start via LBFGSState#29
XingyuZhang2018 wants to merge 2 commits intoJutho:masterfrom
XingyuZhang2018:feature/lbfgs-checkpoint

Conversation

@XingyuZhang2018
Copy link

Closes #22.

Summary

  • LBFGSState struct — public snapshot of the complete optimizer state after each iteration: x, f, g, the full LBFGSInverseHessian H, cumulative counters numfg/numiter, and history vectors fhistory/normgradhistory.

  • checkpoint keyword on optimize(fg, x, alg::LBFGS; ..., checkpoint=nothing) — any callable f(state::LBFGSState) is invoked at the end of every iteration (including the last), after the L-BFGS curvature update, so the saved state is always consistent.

  • optimize(fg, state::LBFGSState, alg::LBFGS; ...) — new dispatch to resume from a checkpoint. History vectors are continued, so the returned history matrix spans the full run.

  • _lbfgs_loop! private helper — shared iteration loop to avoid code duplication between fresh-start and resume paths.

  • examples/jld2_checkpoint.jl — runnable demo showing JLD2-based save/load.

Design notes

  • No new package dependencies. Users choose their own serialization backend (JLD2, BSON, Serialization, etc.).
  • GPU / custom array backends are supported as long as the user's serializer handles those array types.
  • shouldstop and hasconverged receive the cumulative numfg/numiter; users pass a custom shouldstop if they need a fixed number of additional iterations after resuming.

Test plan

  • 14 new tests in test/runtests.jl: callback invocation count, state-field correctness (x, f, numfg, history length), history row continuity across resume, multi-resume cumulative numiter counting.
  • All existing Linesearch / GradientDescent / ConjugateGradient / LBFGS tests still pass.
  • examples/jld2_checkpoint.jl runs end-to-end and all assertions pass.

🤖 Generated with Claude Code

Closes Jutho#22.

## What's added

- New public struct `LBFGSState` that captures the complete optimizer
  state after each iteration: `x`, `f`, `g`, the full
  `LBFGSInverseHessian` approximation `H`, cumulative counters
  `numfg`/`numiter`, and the full `fhistory`/`normgradhistory` vectors.

- New `checkpoint` keyword argument on `optimize(fg, x, alg::LBFGS; ...)`
  that accepts any callable `f(state::LBFGSState)`.  The callback is
  invoked at the end of **every** iteration (including the last), after
  the L-BFGS curvature update, so the saved state is always complete and
  consistent.

- New `optimize(fg, state::LBFGSState, alg::LBFGS; ...)` dispatch to
  resume from a checkpoint.  History vectors are continued, so the
  returned `history` matrix spans the entire run.

- Internal `_lbfgs_loop!` helper to avoid code duplication between the
  fresh-start and resume code paths.

- 14 new tests in `test/runtests.jl` covering callback count,
  state-field correctness, history continuity, and multi-resume
  cumulative counting.

- `examples/jld2_checkpoint.jl`: runnable demo showing JLD2-based
  save/load.

## Design notes

No new package dependencies are introduced.  The library exposes the
state struct; users choose their own serialization backend (JLD2, BSON,
`Serialization`, etc.).  GPU / custom array backends are supported as
long as the user's serializer handles those array types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov-commenter
Copy link

codecov-commenter commented Mar 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.15%. Comparing base (db5e8b7) to head (c922a4d).

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #29      +/-   ##
==========================================
+ Coverage   82.35%   83.15%   +0.79%     
==========================================
  Files           5        5              
  Lines         527      546      +19     
==========================================
+ Hits          434      454      +20     
+ Misses         93       92       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


# Resumed history prepends the prior run's history
@test size(history_resumed, 1) == size(history_full, 1)
@test history_resumed[1:6, :] ≈ history_part # first 6 rows identical to partial run
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is really identical, I guess this could be:

Suggested change
@test history_resumed[1:6, :] history_part # first 6 rows identical to partial run
@test history_resumed[1:6, :] == history_part # first 6 rows identical to partial run

@Jutho
Copy link
Owner

Jutho commented Mar 12, 2026

Thanks. From a first quick look, this looks very good. It would be useful if the code that has not actually changed (I think most of the LBFGSInverseHessian definition and methods) would still be in the same place in the source code, so that it doesn't show up as a major git diff edit. Could you try to restore this to its original location?

Remove the H<:LBFGSInverseHessian type constraint from LBFGSState so
that the struct no longer depends on LBFGSInverseHessian being defined
first. This allows LBFGSInverseHessian and all its methods to remain in
their original position in the file, minimising the diff.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@XingyuZhang2018
Copy link
Author

Fixed. Removed the H<:LBFGSInverseHessian type constraint from LBFGSState so the struct no longer depends on LBFGSInverseHessian being defined first — this lets LBFGSInverseHessian and all its methods stay in their original location. The struct is unchanged in content and behaviour; H remains a concrete type parameter inferred from the actual LBFGSInverseHessian instance passed at construction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Save/Load State for LBFGS Optimizer

3 participants