Add LBFGS checkpointing and warm-start via LBFGSState#29
Add LBFGS checkpointing and warm-start via LBFGSState#29XingyuZhang2018 wants to merge 2 commits intoJutho:masterfrom
Conversation
Closes Jutho#22. ## What's added - New public struct `LBFGSState` that captures the complete optimizer state after each iteration: `x`, `f`, `g`, the full `LBFGSInverseHessian` approximation `H`, cumulative counters `numfg`/`numiter`, and the full `fhistory`/`normgradhistory` vectors. - New `checkpoint` keyword argument on `optimize(fg, x, alg::LBFGS; ...)` that accepts any callable `f(state::LBFGSState)`. The callback is invoked at the end of **every** iteration (including the last), after the L-BFGS curvature update, so the saved state is always complete and consistent. - New `optimize(fg, state::LBFGSState, alg::LBFGS; ...)` dispatch to resume from a checkpoint. History vectors are continued, so the returned `history` matrix spans the entire run. - Internal `_lbfgs_loop!` helper to avoid code duplication between the fresh-start and resume code paths. - 14 new tests in `test/runtests.jl` covering callback count, state-field correctness, history continuity, and multi-resume cumulative counting. - `examples/jld2_checkpoint.jl`: runnable demo showing JLD2-based save/load. ## Design notes No new package dependencies are introduced. The library exposes the state struct; users choose their own serialization backend (JLD2, BSON, `Serialization`, etc.). GPU / custom array backends are supported as long as the user's serializer handles those array types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #29 +/- ##
==========================================
+ Coverage 82.35% 83.15% +0.79%
==========================================
Files 5 5
Lines 527 546 +19
==========================================
+ Hits 434 454 +20
+ Misses 93 92 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
||
| # Resumed history prepends the prior run's history | ||
| @test size(history_resumed, 1) == size(history_full, 1) | ||
| @test history_resumed[1:6, :] ≈ history_part # first 6 rows identical to partial run |
There was a problem hiding this comment.
If it is really identical, I guess this could be:
| @test history_resumed[1:6, :] ≈ history_part # first 6 rows identical to partial run | |
| @test history_resumed[1:6, :] == history_part # first 6 rows identical to partial run |
|
Thanks. From a first quick look, this looks very good. It would be useful if the code that has not actually changed (I think most of the |
Remove the H<:LBFGSInverseHessian type constraint from LBFGSState so that the struct no longer depends on LBFGSInverseHessian being defined first. This allows LBFGSInverseHessian and all its methods to remain in their original position in the file, minimising the diff. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Fixed. Removed the |
Closes #22.
Summary
LBFGSStatestruct — public snapshot of the complete optimizer state after each iteration:x,f,g, the fullLBFGSInverseHessianH, cumulative countersnumfg/numiter, and history vectorsfhistory/normgradhistory.checkpointkeyword onoptimize(fg, x, alg::LBFGS; ..., checkpoint=nothing)— any callablef(state::LBFGSState)is invoked at the end of every iteration (including the last), after the L-BFGS curvature update, so the saved state is always consistent.optimize(fg, state::LBFGSState, alg::LBFGS; ...)— new dispatch to resume from a checkpoint. History vectors are continued, so the returnedhistorymatrix spans the full run._lbfgs_loop!private helper — shared iteration loop to avoid code duplication between fresh-start and resume paths.examples/jld2_checkpoint.jl— runnable demo showing JLD2-based save/load.Design notes
Serialization, etc.).shouldstopandhasconvergedreceive the cumulativenumfg/numiter; users pass a customshouldstopif they need a fixed number of additional iterations after resuming.Test plan
test/runtests.jl: callback invocation count, state-field correctness (x,f,numfg, history length), history row continuity across resume, multi-resume cumulativenumitercounting.examples/jld2_checkpoint.jlruns end-to-end and all assertions pass.🤖 Generated with Claude Code