EFEasVI: Tmaze - very simple, standard example for Active Inference by skoghoern · Pull Request #72 · ReactiveBayes/RxInferExamples.jl

skoghoern · 2026-02-22T11:48:23Z

adapted the code from @wouterwln in EFEasVFE to make a first example of the standard tmaze.
(epistemic priors are currently overengineered with helper functions but maybe allow users to test, whether they can reuse it for their own transition matrices.
interestingly even though the helper functions return fixed vectors to the categorical, using the p-vector directly in the standard definition doesnt lead to the same results (there seems to be an interference with the message-passing-optimization)).

running make.jl for the new folder returned:

Inference\T-Maze Active Inference - Planning as Message Passing.ipynb
┌ Info: Processing Report:
│ Total notebooks: 1
│ Successful: 1
│ Failed: 0
│ Skipped: 0
│ 
│ Successfully processed notebooks:
│   • Basic Examples\T-Maze Active Inference\T-Maze Active Inference - Planning as 
Message Passing.ipynb
│ 
└ 
┌ Info: Notebooks processed successfully! 🎉 
│

…es.jl

wouterwln · 2026-02-23T08:56:54Z

Thanks @skoghoern ! I'll check the PR in the coming weeks after some conference deadlines. I assigned myself such that it shows up on my to-do list. Appreciate the effort and the interest and I'll definitely get back to you!

skoghoern · 2026-02-23T11:42:09Z

Sure, sounds good! Good luck with the deadlines - i'll keep an eye out for your comments later on:)

Add a tutorial recreation of the paper A Message Passing Realization of Expected Free Energy Minimization by Wouter Nuijten, Mykola Lukashchuk, Thijs van de Laar and Bert de Vries

bvdmitri · 2026-02-25T14:25:51Z

I've checked the example locally, nice work! The CI failure is unrelated and can be ignored

wouterwln · 2026-03-10T13:49:00Z

Hey @skoghoern ,

Thanks for putting this together! It's great to see an accessible example of the paper However, there are some significant deviations from the paper's core logic. Specifically, replacing dynamic factor nodes with static pre-computations is a move I'm not sure is 100% correct.

I’ve noted the key areas that need to be reverted/fixed below:

1. Notation Consistency

To stay consistent with the paper and RxInfer standards, let's standardize:

States: Use $x_t$ (not $s$ or $\mathbf{X}$).
Controls/Actions: Use $u_t$.
Observations: Use $y_t$.
Variational Distribution: Use $q(\cdot)$.

2. The "Invariance" Shortcut & Static Priors

In your @model, you are using u[t] ~ Categorical(calc_epis_action_prior_vec(...)).

The Problem: This treats the epistemic prior as a static vector computed once. In the paper, Epistemic Value is dynamic; it depends on the agent's current uncertainty $q(x_{t-1})$.
The Fix: We need to use the Exploration node with JointMarginalStorage. This is a dirty fix I did in the original paper repo, but AFAIK the only proper way to implement the scheme from the paper. This allows the message-passing to update the action prior based on the current belief during VMP iterations. Without this, the agent isn't actually "curious", it's just following a fixed bias.

3. Ambiguity Logic

Technically, you could precompute the ambiguity term as long as the observation model ($A$ matrix) is fixed and there are no priors on $y$. However, it’s probably better not to cut corners here. Using the Ambiguity node ensures the implementation remains robust if someone wants to extend the model later (e.g., with learning or hierarchical observations). Plus, it keeps the code perfectly aligned with the mathematical definitions in the paper.

Overall, this is a massive contribution to the RxInfer.jl ecosystem. Bringing these theoretical concepts into a working T-maze example makes the math much more "real" for other users. If you can tighten up these epistemic nodes to match the message-passing logic, I think this will be a really nice example on how to implement Active Inference in RxInfer.jl. Great work so far, looking forward to the next iteration!

wouterwln

see #72 (comment)

… and examples.rxinfer.com

Changes: - Polished tone in the first cell of the notebook - Changed the mathematical notation for random variables from = to ~ - Created a new version of the graph, with all the proper equality nodes and an attention to the data that is given in the first iteration.

…rch truncation specification

…e_RxInfer_Examples_B Cross reference rx infer examples b

the just in-memory check should be enough

skoghoern · 2026-03-13T09:19:19Z

Hi @wouterwln ,
thanks a lot for your thorough review.
First off, my apologies for the inconsistent notation! I'll definitely clean that up:)
Regarding points 2 & 3: I think there is a valuable distinction to make here between the mathematical correctness for this specific environment versus the broader scope of the tutorial.

mathematical correctness: While I completely agree that your dynamic message-passing implementation is the only robust way to handle general cases, in this specific environment, we can use the mathematical "niceness" of the fixed matrices and precompute both the epistemic action prior (based on transition matrix) and the epistemic state prior (based on likelihood matrix). If my understanding is correct, we can show that when the agent knows the matrices beforehand, if the resulting conditional entropies are invariant to the other factorized state variables, the dynamic belief dependencies naturally factor out. i tried to briefly explain this in the tutorial, but maybe didnt make it clear enough.

Mathematical Proofs for Static Precomputation

(Using bold for random variables/distributions and lowercase for specific samples/states)

The Transition Matrix (Exploration Node / Epistemic Action Prior)

Theorem: If the transition matrix $q(\mathbf{x_t} \mid \mathbf{x_{t-1}}, u_t)$ is invariant to the previous state $\mathbf{x_{t-1}}$ then the conditional entropy $H[q(\mathbf{x_t} \mid \mathbf{x_{t-1}}, u_t)]$ is independent of the dynamic prior state belief $q(\mathbf{x_{t-1}})$.

Proof:
The expected entropy of the future state $\mathbf{x_t}$, conditioned on $\mathbf{x_{t-1}}$ and action $u_t$, is:

$$H[q(\mathbf{x_t} \mid \mathbf{x_{t-1}}, u_t)] = \sum_{x_{t-1}} q(x_{t-1} \mid u_t) H[q(\mathbf{x_t} \mid x_{t-1}, u_t)]$$

By definition, if the transition matrix is invariant to the previous state, the inner Shannon entropy evaluates to a constant $C(u_t)$ with respect to $x_{t-1}$:

$$H[q(\mathbf{x_t} \mid x_{t-1}, u_t)] = C(u_t) \quad \text{for all } x_{t-1}$$

Substituting $C(u_t)$ and factoring it outside the summation:

$$H[q(\mathbf{x_t} \mid \mathbf{x_{t-1}}, u_t)] = C(u_t) \sum_{x_{t-1}} q(x_{t-1} \mid u_t)$$

By the law of total probability, $\sum_{x_{t-1}} q(x_{t-1} \mid u_t) = 1$, which leaves us with:

$$H[q(\mathbf{x_t} \mid \mathbf{x_{t-1}}, u_t)] = C(u_t)$$

Because $C(u_t)$ contains no $q(x_{t-1})$ terms, the epistemic value for the action prior can be treated as a static vector in this specific case.

The Likelihood Matrix (Ambiguity Node / Epistemic State Prior)

A similar logic applies to the epistemic prior over states:

$$\tilde{p}(x_t) \propto \exp(-H[q(\mathbf{y_t} \mid x_t)])$$

In the T-maze, we factorize the hidden states into a time-varying agent location and a time-invariant reward location: ${ \mathbf{x}_t^{loc}, \mathbf{x}^{rew} }$
To find the prior specifically for the agent location $x_t^{loc}$, we look for its marginal epistemic value, averaged over the hidden reward locations $\mathbf{x}^{rew}$:

$$\tilde{p}(x_t^{loc}) \propto \exp(-H [q(\mathbf{y_t} \mid x_t^{loc}, \mathbf{x}^{rew})])$$

This resolves into:

$$-H_q(\mathbf{y_t} \mid x_t^{loc}, \mathbf{x}^{rew}) = - \sum_{x^{rew} \in \mathcal{X}^{rew}} q(x^{rew}) \underbrace{H_q(\mathbf{y_t} \mid x_t^{loc}, x^{rew})}_{-\sum q(y_t \mid x_t^{loc}, x^{rew}) \log q(y_t \mid x_t^{loc}, x^{rew})}$$

Given the agent knows the observation matrix $\mathbf{A}$, the inner entropy is a fixed value. If we confirm that the observation matrix $\mathbf{A}$ (specifically the entropy of the observations) is invariant to the beliefs about the reward for a given location, it evaluates to a constant with respect to $x^{rew}$. The dynamic belief $q(x^{rew})$ sums to $1$, making the ambiguity term safely precomputable.

scope of tutorial: my idea was to implement a "newbie-friendly", more accesible version that simplifies the complex (though brilliant) workarounds you've built. I completely agree that this isn't the "general" solution and could be misleading if users try to scale it directly. so the question is, whether to only use that (advanced) implementation or maybe have them set-up in a learning-journey like hierarchy.

graph TD
    Root[Knowledge of Parameters] --> Known[Agent knows parameters]
    Root --> Learn[Agent needs to learn parameters]
    Known --> Invariant[Matrices are invariant to previous state/other state <br><i>e.g., T-maze</i>]
    Known --> Dependent[Dependent on previous state <br><i>e.g., Stochastic maze</i>]

e.g. we could position this examle as a "simplified starting point" and then link to meditans #73 more comprehensive, general tutorial for advanced use cases?

Obv. this is just a suggestion given my limited understanding, and either way i’m happy to update the tutorial in whatever direction you feel serves the project best.

p.s.: on a related note, have you considered the possibility of adapting GraphPPL/RxInfer/RMP to allow passing beliefs from previous VMP iterations? Technically, since these are stored (when including keep_each(), it could be a powerful addition.

…_ReloadBug fix: on make docs-serve, page was constantly refreshing. Now not.

Add EFE Minimization via Message Passing notebook

skoghoern · 2026-03-25T18:49:52Z

Hi @wouterwln ,

I’ve now submitted a major overhaul of the tutorial!

I included two versions of the t-maze: one as you requested and another one where i explored a different implementation:

While looking into generalizing the implementation, I noticed your marginal storage approach works perfectly for the current scenarios. However, not being 100% sure whether there could be edge cases/models (e.g. hierarchical loopy graphs) in which the outgoing messages from the custom factor nodes could be triggered several times during one VMP iteration itself, where marginals might have been updated inbetween the computations of the rules, making convergence even more difficult.

To explore this, I implemented your original version in the main file and added a second file that handles epistemic prior updates via the callback system.

Key Changes in the Callback Approach

Stable Prior Updates: The callback implementation strictly updates epistemic priors only once per VMP iteration. This avoids the risk of priors shifting mid-iteration.
Minimal Horizon: The agent behaves as expected at the minimally needed planning horizon (time_horizon = 4) requiring only 2 VMP iterations.
Stochastic T-Maze: Tested successfully; it correctly avoids stochastic transitions and noisy observation states.
Slight Efficiency Gain: e.g. for time_horizon = 10, the agent completes the task in 8 steps instead of 9.

Observations on Convergence

I ran some tests to see where the agent breaks (fails to explore the cue field) and found some interesting quirks depending on the time_horizon (H) and iterations (iter) (starting from the second iteration, since 1st iteration is needed to initialize epistemic priors):

H = 4, iter ≥ 4: Both versions break.
H = 5, iter ≥ 5: Both versions break.
H = 6, iter ≥ 6: Both versions break. [EDIT]~~The callback version breaks, but your original version is remarkably stable (tested up to 20 iterations).~~
H ≥ 7: Both versions are completely stable, with no convergence failures found up to 20 iterations.

My hypothesis is that the shorter the loop, the higher the risk of VMP convergence issues disrupting exploration.

Let me know what you think and we can decide on one version to upload.

skoghoern · 2026-03-27T18:55:10Z

[EDIT]: just saw that the difference was only due, because i hadnt updated the local EFEasVFE repo with the correct state-transition tensor (corrected in this commit). Using the same state-transition tensor the both implementations are equal in their results.

(adding free_energy analysis we can see a pattern that free energy plateaus when VMP_iter == Planning horizon for H=[3,6]:
e.g. for H = 6: Free Energy[iter]: | Real[-1.66355, -0.422871, 0.205786, 0.648804, 0.742125, 0.316893, ..., 0.316893]

where we can observe that the posterior for the first action u[1] of the planning horizon changes from South=3 ("correct"), to North=1 ("wrong") beginning in the VMP_iteration==H.

e.g. here again for planning horizon H=6 the posteriors of u[1] for several VMP iterations:

iter: [Categorical, p=[0.2, 0.2, 0.4, 0.2]),
iter: [Categorical, p=[0.14286, 0.17857, 0.5, 0.17857]),
iter: [Categorical, p=[0.1018, 0.16766, 0.56287, 0.16766]),
iter: [Categorical, p=[0.07468, 0.16198, 0.60136, 0.16198]),
iter: [Categorical, p=[0.05763, 0.15886, 0.62465, 0.15886]),
iter: [Categorical, p=[0.33244, 0.19705, 0.27346, 0.19705]),
... [remains on this plataued
iter: [Categorical, p=[0.33244, 0.19705, 0.27346, 0.19705])]

i will try to analyze the reason further in the upcoming days. would be very interesting to find out why this happens, whether this is a general pattern, and if it might generalize to other loopy models as well.
in case you already have some further insights, i would be very happy curious to hear your opinion:)

wouterwln · 2026-03-30T09:00:30Z

Hey @skoghoern, thanks for the revision! The example now looks really cool and really really good, many thanks! One question: There seem to be 2 notebooks in your repo: T-Maze_Active_Inference_-_Planning_as_Message_Passing and T-Maze_Active_Inference_-_Planning_as_Message_Passing_callback. Are they both necessary, or can we use either one of them? If this is resolved, I think we can merge!

skoghoern · 2026-03-30T09:29:54Z

hi wouter, thanks for the feedback, it was also much inspired by @meditans :)

There seem to be 2 notebooks in your repo: T-Maze_Active_Inference_-_Planning_as_Message_Passing and T-Maze_Active_Inference_-_Planning_as_Message_Passing_callback. Are they both necessary, or can we use either one of them?

no, only one of them is necessary. T-Maze_Active_Inference_-_Planning_as_Message_Passing_callback was just an attempt, while trying to fulfill your request of making a more general version, to adapt the logic closer to the theory of your paper: updating the epistemic priors only once at the end of a VMP iteration.
i guess that your version is more intuitive. while the callback version is maybe slightly more general.
If you cant think of an edge case in which one of the epistemic prior nodes could send their message twice during a single VMP iteration where inbetween the "dependent" marginal could have been updated, then i think that the T-Maze_Active_Inference_-_Planning_as_Message_Passing - implementing your version is more comprehensive as tutorial and we can delete the callback version.

…Inference_-_Planning_as_Message_Passing_callback.ipynb

skoghoern · 2026-03-30T13:21:19Z

Decided to use original implementation, as it is more intuitive and doesnt need future adaption since callbacks are being currently changed (deleted T-Maze_Active_Inference_-_Planning_as_Message_Passing_callback version)

skoghoern added 2 commits February 22, 2026 12:32

Add T-Maze Active Inference example

b953e53

Merge branch 'main' of https://github.com/ReactiveBayes/RxInferExampl…

21bc635

…es.jl

wouterwln self-requested a review February 23, 2026 08:56

wouterwln self-assigned this Feb 23, 2026

Add EFE Minimization via Message Passing notebook

5915a3c

Add a tutorial recreation of the paper A Message Passing Realization of Expected Free Energy Minimization by Wouter Nuijten, Mykola Lukashchuk, Thijs van de Laar and Bert de Vries

bvdmitri mentioned this pull request Feb 25, 2026

Add EFE Minimization via Message Passing notebook #73

Merged

EFEasVFE example: make dims of obs tensor generic

2b9c524

wouterwln requested changes Mar 10, 2026

View reviewed changes

ofSingularMind and others added 12 commits March 10, 2026 19:26

feat: Added edits to enable limited searching across docs.rxinfer.com…

4696404

… and examples.rxinfer.com

Merge branch 'main' into cross_reference_RxInfer_Examples_B

d337f07

fix: formatting

fea1d44

fix: cross-site search working (as secondary search results)

4e0d614

fix: on make docs-serve, page was constantly refreshing. Now not.

c500e0e

fix: there was a silent bug. Resolved.

861bf41

feat: add context window and term highlighting, easy peasy

b1ed4e4

feat: add GraphPPL and ReactiveMP search results. Also add n of N sea…

81e9ee7

…rch truncation specification

Merge pull request ReactiveBayes#75 from ReactiveBayes/cross_referenc…

3b991ef

…e_RxInfer_Examples_B Cross reference rx infer examples b

remove isdefined checks

9efa908

the just in-memory check should be enough

Merge branch 'main' into main

eb307a8

bvdmitri and others added 5 commits March 13, 2026 16:51

Merge branch 'main' into LiveServer_Fix_ReloadBug

155365e

Merge pull request ReactiveBayes#74 from ReactiveBayes/LiveServer_Fix…

0c56ebf

…_ReloadBug fix: on make docs-serve, page was constantly refreshing. Now not.

Merge pull request ReactiveBayes#73 from meditans/main

83f1093

Add EFE Minimization via Message Passing notebook

Only updated notation (1.) so far

dc0c898

gif

25b3847

skoghoern added 5 commits March 18, 2026 18:28

Add T-Maze Active Inference example

0d6d211

Only updated notation (1.) so far

d1434b3

gif

7775e15

Merge branch 'main' of https://github.com/skoghoern/RxInferExamples.jl

55c9dad

Updated tutorial, added FactorGraph, Meta version and callback version

6d8b079

notebook versions without local env activation

f29448f

skoghoern changed the title ~~EFEasVFE: Tmaze - very simple, standard example for Active Inference~~ EFEasVI: Tmaze - very simple, standard example for Active Inference Mar 26, 2026

Delete examples/Basic Examples/T-Maze Active Inference/T-Maze_Active_…

32380f0

…Inference_-_Planning_as_Message_Passing_callback.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EFEasVI: Tmaze - very simple, standard example for Active Inference#72

EFEasVI: Tmaze - very simple, standard example for Active Inference#72
skoghoern wants to merge 28 commits intoReactiveBayes:mainfrom
skoghoern:main

skoghoern commented Feb 22, 2026

Uh oh!

wouterwln commented Feb 23, 2026

Uh oh!

skoghoern commented Feb 23, 2026

Uh oh!

bvdmitri commented Feb 25, 2026

Uh oh!

wouterwln commented Mar 10, 2026

Uh oh!

wouterwln left a comment

Uh oh!

skoghoern commented Mar 13, 2026 •

edited

Loading

Uh oh!

skoghoern commented Mar 25, 2026 •

edited

Loading

Uh oh!

skoghoern commented Mar 27, 2026 •

edited

Loading

Uh oh!

wouterwln commented Mar 30, 2026

Uh oh!

skoghoern commented Mar 30, 2026

Uh oh!

skoghoern commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

skoghoern commented Feb 22, 2026

Uh oh!

wouterwln commented Feb 23, 2026

Uh oh!

skoghoern commented Feb 23, 2026

Uh oh!

bvdmitri commented Feb 25, 2026

Uh oh!

wouterwln commented Mar 10, 2026

1. Notation Consistency

2. The "Invariance" Shortcut & Static Priors

3. Ambiguity Logic

Uh oh!

wouterwln left a comment

Choose a reason for hiding this comment

Uh oh!

skoghoern commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skoghoern commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key Changes in the Callback Approach

Observations on Convergence

Uh oh!

skoghoern commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wouterwln commented Mar 30, 2026

Uh oh!

skoghoern commented Mar 30, 2026

Uh oh!

skoghoern commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

skoghoern commented Mar 13, 2026 •

edited

Loading

skoghoern commented Mar 25, 2026 •

edited

Loading

skoghoern commented Mar 27, 2026 •

edited

Loading