Skip to content

Store: add persistent dependency registry and notify hook#2

Open
sejalpunwatkar wants to merge 2 commits intoBigDataAnalyticsGroup:mainfrom
sejalpunwatkar:feature/persistent-observer-store
Open

Store: add persistent dependency registry and notify hook#2
sejalpunwatkar wants to merge 2 commits intoBigDataAnalyticsGroup:mainfrom
sejalpunwatkar:feature/persistent-observer-store

Conversation

@sejalpunwatkar
Copy link
Copy Markdown

Summary

This PR introduces a persistent dependency mechanism for AttributeFunctions within the Store, enabling a foundation for observer-style updates across sessions.

What’s implemented

  • Added a dependency registry stored inside the existing SqliteDict using a reserved key
  • Implemented register_dependency(key, dependent_id) to track relationships between AttributeFunctions
  • Hooked into Store.put() to trigger notifications for dependent AttributeFunctions
  • Ensured registry persistence across Store reloads

Design decisions

  • Used a single-store approach (same SqliteDict) for both data and metadata to avoid cross-store consistency issues
  • Kept the implementation minimal and non-intrusive, without modifying existing AttributeFunction or observer semantics
  • Did not assume a fixed signature for update() methods to maintain compatibility with current implementations

Tests

  • Added test to verify dependency registration and persistence
  • Added test to ensure notification hook executes without breaking existing behavior

Scope

This PR focuses on establishing a persistent observer foundation. It does not enforce or redefine observer/update behavior, leaving room for future API design and refinement.

store.put(af1)
store.put(af2)

store.register_dependency(af1.uuid, af2.uuid)
Copy link
Copy Markdown
Member

@jensdittrich jensdittrich Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is basically a dependency mechanism independent from observable/observers, this means it is up to the user to additonally register those dependencies? That sounds error-prone to me.

In general, this unit test is very short.

store/store.py Outdated
def _get_registry(self):
return self.sqlite_dict[self._registry_key]

def register_dependency(self, key: int, af_id: int):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure to follow the contribution guideline

)

# dependency registry (persistent)
self._registry_key = "__dependency_registry__"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of this string?

try:
store.put(af1)
except TypeError:
pass
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are suppressing a possible exception here? Why?

- Implement __dependency_registry__ in Store for persistent AF relationships.
- Integrate _notify into Store.put for automated, cross-session updates.
- Standardize on string keys for SqliteDict to ensure ID consistency.
- Add lifecycle unit test with class-level mock for notification verification.
@sejalpunwatkar
Copy link
Copy Markdown
Author

Hi @jensdittrich ,
Thank you for the feedback. I have updated the code to address your concerns:

  • Automated Notifications: I moved the notification logic into Store.put(). This means dependencies are updated automatically when an object is saved, so the user doesn't have to do it manually.
  • Purpose of the String: The dependency_registry string is used as a private key to save the dependency map inside the SqliteDict. This allows the relationships to be persistent and survive a store restart.
  • Exception Handling: I removed the try/except block. The test now uses a proper mock function that handles the expected arguments, so it no longer suppresses errors.
  • Improved Testing: I expanded the unit test to verify the full cycle: registering a dependency, triggering an automatic update, and checking that the data persists after closing and reopening the store.
  • Guidelines: I updated the variable names to use uuid.UUID and added docstrings to match the project's contribution guidelines.
    Please let me know if you have further suggestions.

@sejalpunwatkar
Copy link
Copy Markdown
Author

@jensdittrich I've updated the code to address your feedback, including the automated notifications and expanded tests. Please take a look whenever you have a moment!
Since you mentioned you're currently focusing on the FQL operators and the query push-down challenge, I’d love to help out there as well. Would you like me to look into the group_by rewrites or start exploring how to improve the where() API for the store?

@jensdittrich
Copy link
Copy Markdown
Member

I am currently returnign from EDBT, I will take a look at your code next week.

@sejalpunwatkar
Copy link
Copy Markdown
Author

Thank you for the update. I hope the conference was successful and look forward to your feedback next week. In the meantime, I’ll continue reviewing the FQL operators and the group_by logic to see where I can best assist with the next steps.

@jensdittrich
Copy link
Copy Markdown
Member

sorry for the delay, I was travelling

please use the inline comments for discussions where possible, like that we can discuss each issue in a separate thred, I will follow up there now...

"""Store an AttributeFunction in the persistent store.
@param af: The AttributeFunction to store.
"""
uuid_str = str(af.uuid)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still no typehint, I would expect
uuid_str: str = ...
or
uuid: str = ...

return self.sqlite_dict.get(self._registry_key, {})

def register_dependency(self, key: int, af_id: int):
def register_dependency(self, parent_uuid: uuid.UUID, child_uuid: uuid.UUID):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the idea here is to have a separte dependency registry, i.e. all subscriptions are additionally subscribed here, correct?
Is this supposed to be redundant to the data keptin the AFs? Why not store it with the AFs directly?

registry = self._get_registry()

key = str(key)
p_uuid_str = str(parent_uuid)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same typhint comment as above, fix globally

dependent_af.update()
parent_af = self.get(parent_uuid)

for dependent_id in registry[p_uuid_str]:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehint missing

# trigger recomputation / update
if hasattr(dependent_af, "update"):
dependent_af.update()
parent_af = self.get(parent_uuid)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehint missing

WAS_UPDATED = False

def global_update_mock(self, other=None, *args, **kwargs):
"""Picklable global mock that accepts the 'other' argument."""
Copy link
Copy Markdown
Member

@jensdittrich jensdittrich Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from this comment I do not understand the purpose of this, extend the explanation

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need a global variable here?

store.put(child_af)
store.put(parent_af)

store.put(parent_af)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you putting it twice?, see 249

Test persistent dependency mechanism in Store.

This test covers:
1. Registering a dependency between two AttributeFunctions (AFs)
Copy link
Copy Markdown
Member

@jensdittrich jensdittrich Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is again phrased separate fro mthe subscription mechanism

I would expect the test to create AFs with subscriptions and then check through the store whether notificaitons work through dependencies even in the present of data being not available in main memory

So dependencies can be a way to enable subscriptions to work through the store. But the dependency mechanism should be invisible to the user.

store.register_dependency(parent_af.uuid, child_af.uuid)

store.register_dependency(af1.uuid, af2.uuid)
store.put(child_af)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, every time you save, you put the element into the store again? I guess this leads to the confusion whether the AF needs to be put to the store for every update. I guess the latter is very errorprone.
So, I guess this method should be renamed to register(), i.e. you register once, if updates occur, you don not have to call put every time.
I will change it in the store's API accordingly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the store API accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants