-
Notifications
You must be signed in to change notification settings - Fork 61
feature: add icebug-format implementations #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
aheev
wants to merge
1
commit into
LadybugDB:main
Choose a base branch
from
aheev:dev
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+21
−0
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| # Icebug Format Implementation | ||
|
|
||
| *Note: The `structure` keyword is used here to mean either a file or an in-memory structure.* | ||
|
|
||
| ## Syntax | ||
| We might need to extend `ATTACH` to support attaching an Icebug-formatted graph. This would create node/rel tables in the database and point to the CSR structures. | ||
|
|
||
| Alternatively, we can create a new syntax: `ATTACH GRAPH`. | ||
aheev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Metadata File | ||
| The metadata file(passed to `ATTACH`) describes the structure of the graph, including node labels, node table structures, and relationship structures, which will be used to create the corresponding tables in the database(no need of `schema.cypher`). It also includes information about the CSR structures and how they map to the node/rel tables in the database. | ||
aheev marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| We will not use the `CREATE NODE TABLE` and `CREATE REL TABLE` syntax to create these tables. Instead, we will directly call the internal APIs to create the tables based on the metadata file. This approach avoids exposing public APIs that users might accidentally use to corrupt the icebug-formatted tables. | ||
|
|
||
| We won't need to enforce any naming conventions on file names | ||
|
|
||
| ## new icebug classes | ||
| We cannot directly use the existing Arrow/Parquet node and rel table classes because those are specific to their respective table storage formats (we will copy the relevant code to the new classes, however). In the future, we might use a different storage format (or a mix) for these tables. Therefore, we will create new classes, `icebug_node_table` and `icebug_rel_table`, which implement the `node_table` and `rel_table` classes respectively, to represent the node and relationship tables in the Icebug format. CSR structures will be linked in these classes during initialization, similar to how `parquet_rel_table` currently operates. | ||
|
|
||
| ## WITH storage = 'arrow' and WITH storage = 'parquet' | ||
| We will need to delegate these queries to DuckDB. The current arrow/parquet node and rel table classes will be revamped to support this delegation | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental idea behind DuckLake is that yaml/json are brittle and should be replaced with the system catalog of a SQL database. It implements duckdb, postgres and mysql as system catalogs.
Similarly, the main idea behind Icebug Disk is to avoid yaml/json that Apache GraphAR uses to represent the graph and replace it with LadybugDB's system catalog. Perhaps people can implement it with other cypher implementations such as ArcadeDB or Grafeo.
Then a compliant implementation would do:
where
mygraph.lbdbis a small ladybugdb file with only the catalog, but no data.Such a
mygraph.lbdbwould be created vialbug -i schema.cypherand shared with many users who could use it to connect to the graph lake backed by Icebug Disk.