Expanded tags and packets to accept any Python data type and use Arrow-based datastore#22
Conversation
eywalker
commented
Jun 18, 2025
- Major refactor of pod logic to accept tags and packets with arbitrary Python data type
- New set of data stores (ArrowDataStore) that accepts arrow tables to be stored, being aware of "semantic_type" metadata to handle special fields such as Path and UUID using registered handlers
- New hasher for Arrow table that is "semantic_type" aware
- New logic of "saving" computation results and associated tags into result_store and tag_store using CachedFunctionPod
…unction info extractor support
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Pull Request Overview
This PR refactors multiple components to enable support for arbitrary Python data types and integrates a new Arrow‐based datastore. Key changes include renaming of parameters (e.g. replacing “store_name” with “function_name” and “content_hash” with “function_hash”), updating hasher factory methods to use the new API, and modifying various import paths and error messages to reflect the architectural changes.
Reviewed Changes
Copilot reviewed 71 out of 83 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/* | Updates to test cases with revised parameter names and updated import paths |
| src/orcabridge/store/* | Refactor of DirDataStore and SafeDirDataStore to use new function-based naming conventions |
| src/orcabridge/hashing/* | Adjustments to hasher implementations and defaults to align with new API, including renaming in factory methods |
| notebooks/* | Notebook examples updated with new import paths and parameter names |
Codecov ReportAttention: Patch coverage is 📢 Thoughts on this report? Let us know! |