-
Notifications
You must be signed in to change notification settings - Fork 54
feat: RAI Semantic Map Memory (rai_semap) #727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b79d56e to
4618982
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #727 +/- ##
=======================================
Coverage 65.32% 65.32%
=======================================
Files 78 78
Lines 3386 3386
=======================================
Hits 2212 2212
Misses 1174 1174 ☔ View full report in Codecov by Sentry. |
maciejmajek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After pyproject.toml changes I was able to run the semap feature. Amazing work!
The design.md mentions various algorithms for spatial/temporal deduplication. Before I reflect on that, I need to read some literature on this topic.
Since this PR is huge, I will continue the review tomorrow.
Co-authored-by: Maciej Majek <46171033+maciejmajek@users.noreply.github.com>
|
|
||
| @abstractmethod | ||
| def spatial_query( | ||
| self, center: Point, radius: float, filters: Optional[Dict[str, Any]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I feel like interwining non ROS 2 code with ROS 2 types creates confusion and limits future integrations. Consider using src/rai_core/rai/types. Such separation validates the ros2 directory you've created. Let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such separation validates the ros2 directory you've created.
Indeed, time to walk the talk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice adapter pattern: rai/types is an adapter between ROS 2 and the Python system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out to be a bit more work than I anticipated! I ended up using both the Pose from rai.types and the ROS2 message type. See the struggle here
The issue is that ROS2 transform functions (tf2_geometry_msgs) require ROS2 message types, but we also need rai.types.Pose for the Pydantic models. Using PoseType = Any with arbitrary_types_allowed=True lets Pydantic accept both, which works but isn't ideal. If you have suggestions for a cleaner approach, I'm all ears!
| self.wait_tool = WaitForSecondsTool() | ||
| self._nav_action_ready = False | ||
|
|
||
| def wait_for_nav_action_server(self, timeout_sec: float = 60.0) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from rai.communication.ros2 import wait_for_ros2_actions - I think we need to add support for timeout in ros2 waiters. The boilerplate below is what we want to eliminate with ros2connectors & their tooling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But since there is no timeout in waiters, its ok as it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked as #728
src/rai_semap/rai_semap/ros2/node.py
Outdated
| def _initialize_tf(self): | ||
| """Initialize TF buffer and listener. | ||
|
|
||
| Note: ROS2Connector already creates _tf_buffer and _tf_listener internally. | ||
| We access the connector's TF buffer to avoid duplication. | ||
| """ | ||
| # Use the connector's internal TF buffer (already created in __init__) | ||
| self.tf_buffer = self._tf_buffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was left over code, removed.
src/rai_semap/rai_semap/ros2/node.py
Outdated
| map_frame_id = self._get_string_parameter("map_frame_id") | ||
| map_resolution = self._get_double_parameter("map_resolution") | ||
|
|
||
| backend = SQLiteBackend(database_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should most likely be configurable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a tool that can retrieve object types previously seen in a location.
Example:
- The model is asked to bring an object that can appear in different shapes such as a bottle, can, or cup.
- Finding the correct item might take three turns when using QuerySemanticMapTool, but only two turns if it can refer to a SeenObjects record.
At some point this may require context compression, but that can be handled by a future general module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this do ?
class GetSeenObjectsTool(BaseTool):
"""Tool for retrieving object types previously seen in a location."""
name: str = "get_seen_objects"
description: str = (
"Get a list of object types (e.g., 'bottle', 'cup', 'table') that have "
"been previously seen in a location. Useful for discovering what objects "
"exist before querying for specific instances."
)
args_schema: Type[GetSeenObjectsToolInput] = GetSeenObjectsToolInput
memory: SemanticMapMemory
def _run(self, location_id: Optional[str] = None) -> str:
"""Get list of distinct object classes seen in a location."""
pass
I've added skeleton code to support this ask and actual implementation will come in subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, cool
src/rai_semap/rai_semap/ros2/node.py
Outdated
| vision_detection_id=vision_detection_id, | ||
| metadata=metadata if metadata else existing_ann.metadata, | ||
| ) | ||
| self.memory.backend.update_annotation(updated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From architectural (LoD) standpoint, I think that the node should not reach that far (memory.backend). Would it make sense to expose update annotation or similar in memory?
src/rai_semap/rai_semap/ros2/node.py
Outdated
|
|
||
| def _process_detection( | ||
| self, | ||
| detection, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
important type missing
| # It performs object detection and 3D pose computation, which are general perception | ||
| # tasks not specific to semantic mapping. Consider moving to rai_perception when | ||
| # that package has ROS2 node infrastructure in place. | ||
| class DetectionPublisher(ROS2Connector): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually use ROS2Connector through dependency injection. I'm not opposed to inheritance, just would love to know your reasoning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, they're in the ros2 folder and conceptually are ROS2 connectors. Time to walk the talk of loose coupling. I went ahead and refactored it. With dependency injection, the call chain is a bit longer than direct calling.
|
@Juliaj I've finished the review. Huge feature! |
Huge thanks again for all the detailed review feedback! I've deferred a few items for follow-up—one is tracked as an issue, and the other involves moving code to rai_perception. I'll work on these in a subsequent PR to keep this one more focused. |
maciejmajek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM

Purpose
Supports #225, Design SLAM RAI features
rai_semap, a semantic map memory system that stores object detections with their 3D locations in the SLAM map frame. The system enables spatial-semantic queries like "where did I see a cup?" and provides persistent memory for robot exploration and task planning.Proposed Changes
Deduplication: One of the main challenges is handling multiple detections of the same physical object. The system uses spatial merging with class-specific thresholds, point cloud-based matching when available, and confidence filtering to merge duplicate detections. However, distinguishing between a moved object and a new object instance remains a challenge that needs further refinement.
Object Detection Accuracy: The accuracy of object detection directly impacts the quality of the semantic map. The system includes confidence thresholds and bounding box size filtering, but detection accuracy depends on GroundingDINO.
Testing