Along with rtab-map, i am using classifier that operates on the rgb images from another camera on the robot. I ofcourse know the transform between the camera frame and the stereo camera (that i use for rtabmap) frame.
The classifier is trained to detect certain objects. Once those objects are detected, i would like to translate the points corresponding to those object from the camera's frame, to the stereo-camera frame, for inclusion in the map.
Once i compute those transforms, how do you think i should encode the class-name (or id or w.e) into the global map constructed by rtabmap.
The intention is to treat those objects with a gripper in an outdoor setting.
You will then have a depth image registered to your mono camera. You may find object position accordingly to robot knowing the depth value corresponding to pixels of the object, similarly to 3d position example of find_object_2d package.
To label point cloud of the map, there is no built-in way to do that. However, you could save the segmented point cloud like how the objects are saved in the map in this example (see this for more info about this example).