-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waitset not triggering timer when initialized in lifecycle node #2652
Comments
This definitely sounds like a bug.
This is going to force the waitset to always be rebuilt, which is not good for performance. It's likely that the bug is somewhere else (in how timers are added?) I probably won't get a chance to look at this until after roscon, but thank you for the full repro example and detailed steps. |
Yes, I agree that this sounds like a bug.
I wouldn't exclude that this is "timers-specific" since timers are handled in a slightly different ways from other entities. |
My assumption as well is that something about the execution order from lifecycle nodes is surfacing this, but it's not something that would be unique to lifecycle nodes. |
Hi, I discovered this bug five days ago at fmrico/cascade_lifecycle#14, and now I am solving a problem that is directly related at PlanSys2 (ros/rosdistro#43212 (comment)). I am solving this now by changing the SingleThreadedExecutors to EventsExecutors in some cases and removing the node before destroying it in other cases. Basically, if I have two nodes in an executor, each one with a timer, if I destroy one of the nodes, the timer callback in the other node is no longer called. I can investigate more if you want. |
This issue has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/next-client-library-wg-meeting-friday-6th-december-2024-8am-pt/40954/2 |
I found the bug: rclcpp/rclcpp/include/rclcpp/wait_result.hpp Lines 176 to 178 in 8c0161a
We got a mismatch here between the sizeof the rcl waitset and the rclcpp waitset. Therefore the ready timer in the rcl waitset is never checked. The fix is simple, but it is in exposed template code... @clalancette Are we breaking ABI if we are patching the templates ? |
My initial analysis was not correct. We got a mismatch between the index in the rcl_waitset and the index in the rclcpp waitset. Thats the bug. |
It depends on what we are changing, I think. What I'm going to say here is that we open the PR to fix things against |
@wjwwood @mjcarroll This bug was introduced by the rewrite of the executor. It looks like there is a prune missing here :
There is a variable that indicates, that we need pruning, but it is never used. Also the interface is not available at this place... |
This issue has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/preparing-for-jazzy-sync-and-patch-release-2024-12-23/41213/5 |
We have also stumbled upon this issue. Therefore I will add our setup and findings, in case it is helpful. Required Info: DDS implementation: Minimal example #include "rclcpp/rclcpp.hpp"
class TestNode : public rclcpp::Node
{
public:
TestNode() : Node("test_node")
{
m_test_timer = create_timer(std::chrono::milliseconds(110), [this]() {
RCLCPP_INFO(get_logger(), "Test timer 1 callback.");
});
m_test_timer2 = create_timer(std::chrono::milliseconds(500), [this]() {
RCLCPP_INFO(get_logger(), "Test 2 timer callback.");
m_test_timer2.reset();
});
}
rclcpp::TimerBase::SharedPtr m_test_timer;
rclcpp::TimerBase::SharedPtr m_test_timer2;
};
int main(int argc, char** argv)
{
rclcpp::init(argc, argv);
auto test_node = std::make_shared<TestNode>();
rclcpp::spin(test_node);
rclcpp::shutdown();
return 0;
} Expected behavior: Results:
Workarounds: |
Bug report
Required Info:
Steps to reproduce issue
Expected behavior
Since the service callback destroys only
timer_
,timer2_
should keep running andHello, world!
should keep printing.Actual behavior
All timers are stopped and no
Hello, world!
is printed.Additional information
The following might shed some light into this.
timer2_
is only created aftertimer_
is destroyed) the issue does not arise.|| true
here making this condition always evaluate to true, all timers are executed, however I added some printing when timers are cleaned up / added and I can see that every iteration a timer gets added and removed, so something odd is going on there.So I suspect there is something about how timers (and maybe other entities?) are added to wait sets inside lifecycle node transitions that creates this issue.
The text was updated successfully, but these errors were encountered: