Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent memory errors in global_manager thread using my own 4 robots bag #30

Open
Whitneyswu opened this issue Dec 8, 2024 · 1 comment

Comments

@Whitneyswu
Copy link

Whitneyswu commented Dec 8, 2024

Hi @MaverickPeter , thanks for your great work again!
When using the code on my own bag (4 robots in a small house with several rooms), there are often some memory related errors like Segmentation fault, double free or corruption, free(): invalid pointer ... Example is as follow: (the error type and line of codes where error happens are random every run. Because of my notes, the line of code is different from the source code on github)

[ INFO] [1733693194.869503052, 3160.499400000]: merge map: 0.129016s with 3838 points
[ INFO] [1733693195.193198580, 3160.789400000]: merge map: 0.112809s with 3838 points

Thread 8 "global_manager_" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff57cac700 (LWP 86655)]
0x00005555555a3f1b in boost::detail::sp_counted_base::release (this=0x7ffe00019460) at /usr/include/boost/smart_ptr/detail/sp_counted_base_std_atomic.hpp:112
112	            dispose();
(gdb) bt
#0  0x00005555555a3f1b in boost::detail::sp_counted_base::release (this=0x7ffe00019460)
    at /usr/include/boost/smart_ptr/detail/sp_counted_base_std_atomic.hpp:112
#1  0x00005555555a4023 in boost::detail::shared_count::~shared_count (this=0x7fff57ca3f18, 
    __in_chrg=<optimized out>)
    at /usr/include/boost/smart_ptr/detail/shared_count.hpp:427
#2  0x00007fffef8cbf86 in boost::shared_ptr<gtsam::NonlinearFactor>::~shared_ptr (
--Type <RET> for more, q to quit, c to continue without paging--c
    this=0x7fff57ca3f10, __in_chrg=<optimized out>) at /usr/include/boost/smart_ptr/shared_ptr.hpp:341
#3  0x00007fffedace5a7 in distributed_mapper::DistributedMapper::createSubgraphInnerAndSepEdges (this=0x7ffdf4004e80, subgraph=...) at /home/nyf/Code/MR-SLAM/MR_SLAM/Mapping/src/global_manager/src/distributed_mapper/distributed_mapper.cpp:43
#4  0x00007fffedacee3d in distributed_mapper::DistributedMapper::loadSubgraphAndCreateSubgraphEdge (this=0x7ffdf4004e80, graph_and_values={...}) at /home/nyf/Code/MR-SLAM/MR_SLAM/Mapping/src/global_manager/src/distributed_mapper/distributed_mapper.cpp:95
#5  0x00007fffef8ad7dc in global_manager::GlobalManager::constructOptimizer (this=0x7fffffffc950, savingMode=false) at /home/nyf/Code/MR-SLAM/MR_SLAM/Mapping/src/global_manager/src/global_manager.cpp:1247
#6  0x00007fffef8adcba in global_manager::GlobalManager::correctPoses (this=0x7fffffffc950) at /home/nyf/Code/MR-SLAM/MR_SLAM/Mapping/src/global_manager/src/global_manager.cpp:1285
#7  0x00007fffef8a6b84 in global_manager::GlobalManager::loopClosingThread (this=0x7fffffffc950) at /home/nyf/Code/MR-SLAM/MR_SLAM/Mapping/src/global_manager/src/global_manager.cpp:582
#8  0x00005555555b7c87 in boost::_mfi::mf0<void, global_manager::GlobalManager>::operator() (this=0x55555563da38, p=0x7fffffffc950) at /usr/include/boost/bind/mem_fn_template.hpp:49
#9  0x00005555555b7ad9 in boost::_bi::list1<boost::_bi::value<global_manager::GlobalManager*> >::operator()<boost::_mfi::mf0<void, global_manager::GlobalManager>, boost::_bi::list0> (this=0x55555563da48, f=..., a=...) at /usr/include/boost/bind/bind.hpp:259
#10 0x00005555555b783c in boost::_bi::bind_t<void, boost::_mfi::mf0<void, global_manager::GlobalManager>, boost::_bi::list1<boost::_bi::value<global_manager::GlobalManager*> > >::operator() (this=0x55555563da38) at /usr/include/boost/bind/bind.hpp:1294
#11 0x00005555555b7498 in boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf0<void, global_manager::GlobalManager>, boost::_bi::list1<boost::_bi::value<global_manager::GlobalManager*> > > >::run (this=0x55555563d900) at /usr/include/boost/thread/detail/thread.hpp:120
#12 0x00007fffefcc343b in ?? () from /lib/x86_64-linux-gnu/libboost_thread.so.1.71.0
#13 0x00007fffefc97609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#14 0x00007fffee577353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95`

I am sure about the config corresponding the 4th robot, and after setting stack ulimit -s 163840, and even doubled this value, problem still exists.
I checked the mutex, but I didn't notice any obvious shared variables that needed to be locked but didn't

Seems most this type of errors occurs at the moment just before submaps are merged. And if at the beginning, one robot repeats the path of another robot, this error will definitely occur sooner. But sometimes with different, random path, it woks well and sub maps can be completely merged together, sometimes it fails. It's really strange!

I've been struggling for days, master please help me! thanks

@MaverickPeter
Copy link
Owner

MaverickPeter commented Dec 18, 2024

@Whitneyswu
Hi, thanks for using our project. I have no idea what is happening; could you share your bag with me and I'll try to figure it out?
Currently, I guess it could be something inside the GTSAM library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants