I managed to get a backtrace to a segmentation fault using GDB.

It seems like the crash is happening in sword::FileMgr::open( ...

The starting point is sword::InstallMgr::refreshRemoteSource as I was writing before.

Best regards,
Tobias

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f1af3fff700 (LWP 220833)]
0x00007f1b027045a4 in sword::FileMgr::open(char const*, int, int, bool) () from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node
(gdb) backtrace
#0  0x00007f1b027045a4 in sword::FileMgr::open(char const*, int, int, bool) () from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node #1  0x00007f1b0276ad7b in sword::(anonymous namespace)::my_fwrite(void*, unsigned long, unsigned long, void*) ()    from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node #2  0x00007f1b180626bf in ?? () from /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 #3  0x00007f1b18074a2b in ?? () from /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 #4  0x00007f1b1807e2e4 in ?? () from /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 #5  0x00007f1b1807f6f9 in curl_multi_perform () from /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 #6  0x00007f1b18075d13 in curl_easy_perform () from /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 #7  0x00007f1b0276b683 in sword::CURLFTPTransport::getURL(char const*, char const*, sword::SWBuf*) () from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node #8  0x00007f1b0271d5d2 in sword::InstallMgr::remoteCopy(sword::InstallSource*, char const*, char const*, bool, char const*) ()    from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node #9  0x00007f1b0271edc7 in sword::InstallMgr::refreshRemoteSource(sword::InstallSource*) () from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node #10 0x00007f1b026ad734 in RepositoryInterface::refreshIndividualRemoteSource(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::function<void (unsigned int)>*) ()    from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node #11 0x00007f1b026b17dd in std::thread::_State_impl<std::thread::_Invoker<std::tuple<int (RepositoryInterface::*)(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::function<void (unsigned int)>*), RepositoryInterface*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::function<void (unsigned int)>*> > >::_M_run() ()    from /home/tobi/dev/ezra_project/node-sword-interface-git/build/Release/node_sword_interface.node #12 0x00007f1b1d622cb4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #13 0x00007f1b1e20a609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#14 0x00007f1b1e131103 in clone () from /lib/x86_64-linux-gnu/libc.so.6

On 10/13/20 1:07 PM, Tobias Klein wrote:

Hi Troy,

I tested more SVN revisions of SWORD trunk (starting from my stable version until I hit the bug) and I can now say that

SVN Rev. 3759 is the last SVN revision that works without hanging for the below mentioned scenario. (20 out of 20 tests successful)

SVN Rev. 3760 is the first SVN revision where the hanging occurs. The commit message is "First cut at better isolation of FileIO to FileMgr and providing a WIN32 impl with works with wchar_t".

Modified files:
include/filemgr.h
include/swbuf.h
lib/bcppmake/libsword.bpr
src/mgr/curlftpt.cpp
src/mgr/curlhttpt.cpp
src/mgr/filemgr.cpp
src/mgr/installmgr.cpp
src/mgr/swmgr.cpp
src/utilfuns/utilstr.cpp

Maybe this helps to find the root-cause.

Best regards,
Tobias

On 10/12/20 9:20 PM, Tobias Klein wrote:

I'll see whether I can collect a stack trace. It may take some time until I have it.

The multi-threaded "remote source refreshing" worked without issues until recently.

Here is the code of the function that does the actual work in a thread.
See https://github.com/tobias-klein/node-sword-interface/blob/787160ccb4b3bab2a762d22f74031c7237edc803/src/sword_backend/repository_interface.cpp#L105.

intRepositoryInterface::refreshIndividualRemoteSource(stringremoteSourceName, std::function<void(unsignedintprogress)>*progressCallback)
{
//cout << "Refreshing source " << remoteSourceName << endl << flush;
InstallSource* source= this->getRemoteSource(remoteSourceName);
intresult= this->_installMgr->refreshRemoteSource(source);
if(result!= 0) {
cerr<<"Failed to refresh source "<<remoteSourceName<<endl<<flush;
}
remoteSourceUpdateMutex.lock();
this->_remoteSourceUpdateCount++;
unsignedinttotalPercent= (unsignedint)calculateIntPercentage<double>(this->_remoteSourceUpdateCount,
this->_remoteSourceCount);
if(progressCallback!= 0) {
(*progressCallback)(totalPercent);
}
remoteSourceUpdateMutex.unlock();
returnresult;
}

Best regards,
Tobias

On 10/12/20 9:01 PM, Troy A. Griffitts wrote:
Any luck getting a stack trace on crash?

Regarding the "multitheaded mode", I'd have to get a bit more information as to exactly how you are sharing SWORD objects across your threads. Generally, as a rule, you shouldn't. We recommend a separate instance of SWMgr per thread and that probably goes for InstallMgr, as well.

Troy

On October 12, 2020 8:29:31 PM GMT+02:00, Tobias Klein <cont...@tklein.info> wrote:

    Hi Troy,

    I'm using curl on all three platforms.

    Regarding the timeout configuration I have not changed anything
    yet, to make this configurable in Ezra Project is still on my
    todo list.

    I just checked on Linux.
    With the old version (May 18th 2020) no hanging or crash in 10
    out of 10 times.
    WIth the new version (latest trunk / SWORD 1.9 RC3) I get 1 x
    crash, 2 x hanging, 7 x working.

    I'm running the InstallMgr::refreshRemoteSource "in a
    multi-threaded mode".

    Best regards,
    Tobias

    On 10/12/20 6:59 PM, Troy A. Griffitts wrote:
    Hi Tobias,

    What transport library are you building with? ftplib or curl?

    Have you changed the value of our new timeout from the default,
    I believe we decided on, 10 seconds?

    Troy

    On October 12, 2020 6:46:54 PM GMT+02:00, Tobias Klein
    <cont...@tklein.info> wrote:

        Hi Troy,

        In my latest Ezra Project builds using SWORD trunk I’ve been noticing random 
„hangs“ and crashes related to "updating remote sources“. I suppose it must be 
around InstallMgr::refreshRemoteSource.

        This was still rock solid when using SWORD trunk from May 18th 2020, 
but not so any more with the recent SWORD trunk.

        Unfortunately I cannot pinpoint this more specifically. I just wanted 
to first share this observation, because it’s worrying me.

        I’ve been noticing this regression both on Windows and macOS. Need to 
check later whether this also happens on Linux, cannot recall it right now.

        Best regards,
        Tobias
        ------------------------------------------------------------------------
        sword-devel mailing list:sword-devel@crosswire.org
        http://crosswire.org/mailman/listinfo/sword-devel
        Instructions to unsubscribe/change your settings at above page


-- Sent from my Android device with K-9 Mail. Please excuse my brevity.


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

_______________________________________________
sword-devel mailing list:sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to