Hi Jan, here you go: https://github.com/emlix/couchdb-yocto
the mentioned patch is here https://github.com/emlix/couchdb-yocto/blob/main/meta-couchdb/recipes-core/couchdb/files/0001-swap-fds.patch when you run the comaction test (see README do get there) /usr/lib/test-couchdb/test-compaction.sh you will find in the (/var/log/couchdb/couch.log) log as last line: [debug] [<0.173.0>] before gen_server:call Thanks, Stefan Am 02.03.23 um 13:45 schrieb Jan Lehnardt: > Hi Stefan, > > Thanks for the additional info. I’m happy to try a yocto build here. > > Best > Jan > — > >> On 2. Mar 2023, at 12:24, Stefan Kral <stefan.k...@emlix.com> wrote: >> >> Hi, >> >> I can give you some background context: our CouchDB instance is running >> on a embedded device (with minimal attack vector, so we have no pressure >> to mitigate CVEs). CouchDB has been chosen because of its write append >> and power fail safe property (and because of the easy scriptable >> curl/json interface). >> >> Currently there is a production system running on a SMB1 share (mounted >> in a Linux host) which works well (at least for our uses cases). SMB1 is >> not logner the default on the Windows remote side. And SMB2/3 has an >> issue with opening a renamend but not closed filedescriptor. The >> question is, wether we can solve this issue with minimal changes. >> >>> 1. How did you verify that the gen_server:call/3 call never returns? >>> 2. Do you get any pertinent lines (especially crashes) in your >>> couch.log? >> >> by adding: >> >>> + ?LOG_DEBUG("before gen_server:call", []), >>> ok = gen_server:call(Db#db.main_pid, {db_updated, NewDb3}, >>> infinity), >>> + ?LOG_DEBUG("after gen_server:call", []), >> >> the log gives: >> >>> [Thu, 02 Mar 2023 10:36:24 GMT] [debug] [<0.391.0>] Compaction process >>> spawned for db "asdf" >>> [Thu, 02 Mar 2023 10:36:24 GMT] [debug] [<0.84.0>] New task status for >>> <0.391.0>: [{changes_done,1}, >>> {database,<<"asdf">>}, >>> {progress,100}, >>> {started_on,1677753384}, >>> {total_changes,1}, >>> >>> {type,database_compaction}, >>> {updated_on,1677753384}] >>> [Thu, 02 Mar 2023 10:36:24 GMT] [debug] [<0.366.0>] CouchDB swapping files >>> .../asdf.couch and .../asdf.couch.compact. >>> [Thu, 02 Mar 2023 10:36:24 GMT] [debug] [<0.366.0>] before gen_server:call >> >> then long time nothing... >> >> refreshing the db in the futon web gui gives: no response >> >> and the log continues with: >> >>> [Thu, 02 Mar 2023 11:02:54 GMT] [error] [<0.144.0>] ** Generic server >>> couch_compaction_daemon terminating >>> ** Last message in was {'EXIT',<0.145.0>, >>> {timeout, >>> {gen_server,call,[couch_server,get_server]}}} >>> ** When Server state == {state,<0.145.0>} >>> ** Reason for termination == >>> ** {compaction_loop_died, >>> {timeout,{gen_server,call,[couch_server,get_server]}}} >>> >>> [Thu, 02 Mar 2023 11:02:54 GMT] [error] [<0.144.0>] {error_report,<0.31.0>, >>> {<0.144.0>,crash_report, >>> [[{initial_call, >>> {couch_compaction_daemon,init,['Argument__1']}}, >>> {pid,<0.144.0>}, >>> {registered_name,couch_compaction_daemon}, >>> {error_info, >>> {exit, >>> {compaction_loop_died, >>> {timeout, >>> {gen_server,call,[couch_server,get_server]}}}, >>> [{gen_server,terminate,7, >>> [{file,"gen_server.erl"},{line,804}]}, >>> {proc_lib,init_p_do_apply,3, >>> [{file,"proc_lib.erl"},{line,237}]}]}}, >> ... >> >> >>> 3. Can you share your environment where you get to compile 1.6.1 >>> successfully, so we can try and reproduce this? >> >> I could prepare you a yocto setup to build a toolchain and packages for >> an qemu/docker imgage, if you are familar with that build system... >> >>> 4. Could it be that your SMB implementation doesn’t allow for opening >>> and closing files in this quick succession (with our without a rename >>> in the mix)? >> >> For testing it desn't need to run on SMB share, the timeout issue >> occures with the given fd-swap patch on a default (Linux) setup. >> >> And a strace log does not show any underlying FS issues. >> >> >> Best, >> Stefan >> >> Am 28.02.23 um 16:47 schrieb Jan Lehnardt: >>> first off, CouchDB 1.6.1 is no longer supported by this project AND it >>> has a long list of CVEs[1] against it. You REALLY should be operating >>> on a newer version. >>> >>> Secondly, just to understand your motivation: you think closing and >>> opening the fds after the file:rename/2 call will make things work >>> for your SMB operation? >>> >>> If yes, the only think I could spot that is substantially different, is >>> that the NewFd position is advanced implicitly by the underlying >>> file:pread/3 in [2] and your SwappedFd doesn’t get the same treatment, >>> but I don’t know why that should block the gen server call, as that only >>> does some refcounting updates[3]. While this includes stopping the >>> gen_server[4], I don’t see how the Pid this operates on should be any >>> different under your patch. >>> >>> So: >>> >>> 1. How did you verify that the gen_server:call/3 call never returns? >>> 2. Do you get any pertinent lines (especially crashes) in your couch.log? >>> 3. Can you share your environment where you get to compile 1.6.1 >>> successfully, so we can try and reproduce this? >>> 4. Could it be that your SMB implementation doesn’t allow for opening and >>> closing files in this quick succession (with our without a rename in >>> the mix)? >>> >>> >>> [1]: https://docs.couchdb.org/en/stable/cve/index.html >>> [2]: >>> https://github.com/apache/couchdb/blob/1.6.x/src/couchdb/couch_db_updater.erl#L179 >>> [3]: >>> https://github.com/apache/couchdb/blob/1.6.x/src/couchdb/couch_db.erl#L1122-L1130 >>> [4]: >>> https://github.com/apache/couchdb/blob/1.6.x/src/couchdb/couch_ref_counter.erl#L84 >>> >>> >>> Best >>> Jan >>> — >>> Professional Support for Apache CouchDB: >>> https://neighbourhood.ie/couchdb-support/ >>> >>> 24/7 Observation for your CouchDB Instances: >>> https://opservatory.app >>> >>> >>>> On 28. Feb 2023, at 10:19, Stefan Kral <stefan.k...@emlix.com> wrote: >>>> >>>> Hi, >>>> >>>> I'm experimenting with a CouchDB setup on a SMB mount point. I know this >>>> is not supported, but I ran into a (maybe simple) problem I don't >>>> understand. Maybe someone of you can give a hint easily (that would be >>>> amazing). >>>> >>>> Given the following patch (I need to close/reopen the file descriptors >>>> after renaming) for the function >>>> https://github.com/apache/couchdb/blob/1.6.x/src/couchdb/couch_db_updater.erl#L176 >>>> >>>>> 1 --- a/src/couchdb/couch_db_updater.erl >>>>> 2 +++ b/src/couchdb/couch_db_updater.erl >>>>> 3 @@ -202,8 +202,18 @@ handle_call({compact_done, CompactFilepath}, >>>>> _From, #db{filepath=Path}=Db) -> >>>>> 4 RootDir = couch_config:get("couchdb", "database_dir", "."), >>>>> 5 couch_file:delete(RootDir, Filepath), >>>>> 6 ok = file:rename(CompactFilepath, Filepath), >>>>> 7 + >>>>> 8 + ok = couch_file:close(NewDb#db.updater_fd), >>>>> 9 + ok = couch_file:close(NewDb#db.fd), >>>>> 10 + {ok, SwappedFd} = couch_file:open(Filepath), >>>>> 11 + SwappedReaderFd = open_reader_fd(Filepath, Db#db.options), >>>>> 12 + SwappedDb = NewDb2#db{ >>>>> 13 + fd = SwappedReaderFd, >>>>> 14 + updater_fd = SwappedFd >>>>> 15 + }, >>>>> 16 + unlink(SwappedFd), >>>>> 17 close_db(Db), >>>>> 18 - NewDb3 = refresh_validate_doc_funs(NewDb2), >>>>> 19 + NewDb3 = refresh_validate_doc_funs(SwappedDb), >>>>> 20 ok = gen_server:call(Db#db.main_pid, {db_updated, NewDb3}, >>>>> infinity), >>>>> 21 couch_db_update_notifier:notify({compacted, NewDb3#db.name}), >>>>> 22 ?LOG_INFO("Compaction for db \"~s\" completed.", >>>>> [Db#db.name]), >>>> >>>> then the gen_server:call() of line 20 never returns. >>>> >>>> Is there a major issue with this approach or just a minor mistake in my >>>> implementation? >>>> >>>> >>>> Thank you for having a look, >>>> Stefan >>> >>> > -- Besuchen Sie uns auf der Embedded World 2023 14. bis 16. März 2023 | Messe Nürnberg Sie finden uns in Halle 4, Stand 336 Dipl.-Ing. Stefan Kral, emlix GmbH, http://www.emlix.com Fon +49 30 275911-00, Fax -33 Panoramastraße 1, 10178 Berlin, Germany Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160 Geschäftsführung: Heike Jordan, Dr. Uwe Kracke Ust.-IdNr.: DE 205 198 055 emlix - smart embedded open source