> Admittedly, I would like to know why this is being done in this fashion, but that is tangential to my issue.
IIRC, this is a limitation given to use by the AWS C++ SDK. See [1]. The AWS C++ SDK has static state and they do not manage it with static local variables. As a result, the initialization and finalization order is (IIRC) undefined (or at least not very well defined). > Now for my question: this is all fine and well in the context of developing your own stand-alone program and such. > However, what happens when you live in an embedded world in which your code lies many layers below main() and > you don't have access to main(), even if you wanted to follow the prescribed pattern? I mean, we are expected to wind > up and then down in an on-demand fashion, allocating and then freeing all resources respectively. I pulled the init/finalize > out to the outermost layer that I have any involvement with, yet I see the following error messages I'm not familiar with embedded programming models. Is there a main somewhere? If so, can you pass the responsibility onto your caller (whomever has the main?) Or does some kind of component-level initialization exist? If not, then you can try and play games with static variables, but I think that would violate "freeing all resources respectively". However, Arrow-C++ itself has static state (e.g. CPU & I/O thread pools), so unless you are unloading the library, it's not clear that you will be freeing all resources anyways. [1] https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/basic-use.html On Sun, Dec 1, 2024 at 1:02 PM Jerry Adair <jerry.ad...@sas.com.invalid> wrote: > Hi, > > I have a question regarding the initialization/finalization of the S3 > filesystem within the Arrow filesystem library. Apologies if this question > has been raised in the past; I did perform a search but that search didn't > turn up anything. I did read the thread that discussed the issue of > init/finalize, though nothing I found made it clear when the addition of > the finalize method surfaced. I thought I read mention that it occurred > around version 12.0.0, but not certain. That's just a side note really, I > am curious to know when it came about, because we had been using an old > version of the libraries (8.0.0) and it didn't exist within that version. > But I digress. > > So my issue and the question I have surrounds this notion of timing. The > aforementioned thread that I read made it clear that the init/finalize > should take place at the beginning and the end of main(): > > > // Snipped for brevity reasons > int main() > { > // More snipping > arrow::Status initializeStatus = arrow::fs::InitializeS3( > globalOptions ); > ... > arrow::Status finalizeStatus = arrow::fs::FinalizeS3(); > } /* end of your main() entry point*/ > > > The thread also made it clear that this bookended init/finalize should not > occur within a class definition, most likely in the constructor/destructor > respectively. > > So OK. While I am not familiar with the reason that this structure became > "a thing" within the Arrow filesystem library, it is indeed that way now. > Admittedly, I would like to know why this is being done in this fashion, > but that is tangential to my issue. Now for my question: this is all fine > and well in the context of developing your own stand-alone program and > such. However, what happens when you live in an embedded world in which > your code lies many layers below main() and you don't have access to > main(), even if you wanted to follow the prescribed pattern? I mean, we > are expected to wind up and then down in an on-demand fashion, allocating > and then freeing all resources respectively. I pulled the init/finalize > out to the outermost layer that I have any involvement with, yet I see the > following error messages: > > > 2024-11-26T04:55:10,917 DEBUG [00000007] () App.parquet - Could not create > a AWS filesystem object > 2024-11-26T04:55:10,917 DEBUG [00000007] () App.parquet - > parquetFileReader): Exception exit, reason = Unable to create a file system > object on AWS server: Invalid: S3 subsystem is finalized > > > This occurs because the first spool-up/spool-down worked successfully, but > then when we are called sometime thereafter, the finalize method has > already done its thing, thus we can't initialize again. Obviously, I know > why this is occurring, that is straightforward, I don't need an explanation > for that. The question is what can I do about this in my environment where > no access to main() is available and we must exist/not-exist on-demand? > Surely I am not the only one in this development scenario who has been > faced with this issue. So what is the solution here? Anyone else faced > this? Help? > > Thanks, > Jerry >