On Jan 9, 2019, at 15:45, Noel O'Boyle <baoille...@gmail.com> wrote: > Making such an API addition adds a maintainence would commit us to correctly > maintaining the underlying information, and that's a maintainence task I'm > not willing to take on.
Certainly, and I appreciate that maintenance is an important factor. Let me try this again. Hi! Open Babel supports the FPS output format. What can I do to help maintain that plugin? When Open Babel decided to support the FPS format, I realized they took on a maintenance task. For example, the FPS format at that time, at https://code.google.com/archive/p/chem-fingerprints/wikis/FPS.wiki , says "The version MUST change whenever the underlying fingerprint algorithm changes." I noticed that the implementations in version control have changed. I want to help update the version numbers so they will be ready for the next release. I have a lot of Python experience but haven't really used C++ in 20 years - and even then without shared library experience - so I'm afraid I don't really know what I'm doing with that language. But I'll try! My proposals to simply update the version number to "/3", and to change the plugin API to allow version numbers, don't seem like they were acceptable. What about changing the plugin so it is structured like this: const char *plugin_version = "1"; const char *plugin_id = _pFP->GetID(); if (!strcmp(plugin_id, "MACCS"))) { plugin_version = "3"; } ... << "#type=OpenBabel-" << plugin_id << "/" << plugin_version << '\n' so that it is possible to have different version numbers for each fingerprint type, embedded as logic inside of the plugin? If this third alternative seems like it might be acceptable, then I can work on it further and submit a pull request for a more detailed examination. > In addition, I don't personally think that this information adds anything > beyond reporting the version number. And I personally think it does. I know I've liked it when chemfp reported mismatched versions when I accidentally used an outdated file. But that's okay. Open Babel doesn't need to follow my feelings. I can totally see that if the overall fingerprint generation process in Open Babel effectively changes for every release, then the type version adds nothing. In that case, here's a fourth proposal. Change the FPS plugin so the output "#type" includes the Babel version, that is: << "#type=OpenBabel-" << _pFP->GetID() << "/" << BABEL_VERSION << '\n' The output would look like something like "OpenBabel-MACCS/2.4.1", which is a valid FPS fingerprint type string. This type string provides a hint to all downstream users that they need to re-build their fingerprints after every Open Babel upgrade. If this seems like it might be acceptable, then I can submit a pull request for it. For some background on why there is a "#type" line with a fingerprint type version, and a "#software" line with a version of the software, that's because other tools have different needs than Open Babel. I might have my own tool built on Open Babel which generates fingerprints. The releases are: 1.0 - initial release, v1 of the fingerprint 1.1 - performance improvements, fingerprints don't change 1.2 - fix a bug in my implementation, call the new version v2, but allow v1 to be selected as a command-line option 1.3 - port to MS Windows, otherwise unchanged 1.4 - change to require C++14 instead of C++03, otherwise unchanged 1.5 - switch to faster PRNG as the default, call it v3, and allow v1 and v2 to be selected as a command-line option. That is, version 1.5 of my program can generate versions 1, 2, and 3 of the fingerprint type I created, selectable at run-time. These fingerprint types might be available for reproducibility or contractual reasons, among others. In this case, the program version cannot be used to distinguish between the fingerprint types, which is why there is a distinct type. As another example, RDKit's and OpenEye's fingerprints have not changed for many years, though they both have had multiple releases during that time. Releases for those toolkits typically add features elsewhere. There is no reason for users of those toolkits to re-build their fingerprint data sets after each release. My hope is to simplify and standardize the mechanism used to inform users that there may be a problem. For example, a web server might use a component, installed with 'pip', which has a hard-coded set of fingerprints. Users might not even be aware that that FPS file exists. To help reduce silent errors, that component might generate a log message about a potential problem if it sees a version skew. That's why chemfp includes a function to compare two FPS headers and generate a list of errors (like comparing 1024 vs. 2048 bit fingerprints), warnings (e.g., different fingerprint types) and "info"-level messages (e.g., different program versions) for things that typically aren't important - and lets the calling software decide what to do with each severity level and message type. Best regards, Andrew da...@dalkescientific.com _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss