Hi JP,
yes, as Greg said, I built that .fdef file years ago based on some trial 
and error runs on common benchmark data sets. In retrospect, I should 
add that the data sets themselves were quite biased (kinases are your 
friend these days) so I can't really tell if this is a globally better 
file to use. A good .fdef file for the "uncharged" case is really tricky 
when it comes to things like piperazines etc, so make sure to test your 
.fdefs on these cases!

Best,
Markus



Am 16/04/2013 06:08, schrieb Greg Landrum:
> On Mon, Apr 15, 2013 at 7:53 PM, JP <[email protected]> wrote:
>> As a followup to this question - further questions/problems :)
>>
>> Why is AmideN (and SulfonamideN) defined in the BaseFeatures.fdef ?  (I
>> cannot understand how/where these two definitions are used).
>
> They aren't currently being used. I guess they were probably used
> earlier as part of the Donor definition (i.e. to define atoms that are
> not donors).
>
>> One of the H Bond Donor definitions is AtomType NDonor [$([Nv3](-C)(-C)-C)]
>> :- but if a Nv3 is connected to 3 C - then there are no hydrogens.  How is
>> this a donor?  The v3 according to daylight means "atom with bond orders
>> totaling 3 (includes implicit H's)"
>
> This is an "uncharged" feature definition and is, implicitly assuming
> that a v3 N which is connected to three Cs would be protonated.
>
> I should emphasize that the feature definitions in BaseFeatures.fdef
> are really not very good. They are primarily provided as an example of
> what an fdef file should look like.
>
>> I had a look around and thought the
>> Contrib/M_Kossner/BaseFeatures_DIP2_NoMicrospecies.fdef file looked more
>> complete in terms of definition.  Unfortunately this file does not load with
>> BuildFeatureFactory (ValueError).  Anyone knows the history of that file?
>> Or why it came to being?
>>
>> FACTORY =
>> ChemicalFeatures.BuildFeatureFactory("/opt/RDKit_2012_12_1/Contrib/M_Kossner/BaseFeatures_DIP2_NoMicrospecies.fdef")
>> ValueError:  pattern->getNumAtoms() != len(feature weight vector)
>
> That was indeed a bug in the file, which is now fixed. Thanks for reporting 
> it.
>
> The file itself is an attempt at building a useable (=reasonably
> complete) fdef file that Markus Kossner contributed a few years ago.
> It's probably a better starting point than BaseFeatures.fdef is.
>
>>
>> Many thanks and sorry for the repeated emails,
>
> No worries.
>
> -greg
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to