Ben Woodcroft <b.woodcr...@uq.edu.au> writes: > On 03/05/16 00:49, Ricardo Wurmus wrote: >> * gnu/packages/bioinformatics.scm (r-centipede): New variable. >> --- >> gnu/packages/bioinformatics.scm | 21 +++++++++++++++++++++ >> 1 file changed, 21 insertions(+) >> >> diff --git a/gnu/packages/bioinformatics.scm >> b/gnu/packages/bioinformatics.scm >> index 7d025ef..d7957cf 100644 >> --- a/gnu/packages/bioinformatics.scm >> +++ b/gnu/packages/bioinformatics.scm >> @@ -441,6 +441,27 @@ pybedtools extends BEDTools by offering feature-level >> manipulations from with >> Python.") >> (license license:gpl2+))) >> >> +(define-public r-centipede >> + (package >> + (name "r-centipede") >> + (version "1.2") >> + (source (origin >> + (method url-fetch) >> + (uri (string-append "http://download.r-forge.r-project.org/" >> + "src/contrib/CENTIPEDE_" version >> ".tar.gz")) >> + (sha256 >> + (base32 >> + "1hsx6qgwr0i67fhy9257zj7s0ppncph2hjgbia5nn6nfmj0ax6l9")))) >> + (build-system r-build-system) >> + (home-page "http://centipede.uchicago.edu/") >> + (synopsis "Predict transcription factor binding sites") >> + (description >> + "Centipede fits a bayesian hierarchical mixture model to learn >> +transcription-factor-specific distribution of experimental data on a >> +particular cell-type for a set of candidate binding sites described by a >> +genetic motif.") > Perhaps this is just personal opinion but I prefer not to make the > suggestion that experiments can only be done in the lab. > > Also I don't think that sentence makes sense grammatically - > s/distribution/distributions/ but even then, it doesn't learn the > experimental data. > Maybe steal from the website, cut down a bit from this? > >CENTIPEDE applies a hierarchical Bayesian mixture model to infer > regions of the genome that are bound by particular transcription > factors. It starts by identifying a set of candidate binding sites > (e.g., sites that match a certain position weight matrix (PWM)), and > then aims to classify the sites according to whether each site is bound > or not bound by a TF. CENTIPEDE is an unsupervised learning algorithm > that discriminates between two different types of motif instances using > as much relevant information as possible. > > Thanks, > ben
Not even a year later I pushed it to master as b91cfa22e with the suggested change to the description. I had totally forgotten about this patch! -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net