An example for gnu parallel

$ parallel -L1000  -k obabel -:{} -osdf --gen2D < gdb11_size09.smi >
gdb11_size09.sdf

I've taken the gdb11 molecules of size 9, which are 444,313 molecules.
Using -L1000 , parallel reads 1000 lines at a time and passes them into
openbabel for conversion to sdf
By using the "-k" option, the order of the output sdf is maintained as the
same order in the input smi file

Timing this, it takes 1m54s on my computer.

The serial version
$  obabel gdb11_size09.smi -O gdb11_size09.sdf --gen2D
takes  13m11s

These are on a six-core processor that has hyperthreading to give 12
virtual cores, so maybe the hyperthreading gave the little bit over 6x
speedup, all without the need to separate by file into multiple parts and
create a script that has multiple commands.

-David




On Tue, Feb 4, 2014 at 4:28 AM, Noel O'Boyle <[email protected]> wrote:

> It would be nice to see some explicit examples of how Open Babel might
> be used in this way, using one or all of these tools.
>
> - Noel
>
> On 4 February 2014 00:52, Francois Berenger <[email protected]> wrote:
> > On 02/04/2014 12:14 AM, Maciek Wójcikowski wrote:
> >> You can also use xargs.
> >
> > Yes, xargs with the -P option, but the command lines are not trivial
> then.
> >
> >> ----
> >> Pozdrawiam,  |  Best regards,
> >> Maciek Wójcikowski
> >> [email protected] <mailto:[email protected]>
> >>
> >>
> >> 2014-02-03 16:10 GMT+01:00 Igor Filippov <[email protected]
> >> <mailto:[email protected]>>:
> >>
> >>     How is it different from GNU parallel?
> >>     http://www.gnu.org/software/bash/manual/html_node/GNU-Parallel.html
> >
> > It should be quite similar in functionality.
> >
> >>     Igor
> >>
> >>
> >>     On Mon, Feb 3, 2014 at 1:37 AM, Francois Berenger <
> [email protected]
> >>     <mailto:[email protected]>> wrote:
> >>
> >>         Hello,
> >>
> >>         I do this almost everyday so I think I should share it with this
> >>         list.
> >>
> >>         In case you need to execute many Open Babel commands
> >>         and don't want to wait, you can execute them in parallel
> >>         on a multi-core computer.
> >>         Of course, the commands should be independent, for example
> >>         processing different datasets.
> >>
> >>         Let's say the commands are in a file called for_par.sh.
> >>         I developped a tool called PAR years ago that can do this:
> >>
> >>         par -i for_par.sh -v -o log
> >>
> >>         It will use all cores of the computer, display a completion
> >>         percentage and store all output messages in the file log.
> >>
> >>         If your user can connect to several computers e.g. via
> >>         SSH then you can even run commands in a distributed manner.
> >>         I use it daily on Linux but know some people used it on Mac OS X
> >>         as well.
> >>
> >>         The project is there:
> >>
> >>         https://savannah.nongnu.org/projects/par
> >>
> >>         The paper is freely available there:
> >>
> >>
> http://bioinformatics.oxfordjournals.org/content/26/22/2918.long
> >>
> >>         --
> >>         Best regards,
> >>         Francois Berenger.
> >>
> >>
> ------------------------------------------------------------------------------
> >>         Managing the Performance of Cloud-Based Applications
> >>         Take advantage of what the Cloud has to offer - Avoid Common
> >>         Pitfalls.
> >>         Read the Whitepaper.
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> >>         _______________________________________________
> >>         OpenBabel-discuss mailing list
> >>         [email protected]
> >>         <mailto:[email protected]>
> >>         https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> >>
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >>     Managing the Performance of Cloud-Based Applications
> >>     Take advantage of what the Cloud has to offer - Avoid Common
> Pitfalls.
> >>     Read the Whitepaper.
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> >>     _______________________________________________
> >>     OpenBabel-discuss mailing list
> >>     [email protected]
> >>     <mailto:[email protected]>
> >>     https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
> >>
> >>
> >
> >
> > --
> > Best regards,
> > Francois Berenger.
> >
> >
> ------------------------------------------------------------------------------
> > Managing the Performance of Cloud-Based Applications
> > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> > Read the Whitepaper.
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> > _______________________________________________
> > OpenBabel-discuss mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
> ------------------------------------------------------------------------------
> Managing the Performance of Cloud-Based Applications
> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
> Read the Whitepaper.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
> _______________________________________________
> OpenBabel-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
OpenBabel-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to