Hi All,

I´m trying to use the pml_v (pessimist) with FT components, but during
the loading the pml_v closes and close the vprotocol_pessimist too...
according the following:

(log of only one process...)

$ mpirun -np 2 -hostfile ../hostfile -am ../ft-enable-cr -v -d ./ping 10 1

opal_cr: init: Verbose Level: 128
opal_cr: init: FT Enabled: 1
opal_cr: init: OPAL CR Allow OPAL Only: 0
opal_cr: init: Is a tool program: 0
opal_cr: init: Checkpoint Signal: 10
opal_cr: init: Temp Directory: /tmp
proc_info: hnp_uri
1251737600.0;tcp://172.20.5.128:46169;tcp://158.109.65.178:46169;tcp://10.8.0.1:46169
      daemon uri 1251737600.1;tcp://172.20.5.1:39991
App) Named Pipes (/tmp/opal_cr_prog_read.17352) (/tmp/opal_cr_prog_write.17352)
orte_cr: init: orte_cr_init()
mca: base: components_open: Looking for pml components
mca: base: components_open: opening pml components
mca: base: components_open: found loaded component cm
mca: base: components_open: component cm open function successful
mca: base: components_open: found loaded component crcpw
pml:crcpw: open()
pml:crcpw: open: priority   = -128
pml:crcpw: open: verbosity  = 128
mca: base: components_open: component crcpw open function successful
mca: base: components_open: found loaded component dr
mca: base: components_open: component dr open function successful
mca: base: components_open: found loaded component ob1
mca: base: components_open: component ob1 open function successful
mca: base: components_open: found loaded component v
pml_v: loaded
pml_v: vprotocol_pessimist: component_open: read priority 120
mca: base: components_open: component v open function successful
select: initializing pml component cm
select: init returned failure for component cm
select: initializing pml component crcpw
pml:crcpw: component_init: Priority -128
select: init returned priority -128
pml:select: Wrapper Component: Component crcpw was determined to be a
Wrapper PML with priority -128
select: component dr not in the include list
select: initializing pml component ob1
select: init returned priority 20
select: component v not in the include list
selected ob1 best priority 20
select: component ob1 selected
mca: base: close: component cm closed
mca: base: close: unloading component cm
mca: base: close: component dr closed
mca: base: close: unloading component dr
pml_v: parasite_close: Ok, I accept to die and let ob1 component finish
pml_v: vprotocol_pessimist: component_close
pml_v: mca: base: close: component pessimist closed
pml_v: mca: base: close: unloading component pessimist
mca: base: close: component v closed
mca: base: close: unloading component v
pml:select: Wrapping: Component ob1 [20] is being wrapped by component
crcpw [-128]
pml:crcpw: component_init: Wrap the selected component ob1
pml:crcpw: component_init: Initalize Wrapper
ompi_cr: init: ompi_cr_init()
ompi_cr: finalize: ompi_cr_finalize()
pml:crcpw: component_finalize: Finalize
mca: base: close: component ob1 closed
mca: base: close: unloading component ob1
orte_cr: finalize: orte_cr_finalize()

The MCA parameters are (except the verbose parameters):

vprotocol_pessimist_priority=120 (very, very big...?)
snapc_base_global_snapshot_dir=/tmp/checkpoints
snapc_base_store_in_place=0
opal_cr_allow_opal_only=0
mca_base_component_distill_checkpoint_ready=0
ft_cr_enabled=1
crs=
rml_wrapper=ftrm
snapc=single (similar to full but do a checkpoint of only one process)
filem=rsh
pml_wrapper=crcpw
crcp=uncoord (similar to coord but need to do checkpoint of only one process)
btl=tcp,self

Thanks,
Leonardo Fialho

--
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478

Reply via email to