On Fri, Aug 17, 2018 at 10:04 PM, Dr. David Alan Gilbert <dgilb...@redhat.com> wrote: > * Lidong Chen (jemmy858...@gmail.com) wrote: >> It's not necessary to send RDMA_CONTROL_ERROR when clean up rdma resource. >> If rdma->error_state is ture, the message may not send successfully. >> and the cm event can also notify the peer qemu. >> >> Signed-off-by: Lidong Chen <lidongc...@tencent.com> > > How does this keep 'cancel' working; I added 32bce196344 last year to > make that code also send the RDMA_CONTROL_ERROR in 'cancelling'.
I guess send the RDMA_CONTROL_ERROR is to notify peer qemu close rdma connection, and to make sure the receive rdma_disconnect event. But the two sides qemu should cleanup rdma independently. maybe the destination qemu is hang. 1.the current qemu version already not wait for RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect, 2.for peer qemu, it's already poll rdma->channel->fd, compare to send RDMA_CONTROL_ERROR, maybe use cm event to notify peer qemu to quit is better. maybe the rdma is already in error_state, and RDMA_CONTROL_ERROR cannot send successfully. when cancel migraiton, the destination will receive RDMA_CM_EVENT_DISCONNECTED. 3. I prefer use RDMA_CONTROL_ERROR to notify some code logic error, not rdma connection error. > > Dave > >> --- >> migration/rdma.c | 11 ----------- >> 1 file changed, 11 deletions(-) >> >> diff --git a/migration/rdma.c b/migration/rdma.c >> index ae07515..e1498f2 100644 >> --- a/migration/rdma.c >> +++ b/migration/rdma.c >> @@ -2305,17 +2305,6 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) >> int idx; >> >> if (rdma->cm_id && rdma->connected) { >> - if ((rdma->error_state || >> - migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) && >> - !rdma->received_error) { >> - RDMAControlHeader head = { .len = 0, >> - .type = RDMA_CONTROL_ERROR, >> - .repeat = 1, >> - }; >> - error_report("Early error. Sending error."); >> - qemu_rdma_post_send_control(rdma, NULL, &head); >> - } >> - >> rdma_disconnect(rdma->cm_id); >> trace_qemu_rdma_cleanup_disconnect(); >> rdma->connected = false; >> -- >> 1.8.3.1 >> > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK