Hi Santosh,

On 01/07/2019 13.41, santosh.shilim...@oracle.com wrote:
>> @@ -144,7 +146,29 @@ static int rds_ib_post_reg_frmr(struct rds_ib_mr *ibmr)
>>           if (printk_ratelimit())
>>               pr_warn("RDS/IB: %s returned error(%d)\n",
>>                   __func__, ret);
>> +        goto out;
>> +    }
>> +
>> +    if (!frmr->fr_reg)
>> +        goto out;
>> +
>> +    /* Wait for the registration to complete in order to prevent an invalid
>> +     * access error resulting from a race between the memory region already
>> +     * being accessed while registration is still pending.
>> +     */
>> +    wait_event_timeout(frmr->fr_reg_done, !frmr->fr_reg,
>> +               msecs_to_jiffies(100));
>> +
> This arbitrary timeout in this patch as well as pacth 1/7 which
> Dave pointed out has any logic ?
> 

It's empirical (see my response to David's question):
Memory registrations took longer than invalidations, hence 100msec instead of 
10msec.

> MR registration command issued to hardware can at times take as
> much as command timeout(e.g 60 seconds in CX3) and upto that its still
> legitimate operation and not necessary failure. We shouldn't add
> arbitrary time outs in ULPs.

Where did you find the 60 seconds for CX3 you are referring to?
Is there a "generic" upper-bound that is not tied to a specific vendor / HCA?
Can you provide a pointer?

Thanks,

  Gerd

Reply via email to