Hello Lixiaokeng,

Thanks for the report and the analysis.

On Mon, 2020-07-13 at 10:22 +0800, lixiaokeng wrote:
> When two iscsi ips log in and out alternately and  the following
> scripts run
> at the same time,
> 
> #!/bin/bash
> interval=5
> while true
> do
>         iscsiadm -m node -p 9.41.147.171 &> /dev/null
>         iscsiadm -m node -p 9.41.148.172 &> /dev/null
>         iscsiadm -m session &> /dev/null
>         rescan-scsi-bus.sh &> /dev/null
>         multipath -v2 &> /dev/null
>         multipath -ll &> /dev/null
>         sleep $interval
> done
> 
> multipathd will have a segfault after about 30 mins.
> 
> The reason is that mpp->hwe is accessed after hwe is already freed.
> In
> extract_hwe_from_path func, mpp->hwe is set to pp->hwe, so they
> points to the
> same hwe. For some reasons, pp->mpp will be set to NULL in
> orphan_path func.
> Then, pp and hwe will be freed with (pp->mpp == NULL) in free_path
> func
> called by ev_remove_path func. However, mpp->hwe is not set to NULL
> while hwe
> is already freed. So, when iscsi device logs in and new path is added
> to mpp,
> mpp->hwe will be accessed in select_pgfailback func. Finally, use-
> after-free
> problem occurs.
> 
> The procedure details given as follows,
> 1.wait_dmevents thread
> wait_dmevents
>       ->dmevent_loop
>               ->update_multipath
>                       ->__setup_multipath
>                               ->update_multipath_strings
>                                        -> sync_paths
>                                               ->orphan_path
> 2.uevqloop thread  (iscsi log out, remove path)
> uevqloop
> ->uevent_dispatch
>       ->service_uevq
>               ->uev_remove_path
>                       ->ev_remove_path  //pp->mpp is NULL
>                               ->free_path(pp)
>                       //pp->hew are freed but mpp->hwe is not set to
> NULL
> 3.ev_remove_path  (iscsi log in, add path)
> uevqloop
> ->uevent_dispatch
>       ->service_uevq
>               ->ev_add_path
>                       ->select_pgfailback //mpp->hwe is accessed
> 
> Here, we will set mpp->hwe to NULL before setting pp->map to NULL in
> orphan_path
> func.
> 
> Signed-off-by: Tianxiong Lu <[email protected]>
> Signed-off-by: lixiaokeng <[email protected]>
> Signed-off-by: Zhiqiang Liu <[email protected]>
> ---
>  libmultipath/structs_vec.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index 3dbbaa0..9bbe5d1 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -93,6 +93,8 @@ int adopt_paths(vector pathvec, struct multipath
> *mpp)
>  void orphan_path(struct path *pp, const char *reason)
>  {
>       condlog(3, "%s: orphan path, %s", pp->dev, reason);
> +     if (pp->mpp && pp->mpp->hwe == pp->hwe)
> +             pp->mpp->hwe = NULL;
>       pp->mpp = NULL;
>       pp->dmstate = PSTATE_UNDEF;
>       pp->uid_attribute = NULL;

This isn't wrong, but I think sync_paths() has to be fixed as well.
I'll send a small series soon, based on your patch.

Martin


--
dm-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/dm-devel

Reply via email to