On 9.4.2022. 1:51, Alexandr Nedvedicky wrote:
> Hello,
>
> thank you for taking a look at it.
>
> </snip>
>> I think there is a use after free in you diff. After you return
>> from pfsync_delete_tdb() you must not access the TDB again.
>>
>> Comments inline.
>>
> </snip>
>>> TAILQ_INIT(&sn->sn_upd_req_list);
>>> - TAILQ_CONCAT(&sn->sn_upd_req_list, &sc->sc_upd_req_list, ur_entry);
>>> + while (!TAILQ_EMPTY(&sc->sc_upd_req_list)) {
>>> + ur = TAILQ_FIRST(&sc->sc_upd_req_list);
>> Other loops have this idiom
>> while ((ur = TAILQ_FIRST(&sc->sc_upd_req_list) != NULL) {
>>
> new diff version uses 'TAILQ_FIRST()'
>
> </snip>
>>> @@ -1827,18 +1853,20 @@ pfsync_sendout(void)
>>> subh = (struct pfsync_subheader *)(m->m_data + offset);
>>> offset += sizeof(*subh);
>>>
>>> - mtx_enter(&sc->sc_tdb_mtx);
>>> count = 0;
>>> while ((t = TAILQ_FIRST(&sn.sn_tdb_q)) != NULL) {
>>> - TAILQ_REMOVE(&sn.sn_tdb_q, t, tdb_sync_entry);
>>> + TAILQ_REMOVE(&sn.sn_tdb_q, t, tdb_sync_snap);
>> I think the TDB in tdb_sync_snap list may be freed. Maybe
>> you should grab a refcount if you store them into a list.
>>
> I see. pfsync must grab the reference count in pfsync_update_tdb(),
> where tdb entry is inserted into queue. new diff fixes that.
>
>>> pfsync_out_tdb(t, m->m_data + offset);
>>> offset += sizeof(struct pfsync_tdb);
>>> +#ifdef PFSYNC_DEBUG
>>> + KASSERT(t->tdb_snapped == 1);
>>> +#endif
>>> + t->tdb_snapped = 0;
>> This may also be use after free.
> new diffs drops the reference as soon as pfsync(4) dispatches
> the tdb into pfsync packet.
>
> </snip>
>>> @@ -2525,7 +2565,17 @@ pfsync_delete_tdb(struct tdb *t)
>>>
>>> mtx_enter(&sc->sc_tdb_mtx);
>>>
>>> + /*
>>> + * if tdb entry is just being processed (found in snapshot),
>>> + * then it can not be deleted. we just came too late
>>> + */
>>> + if (t->tdb_snapped) {
>>> + mtx_leave(&sc->sc_tdb_mtx);
>> You must not access the TDB after this point. I thnik you cannot
>> guarantee that. The memory will freed after return.
> new diff fixes that
>
> </snip>
>>> diff --git a/sys/netinet/ip_ipsp.h b/sys/netinet/ip_ipsp.h
>>> index c697994047b..ebdb6ada1ae 100644
>>> --- a/sys/netinet/ip_ipsp.h
>>> +++ b/sys/netinet/ip_ipsp.h
>>> @@ -405,6 +405,7 @@ struct tdb { /* tunnel
>>> descriptor block */
>>> u_int8_t tdb_wnd; /* Replay window */
>>> u_int8_t tdb_satype; /* SA type (RFC2367, PF_KEY) */
>>> u_int8_t tdb_updates; /* pfsync update counter */
>>> + u_int8_t tdb_snapped; /* dispatched by pfsync(4) */
>> u_int8_t is not atomic. I you want to change tdb_snapped, you need
>> a mutex that also protects the othere fields in the same 32 bit
>> word everywhere. I think a new flag TDBF_PFSYNC_SNAPPED in tdb_flags
>> would be easier. The tdb_flags are protected by tdb_mtx.
>>
> I like your idea.
>
> updated diff is below.
>
>
> thanks and
> regards
> sashan
Hi,
I'm running this diff with bluhm@ pfsync mpfloor diff for 4 days on
production firewalls where panics were found and for now everything
seems normal ..