On Wed, Jan 18, 2023 at 6:35 PM Michael Peddemors via mailop < mailop@mailop.org> wrote:
> Thanks Brandon, > > for the quick response, and of course can confirm in those cases there > is no To or Cc recipients in that email, however we have a hard time > telling if this is a broken script kiddie job, or legitimate. > > Tackling differentiating the spammers using that method (only Bcc, no To > or Cc) vs any potential legitimate cases. Really only need to do that > with the Gmail stuff right now, as it is the most prevalent, and a good > bell weather to real world cases. > > The spammers do make it 'partly' easy, eg when the Reply-To is different > than the From, and one of more are throway style. > > Even though Bcc was re-addressed in later RFC's, there is no express > consideration for this. > > We 'might' think it is time to address this in another way, eg rewrite > it to the To, and address it in another way, but not here to move > mountains. > Rewriting inbound mail is not generally a good idea since it will break DKIM, but if the mail doesn't get relayed, sure. It sure would be nice if other tokens could make it more apparent, that > this was a legitimate case, and not a leak of unintended PPI. > Do you mean PII? As I've mentioned before, there are limited cases where its a surprise to senders and may reveal something (imagine sending to an internal alias not-those-antispam-bozos-again and that would be exposed), but generally speaking one shouldn't expect that forwarding/aliases/mailing lists/etc in the path to a recipient to not be visible to the recipient. The general answer is that the bcc doesn't actually need to match the RCPT TO, that its value can't be trusted, but if not being deliberately forged, it should let the receiver know why they received a message, which can be helpful to them. Anyways, we only strip BCC on MSA messages, otherwise we just pass it through. I mean, all you really need is the RCPT TO anyways ;) > > But MAYBE there is something that Gmail can do to more transparently > indicated how these messages are transversing their systems. > > Always seem to be.. > > X-Received: by 2002:a05:6402:1013:b0:49c:78d7:ea61 with SMTP id > c19-20020a056402101300b0049c78d7ea61mr622007edu.269.1674026000819; Tue, > 17 > Jan 2023 23:13:20 -0800 (PST) > That's just an internal hop for the message queue, basically every message that goes through us should have one of those. > Sometimes via HTTP.. > > Wish we could have some insight into what is in.. > > X-Gm-Message-State: > AFqh2kq79brjIh8zUpGfXkU6IMrI1HRjwS3V4t4etnkzPfzmGp43h+xe > Hr7BK3rxAF0s5BU1car2XZE50tm4mOBBfCd5ycI= > X-Google-Smtp-Source: > > AMrXdXsRdxGXsEvCByw8J3u8NhKOtp9dPLs5N1kXEfsdc7oHSA/lbel1UttXcDmbqsjEU1jodxzQ5CV81cH7kFDLt4g= > The first is just some random state that needs to survive an smtp hop, information for loop detection and bounce handling. The second is just where Google first saw the message and when so we can calculate end to end latencies. Neither is particularly useful for anyone else. > There is just not enough tokens available, to clearly identify the > methodology, so at least we can reduce chances of a false positive. > > And playing whack-a-mole with content isn't going to cut it.. > (Oh, yes, we do have a filtering team that has to work on that) > > Over 90% of the gmail spam, comes in with no To or Cc headers. > > Heck, with a little more transparency, we could even report to Gmail > spam accounts, not that we are sure they care, want, or need to remove > them, maybe they get some advantage to leaving them in place. > I no longer have access to look at the state of these, but typically they've been disabled by the time we receive reports... though sometimes they still manage to stay under the radar, more typical for the drop boxes (reply-to) which are harder to catch and manage. Oh, and dkim replay attacks, where a single bad email is sent from other hosts and not back to us, so we have no real view of the spamming run. > Received: from 52669349336 named unknown by gmailapi.google.com with > HTTPREST; > Wed, 18 Jan 2023 04:11:48 -0800 > Received: from 52669349336 named unknown by gmailapi.google.com with > HTTPREST; > Wed, 18 Jan 2023 04:11:45 -0800 > That should be the gmail api, not sure what would cause two, but also as a received header, they could be spoofed. The number, in this case 52669349336, is a unique id associated with the registration for access to the api, so it could theoretically be used for say reputation analysis. ugh, the code is a bit obtuse, but yeah, on send it looks like it gets added twice. Brandon
_______________________________________________ mailop mailing list mailop@mailop.org https://list.mailop.org/listinfo/mailop