> I think you should fix this explanation: Thanks! I would like to copy the context you provide to the PIP motivation, your description is more detailed, so developers don't have to go through the code.
> Today the quota is checked periodically, right? So that's how the operator > knows the cost in terms of I/O is limited. > Now you are adding one additional I/O per collection, every 1 min by > default. That's a lot perhaps. How long is the check interval today? Actually, I don't want to introduce additional costs, I thought we could cache its result, so that it won't introduce additional costs. It may be that I did not make it clear in the PIP and caused this misunderstanding, sorry. > The user today can calculate quota used for size based limit, since there > are two metrics that are exposed today on a topic level: " > pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size". You > can just divide the two to get a percentage. > For the time-based limit, the only metric exposed today is quota itself , " > pulsar_storage_backlog_quota_limit_time". I only noticed `pulsar_storage_backlog_size` but missed `pulsar_storage_backlog_quota_limit` and `pulsar_storage_backlog_quota_limit_time`. Many thanks for your reminder. So, in this condition, we already have the following topic-level metrics: `pulsar_storage_backlog_size`: The total backlog size of the topics of this topic owned by this broker (in bytes). `pulsar_storage_backlog_quota_limit`: The total amount of the data in this topic that limits the backlog quota (bytes). `pulsar_storage_backlog_quota_limit_time`: The backlog quota limit in time(seconds). (This metric does not exists in the doc, need to improve) We just need to add a new metric named `pulsar_storage_earliest_msg_publish_time_in_backlog` in the topic-level that indicates the publish time of the earliest message in the backlog. So users could get `pulsar_backlog_size_quota_used_percentage` by divide `pulsar_storage_backlog_size ` and `pulsar_storage_backlog_quota_limit`(`pulsar_storage_backlog_size` / `pulsar_storage_backlog_quota_limit`), and could get `pulsar_backlog_time_quota_used_percentage` by divide `now - pulsar_storage_earliest_msg_publish_time_in_backlog` and `pulsar_storage_backlog_quota_limit_time` (`now - pulsar_storage_earliest_msg_publish_time_in_backlog` / `pulsar_storage_backlog_quota_limit_time`). The backlog quota time checker runs periodically, so we can cache its result, so it won't lead to much costs. Pulsar also exposed subscription-level `backlogSize` and `earliestMsgPublishTimeInBacklog` in Pulsar-Admin <https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139> if `subscriptionBacklogSize` and `getEarliestTimeInBacklog` are true. We can also expose `backlogQuotaLimiteSize` and `backlogQuotaLimitTime` of the topic to PulsarAdmin. After users receive the backlog alert from metrics alerting systems, they can get the topic name, then, they can request Topics#getStats <https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/client/admin/Topics.java#L1139> to get which subscriptions are in the huge backlog. Thanks, Tao Jiuming Asaf Mesika <asaf.mes...@gmail.com> 于2023年3月1日周三 23:42写道: > > > > Pulsar has 2 configurations for the backlog eviction > > < > https://pulsar.apache.org/docs/2.11.x/cookbooks-retention-expiry/#backlog-quotas > > > > : backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond. > > By default, backlog eviction is disabled, and also, there is a field > named > > backlogQuotaMap in TopicPolicies > > < > https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/policies/data/HierarchyTopicPolicies.java#L45 > > > > /NamespaceSpacePolicies > > < > https://github.com/apache/pulsar/blob/master/pulsar-client-admin-api/src/main/java/org/apache/pulsar/common/policies/data/Policies.java#L41> > assists > > in controlling Topic/Namespace level backlog quota. > > > > If topic backlog reaches the threshold of any item, backlog eviction will > > be triggered, Pulsar will move subscription's cursor to skip > unacknowledged > > messages. > > > > Before backlog eviction happens, we don't have a metric to monitor how > > long that it can reaches the threshold. > > > > I think you should fix this explanation: > > In Pulsar, a subscription maintains a state of message acknowledged. A > subscription backlog is the set of messages which are unacknowledged. > A subscription backlog size is the sum of size of unacknowledged messages > (in bytes). > A topic can have many subscriptions. > A topic backlog is defined as the backlog size of the subscription which > has the oldest unacknowledged message. Since acknowledged messages can be > interleaved with unacknowledged messages, calculating the exact size of > that subscription can be expensive as it requires I/O operations to read > from the messages from the ledgers. > For that reason, the topic backlog is actually defined to be the estimated > backlog size of that subscription. It does so by summarizing the size of > all the ledgers, starting from the current active one, up to the ledger > which contains the oldest unacknowledged message (There is actually a > faster way to calculate it, but this is the definition of the estimation). > > A topic backlog age is the age of the oldest unacknowledged message (in any > subscription). If that message was written 30 minutes ago, its age is 30 > minutes. > > Pulsar has a feature called backlog quota (place link). It allows the user > to define a quota - in effect, a limit - which limits the topic backlog. > There are two types of quotas: > * Size based: The limit is for the topic backlog size (as we defined > above). > * Time based: The limit is for the topic's backlog age (as we defined > above). > > Once a topic backlog exceeds either one of those limits, an action is taken > upon messages written to the topic: > * The producer write is placed on hold for a certain amount of time before > failing. > * The producer write is failed > * The subscriptions oldest unacknowledged messages will be acknowledged in > order until both the topic backlog size or age will fall inside the limit > (quota). The process is called backlog eviction (happens every interval) > > The quotas can be defined as a default value for any topic, by using the > following broker configuration keys: backlogQuotaDefaultLimitBytes , > backlogQuotaDefaultLimitSecond. It can also be specified directly for all > topics in a given namespace using the namespace policy, or a specific topic > using a topic policy. > > The user today can calculate quota used for size based limit, since there > are two metrics that are exposed today on a topic level: " > pulsar_storage_backlog_quota_limit" and "pulsar_storage_backlog_size". You > can just divide the two to get a percentage. > For the time-based limit, the only metric exposed today is quota itself , " > pulsar_storage_backlog_quota_limit_time". > > ------------ > > I would create two metrics: > > `pulsar_backlog_size_quota_used_percentage` > `pulsar_backlog_time_quota_used_percentage` > > You would like to know what triggered the alert, hence two. > It's not the quota percentage, it's the quota used percentage. > > ---------- > > It checks if the backlog size exceeds the threshold( > > backlogQuotaDefaultLimitBytes), and it gets the current backlog size by > > calculating LedgerInfo > > < > https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54 > >, > > it will not lead to I/O. > > This is not correct. > It checks against the topic / namespace policy, and if it doesn't exist, it > falls back on the default configuration key mentioned above. > > It checks if the backlog time exceeds the threshold( > > backlogQuotaDefaultLimitSecond). If preciseTimeBasedBacklogQuotaCheck is > > set to be true, it will read an entry from Bookkeeper, but the default > > value is false, which means it gets the backlog time by calculating > > LedgerInfo > > < > https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto#L54 > >. > > So in general, we don't need to worry about it will lead to I/O. > > > I'm afraid of that. > Today the quota is checked periodically, right? So that's how the operator > knows the cost in terms of I/O is limited. > Now you are adding one additional I/O per collection, every 1 min by > default. That's a lot perhaps. How long is the check interval today? > > Perhaps in the backlog quota check, you can persist the check result, and > use it? Persist the age that is. > > > ------ > > Regarding "slowest_subscription" > I think the cost is too high, because the subscriptions will keep > alternating, which can generate so many unique time series. Since > Prometheus flush only every 2 hours, or any there TSDB, it will cost you > too much. > > I suggest exposing the name via the topic stats. This way they can issue a > REST call to grab that subscription name only when the alert fires. > > Thanks, > > Asaf > > > > > > On Tue, Feb 28, 2023 at 9:29 AM 太上玄元道君 <dao...@apache.org> wrote: > > > Hi Asaf, > > I've updated the PIP, PTAL > > > > Thank, > > Tao Jiuming > > > > Asaf Mesika <asaf.mes...@gmail.com> 于2023年2月26日周日 23:03写道: > > > > > Hi, > > > > > > Pulsar has 2 configurations for the backlog eviction: > > > > backlogQuotaDefaultLimitBytes and backlogQuotaDefaultLimitSecond, if > > > > topic backlog reaches the threshold of any item, backlog eviction > will > > be > > > > triggered. > > > > > > This seems like default values, not the actual values. Can you please > > > provide an explanation in the PIP and link to read more: > > > 1. Where do you define the backlog quota exactly? What is the > granularity > > > (subscription?) > > > 2. Is the backlog quota on by default? If so, what are the default > > values? > > > > > > > > > > > > *Notes* > > > 1. When the backlog quota limit is defined in Bytes, and you wish to > know > > > how close a subscription is to its bytes limit, you need to calculate > the > > > backlog size in bytes. From my understanding, there is an accurate > > > calculation (which is costly in terms of I/O) and there is an estimate > of > > > it. I presume you would want to use the estimated one, is that correct? > > > The backlog quota itself, uses the accurate or the estimated when it > > starts > > > evicting entries (i.e. marking them as acknowledged)? > > > > > > 2. For the backlog limit specifying in time units, there is no > estimate, > > as > > > it must be calculated all the time (earliest unacknowledged message > > > distance from now). How do you plan to calculate the current age of the > > > earliest message without bearing that I/O cost on each metric > > calculation? > > > > > > 3. In the Goal section, you specify that your goal is to add a > > "proximity" > > > metric. > > > a) You must define that - what is proximity metric exactly? What are > its > > > units? How are you planning to calculate it? > > > b) Proximity is not a good term IMO. I personally have never seen this > > term > > > used in software systems, unless it's in the aviation/space industry. > > Once > > > you explain (a) I hope I can help provide alternative names. > > > > > > 4. Maybe we should provide the used quota percentage for both limits, > > > instead of one per both, since it's easier to act upon the alert when > you > > > need which one triggered it. > > > > > > 5. I didn't understand the "slowest_subscription" label used when > > > describing the metric label. Can you please provide an explanation? > > > > > > 6. I suggest writing a "High Level Design" section, and add everything > > you > > > need to know for this proposal, so I don't need to read the > > > implementation details below (code). > > > > > > Thanks, > > > > > > Asaf > > > > > > > > > On Wed, Feb 22, 2023 at 4:52 PM 太上玄元道君 <dao...@apache.org> wrote: > > > > > > > Hi all, > > > > > > > > I've started a PIP to discuss: PIP-248 Add backlog eviction metric > > > > > > > > ### Motivation: > > > > > > > > Pulsar has 2 configurations for the backlog eviction: > > > > `backlogQuotaDefaultLimitBytes` and `backlogQuotaDefaultLimitSecond`, > > if > > > > topic backlog reaches the threshold of any item, backlog eviction > will > > be > > > > triggered. > > > > > > > > Before backlog eviction happens, we don't have a metric to monitor > how > > > long > > > > that it can reaches the threshold. > > > > > > > > We can provide a progress bar metric to tell users some topics is > about > > > to > > > > trigger backlog eviction. And users can subscribe the alert to > schedule > > > > consumers. > > > > > > > > For more details, please read the PIP at > > > > https://github.com/apache/pulsar/issues/19601 > > > > > > > > Thanks, > > > > Tao Jiuming > > > > > > > > > >