[ https://issues.apache.org/jira/browse/COMDEV-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhijing Lu updated COMDEV-510: ------------------------------ Description: *Apache Doris* Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods. {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/] {*}Github{*}: [https://github.com/apache/doris] h3. *Background* Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored. Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems: * Hot data will be phased out in large queries * The page cache configuration is immutable and does not support GC. h3. Task * {*}Phase One{*}: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required. * {*}Phase Two{*}: Improve the cache strategy for Apache Doris based on the results from Phase One. h3. Learning Material {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/] {*}Github{*}: [https://github.com/apache/doris] h3. Mentor * Mentor: Yongqiang Yang, Apache Doris PMC member & Committer, [yangyongqi...@apache.org |mailto:yangyongqi...@apache.org] * Mentor: Haopeng Li, Apache Doris PMC member & Committer, [lihaop...@apache.org|mailto:lihaop...@apache.org] * Mailing List: d...@doris.apache.org was: *Apache Doris* Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods. {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/] {*}Github{*}: [https://github.com/apache/doris] h3. *Background* Apache Doris accelerates high-concurrency queries utilizing page cache, where the decompressed data is stored. Currently, the page cache in Apache Doris uses a simple LRU algorithm, which reveals a few problems: # Hot data will be phased out in large queries # The page cache configuration is immutable and does not support GC. h3. Task # {*}Phase One{*}: Identify the impacts on queries when the decompressed data is stored in memory and SSD, respectively, and then determine whether full page cache is required. # {*}Phase Two{*}: Improve the cache strategy for Apache Doris based on the results from Phase One. h3. Learning Material {*}Page{*}: https://doris.apache.org {*}Github{*}: [https://github.com/apache/doris] h3. Mentor * Mentor: Yongqiang Yang, Apache Doris PMC member & Committer, [yangyongqi...@apache.org |mailto:yangyongqi...@apache.org] * Mentor: Haopeng Li, Apache Doris PMC member & Committer, [lihaop...@apache.org|mailto:lihaop...@apache.org] * Mailing List: d...@doris.apache.org > [GSoC][Doris]Page Cache Improvement > ----------------------------------- > > Key: COMDEV-510 > URL: https://issues.apache.org/jira/browse/COMDEV-510 > Project: Community Development > Issue Type: Task > Components: GSoC/Mentoring ideas > Reporter: Zhijing Lu > Priority: Major > Labels: ApacheDoris, Mentor, full-time, gsoc2023 > > *Apache Doris* > Apache Doris is a real-time analytical database based on MPP architecture. As > a unified platform that supports multiple data processing scenarios, it > ensures high performance for low-latency and high-throughput queries, allows > for easy federated queries on data lakes, and supports various data ingestion > methods. > {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/] > {*}Github{*}: [https://github.com/apache/doris] > h3. *Background* > Apache Doris accelerates high-concurrency queries utilizing page cache, where > the decompressed data is stored. > Currently, the page cache in Apache Doris uses a simple LRU algorithm, which > reveals a few problems: > * Hot data will be phased out in large queries > * The page cache configuration is immutable and does not support GC. > h3. Task > * {*}Phase One{*}: Identify the impacts on queries when the decompressed > data is stored in memory and SSD, respectively, and then determine whether > full page cache is required. > * {*}Phase Two{*}: Improve the cache strategy for Apache Doris based on the > results from Phase One. > h3. Learning Material > {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/] > {*}Github{*}: [https://github.com/apache/doris] > h3. Mentor > * Mentor: Yongqiang Yang, Apache Doris PMC member & Committer, > [yangyongqi...@apache.org |mailto:yangyongqi...@apache.org] > * Mentor: Haopeng Li, Apache Doris PMC member & Committer, > [lihaop...@apache.org|mailto:lihaop...@apache.org] > * Mailing List: d...@doris.apache.org -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@community.apache.org For additional commands, e-mail: dev-h...@community.apache.org