(datasketches-website) branch asf-site updated: Automatic Site Publish by Buildbot

git-site-role Fri, 25 Oct 2024 11:57:33 -0700

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datasketches-website.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 02f2675d Automatic Site Publish by Buildbot
02f2675d is described below

commit 02f2675d1a46f6aed8ab20fc17b0dbbae034e472
Author: buildbot <[email protected]>
AuthorDate: Fri Oct 25 18:55:54 2024 +0000

    Automatic Site Publish by Buildbot
---
 output/docs/Architecture/KeyFeatures.html         | 6 +++---
 output/docs/Architecture/MajorSketchFamilies.html | 8 ++++----
 output/docs/HLL/Hll_vs_CS_Hllpp.html              | 2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/output/docs/Architecture/KeyFeatures.html 
b/output/docs/Architecture/KeyFeatures.html
index 79a675b4..75d861d8 100644
--- a/output/docs/Architecture/KeyFeatures.html
+++ b/output/docs/Architecture/KeyFeatures.html
@@ -414,9 +414,9 @@ and Difference) on sets of unique identifiers</li>
 <h4 id="four-families-of-count-unique-algorithms">Four families of Count 
Unique algorithms:</h4>
 
 <ul>
-  <li><a href="/docs/HLL/HLL.html">The HLL Sketch</a>. The famous HyperLogLog 
algorithm when stored sketch size is of utmost concern.</li>
-  <li><a href="/docs/CPC/CPC.html">The CPC Sketch</a>. The Compressed 
Probabilistic Counting algorithm when maximizing accuracy per stored sketch 
size is of utmost concern.</li>
-  <li><a href="/docs/Theta/ThetaSketchFramework.html">The Theta Sketch 
Framework</a>. Theta sketches enable real-time set-expression computations and 
can operate on or off the java heap.</li>
+  <li><a href="/docs/HLL/HllSketches.html">The HLL Sketch</a>. The famous 
HyperLogLog algorithm when stored sketch size is of utmost concern.</li>
+  <li><a href="/docs/CPC/CpcSketches.html">The CPC Sketch</a>. The Compressed 
Probabilistic Counting algorithm when maximizing accuracy per stored sketch 
size is of utmost concern.</li>
+  <li><a href="/docs/Theta/ThetaSketches.html">The Theta Sketch Framework</a>. 
Theta sketches enable real-time set-expression computations and can operate on 
or off the java heap.</li>
   <li><a href="/docs/Tuple/TupleOverview.html">The Tuple Sketch</a>. Tuple 
sketches are associative sketches that are useful for performing approximate 
join operations and extracting other kinds of statistical behavior associated 
with unique identifiers.</li>
 </ul>
 
diff --git a/output/docs/Architecture/MajorSketchFamilies.html 
b/output/docs/Architecture/MajorSketchFamilies.html
index 3a6c3771..70ef83c5 100644
--- a/output/docs/Architecture/MajorSketchFamilies.html
+++ b/output/docs/Architecture/MajorSketchFamilies.html
@@ -327,10 +327,10 @@
 
 <h2 id="cardinality-sketches">Cardinality Sketches</h2>
 
-<h3 
id="cpc-sketch-estimating-stream-cardinalities-more-efficiently-than-the-famous-hll-sketch"><a
 href="/docs/CPC/CPC.html">CPC Sketch</a>: Estimating Stream Cardinalities more 
efficiently than the famous HLL sketch!</h3>
+<h3 
id="cpc-sketch-estimating-stream-cardinalities-more-efficiently-than-the-famous-hll-sketch"><a
 href="/docs/CPC/CpcSketches.html">CPC Sketch</a>: Estimating Stream 
Cardinalities more efficiently than the famous HLL sketch!</h3>
 <p>This sketch was developed by the late Keven J. Lang, our chief scientist at 
the time. It is an amazing <em>tour de force</em> of scientific design and 
engineering and has substantially better accuracy / per stored size than the 
famous HLL sketch. The theory and demonstration of its performance is detailed 
in Lang’s paper <a href="https://arxiv.org/abs/1708.06839";>Back to the Future: 
an Even More Nearly Optimal Cardinality Estimation Algorithm</a>.</p>
 
-<h3 id="theta-sketches-estimating-stream-expression-cardinalities"><a 
href="/docs/Theta/ThetaSketchFramework.html">Theta Sketches</a>: Estimating 
Stream Expression Cardinalities</h3>
+<h3 id="theta-sketches-estimating-stream-expression-cardinalities"><a 
href="/docs/Theta/ThetaSketches.html">Theta Sketches</a>: Estimating Stream 
Expression Cardinalities</h3>
 <p>Internet content, search and media companies like Yahoo, Google, Facebook, 
etc., collect many tens of billions of event records from the many millions of 
users to their web sites each day.  These events can be classified by many 
different dimensions, such as the page visited and user location and profile 
information.  Each event also contains some unique identifiers associated with 
the user, specific device (cell phone, tablet, or computer) and the web browser 
used.</p>
 
 <p><img class="doc-img-full" src="/docs/img/PeopleCloud.png" alt="PeopleCloud" 
/></p>
@@ -338,7 +338,7 @@
 <p>These same unique identifiers will appear on every page that the user 
visits.  In order to measure the number of unique identifiers on a page or 
across a number of different pages, it is necessary to discount the identifier 
duplicates.  Obtaining an exact answer to a <em>COUNT DISTINCT</em> query with 
massive data is a difficult computational challenge. It is even more 
challenging if it is necessary to compute arbitrary expressions across sets of 
unique identifiers. For example, if se [...]
 
 <p>Computing cardinalities with massive data requires lots of computer 
resources and time.
-However, if an approximate answer to these problems is acceptable, <a 
href="/docs/Theta/ThetaSketchFramework.html">Theta Sketches</a> can provide 
reasonable estimates, in a single pass, orders of magnitude faster, even fast 
enough for analysis in near-real time.</p>
+However, if an approximate answer to these problems is acceptable, <a 
href="/docs/Theta/ThetaSketches.html">Theta Sketches</a> can provide reasonable 
estimates, in a single pass, orders of magnitude faster, even fast enough for 
analysis in near-real time.</p>
 
 <p>The <a 
href="https://github.com/apache/datasketches-java/blob/master/src/main/java/org/apache/datasketches/theta/Sketch.java";>theta/Sketch</a>
 can operate both on-heap and off-heap, has powerful Union, Intersection, AnotB 
and Jaccard operators, has a high-performance concurrent form for 
multi-threaded environments, has both immutable compact, and updatable 
representations, and is quite fast. Because of its flexibility, it is one of 
the most popular sketches in our library.</p>
 
@@ -354,7 +354,7 @@ However, if an approximate answer to these problems is 
acceptable, <a href="/doc
   <li><a 
href="https://github.com/apache/datasketches-java/blob/master/src/main/java/org/apache/datasketches/tuple/arrayofdoubles/ArrayOfDoublesSketch.java";>tuple/ArrayOfDoublesSketch</a>,
 which enables the user to specify the number of columns of double values as 
the <em>summary</em>. This variant also provides both on-heap and off-heap 
operation.</li>
 </ul>
 
-<h3 id="hyperloglog-sketches-estimating-stream-cardinalities"><a 
href="/docs/HLL/HLL.html">HyperLogLog Sketches</a>: Estimating Stream 
Cardinalities</h3>
+<h3 id="hyperloglog-sketches-estimating-stream-cardinalities"><a 
href="/docs/HLL/HllSketches.html">HyperLogLog Sketches</a>: Estimating Stream 
Cardinalities</h3>
 <p>The HyperLogLog (HLL) is a cardinality sketch similar to the above Theta 
sketches except they are anywhere from 2 to 16 times smaller in size.  The HLL 
sketches can be merged via the Union operator, but set intersection and 
difference operations are not provided intrinsically, because the resulting 
error would be quite poor.  If your application only requires cardinality 
estimation and merging and space is at a premium, the HLL or CPC sketches would 
be your best choice.</p>
 
 <p>The <a 
href="https://github.com/apache/datasketches-java/blob/master/src/main/java/org/apache/datasketches/hll/HllSketch.java";>hll/HllSketch</a>
 can operate both on-heap and off-heap, provides the Union operators, and has 
both immutable compact and updatable representations.</p>
diff --git a/output/docs/HLL/Hll_vs_CS_Hllpp.html 
b/output/docs/HLL/Hll_vs_CS_Hllpp.html
index 30d5c6d8..015e4874 100644
--- a/output/docs/HLL/Hll_vs_CS_Hllpp.html
+++ b/output/docs/HLL/Hll_vs_CS_Hllpp.html
@@ -559,7 +559,7 @@ Note that the Y-axis scale is now 100 nanoseconds. Some of 
the peaks in these pl
 
 <ul>
   <li>[1] <a 
href="https://github.com/apache/datasketches-java/tree/master/src/main/java/org/apache/datasketches/hll";>DataSketches
 HllSketch GitHub</a></li>
-  <li>[2] <a 
href="https://datasketches.apache.org/docs/HLL/HLL.html";>DataSketches HllSketch 
JavaDocs (top of page)</a></li>
+  <li>[2] <a 
href="https://datasketches.apache.org/docs/HLL/HllSketches.html";>DataSketches 
HllSketch JavaDocs (top of page)</a></li>
   <li>[3] <a 
href="https://github.com/addthis/stream-lib/blob/master/src/main/java/com/clearspring/analytics/stream/cardinality/HyperLogLogPlus.java";>HyperLogLogPlus
 GitHub</a></li>
   <li>[4] <a 
href="https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf";>Google:
 HyperLogLog in Practice: Algorithmic Engineering of a State of The Art 
Cardinality Estimation Algorithm</a></li>
   <li>[5] The Root-Mean-Square of the Relative Error (RMS-RE) is sensitive to 
bias of the mean if there is any. However, if the bias is zero RMS-RE will 
produce the same results as the theoretical Relative Standard Error (RSE) of 
the stochastic process.</li>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datasketches-website) branch asf-site updated: Automatic Site Publish by Buildbot

Reply via email to