This is an automated email from the ASF dual-hosted git repository.

alsay pushed a commit to branch req_sketch_float
in repository https://gitbox.apache.org/repos/asf/datasketches-bigquery.git


The following commit(s) were added to refs/heads/req_sketch_float by this push:
     new 67f882b  req_sketch_float
67f882b is described below

commit 67f882b67fc030bf570feaa137e3fca689dfae7e
Author: AlexanderSaydakov <[email protected]>
AuthorDate: Thu Nov 21 15:48:38 2024 -0800

    req_sketch_float
---
 Makefile               |   2 +-
 readme_generator.py    |   2 +-
 req/README.md          | 113 +++++++++++++++++++++++++++++++++++++++++++++++++
 req/README_template.md |  38 +++++++++++++++++
 4 files changed, 153 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 8c7eafe..dcf5791 100644
--- a/Makefile
+++ b/Makefile
@@ -15,7 +15,7 @@
 # specific language governing permissions and limitations
 # under the License.
 
-MODULES := theta tuple cpc hll kll fi tdigest
+MODULES := theta tuple cpc hll kll fi tdigest req
 
 $(MODULES):
        $(MAKE) -C $@
diff --git a/readme_generator.py b/readme_generator.py
index f294c65..bf3bb80 100644
--- a/readme_generator.py
+++ b/readme_generator.py
@@ -162,7 +162,7 @@ def generate_readme(template_path: str, function_index: 
dict, examples_path: str
   return output_content
 
 if __name__ == "__main__":
-  sketch_types = ["cpc", "fi", "hll", "kll", "tdigest", "theta", "tuple"]
+  sketch_types = ["cpc", "fi", "hll", "kll", "tdigest", "theta", "tuple", 
"req"]
   template_name = "README_template.md"
   readme_name = "README.md"
   for sketch_type in sketch_types:
diff --git a/req/README.md b/req/README.md
new file mode 100644
index 0000000..63a0b9c
--- /dev/null
+++ b/req/README.md
@@ -0,0 +1,113 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+
+# Apache DataSketches REQ Sketches for Google BigQuery
+
+Relative Error Quantiles Sketch that rovides extremely high accuracy
+at a chosen end of the rank domain: high rank accuracy (HRA) or low
+rank accuracy (LRA).
+REQ sketches are quantile sketches that provide approximate quantiles
+and ranks for a dataset.
+
+Please visit 
+[REQ Sketches](https://datasketches.apache.org/docs/REQ/ReqSketch.html) 
+for more information about this sketch family.
+
+Please visit the main 
+[Apache DataSketches website](https://datasketches.apache.org) 
+for more information about DataSketches library.
+
+If you are interested in making contributions to this project please see our 
+[Community](https://datasketches.apache.org/docs/Community/) 
+page for how to contact us.
+
+| Function Name | Function Type | Signature | Description |
+|---|---|---|---|
+| [req_sketch_float_build](../definitions/req/req_sketch_float_build.sqlx) | 
AGGREGATE | (value FLOAT64) -> BYTES | Creates a sketch that represents the 
distribution of the given column.\<br\>\<br\>Param value: the column of FLOAT64 
values.\<br\>Defaults: k = 12, hra = true.\<br\>Returns: a serialized REQ 
Sketch as BYTES. |
+| [req_sketch_float_merge](../definitions/req/req_sketch_float_merge.sqlx) | 
AGGREGATE | (sketch BYTES) -> BYTES | Merges sketches from the given 
column.\<br\>\<br\>Param sketch: the column of sketches.\<br\>Defaults: k = 12, 
hra = true.\<br\>Returns: a serialized REQ sketch as BYTES. |
+| 
[req_sketch_float_build_k_hra](../definitions/req/req_sketch_float_build_k_hra.sqlx)
 | AGGREGATE | (value FLOAT64, params STRUCT<k INT, hra BOOL> NOT AGGREGATE) -> 
BYTES | Creates a sketch that represents the distribution of the given 
column.\<br\>\<br\>Param value: the column of FLOAT64 values.\<br\>Param k: the 
sketch accuracy/size parameter as an even INT in the range \[4, 
65534\].\<br\>Param hra: if true, the high ranks are prioritized for better 
accuracy. Otherwise the low ranks a [...]
+| 
[req_sketch_float_merge_k_hra](../definitions/req/req_sketch_float_merge_k_hra.sqlx)
 | AGGREGATE | (sketch BYTES, params STRUCT<k INT, hra BOOL> NOT AGGREGATE) -> 
BYTES | Merges sketches from the given column.\<br\>\<br\>Param sketch: the 
column of values.\<br\>Param k: the sketch accuracy/size parameter as an even 
INT in the range \[4, 65534\].\<br\>Param hra: if true, the high ranks are 
prioritized for better accuracy. Otherwise the low ranks are prioritized for 
better accuracy.\<br\ [...]
+| [req_sketch_float_get_n](../definitions/req/req_sketch_float_get_n.sqlx) | 
SCALAR | (sketch BYTES) -> INT64 | Returns the length of the input 
stream.\<br\>\<br\>Param sketch: the given sketch as BYTES.\<br\>Returns: 
stream length as INT64 |
+| 
[req_sketch_float_get_num_retained](../definitions/req/req_sketch_float_get_num_retained.sqlx)
 | SCALAR | (sketch BYTES) -> INT64 | Returns the number of retained items 
\(samples\) in the sketch.\<br\>\<br\>Param sketch: the given sketch as 
BYTES.\<br\>Returns: number of retained items as INT64 |
+| 
[req_sketch_float_get_min_value](../definitions/req/req_sketch_float_get_min_value.sqlx)
 | SCALAR | (sketch BYTES) -> FLOAT64 | Returns the minimum value of the input 
stream.\<br\>\<br\>Param sketch: the given sketch as BYTES.\<br\>Returns: min 
value as FLOAT64 |
+| 
[req_sketch_float_to_string](../definitions/req/req_sketch_float_to_string.sqlx)
 | SCALAR | (sketch BYTES) -> STRING | Returns a summary string that represents 
the state of the given sketch.\<br\>\<br\>Param sketch: the given sketch as 
BYTES.\<br\>Returns: a string that represents the state of the given sketch. |
+| 
[req_sketch_float_get_max_value](../definitions/req/req_sketch_float_get_max_value.sqlx)
 | SCALAR | (sketch BYTES) -> FLOAT64 | Returns the maximum value of the input 
stream.\<br\>\<br\>Param sketch: the given sketch as BYTES.\<br\>Returns: max 
value as FLOAT64 |
+| [req_sketch_float_get_cdf](../definitions/req/req_sketch_float_get_cdf.sqlx) 
| SCALAR | (sketch BYTES, split_points ARRAY<FLOAT64>, inclusive BOOL) -> 
ARRAY<FLOAT64> | Returns an approximation to the Cumulative Distribution 
Function \(CDF\) \<br\>of the input stream as an array of cumulative 
probabilities defined by the given split\_points.\<br\>\<br\>Param sketch: the 
given sketch as BYTES.\<br\>\<br\>Param split\_points: an array of M unique, 
monotonically increasing values\<br\>  \( [...]
+| 
[req_sketch_float_get_rank_lower_bound](../definitions/req/req_sketch_float_get_rank_lower_bound.sqlx)
 | SCALAR | (sketch BYTES, rank FLOAT64, num_std_dev BYTEINT) -> FLOAT64 | 
Returns an approximate lower bound of the given normalized rank.\<br\>Param 
sketch: the given sketch as BYTES.\<br\>Param rank: the given rank, a value 
between 0 and 1.0.\<br\>Param num\_std\_dev: The returned bounds will be based 
on the statistical confidence interval determined by the given number of 
standard  [...]
+| [req_sketch_float_get_pmf](../definitions/req/req_sketch_float_get_pmf.sqlx) 
| SCALAR | (sketch BYTES, split_points ARRAY<FLOAT64>, inclusive BOOL) -> 
ARRAY<FLOAT64> | Returns an approximation to the Probability Mass Function 
\(PMF\)\<br\>of the input stream as an array of probability masses defined by 
the given split\_points.\<br\>\<br\>Param sketch: the given sketch as 
BYTES.\<br\>\<br\>Param split\_points: an array of M unique, monotonically 
increasing values \<br\>  \(of the same t [...]
+| 
[req_sketch_float_get_quantile](../definitions/req/req_sketch_float_get_quantile.sqlx)
 | SCALAR | (sketch BYTES, rank FLOAT64, inclusive BOOL) -> FLOAT64 | Returns a 
value from the sketch that is the best approximation to a value from the 
original stream with the given rank.\<br\>\<br\>Param sketch: the given sketch 
in serialized form.\<br\>Param rank: rank of a value in the hypothetical sorted 
stream.\<br\>Param inclusive: if true, the given rank is considered inclusive 
\(includes wei [...]
+| 
[req_sketch_float_get_rank_upper_bound](../definitions/req/req_sketch_float_get_rank_upper_bound.sqlx)
 | SCALAR | (sketch BYTES, rank FLOAT64, num_std_dev BYTEINT) -> FLOAT64 | 
Returns an approximate upper bound of the given normalized rank.\<br\>Param 
sketch: the given sketch as BYTES.\<br\>Param rank: the given rank, a value 
between 0 and 1.0.\<br\>Param num\_std\_dev: The returned bounds will be based 
on the statistical confidence interval determined by the given number of 
standard  [...]
+| 
[req_sketch_float_get_rank](../definitions/req/req_sketch_float_get_rank.sqlx) 
| SCALAR | (sketch BYTES, value FLOAT64, inclusive BOOL) -> FLOAT64 | Returns 
an approximation to the normalized rank, on the interval \[0.0, 1.0\], of the 
given value.\<br\>\<br\>Param sketch: the given sketch in serialized 
form.\<br\>Param value: value to be ranked.\<br\>Param inclusive: if true the 
weight of the given value is included into the rank.\<br\>Returns: an 
approximate rank of the given value. |
+
+**Examples:**
+
+```sql
+
+# using defaults
+
+create or replace table `$BQ_DATASET`.req_sketch(sketch bytes);
+
+insert into `$BQ_DATASET`.req_sketch
+(select `$BQ_DATASET`.req_sketch_float_build(value) from 
unnest([1,2,3,4,5,6,7,8,9,10]) as value);
+
+insert into `$BQ_DATASET`.req_sketch
+(select `$BQ_DATASET`.req_sketch_float_build(value) from 
unnest([11,12,13,14,15,16,17,18,19,20]) as value);
+
+select 
`$BQ_DATASET`.req_sketch_float_to_string(`$BQ_DATASET`.req_sketch_float_merge(sketch))
 from `$BQ_DATASET`.req_sketch;
+
+# expected 0.5
+select 
`$BQ_DATASET`.req_sketch_float_get_rank(`$BQ_DATASET`.req_sketch_float_merge(sketch),
 10, true) from `$BQ_DATASET`.req_sketch;
+
+# expected 10
+select 
`$BQ_DATASET`.req_sketch_float_get_quantile(`$BQ_DATASET`.req_sketch_float_merge(sketch),
 0.5, true) from `$BQ_DATASET`.req_sketch;
+
+# expected 0.5, 0.5
+select 
`$BQ_DATASET`.req_sketch_float_get_pmf(`$BQ_DATASET`.req_sketch_float_merge(sketch),
 [10.0], true) from `$BQ_DATASET`.req_sketch;
+
+# expected 0.5, 1
+select 
`$BQ_DATASET`.req_sketch_float_get_cdf(`$BQ_DATASET`.req_sketch_float_merge(sketch),
 [10.0], true) from `$BQ_DATASET`.req_sketch;
+
+# expected 1
+select 
`$BQ_DATASET`.req_sketch_float_get_min_value(`$BQ_DATASET`.req_sketch_float_merge(sketch))
 from `$BQ_DATASET`.req_sketch;
+
+# expected 20
+select 
`$BQ_DATASET`.req_sketch_float_get_max_value(`$BQ_DATASET`.req_sketch_float_merge(sketch))
 from `$BQ_DATASET`.req_sketch;
+
+# expected 20
+select 
`$BQ_DATASET`.req_sketch_float_get_n(`$BQ_DATASET`.req_sketch_float_merge(sketch))
 from `$BQ_DATASET`.req_sketch;
+
+# expected 20
+select 
`$BQ_DATASET`.req_sketch_float_get_num_retained(`$BQ_DATASET`.req_sketch_float_merge(sketch))
 from `$BQ_DATASET`.req_sketch;
+
+drop table `$BQ_DATASET`.req_sketch;
+
+# using full signatures
+
+create or replace table `$BQ_DATASET`.req_sketch(sketch bytes);
+
+insert into `$BQ_DATASET`.req_sketch
+(select `$BQ_DATASET`.req_sketch_float_build_k_hra(value, struct<int, 
bool>(10, false)) from unnest([1,2,3,4,5,6,7,8,9,10]) as value);
+
+insert into `$BQ_DATASET`.req_sketch
+(select `$BQ_DATASET`.req_sketch_float_build_k_hra(value, struct<int, 
bool>(10, false)) from unnest([11,12,13,14,15,16,17,18,19,20]) as value);
+
+select 
`$BQ_DATASET`.req_sketch_float_to_string(`$BQ_DATASET`.req_sketch_float_merge_k_hra(sketch,
 struct<int, bool>(10, false))) from `$BQ_DATASET`.req_sketch;
+
+drop table `$BQ_DATASET`.req_sketch;
+```
diff --git a/req/README_template.md b/req/README_template.md
new file mode 100644
index 0000000..06d7108
--- /dev/null
+++ b/req/README_template.md
@@ -0,0 +1,38 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+
+# Apache DataSketches REQ Sketches for Google BigQuery
+
+Relative Error Quantiles Sketch that rovides extremely high accuracy
+at a chosen end of the rank domain: high rank accuracy (HRA) or low
+rank accuracy (LRA).
+REQ sketches are quantile sketches that provide approximate quantiles
+and ranks for a dataset.
+
+Please visit 
+[REQ Sketches](https://datasketches.apache.org/docs/REQ/ReqSketch.html) 
+for more information about this sketch family.
+
+Please visit the main 
+[Apache DataSketches website](https://datasketches.apache.org) 
+for more information about DataSketches library.
+
+If you are interested in making contributions to this project please see our 
+[Community](https://datasketches.apache.org/docs/Community/) 
+page for how to contact us.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to