> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1460 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990388#file1990388line1471>
> >
> >     This will update all MV before they are considered for rewrite. Looks 
> > like we are doing extra work which could be avoided by delaying augmenting 
> > MV until it qualifies for rewrite.
> 
> Jesús Camacho Rodríguez wrote:
>     Rewriting algorithm needs those filters to produce correct rewriting, 
> hence though we are indeed doing extra work, I do not think we can avoid it. 
> This part is also useful when MVs are outdated (not only for rebuild), as 
> they may still be used to produce correct rewritings.
>     
>     On the other hand, we could avoid this work by implementing the 
> pre-filtering techniques that we were talking about at some point. The idea 
> would be that only views relevant for a given query will be returned by the 
> method, then they would be enriched.
>     
>     Another improvement is that for a rebuild operation, we only consider in 
> the rewriting the MV that we are rebuilding? However, this is a kind of 
> heuristic, since other MVs that are up-to-date may be useful to rewrite the 
> rebuild query.
> 
> Ashutosh Chauhan wrote:
>     Yeah, this makes pre-filtering even more important. I thought we had 
> prefiltering based on table and col names. If not, we should add that in 
> follow-up.
>     Heuristic for rebuidling makes lot of sense. Since I dont expect 
> overlapping MVs to be common. Heuristic will also improve compile time.

Prefiltering based on tables in present in Calcite, but we need to add the 
materializations to the planner first. Here logic is a bit different, because 
we are enriching the materializations before adding them to the planner. In any 
case, for the rebuild, which is the most common case of incremental rewriting, 
adding the heuristic will give us a boost. I just did it and added comments 
explaining it.

FWIW, in the creation metadata we have information about the tables used by a 
materialization, so we could do a simple prefilter before loading the 
materializations in the planner (follow-up).


> On April 3, 2018, 11:34 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
> > Lines 57-60 (patched)
> > <https://reviews.apache.org/r/66369/diff/1/?file=1990390#file1990390line57>
> >
> >     Is this rewrite always better?
> >     
> >     Imagine tabA and tabB to be in millions rows but their join generates 
> > cross product of billions of rows. In prev case we were doing union of 
> > large MV with join of two tables (with which filter may get very small) as 
> > compared to here where we do ROJ of MV with one source table.
> 
> Jesús Camacho Rodríguez wrote:
>     Comment may be misleading. In the last MERGE, _source_ refers to _source_ 
> block, not to a single query:
>     _(SELECT a, b, SUM(x) AS s, COUNT(*) AS c --NEW DATA
>      FROM TAB_A
>      JOIN TAB_B ON (TAB_A.a = TAB_B.z)
>      WHERE TAB_A.ROW_ID > 5
>      GROUP BY a, b) source_
>     
>     In any case, I think the point about performance is relevant. I expect if 
> we do a rebuild with not many new rows, incremental maintenance will get the 
> edge. However, if there are many changes in source tables, union may be 
> benefitial over ROJ? But in the latter case, we also need to take into 
> account that the UNION operation will invalidate the LLAP cache of the MV, 
> while the ROJ variant will not. I can add a flag to control the incremental 
> maintenance piece, even if it is always true, in case we want to disable it 
> to do some tests in the future? What do you think?
> 
> Ashutosh Chauhan wrote:
>     yeah.. lets add a flag with default to true.

Added hive.materializedview.rebuild.incremental.


- Jesús


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66369/#review200335
-----------------------------------------------------------


On April 5, 2018, 2:23 a.m., Jesús Camacho Rodríguez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66369/
> -----------------------------------------------------------
> 
> (Updated April 5, 2018, 2:23 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-18839
>     https://issues.apache.org/jira/browse/HIVE-18839
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-18839
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 5f07312651dc64fe23dcb21d185d6676f33f3bb0 
>   itests/src/test/resources/testconfiguration.properties 
> d2e077b5096cbf82f7b8a98e8f66a5ef98e72bc9 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> fb1efe01dcddb4fd77f05918e46d742922dd313d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MaterializedViewTask.java 
> 50fc4e0c63a060781c93f0e65c5da32dd9116d70 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 
> 5a95649f5bccbdd44cb4c4d12c65d39a5a733a56 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 490f3b8ffe7564d10d86293b402d3c0dad2f7ee7 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManagerImpl.java 
> c8cafa2a68f897b1034e5bf61e872e044e01c22a 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> eed37a19376c398cacd9f8e86551591eed89c07c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
>  53dc8ec1974dfc095bd5c7601c8d486b4712319e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateIncrementalRewritingRule.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveNoAggregateIncrementalRewritingRule.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/MaterializedViewRewritingRelVisitor.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 
> 41de17fd4679009ef6a4fb5a6d976cbc794ce791 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java
>  75eb50c5797b312f66e81b5cec23849684e641fc 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 99e2c72d212cfaefe463b1fa32239d6b63aa9228 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
> 4c86fb89376041b30bc5c90a4e84f2e685d8ff82 
>   ql/src/test/queries/clientpositive/materialized_view_create_rewrite_4.q 
> c7f050be8da71f13463c22df2ccdab21c0e40b6a 
>   ql/src/test/queries/clientpositive/materialized_view_create_rewrite_5.q 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_3.q.out
>  1ef7b876d82136e3a7ad9021db8c293a3cc4808a 
>   
> ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_4.q.out
>  PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_5.q.out
>  PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/materialized_view_create_rewrite_rebuild_dummy.q.out
>  1ef7b876d82136e3a7ad9021db8c293a3cc4808a 
>   ql/src/test/results/clientpositive/llap/materialized_view_rewrite_7.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite.q.out 
> 18eb1d1daa483d4635db514beb96104b558ee1bd 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite_2.q.out 
> 144f9d84208ab32528a20bc6151c9be58571291e 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite_3.q.out 
> 65614566c9b738ea0354dc6d206a0ed7e635174e 
>   ql/src/test/results/clientpositive/materialized_view_create_rewrite_4.q.out 
> 48c0ecb23f1657c99ef30ed6963a3af1737f0514 
>   
> ql/src/test/results/clientpositive/materialized_view_create_rewrite_multi_db.q.out
>  32b408f5e4d389f9bd143695f8dd570db234a876 
>   
> ql/src/test/results/clientpositive/materialized_view_create_rewrite_time_window.q.out
>  bfbac3f3492fdd15b55475cd423cf262793cff84 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_1.q.out 
> 75e828fd4296d5f9333356916ba785ca664bfcb6 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_2.q.out 
> 6e8b2e3e6593b9e921517cb349755a7b668dc900 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_3.q.out 
> e3bd233177a9058cb202e641175a5fbb4aa7ca89 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_4.q.out 
> 7301571cbf06b874fa027aa8e7a9bc1192a5c948 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_5.q.out 
> bab3cacc2ce420578732037fb2cefc3d375c4544 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_6.q.out 
> 017d793283904ca83fc1074eb01ca9879805a61a 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_7.q.out 
> 486a50d55a346b7c2dd1e4e30a70ac7663da1d5d 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_8.q.out 
> 1ca06d3cc4207818f059650b0bc250433d41c783 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_9.q.out 
> 3120e0db7a19017d9edecc541f5bf1a1d444e66d 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_ssb.q.out 
> bc42df022734c76c18dc32c64509dd16addd1d3f 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_ssb_2.q.out 
> d561208fd8345bb338c0020ce6449fffd6067507 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
> b9e8e24b6356feea2abd0777a1bd41907b6a4afc 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
> c24055af98878a9422099e9e786252680243fcd5 
>   
> standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
>  cfec64f96a005dc6a342d267563f384108bfed22 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> 086cfe9db6cd64aa6720f63925ee501192e6008f 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 383cb9bc9c1cb06cea4ed4577bcfe1bd81f69c54 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Materialization.java
>  69607280bb6fc92596db90c01ea2508fc2fe4cee 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
>  0bca4ebafd827bc1dd59b05f926bbeab97efb87f 
>   
> standalone-metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php 
> 5e3dff1a12a03bb5b362ba684e199f5afab589e4 
>   standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php 
> d4fcc888ed0ef679fa70ccc30c7d38fa932d5d8a 
>   
> standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
>  d39690f31ca62181cb535d95cc90203cfaec33d2 
>   
> standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
>  f8ffeac6053a894fb039c0fad3663f52249406d6 
>   standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
> 972db1f0a39c8178b604e4d30447ee687243b5b3 
>   standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 
> 94454a1499e8354ef121074d1038a2f074a14cf5 
>   standalone-metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 
> c1036755b4421bfbcd4c21a036cc8bad5a8e32d8 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  8539fea42fa2743381833ab3137579caeac64672 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  ebbf465244885f8648800b0f98ec3abad9194fe1 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  b2c40c25f25c8069be9da198cc8f8c2dce09c14d 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationInvalidationInfo.java
>  3d774071c2ff00e892540364a7ec63e49b8089df 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationsInvalidationCache.java
>  80cb1de75e44efebd23e300c991b6715ad15faff 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationsRebuildLockCleanerTask.java
>  PRE-CREATION 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MaterializationsRebuildLockHandler.java
>  PRE-CREATION 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  995137f9675cd34b7d7c6f79bb78e206a65d99f7 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  2b815fc2e199914dbbefd884f78e97b23fd4c880 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  e72d3278928e43f479de38f424c36298bb53e4da 
>   standalone-metastore/src/main/thrift/hive_metastore.thrift 
> 7450439885f2437359db591802e8c37aae488226 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java
>  7d372627a4a0eb96c739ba7ecf30e2ab5799efb5 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreMaterializationsCacheCleaner.java
>  7a871e14585b0cecd2e101609390e0b9c3672c8e 
> 
> 
> Diff: https://reviews.apache.org/r/66369/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jesús Camacho Rodríguez
> 
>

Reply via email to