Hi Aljoscha I want to know what is the apache flink performance if I run the same SQL as below. Do you have any apache flink benchmark information? Such as: https://amplab.cs.berkeley.edu/benchmark/ Thanks.
SELECT pageURL, pageRank FROM rankings WHERE pageRank > X Query 1A 32,888 resultsQuery 1B 3,331,851 resultsQuery 1C 89,974,976 results05101520253035404550Redshift (HDD)Impala - DiskImpala - MemShark - DiskShark - MemHiveTez0510152025303540455055Redshift (HDD)Impala - DiskImpala - MemShark - DiskShark - MemHiveTez0510152025303540Redshift (HDD)Impala - DiskImpala - MemShark - DiskShark - MemHiveTezOld DataMedian Response Time (s)Redshift (HDD) - Current2.492.619.46Impala - Disk - 1.2.3 12.01512.01537.085Impala - Mem - 1.2.32.173.0136.04Shark - Disk - 0.8.16.67 22.4Shark - Mem - 0.8.11.71.83.6Hive - 0.12 YARN50.4959.9343.34Tez - 0.2.0 28.2236.3526.44 On Mon, Jun 8, 2015 at 2:03 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > Hi, > actually, what do you want to know about Flink SQL? > > Aljoscha > > On Sat, Jun 6, 2015 at 2:22 AM, Hawin Jiang <hawin.ji...@gmail.com> wrote: > > Thanks all > > > > Actually, I want to know more info about Flink SQL and Flink performance > > Here is the Spark benchmark. Maybe you already saw it before. > > https://amplab.cs.berkeley.edu/benchmark/ > > > > Thanks. > > > > > > > > Best regards > > Hawin > > > > > > > > On Fri, Jun 5, 2015 at 1:35 AM, Fabian Hueske <fhue...@gmail.com> wrote: > >> > >> If you want to append data to a data set that is store as files (e.g., > on > >> HDFS), you can go for a directory structure as follows: > >> > >> dataSetRootFolder > >> - part1 > >> - 1 > >> - 2 > >> - ... > >> - part2 > >> - 1 > >> - ... > >> - partX > >> > >> Flink's file format supports recursive directory scans such that you can > >> add new subfolders to dataSetRootFolder and read the full data set. > >> > >> 2015-06-05 9:58 GMT+02:00 Aljoscha Krettek <aljos...@apache.org>: > >>> > >>> Hi, > >>> I think the example could be made more concise by using the Table API. > >>> http://ci.apache.org/projects/flink/flink-docs-master/libs/table.html > >>> > >>> Please let us know if you have questions about that, it is still quite > >>> new. > >>> > >>> On Fri, Jun 5, 2015 at 9:03 AM, hawin <hawin.ji...@gmail.com> wrote: > >>> > Hi Aljoscha > >>> > > >>> > Thanks for your reply. > >>> > Do you have any tips for Flink SQL. > >>> > I know that Spark support ORC format. How about Flink SQL? > >>> > BTW, for TPCHQuery10 example, you have implemented it by 231 lines of > >>> > code. > >>> > How to make that as simple as possible by flink. > >>> > I am going to use Flink in my future project. Sorry for so many > >>> > questions. > >>> > I believe that you guys will make a world difference. > >>> > > >>> > > >>> > @Chiwan > >>> > You made a very good example for me. > >>> > Thanks a lot > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > -- > >>> > View this message in context: > >>> > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Apache-Flink-transactions-tp1457p1494.html > >>> > Sent from the Apache Flink User Mailing List archive. mailing list > >>> > archive at Nabble.com. > >> > >> > > >