Re: Adding abstraction in MLlib

2014-09-14 Thread Egor Pahomov
It's good, that databricks working on this issue! However current process of working on that is not very clear for outsider. - Last update on this ticket is August 5. If all this time was active development, I have concerns that without feedback from community for such long time developme

Source code for mining big data with Spark

2014-09-14 Thread David Tung
Hi all, I watched am impressed spark demo video by Reynold Xin and Aaron Davidson in youtube ( https://www.youtube.com/watch?v=FjhRkfAuU7I ). Can someone let me know where can I find the source codes for the demo? I can¹t see the source codes from video clearly. Thanks in advance CONFIDENTIALITY

Support for Hive buckets

2014-09-14 Thread Cody Koeninger
I noticed that the release notes for 1.1.0 said that spark doesn't support Hive buckets "yet". I didn't notice any jira issues related to adding support. Broadly speaking, what would be involved in supporting buckets, especially the bucketmapjoin and sortedmerge optimizations?

Re: Tests and Test Infrastructure

2014-09-14 Thread Nicholas Chammas
I fully support this. A smoothly running test infrastructure helps everybody’s work just flow better. The Jenkins Pull Request Builder is mostly functioning again. However, we are working on a simpler technical pipeline for testing patches, as this plug-in has been a constant source of downtime an