I continue to not see a point to engaging in this as a debate. The user acceptance speaks for itself. (As just one thing, the only person who hasn't gotten the display system working in 208, is Eric.) So does the rate of change - there have been a series of pushes to 702 in the past few month or two, either fixing problems related to (1), or adding functionality that Eric originally said wasn't required or was a bad idea, but put in after I pointed it out or users complained. The reason that process slowed is that I've stopped highlighting the gaps.
If anyone has a question about any of this, I'll address it. > On Feb 23, 2016, at 2:06 PM, Eric Charles <[email protected]> wrote: > > >> On 23/02/16 19:52, Amos B. Elberg wrote: >> Eric, they're not equivalent. 208 continues to have functionality 702 >> doesn't, including the display system. >> >> I'm not going to tell you what you're doing wrong in your implementation and >> "test" of 208, because the users don't seem to have the same confusion, and >> I've essentially been guiding your development process by pointing out the >> issues. >> >> All three of the issues you raise were addressed already in other threads: >> >> 1. The proposed approach to rscala actually introduces maintenance issues >> that have already broken 702. 702 was then revised to work around that, by >> distributing part of rscala in binary form. But the workaround doesn't deal >> with the issue of R users updating their own installations, and it >> eliminates the purported benefit of the approach. > > Using binary form with a specific version at build time is the classical way > to deploy on machines. Upgrading machines with a new rscala library implies > rebuilding and redeploying. > > This flexibility is only possible with binaries and not with forked fixed > source code. With 702, you can choose to build with scala 2.xx and rscala > 1.0.8 or the version you want to align with the library available on your > machines. > >> 2. This is purely cosmetic. 208 is outside the spark module because it made >> development, testing and merging cleaner. > > Sure, this is cosmetic, but I have tried to stick to the existing pyspark > implementation to avoid additional maven modules. Btw, having two magic > keywords as 208 offers is also something I have avoided to align with current > practices and make it simple for the end user. > >> >> 3. 208 has supported the HTML, TABLE and IMG display system all along, in an >> R-consistent manner. 702 originally did not support any of it. After I >> pointed out the gap and users complained, 702 was revised to implement it >> partially. 702 still does not. That's why the user questions about this all >> get asked on 702 - the people using 208 don't need to ask about it, because >> it works as expected. > > I quickly pulled and tested today your branch but running print("%html > <h1>hello</h1>") didn't work. Will try again tomorrow. > >>> On Feb 23, 2016, at 1:20 PM, Eric Charles <[email protected]> wrote: >>> >>> It would make no sense merging both. >>> >>> From an end-user perspective, I guess both are equivalent, although with >>> the last commit I made, the Zeppelin Display system is supported in 702 (I >>> had no luck when testing this functionality with 208). As I said, feel free >>> to test both and send feature requests. >>> >>> From a developer perspective, I will reiterate the points I sent on [1] >>> which are addressed in 702 (these points make sense to me but didn't >>> receive echo so far - would like to get feedback on these): >>> >>> 1.- Use rscala jar instead of forking -> allows to support the platform >>> version (scala version...) and benefit from the rscala project new versions >>> with patches without having to maintain in the zeppelin source tree fork. >>> >>> 2.- Just like Python, develop R in the Spark module >>> >>> 3.- Support the same behavior asthe rest (no TABLE when output is a >>> dataframe, support the HTML, TABLE and IMG display system, support the >>> Dynamic Form system). >>> >>> I still have the Dynamic Form system operational. >>> >>> [1] >>> http://mail-archives.apache.org/mod_mbox/incubator-zeppelin-dev/201512.mbox/%3C5683E471.9010001%40apache.org%3E >>> >>>> On 23/02/16 19:09, Jeff Steinmetz wrote: >>>> Thank you Amos Elberg & Eric Charles: >>>> Is the goal of the community to merge both 208 and 702 at some point as >>>> two “different” R interpreters? >>>> >>>> One that is >>>> %r >>>> And another that is >>>> %spark.r >>>> >>>> Still trying to wrap my head around the difference. >>>> >>>> >>>> >>>> >>>>> On 2/23/16, 9:34 AM, "Amos B. Elberg" <[email protected]> wrote: >>>>> >>>>> Jeff - 702 isn't a fork, it's an alternative based on 208 that has a >>>>> subset of 208's features. 208 is the superset. 208 is also what the >>>>> community is now attempting to integrate. >>>>> >>>>> R does support serialization of functions. >>>>> >>>>> 208 does support passing a spark table back and forth between R and >>>>> scala. Passing a data.frame through the Zeppelin context will fail in >>>>> spark up to 1.5. It may now be working for some data frames in 1.6. >>>>> >>>>> There are examples that do all these things in the documentation for 208 >>>>> on my repo at github.com/elbamos/Zeppelin-With-R >>>>> >>>>>> On Feb 23, 2016, at 12:03 PM, Jeff Steinmetz >>>>>> <[email protected]> wrote: >>>>>> >>>>>> Hello zeppelin dev group, >>>>>> >>>>>> Regarding the R Interpreter Pull requests 208 and 702. I am trying to >>>>>> figure out if the functionality between these are overlapping, or one >>>>>> supports something different than the other. Is 702 a super set of 208 >>>>>> (702 is a fork of 208)? >>>>>> >>>>>> Can you pass the reference of a distributed (parallelized) dataframe >>>>>> built in %spark (scala) to the R interpreter? Similar to z.put(“myDF", >>>>>> myDF)? >>>>>> >>>>>> Similarly, since R doesn’t support serialization of functions (unless >>>>>> you use something from the SparkR library) is there an example of >>>>>> collecting the parallel DF to a local DF (which I realize it means the >>>>>> dataset needs to fit in local memory on the zeppelin server). >>>>>> >>>>>> I can to dig into this a bit and help out where appropriate, however its >>>>>> unclear which PR to focus my efforts on. >>>>>> >>>>>> Best, >>>>>> Jeff Steinmetz >>>>>> Principal Architect >>>>>> Akili Interactive Labs >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On 2/23/16, 8:01 AM, "elbamos" <[email protected]> wrote: >>>>>>> >>>>>>> Github user elbamos commented on the pull request: >>>>>>> >>>>>>> >>>>>>> https://github.com/apache/incubator-zeppelin/pull/702#issuecomment-187764059 >>>>>>> >>>>>>> @btiernay support for that has been in 208 all along... >>>>>>> >>>>>>>> On Feb 23, 2016, at 9:27 AM, Bob Tiernay <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>> @echarles This is great! Thanks for all your hard work. Very much >>>>>>>> appreciated! >>>>>>>> >>>>>>>> ╉ >>>>>>>> Reply to this email directly or view it on GitHub. >>>>>>> >>>>>>> >>>>>>> >>>>>>> --- >>>>>>> If your project is set up for it, you can reply to this email and have >>>>>>> your >>>>>>> reply appear on GitHub as well. If your project does not have this >>>>>>> feature >>>>>>> enabled and wishes so, or if the feature is enabled but not working, >>>>>>> please >>>>>>> contact infrastructure at [email protected] or file a JIRA >>>>>>> ticket >>>>>>> with INFRA. >>>>>>> --- >>>>
