Hi Michael,
thanks for your reply. The current problem with Data Science ecosystem 
(from Data Analysis all way to GPU based ML) is that it employs a whole 
stack of languages from low-level like C (and sometimes assembler) all way 
to scripting like Python or R. In parallel, there are Big Data tools, 
Spark/Scala being most popular that can process massive data sets but only 
provided a computation nicely fits its computation model (Map-Reduce and 
friends). Working with pandas or scipy or R does not feel like programming 
any longer:-) You are calling into massive APIs written by other people in 
other languages. There is realization in Deep Learning community that 
Python does not quite cut it. A proverbial saying is that "the worst thing 
about Pytorch is Python". Thus, attempts to create monolingual stacks like 
Julia, or more recently Swift TensorFlow - not quite monolingual - but with 
an ambition to gradually eat into the territory of  C++ based TF kernel. 
The same goes for Scala/Spark -- JVM with its high memory pressure is not 
the best choice for near-bare-metal calculations (a possible opening for 
Go).

I am curious whether Go will toss its hat in the ring or will leave the 
field to other players.

--Leo

On Tuesday, July 16, 2019 at 3:31:12 PM UTC-4, Michael Jones wrote:
>
> Leo,
>
> R is implemented in C and FORTRAN plus R on top of that. SAS is in C (and 
> some Go here and there) plus the SAS language in top of that. Mathematica 
> is implemented in C/C++ with the "Wolfram Language" on top of that. PARI/GP 
> is implemented in C plus some GP-language code. Macsyma, Maple, Octave, 
> Python,... follow this pattern too:
>
> 3 [glue-like meta-tools that combine various "full stack" tools]: Sage
>   :
> 2 [interactive exploration environment with scripting]: many and various, 
> including R, SAS, MMA, GP, Macsyma, Axciom, Maple, Python, ...
>   :
> 1 [performant heavy duty computation in compiled language]: C/C++
>   :
> 0 [ultra-performant kernels in C/Assembler/..]: GMP, LAPACK, BLAS, ATLAS, 
> ...
>
> You say Data Science is an application domain where Level 2 features make 
> sense, where they facilitate understanding by providing an interactive 
> environment. The evidence supports you, though understand that none of your 
> examples (or in my expanded set) actually do much at that level: this is 
> where the "convolve a with b" is specified, but the actual doing is lower, 
> in Level 0 and 1, where Go-like compiled software in C, C++, or FORTRAN 
> does the heavy lifting. (I make this point only to clarify what some people 
> seem not to understand in blogs where they write "my Python giant matrix 
> solver is just as fast as C/C++/Go, I don't see why C/C++/Go is not faster" 
> or "I don't see advantage in compiled languages.")
>
> If Go has a place in interactive, interpretive data science it seems to me 
> that it would be as the substrate language (Levels 0 and 1). Go certainly 
> has a place in statistics, applied mathematics, and other realms related to 
> data science if you want to include apps that do work and act on 
> results--control systems, analysis tools, etc. But to create an interactive 
> "play" space I'd (again, just me) be inclined to follow the PARI/GP model 
> with a Go kind of PARI and a domain-friendly GP. 
>
> The high-level GP (Mathematica, Maple, GP, SAS, ...) in the existing 
> systems often seems to me to be weak, not designed as a first-class 
> programming language but more like an endless accretion of script enabling 
> fixes and patches. I feel this especially in the way local variables are 
> defined which often feels brutish and awkward, but that extends to many 
> subtleties. It is natural that it tends this way--developers were focused 
> on the core and just needed "a way" to bind it all together. The successful 
> projects span decades and unanticipated new application domains so have 
> accumulated the most duct tape.
>
> Another goodness of this two-level scheme is that the top language can be 
> "faulty" in ways that are comfortable. For example, think how many scalar 
> variables you see in C/C++/FORTRAN/Go: "i:= 3" is the bulk of variables. 
> But in R, there are (at least when I last looked) no scalar variables(!), 
> but you can get by with vectors of length 1. This would not do, generally, 
> but for R, it may be perfect. The two-level strata design of which PARI/GP 
> is one of the best implementations, makes this kind of field-of-use 
> tailoring work fine in practice. That's important, it is matching the 
> language's exposed concepts to the problem domain.
>
> I don't see any of this as a weakness or strength of Go, or as something 
> to address in the case of a REPL, because it's not Go that you'd want a 
> REPL for, instead something that knows about data, or Diophantine 
> equations, or moon rocks, or whatever the domain may be and its natural 
> forms of notation.
>
> Michael
>
> On Tue, Jul 16, 2019 at 10:18 AM Slonik Az <slon...@gmail.com 
> <javascript:>> wrote:
>
>> Hi Gophers!
>> I was thinking to start a Go project in the area of Data Science that 
>> would allow for convenient and easy concurrent data processing but at the 
>> end decided against it mainly because of two reasons:
>>
>> (1) Almost all data science projects start with a data exploratory 
>> analysis of some sort. Unfortunately, Go does not have REPL. Go Playground 
>> is not a substitute, for it does not preserve state. On every iteration 
>> Playground recompiles and relaunches the entire program, reads all the data 
>> anew, performs all the calculations. Not good for an interactive "rapid 
>> fire".
>> REPL in a static AOT compiled language is hard, yet Swift somehow managed 
>> to implement it.
>>
>> (2) Even if somebody implements incremental Go compiler and provides a 
>> proper REPL, people will be longing for data analysis "at your fingertips", 
>> missing rich pandas-like API, overloaded operators (python style) and 
>> dynamical scoping (like in R). Minimalistic design of Go is unlikely to 
>> accommodate all of these "convenience" constructs and for a good reason.
>>
>> I think Go has a place in highly performant concurrent data pipelines and 
>> transformations but I am less optimistic it would ever play in the field 
>> dominated currently by Python and R and possibly by Julia in the future. I 
>> am curious of what am I missing in this line of thinking?
>>
>> Thanks,
>> --Leo
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golan...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/6009a15f-d944-449e-8bd7-e167b5e7d84d%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/golang-nuts/6009a15f-d944-449e-8bd7-e167b5e7d84d%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
>
> *Michael T. jonesmichae...@gmail.com <javascript:>*
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/1e94e9b4-227f-4e94-a32c-dbe2c4030952%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to