Hi Michael, thanks for your reply. The current problem with Data Science ecosystem (from Data Analysis all way to GPU based ML) is that it employs a whole stack of languages from low-level like C (and sometimes assembler) all way to scripting like Python or R. In parallel, there are Big Data tools, Spark/Scala being most popular that can process massive data sets but only provided a computation nicely fits its computation model (Map-Reduce and friends). Working with pandas or scipy or R does not feel like programming any longer:-) You are calling into massive APIs written by other people in other languages. There is realization in Deep Learning community that Python does not quite cut it. A proverbial saying is that "the worst thing about Pytorch is Python". Thus, attempts to create monolingual stacks like Julia, or more recently Swift TensorFlow - not quite monolingual - but with an ambition to gradually eat into the territory of C++ based TF kernel. The same goes for Scala/Spark -- JVM with its high memory pressure is not the best choice for near-bare-metal calculations (a possible opening for Go).
I am curious whether Go will toss its hat in the ring or will leave the field to other players. --Leo On Tuesday, July 16, 2019 at 3:31:12 PM UTC-4, Michael Jones wrote: > > Leo, > > R is implemented in C and FORTRAN plus R on top of that. SAS is in C (and > some Go here and there) plus the SAS language in top of that. Mathematica > is implemented in C/C++ with the "Wolfram Language" on top of that. PARI/GP > is implemented in C plus some GP-language code. Macsyma, Maple, Octave, > Python,... follow this pattern too: > > 3 [glue-like meta-tools that combine various "full stack" tools]: Sage > : > 2 [interactive exploration environment with scripting]: many and various, > including R, SAS, MMA, GP, Macsyma, Axciom, Maple, Python, ... > : > 1 [performant heavy duty computation in compiled language]: C/C++ > : > 0 [ultra-performant kernels in C/Assembler/..]: GMP, LAPACK, BLAS, ATLAS, > ... > > You say Data Science is an application domain where Level 2 features make > sense, where they facilitate understanding by providing an interactive > environment. The evidence supports you, though understand that none of your > examples (or in my expanded set) actually do much at that level: this is > where the "convolve a with b" is specified, but the actual doing is lower, > in Level 0 and 1, where Go-like compiled software in C, C++, or FORTRAN > does the heavy lifting. (I make this point only to clarify what some people > seem not to understand in blogs where they write "my Python giant matrix > solver is just as fast as C/C++/Go, I don't see why C/C++/Go is not faster" > or "I don't see advantage in compiled languages.") > > If Go has a place in interactive, interpretive data science it seems to me > that it would be as the substrate language (Levels 0 and 1). Go certainly > has a place in statistics, applied mathematics, and other realms related to > data science if you want to include apps that do work and act on > results--control systems, analysis tools, etc. But to create an interactive > "play" space I'd (again, just me) be inclined to follow the PARI/GP model > with a Go kind of PARI and a domain-friendly GP. > > The high-level GP (Mathematica, Maple, GP, SAS, ...) in the existing > systems often seems to me to be weak, not designed as a first-class > programming language but more like an endless accretion of script enabling > fixes and patches. I feel this especially in the way local variables are > defined which often feels brutish and awkward, but that extends to many > subtleties. It is natural that it tends this way--developers were focused > on the core and just needed "a way" to bind it all together. The successful > projects span decades and unanticipated new application domains so have > accumulated the most duct tape. > > Another goodness of this two-level scheme is that the top language can be > "faulty" in ways that are comfortable. For example, think how many scalar > variables you see in C/C++/FORTRAN/Go: "i:= 3" is the bulk of variables. > But in R, there are (at least when I last looked) no scalar variables(!), > but you can get by with vectors of length 1. This would not do, generally, > but for R, it may be perfect. The two-level strata design of which PARI/GP > is one of the best implementations, makes this kind of field-of-use > tailoring work fine in practice. That's important, it is matching the > language's exposed concepts to the problem domain. > > I don't see any of this as a weakness or strength of Go, or as something > to address in the case of a REPL, because it's not Go that you'd want a > REPL for, instead something that knows about data, or Diophantine > equations, or moon rocks, or whatever the domain may be and its natural > forms of notation. > > Michael > > On Tue, Jul 16, 2019 at 10:18 AM Slonik Az <slon...@gmail.com > <javascript:>> wrote: > >> Hi Gophers! >> I was thinking to start a Go project in the area of Data Science that >> would allow for convenient and easy concurrent data processing but at the >> end decided against it mainly because of two reasons: >> >> (1) Almost all data science projects start with a data exploratory >> analysis of some sort. Unfortunately, Go does not have REPL. Go Playground >> is not a substitute, for it does not preserve state. On every iteration >> Playground recompiles and relaunches the entire program, reads all the data >> anew, performs all the calculations. Not good for an interactive "rapid >> fire". >> REPL in a static AOT compiled language is hard, yet Swift somehow managed >> to implement it. >> >> (2) Even if somebody implements incremental Go compiler and provides a >> proper REPL, people will be longing for data analysis "at your fingertips", >> missing rich pandas-like API, overloaded operators (python style) and >> dynamical scoping (like in R). Minimalistic design of Go is unlikely to >> accommodate all of these "convenience" constructs and for a good reason. >> >> I think Go has a place in highly performant concurrent data pipelines and >> transformations but I am less optimistic it would ever play in the field >> dominated currently by Python and R and possibly by Julia in the future. I >> am curious of what am I missing in this line of thinking? >> >> Thanks, >> --Leo >> >> -- >> You received this message because you are subscribed to the Google Groups >> "golang-nuts" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to golan...@googlegroups.com <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/golang-nuts/6009a15f-d944-449e-8bd7-e167b5e7d84d%40googlegroups.com >> >> <https://groups.google.com/d/msgid/golang-nuts/6009a15f-d944-449e-8bd7-e167b5e7d84d%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > > *Michael T. jonesmichae...@gmail.com <javascript:>* > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1e94e9b4-227f-4e94-a32c-dbe2c4030952%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.