Re: [Rcpp-devel] `checkUserInterrupt()` might not be safe if Rcout is used.

2018-07-29 Thread Wush Wu
lso consider in Rcpp wrapping our calls to Rprintf() in > R_ToplevelExec() and 'catching' and 'rethrowing' interrupts seen, but this > might have some unintended side-effects) > > Best, > Kevin > > On Thu, Jul 26, 2018 at 8:18 AM Wush Wu wrote: > >&g

Re: [Rcpp-devel] `checkUserInterrupt()` might not be safe if Rcout is used.

2018-07-26 Thread Wush Wu
e > describing its use. > > (We might also consider in Rcpp wrapping our calls to Rprintf() in > R_ToplevelExec() and 'catching' and 'rethrowing' interrupts seen, but this > might have some unintended side-effects) > > Best, > Kevin > > On Thu, Jul 26,

[Rcpp-devel] `checkUserInterrupt()` might not be safe if Rcout is used.

2018-07-26 Thread Wush Wu
Hi all, I just learned the function `checkUserInterrupt` and played with it in my package today. At first, everything was good. However, I sensed something wrong when I interrupted my function and relaunched it. In my case, the thread number of OpenMP decreased to 1 after an user interruption. Ac

Re: [R] Failed to read UTF-16LE file on Windows

2015-10-28 Thread Wush Wu
n, header = TRUE, sep = ",") ``` Hope it will help someone in the future. Wush 2015-10-27 21:24 GMT+08:00 Wush Wu : > Dear all, > > I tried to run the following code on 3 different OS: > > ``` > download.file(" > https://raw.githubusercontent.com/wush978/DataSci

[R] Failed to read UTF-16LE file on Windows

2015-10-27 Thread Wush Wu
Dear all, I tried to run the following code on 3 different OS: ``` download.file(" https://raw.githubusercontent.com/wush978/DataScienceAndR/course/RBasic-07-Loading-Dataset/orglist-100.CSV";, destfile = "orglist-100.CSV") con <- file("orglist-100.CSV", encoding = "UTF-16LE") src <- readLines(con

Re: Frequency of RECEIVE << PING

2015-09-09 Thread Wush Wu
github.com/apache/incubator-zeppelin/commit/a9d686b40f0a5a37cdc775403866f711df70cf56 > > However, if you got 211 ping a second, it would mean you had 211 notebooks > opened, or that there is a bug (ping still sent after you move from > notebook for example) > > On Thu, Sep

Frequency of RECEIVE << PING

2015-09-09 Thread Wush Wu
Dear all, I saw about 211 line of logs of "INFO NotebookServer: RECEIVE << PING" in 1 second. Is there a parameter to configure the frequency of PING? --- Currently we use the zeppelin on commit: b4b4f5521a57fd3b0902b5e3ab0e228c10b8bac5 Best, Wush

[jira] [Created] (ZEPPELIN-264) Viewing Task Queue of Zeppelin

2015-08-25 Thread Wush Wu (JIRA)
Wush Wu created ZEPPELIN-264: Summary: Viewing Task Queue of Zeppelin Key: ZEPPELIN-264 URL: https://issues.apache.org/jira/browse/ZEPPELIN-264 Project: Zeppelin Issue Type: Wish

Re: Checking the reason of pending

2015-08-25 Thread Wush Wu
g-mailing-list.75479.x6.nabble.com/why-zeppelin-SparkInterpreter-use-FIFOScheduler-td579.html > > Best, > moon > > > On Tue, Aug 25, 2015 at 3:48 AM Victor Manuel Garcia < > victor.gar...@beeva.com> wrote: > >> Hi Wush, >> >> by the moment you can n

Checking the reason of pending

2015-08-25 Thread Wush Wu
Dear all, Our team is using zeppelin to submit ad hoc queries to our spark cluster. There are many people using the zeppelin at the same time. Sometimes, we need to wait each other and the task is pending for a long time. Is there a place to see the task queue in the zeppelin? Thanks, Wush

Re: Efficiency of leftOuterJoin a cassandra rdd

2015-07-15 Thread Wush Wu
nd join apis are super slow in spark. 100x slower than >> hadoop >> >> Sent from my iPhone >> >> > On 14-Jul-2015, at 10:59 PM, Wush Wu wrote: >> > >> > I don't understand. >> > >> > By the way, the `joinWithCassandraTable` does

Re: Efficiency of leftOuterJoin a cassandra rdd

2015-07-14 Thread Wush Wu
14, 2015 at 9:35 PM, Wush Wu wrote: >> >> Dear all, >> >> I have found a post discussing the same thing: >> >> https://groups.google.com/a/lists.datastax.com/forum/#!searchin/spark-connector-user/join/spark-connector-user/q3GotS-n0Wk/g-LPTteCEg0J >>

Re: Efficiency of leftOuterJoin a cassandra rdd

2015-07-14 Thread Wush Wu
github.com/datastax/spark-cassandra-connector/blob/v1.3.0-M2/doc/2_loading.md Wush 2015-07-15 12:15 GMT+08:00 Wush Wu : > Dear all, > > I am trying to join two RDDs, named rdd1 and rdd2. > > rdd1 is loaded from a textfile with about 33000 records. > > rdd2 is loaded from a table

Efficiency of leftOuterJoin a cassandra rdd

2015-07-14 Thread Wush Wu
Dear all, I am trying to join two RDDs, named rdd1 and rdd2. rdd1 is loaded from a textfile with about 33000 records. rdd2 is loaded from a table in cassandra which has about 3 billions records. I tried the following code: ```scala val rdd1 : (String, XXX) = sc.textFile(...).map(...) import o

Re: sqlContext fails to discover parquet partition

2015-06-23 Thread Wush Wu
rquetFile("hdfs:///bwlogs/beta/archive/EC.RTB/_year=2015/_month=06/_day=22") ``` Hope this might help others in the future. Best, Wush 2015-06-23 10:00 GMT+08:00 Wush Wu : > Dear all, > > Today we try to load parquet file with partition as instructed in < > https://spark.apa

sqlContext fails to discover parquet partition

2015-06-22 Thread Wush Wu
Dear all, Today we try to load parquet file with partition as instructed in < https://spark.apache.org/docs/1.3.1/sql-programming-guide.html#partition-discovery> : ``` sqlContext.parquetFile("hdfs:///bwlogs/beta/archive/EC.Buy/_year=2015/_month=06/_day=11") ``` but we got `java.lang.IllegalArgum

Re: matplotlib in pyspark

2015-06-10 Thread Wush Wu
; p.savefig(img, format='svg') > img.seek(0) > print "%html " + img.buf + "" > > Then it displays plot, like > %pyspark > show(plt) > > attaching a screenshot > > > Thanks, > moon > > On Tue, Jun 9, 2015 at 7:08 PM Wush

matplotlib in pyspark

2015-06-09 Thread Wush Wu
Dear all, I am trying the materials of an online spark course in edX ( https://www.edx.org/course/introduction-big-data-apache-spark-uc-berkeleyx-cs100-1x). They use the ipython notebook to teach, and I want to test if zeppelin is as good as ipython notebook when using pyspark. However, the matpl

Difference behaviour of DateType in SparkSQL between 1.2 and 1.3

2015-03-26 Thread Wush Wu
Dear all, I am trying to upgrade the spark from 1.2 to 1.3 and switch the existed API of creating SchemaRDD to DataFrame. After testing, I notice that the following behavior is changed: ``` import java.sql.Date import com.bridgewell.SparkTestUtils import org.apache.spark.rdd.RDD import org.apach

Construct model matrix from SchemaRDD automatically

2015-03-05 Thread Wush Wu
Dear all, I am a new spark user from R. After exploring the schemaRDD, I notice that it is similar to data.frame. Is there a feature like `model.matrix` in R to convert schemaRDD to model matrix automatically according to the type without explicitly converting them one by one? Thanks, Wush

Re: Global sequential access of elements in RDD

2015-02-27 Thread Wush Wu
",")) > > > > 2015-02-27 15:38 GMT+08:00 Wush Wu : > >> Dear all, >> >> I want to implement some sequential algorithm on RDD. >> >> For example: >> >> val conf = new SparkConf() >> conf.setMaster("local[2]"). >&

Global sequential access of elements in RDD

2015-02-26 Thread Wush Wu
Dear all, I want to implement some sequential algorithm on RDD. For example: val conf = new SparkConf() conf.setMaster("local[2]"). setAppName("SequentialSuite") val sc = new SparkContext(conf) val rdd = sc. parallelize(Array(1, 3, 2, 7, 1, 4, 2, 5, 1, 8, 9), 2). sortBy(x => x, true) r

Re: Extract hour from Timestamp in Spark SQL

2015-02-16 Thread Wush Wu
Dear Cheng Hao, You are right! After using the HiveContext, the issue is solved. Thanks, Wush 2015-02-15 10:42 GMT+08:00 Cheng, Hao : > Are you using the SQLContext? I think the HiveContext is recommended. > > > > Cheng Hao > > > > *From:* Wush Wu [mailto:w

Extract hour from Timestamp in Spark SQL

2015-02-11 Thread Wush Wu
Dear all, I am new to Spark SQL and have no experience of Hive. I tried to use the built-in Hive Function to extract the hour from timestamp in spark sql, but got : "java.util.NoSuchElementException: key not found: hour" How should I extract the hour from timestamp? And I am very confusing abou

Re: [Rcpp-devel] Does uncaught c++ exception cause memory leak?

2014-12-29 Thread Wush Wu
Dear Romain, Dirk, and Qiang, Thanks for Romain's reminder of auto appended catch block, Dirk's explanation of leaks and exception and Qiang's simple example. Now I understand that an uncaught exception in a c++ program might cause memory leak. A SO Q&A ( http://stackoverflow.com/questions/2767789

Re: [Rcpp-devel] Does uncaught c++ exception cause memory leak?

2014-12-29 Thread Wush Wu
However, if I manually use a try/catch to enclose the exception, then the valgrind detects no error. It seems that raising R error causes the memory error, is it? 2014-12-30 1:42 GMT+08:00 Romain Francois : > includes ___ Rcpp-devel mailing list Rcpp-d

[Rcpp-devel] Does uncaught c++ exception cause memory leak?

2014-12-29 Thread Wush Wu
Dear all, I have a question which is described in the subject. It is discovered when I use the valgrind to check my package. Because I am not familiar with valgrind, please help me verify that if this is a bug of Rcpp or not. Here is a minimal reproducible example with docker and the log of val

Re: [R] using inspect with a TermDocumentMatrix to convert to a data frame

2014-12-04 Thread Wush Wu
Dear Erin, For the issue of printing big data.frame, you could define a customized `print.data.frame` in the user environment to prevent R prints all the data. For example: ```r print.data.frame <- function(df) { base::print.data.frame(head(df)) cat("===\n") base::print.data.frame(tail(df)

Re: [R] Compressing code help in a loop

2014-11-10 Thread Wush Wu
Dear Francesca, Is this what you want? ```r index <- c(406, 107, 207, 307, 407, 108, 208, 308, 408, 109, 209, 309, 409, 110, 210, 310, 410, 111, 211) p_m <- match( p_int$p_made, index) dim(p_m) <- c(dim(p_int)[1],1) ``` Best, Wush PhD Student Graduate Institute of Electrical Engineering, Nationa

[R] Compression variable in memory with gzcon

2014-11-09 Thread Wush Wu
Hello, I cannot figure out how to compress a raw vector in memory with gzip format that is compatible with `gzfile`. (`memCompress` does not compatible with `gzfile`.) I modified the example in `rawConnection`, but it does not work: ``` > zz <- gzcon(rawConnection(raw(0), "r+b")) # start with em

[R] Matrix::sparse.model.matrix produces inconsistent result

2014-10-26 Thread Wush Wu
Hi, I notice that the Matrix::sparse.model.matrix produces inconsistent matrix when the contrasts is a sparse matrix. Here is a minimal example: ```r library(Matrix) class(CO2[[1]]) <- class(CO2[[2]]) <- "factor" get_contr <- function(is_sparse) { contr <- list() contr[["Plant"]] <- contrast

[R] Looking for Feature Hashing

2014-10-26 Thread Wush Wu
x quickly without the overhead of constructing the index. The probability of "collision" might be small if we pick a good hashing function. For details, please see http://en.wikipedia.org/wiki/Feature_hashing Best, Wush Wu PhD Student Graduate Institute of Electrical Engineering, National Ta

Re: Using String Dataset for Logistic Regression

2014-06-02 Thread Wush Wu
Dear all, Does spark support sparse matrix/vector for LR now? Best, Wush 2014/6/2 下午3:19 於 "praveshjain1991" 寫道: > Thank you for your replies. I've now been using integer datasets but ran > into > another issue. > > > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-not-proce

Recommended way to develop spark application with both java and python

2014-04-07 Thread Wush Wu
Dear all, We have a spark 0.8.1 cluster on mesos 0.15. Some of my colleagues are familiar with python, but some of features are developed under java. I am looking for a way to integrate java and python on spark. I notice that the initialization of pyspark does not include a field to distribute ja

Re: Run spark on mesos remotely

2014-03-28 Thread Wush Wu
the problem of hostname resolving. On the other hand, is there a way to set the system property `spark.hostPort` so that I could export `192.168.xx.xx:45678` to `spark.hostPort`? Thanks. 2014-03-28 9:32 GMT+08:00 Wush Wu : > Dear Rustagi, > > Thanks for you response. > > As far as

Re: Run spark on mesos remotely

2014-03-27 Thread Wush Wu
to the master > cluster as the driver with DAG scheduler runs on that machine. > Regards > Mayur > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi <https://twitter.com/mayur_rustagi> > > > > On Thu, Mar 27, 2014 at

Run spark on mesos remotely

2014-03-27 Thread Wush Wu
Dear all, We have a spark 0.8.1 cluster on mesos 0.15. It works if I submit the job from the master of mesos. That is to say, I spawn the spark shell or launch the scala application on the master of mesos. However, when I submit the job from another machine, the job will lost. The logs shows that

Re: [Rcpp-devel] Copyright Issue of using 3rd party C/C++ library in R package

2013-09-08 Thread Wush Wu
Dear Dirk, About hiredis, I am trying to improve the efficiency of rredis with Bryan, the maintainer of rredis now. Even though we might not need hiredis, the copyright issue will still bother me in the future. Note that your example is interesting, because it shows that the performance of rredis

[Rcpp-devel] Copyright Issue of using 3rd party C/C++ library in R package

2013-09-08 Thread Wush Wu
Hello, I am working on the R wrapper of hiredis, which is a C library released under BSD license. Since the development package of libhiredis is not available on many OS such as Mac OS X,Solaris and Windows, I need to embed the source code of hiredis in the package to make it work. I submitted my

Re: [Rcpp-devel] Returning an element from a List with its attributes?

2013-08-20 Thread Wush Wu
Hi Tal, ``` cppFunction(' std::vector temp2_fail(RObject x){ std::vector lower; List x_list(wrap(x)); int n = x_list.size(); lower.push_back(x); return(lower) ; } ') ``` Is this what you want? 2013/8/20 Tal Galili > Dear list members, > > I would like to be able to

Re: [Rcpp-devel] Rcpp build trouble with strings

2013-08-19 Thread Wush Wu
"Rcpp:::LdFlags()"` OBJECTS = RcppExports.o rcpp_hello_world.o /home/wush/Test/TestLinking/inst/linkHello.so ``` Is that correct? Thanks, Wush 2013/8/20 Dirk Eddelbuettel > > On 20 August 2013 at 08:34, Wush Wu wrote: > | Maybe you need to set some flag in NAMESPACES to

Re: [Rcpp-devel] Rcpp build trouble with strings

2013-08-19 Thread Wush Wu
Maybe you need to set some flag in NAMESPACES to ask the package to load required libraries. Hope that help. Hey again, I built the default Rcpp.package.skeleton rcpp_hello_world to take in strings and it worked fine. My problems seem to be from the libraries themselves. I recompiled the librarie

Re: [Rcpp-devel] Invitation to connect on LinkedIn

2013-08-08 Thread Wush Wu
Hi all, I am very sorry about this spam message. Please ignore it. Best, Wush 2013/8/7 下午10:07 於 "Wush Wu" 寫道: > > [image: LinkedIn] > > > > > * From Wush Wu * > > PhD Student at Institute of Electrical Engineering, National Taiwan > Universi

[Rcpp-devel] Invitation to connect on LinkedIn

2013-08-07 Thread Wush Wu
LinkedIn rcpp-devel, I'd like to add you to my professional network on LinkedIn. - Wush Wush Wu PhD Student at Institute of Electrical Engineering, National Taiwan University Taiwan Confirm that you know Wush Wu: https://www.linkedin.com/e/1p7qzd-hk2m2aez-18/isd/156067

Re: [Rcpp-devel] Favourite Rcpp examples for newbies ?

2013-08-01 Thread Wush Wu
For me, there are two reasons to learn Rcpp: 1. Rcpp provides a straightforward way to speed up existed R script. I answered 2 question of speeding up R code on StackOverflow with Rcpp: http://stackoverflow.com/questions/14495697/speeding-up-a-repeated-function-call/14495967#14495967 For this q

Re: [Rcpp-devel] What is the best practice to expose a C st ructure from 3rd party library into R?

2013-07-29 Thread Wush Wu
Thanks for your suggestion, Dirk. I'll try to rewrite Rhiredis based on your approach and provide a similar rredis interface. If everything is good, I'll try to make it on CRAN if it is possible. 2013/7/24 Dirk Eddelbuettel > > Hi Wush, > > After some more cleanup and consolidation, the packag

Re: [Rcpp-devel] What is the best practice to expose a C st ructure from 3rd party library into R?

2013-07-23 Thread Wush Wu
By the way, maybe you should look two files in `inst`. The `gen_function.R` crawls http://redis.io/commands and generates the command according to the official redis manual. You could modify `template.R` to generate the helper functions dynamically based on the exposed Rcpp function. Hope it help

Re: [Rcpp-devel] What is the best practice to expose a C st ructure from 3rd party library into R?

2013-07-23 Thread Wush Wu
gt; | On 22 July 2013 at 13:19, rom...@r-enthusiasts.com wrote: > | | Le 2013-07-22 10:12, Wush Wu a écrit : > | | > I wrote a wrapper of hiredis, which is a minimalistic C client for > | | > the Redis database. Its name is `Rhiredis` and is much faster than > | | > rredis, an e

[Rcpp-devel] What is the best practice to expose a C structure from 3rd party library into R?

2013-07-22 Thread Wush Wu
Hi all, I wrote a wrapper of hiredis, which is a minimalistic C client for the Redis database. Its name is `Rhiredis` and is much faster than rredis, an existed redis client of R. Please see http://rpubs.com/wush978/rhiredis for details. Thanks for the Rcpp, it is much easier to wrap a C library

Re: [Rcpp-devel] Help with accessing and manipulating List objects

2013-07-20 Thread Wush Wu
Hi Tal, > That is, say that we have x as a List, how do we go about fetching its attr? > I tried: > x[1].attr("type") > x.attr[1]("type") > But none seemed to have worked. Any suggestions there? Here is a script to show you how to access them: ```r x <- list(a = 1, b = 2, c = list(ca = 3, cb = 4

[Rpy] Embedded python in R

2013-03-13 Thread Wush Wu
Hi all, Recently I need to do some job in R and python simultaneously. Since I am a R, Rcpp user and new to python, so I prefer to embed python into R. Unfortunately, RSPython does not work for me. In ubuntu, I have some linking issue and resolve them according to http://stackoverflow.com/questi

[Rcpp-devel] RCMake

2013-02-09 Thread Wush Wu
alling cmake. Please read https://github.com/wush978/RCMake/blob/master/README.md for further introductions. The package is still under testing, so please report the bugs or give me some feedbacks on https://github.com/wush978/RCMake/issues Thanks Wush Wu _