Re: [R] assignment by value or reference

2010-09-15 Thread Xiaobo Gu
Thanks.



Xiaobo.Gu

>>-Original Message-
>>From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
>>Sent: Wednesday, September 15, 2010 5:06 PM
>>To: Xiaobo Gu
>>Cc: r-help@r-project.org; '顾小波'
>>Subject: Re: [R] assignment by value or reference
>>
>>See the R Language Definition manual. Since R knows about lazy
>>evaluation, it is sometimes neither by reference nor by value.
>>If you want to think binary, then "by value" fits better than "by
>>reference".
>>
>>Uwe Ligges
>>
>>
>>
>>On 05.09.2010 17:19, Xiaobo Gu wrote:
>>> Hi Team,
>>>
>>>   Can you please tell me the rules of assignment in R, by value or
>>by reference.
>>>
>>>> From my about 3 months of experience of part time job of R, it seems most
>>times it is by value, especially in function parameter and return values
>>assignment; and it is by reference when referencing container sub-objects of
>>container objects, such as elements of List objects and row/column objects of
>>DataFrame objectes; but it is by value when referencing the smallest unit of
>>element of a container object, such as cell of data frame objects.
>>>
>>>
>>>
>>>
>>>
>>> Xiaobo.Gu
>>>
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ff objects saving problem

2010-11-10 Thread Xiaobo Gu
Hi,
I am running the examples in page 70 of the ff package document, but it 
failed with the following error
> cat("let's create some ff objects\n")
let's create some ff objects
> n <- 8e3
> a <- ff(sample(n, n, TRUE), vmode="integer", length=n, filename="d:/tmp/a.ff")
> b <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/b.ff")
> x <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/x.ff")
> y <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/y.ff")
> z <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/z.ff")
> df <- ffdf(x=x, y=y, z=z)
> rm(x,y,z)
> cat("save some of them with shorter relative pathnames ...\n")
save some of them with shorter relative pathnames ...
> ffsave(a, b, file="d:/tmp/y", rootpath="d:/tmp")
Error in system(cmd, input = list, intern = TRUE) : 'zip' not found
> str(ffinfo("d:/tmp/y"))
Error in system(cmd, intern = TRUE) : 'unzip' not found

Does it mean I have to install some special softwares for zip and unzip 
operation.


Another question is whether ff always create physical files for vectors, I 
found when a ff ffdf object is created, the number of physical files created is 
eaqual to the number of data frame columns.


Xiaobo.Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ff objects saving problem

2010-11-10 Thread Xiaobo Gu
Hi Jens,
I have installed RTools, on my Win XP SP3 notebook, there are still
errors when running the examples,


> cat("let's create some ff objects\n")
let's create some ff objects
> n <- 8e3
> a <- ff(sample(n, n, TRUE), vmode="integer", length=n, filename="d:/tmp/a.ff")
> b <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/b.ff")
> x <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/x.ff")
> y <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/y.ff")
> z <- ff(sample(255, n, TRUE), vmode="ubyte", length=n, filename="d:/tmp/z.ff")
> df <- ffdf(x=x, y=y, z=z)
> rm(x,y,z)
> cat("save all of them\n")
save all of them
> ffsave.image("d:/tmp/x")
[1] "d:/tmp/a.ff" "d:/tmp/b.ff" "d:/tmp/x.ff" "d:/tmp/y.ff" "d:/tmp/z.ff"
Error in ffsave(list = ls(envir = .GlobalEnv, all.names = TRUE), file
= outfile,  :
  the previous files do not match the rootpath (case sensitive)
> str(ffinfo("d:/tmp/x"))
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
In addition: Warning messages:
1: In file.remove(c(imgfile, zipfile)) :
  cannot remove file 'd:/tmp/xTmp.ffData', reason 'No such file or directory'
2: In readChar(con, 5L, useBytes = TRUE) :
  cannot open compressed file 'd:/tmp/x.RData', probable reason 'No
such file or directory'
> cat("save some of them with shorter relative pathnames ...\n")
save some of them with shorter relative pathnames ...
> ffsave(a, b, file="d:/tmp/y", rootpath="d:/tmp")
[1] """zip error: Nothing
to do! (d:/tmp/y.ffData)"
> str(ffinfo("d:/tmp/y"))
List of 3
 $ RData   :List of 2
  ..$ a: chr "d:/tmp/a.ff"
  ..$ b: chr "d:/tmp/b.ff"
 $ ffData  :List of 1
  ..$ zipinfo:  cannot find either d:/tmp/y.ffData or d:/tmp: chr
"y.ffData.zip."
 $ rootpath: chr "d:/tmp/"
>


2010/11/11 "Jens Oehlschlägel" :
> Xiaobo,
>
> You indeed need external 'zip' and 'unzip' utlities in the path, citing from 
> ffsave's help: "using an external zip utility, e.g. for windows in Rtools on 
> [http://www.murdoch-sutherland.com/Rtools/]";.
>
> Please note that the mentioned utilities have a 4 GB limit for the zip file, 
> AFAIK. I will for the next release check for a way to get rid of this limit 
> and also to get rid of inconsistencies in upper/lower-case spelling of drive 
> letters which can cause ffsave to fail. Note that - even without fffsave - ff 
> objects can be made permanent simply by creating them with 'filename' resp. 
> 'pattern' outside of fftempdir and saving the R-side ff-object with the usual 
> 'save' or 'save.image' function.
>
> In a new R session, after 'library(ff)' and 'load' you again have access, 
> assumed your ff files are still in the same location.
> And yes, each column of a ffdf dataframe is stored as a separate ff file.
>
> Jens Oehlschlägel
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to remove directory?

2010-11-12 Thread Xiaobo Gu
Hi,
It seems file.remove can't remove directories,  regardless whether
it's an empty one, but how to delete a directory in R

> file.remove("D:/ffdata")
[1] FALSE
Warning message:
In file.remove("D:/ffdata") :
  cannot remove file 'D:/ffdata', reason 'Permission denied'
Thanks.
Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R encoding question

2010-11-24 Thread Xiaobo Gu
Hi,
 I am using RpgSQL to retrieve data from a PostgreSQL database wich is
with encoding UTF8, and I have some Chinese character in one of the
columns, unfortunately R can't show it correctly.

> df <- dbGetQuery(con, "select * from test")
> df
  ab
1 1 椤惧��娉\xa2
2 2   瑕�� EURO\xa1

I see the following option, do I need to change the encoding option to
show the corresponding texts? In my case how to set?

$encoding
[1] "native.enc"

Thanks,
Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: R encoding question

2010-11-29 Thread Xiaobo Gu
But Sys.setlocale tries to change the option of the whole OS, I just want only 
R to use a specified encoding, how can I do this. 


Xiaobo.Gu


>>-Original Message-
>>From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
>>Sent: Monday, November 29, 2010 8:57 PM
>>To: Xiaobo Gu
>>Subject: Re: FW: R encoding question
>>
>>I have never played with encodings myself.  Suggest you read the postgresql
>>documentation and try different arguments to Sys.setlocale in R.  You
>>probably have to do that before you initiate the database since it might not
>>have any effect afterwards. I am not sure this is the problem but its worth a 
>>try.
>>Here are some examples.
>>
>>Sys.setlocale(locale="C")
>>Sys.setlocale(locale="en_NZ.iso88591")
>>Sys.setlocale("LC_ALL", "en_US")
>>Sys.setlocale("LC_TIME", "English")
>>Sys.setlocale('LC_ALL','fr_FR')
>>Sys.putenv("LANGUAGE"="EN");Sys.setlocale("LC_ALL","EN")
>>Sys.putenv("LANGUAGE"="FR");Sys.setlocale("LC_ALL","FR")
>>
>>
>>2010/11/29 Xiaobo Gu :
>>> Hi,
>>>Can you help with this.
>>>
>>> Regards,
>>>
>>> Xiaobo Gu
>>>
>>>
>>> -Original Message-
>>> From: Xiaobo Gu [mailto:guxiaobo1...@gmail.com]
>>> Sent: Wednesday, November 24, 2010 10:19 PM
>>> To: r-help@r-project.org
>>> Subject: R encoding question
>>>
>>> Hi,
>>>  I am using RpgSQL to retrieve data from a PostgreSQL database wich is
>>> with encoding UTF8, and I have some Chinese character in one of the
>>> columns, unfortunately R can't show it correctly.
>>>
>>>> df <- dbGetQuery(con, "select * from test") df
>>>  ab
>>> 1 1 椤惧皬娉\xa2
>>> 2 2   瑕冩 EURO\xa1
>>>
>>> I see the following option, do I need to change the encoding option to
>>> show the corresponding texts? In my case how to set?
>>>
>>> $encoding
>>> [1] "native.enc"
>>>
>>> Thanks,
>>> Xiaobo Gu
>>>
>>>
>>
>>
>>
>>--
>>Statistics & Software Consulting
>>GKX Group, GKX Associates Inc.
>>tel: 1-877-GKX-GROUP
>>email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: R encoding question

2010-11-30 Thread Xiaobo Gu
Do you know what values should I set to the category and locale parameters in 
order to use UTF-8 encoding in a Chinese Windows XP SP3 environment?

Sys.setlocale(category = "LC_ALL", locale = "")



Xiaobo Gu

>>-Original Message-
>>From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
>>Sent: Monday, November 29, 2010 9:27 PM
>>To: Xiaobo Gu
>>Subject: Re: FW: R encoding question
>>
>>I believe the R Sys.setlocale function only changes it in R, not the entire 
>>OS. For
>>example here we set it to German but the messages from the OS still come out
>>as English:
>>
>>> Sys.getlocale()
>>[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>States.1252;LC_MONETARY=English_United
>>States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
>>> Sys.setlocale(locale="German")
>>[1]
>>"LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC
>>_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Ge
>>rmany.1252"
>>> shell("date")
>>The current date is: 29/11/2010
>>Enter the new date: (dd-mm-yy) Warning message:
>>In shell("date") : 'date' execution failed with error code 1
>>
>>
>>
>>On Mon, Nov 29, 2010 at 8:18 AM, Xiaobo Gu 
>>wrote:
>>> But Sys.setlocale tries to change the option of the whole OS, I just want 
>>> only
>>R to use a specified encoding, how can I do this.
>>>
>>>
>>> Xiaobo.Gu
>>>
>>>
>>>>>-Original Message-
>>>>>From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
>>>>>Sent: Monday, November 29, 2010 8:57 PM
>>>>>To: Xiaobo Gu
>>>>>Subject: Re: FW: R encoding question
>>>>>
>>>>>I have never played with encodings myself.  Suggest you read the
>>>>>postgresql documentation and try different arguments to Sys.setlocale
>>>>>in R.  You probably have to do that before you initiate the database
>>>>>since it might not have any effect afterwards. I am not sure this is the
>>problem but its worth a try.
>>>>>Here are some examples.
>>>>>
>>>>>Sys.setlocale(locale="C")
>>>>>Sys.setlocale(locale="en_NZ.iso88591")
>>>>>Sys.setlocale("LC_ALL", "en_US")
>>>>>Sys.setlocale("LC_TIME", "English")
>>>>>Sys.setlocale('LC_ALL','fr_FR')
>>>>>Sys.putenv("LANGUAGE"="EN");Sys.setlocale("LC_ALL","EN")
>>>>>Sys.putenv("LANGUAGE"="FR");Sys.setlocale("LC_ALL","FR")
>>>>>
>>>>>
>>>>>2010/11/29 Xiaobo Gu :
>>>>>> Hi,
>>>>>>Can you help with this.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Xiaobo Gu
>>>>>>
>>>>>>
>>>>>> -Original Message-
>>>>>> From: Xiaobo Gu [mailto:guxiaobo1...@gmail.com]
>>>>>> Sent: Wednesday, November 24, 2010 10:19 PM
>>>>>> To: r-help@r-project.org
>>>>>> Subject: R encoding question
>>>>>>
>>>>>> Hi,
>>>>>>  I am using RpgSQL to retrieve data from a PostgreSQL database wich
>>>>>> is with encoding UTF8, and I have some Chinese character in one of
>>>>>> the columns, unfortunately R can't show it correctly.
>>>>>>
>>>>>>> df <- dbGetQuery(con, "select * from test") df
>>>>>>  ab
>>>>>> 1 1 椤惧皬娉\xa2
>>>>>> 2 2   瑕冩 EURO\xa1
>>>>>>
>>>>>> I see the following option, do I need to change the encoding option
>>>>>> to show the corresponding texts? In my case how to set?
>>>>>>
>>>>>> $encoding
>>>>>> [1] "native.enc"
>>>>>>
>>>>>> Thanks,
>>>>>> Xiaobo Gu
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>--
>>>>>Statistics & Software Consulting
>>>>>GKX Group, GKX Associates Inc.
>>>>>tel: 1-877-GKX-GROUP
>>>>>email: ggrothendieck at gmail.com
>>>
>>>
>>
>>
>>
>>--
>>Statistics & Software Consulting
>>GKX Group, GKX Associates Inc.
>>tel: 1-877-GKX-GROUP
>>email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: R encoding question

2010-11-30 Thread Xiaobo Gu
But locale "Chinese" will use GBK encoding by default, how to use UTF-8
encoding

I have tried the following, neither of them works.

Sys.setlocale(locale = "zh_CN.UTF-8") 

Sys.setlocale(category = "LC_CTYPE", locale= "zh_CN.UTF-8")


Xiaobo Gu


>>-Original Message-
>>From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
>>Sent: Tuesday, November 30, 2010 8:57 PM
>>To: Xiaobo Gu
>>Cc: r-help@r-project.org
>>Subject: Re: FW: R encoding question
>>
>>On Tue, Nov 30, 2010 at 7:30 AM, Xiaobo Gu 
>>wrote:
>>> Do you know what values should I set to the category and locale
parameters
>>in order to use UTF-8 encoding in a Chinese Windows XP SP3 environment?
>>>
>>> Sys.setlocale(category = "LC_ALL", locale = "")
>>>
>>
>>Its OS dependent but you could try:
>>
>>Sys.setlocale(locale = "Chinese")
>>
>>and
>>
>>Sys.setlocale(locale = "")
>>
>>to set it back.
>>
>>
>>--
>>Statistics & Software Consulting
>>GKX Group, GKX Associates Inc.
>>tel: 1-877-GKX-GROUP
>>email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: R encoding question

2010-11-30 Thread Xiaobo Gu
It does not work,

> Sys.setlocale(locale = "Chinese_China")
[1] "LC_COLLATE=Chinese_People's Republic of
China.936;LC_CTYPE=Chinese_People's Republic of
China.936;LC_MONETARY=Chinese_People's Republic of
China.936;LC_NUMERIC=C;LC_TIME=Chinese_People's Republic of China.936"
> library(RpgSQL)
..
> driver <- pgSQL(classPath="D:/rtemp/postgresql-8.4-702.jdbc4.zip")
> con <- dbConnect(driver, dbname="demo", host="localhost", user="postgres", 
> password="postgres", port=5432)
> df <- dbGetQuery(con, "select * from test")
> df
  a    b
1 1 椤惧皬娉\xa2
2 2   瑕冩€\xa1
>

On Tue, Nov 30, 2010 at 10:25 PM, Gabor Grothendieck
 wrote:
> On Tue, Nov 30, 2010 at 9:09 AM, Xiaobo Gu  wrote:
>> But locale "Chinese" will use GBK encoding by default, how to use UTF-8
>> encoding
>>
>> I have tried the following, neither of them works.
>>
>> Sys.setlocale(locale = "zh_CN.UTF-8")
>>
>> Sys.setlocale(category = "LC_CTYPE", locale= "zh_CN.UTF-8")
>>
>>
>
> Try this:
>
>> Sys.setlocale(locale = "Chinese_China")
> [1] "LC_COLLATE=Chinese (Simplified)_People's Republic of
> China.936;LC_CTYPE=Chinese (Simplified)_People's Republic of
> China.936;LC_MONETARY=Chinese (Simplified)_People's Republic of
> China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_People's Republic
> of China.936"
>
> Although it does not specifically indicate UTF in the output if we now
> search for Chinese_China.936 we do find the link below which suggests
> that it is a UTF locale:
>
> http://docs.moodle.org/en/Table_of_locales
>
>
>
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to specify ff object filepaths when reading a CSV file into a ff data frame.

2010-12-24 Thread Xiaobo Gu
Hi,
The read.csv.ffdf function in package ff will create the ff object
physical file in the default directories, I am trying to let the files
created in the paths users specify, I think the point is to make use
of the asffdf_args parameter,
I have a test CSV file named D:\rtemp\fftest.csv, the content of the
file is as following:

col1,col2,col3
1,"amber",2.4
2,"linda",4.5

I tried the following code, hoping ff will create the physical files
for col1,col2 and col3 to D:/a.f,D:/b.f,D:/c.f respectively

 fdf <- read.csv.ffdf(file="D:/rtemp/fftest.csv",asffdf_args = list(
col_args =  c(list(filename="D:/a.f"), list(filename="D:/b.f"),
list(filename="D:/c.f"
and the error message is :
Error in as.ff.default(1:2, vmode = NULL, filename = "D:/a.f",
filename = "D:/b.f",  :
  formal argument "filename" matched by multiple actual arguments

I also tried the following:

> fdf <- read.csv.ffdf(file="D:/rtemp/fftest.csv",asffdf_args = list( col_args 
> =  list(filename=c("D:/a.f","D:/b.f","D:/c.f"
Error in ff(initdata = initdata, length = length, levels = levels,
ordered = ordered,  :
  bad argument initdata for existing file; initializing existing file is invalid
In addition: Warning messages:
1: In if (file.exists(filename)) { :
  the condition has length > 1 and only the first element will be used
2: In if (file.exists(filename)) { :
  the condition has length > 1 and only the first element will be used
3: In if (file.access(filename, 4) == -1) { :
  the condition has length > 1 and only the first element will be used
4: In if (file.access(filename, 2) == -1) { :
  the condition has length > 1 and only the first element will be used
5: In if (is.na(filesize)) stop("unable to open file") :
  the condition has length > 1 and only the first element will be used

My questions are:
1. What's the datatype of the col_args parameter of the as.ffdf function
2. If I can make layout of the asffdf_args parameter correct, how can
I set the exact filenames for each column of the ff data frame.

Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to specify ff object filepaths when reading a CSV file into a ff data frame.

2010-12-26 Thread Xiaobo Gu
Hi, I have done another simple test, I test the two syntext against a
CSV file with only one column, both success,

> fdf <- read.csv.ffdf(file="D:/rtemp/fftest2.csv",asffdf_args = list( col_args 
> =  list(filename=c("F:/a.f"
> fdf
ffdf (all open) dim=c(2,1), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
 PhysicalName VirtualVmode PhysicalVmode  AsIs VirtualIsMatrix
PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
PhysicalIsOpen
col1 col1  integer   integer FALSE   FALSE
   FALSE 11   1
   TRUE
ffdf data
  col1
11
22


> fdf <- read.csv.ffdf(file="D:/rtemp/fftest2.csv",asffdf_args = list( col_args 
> =  c(list(filename="D:/a2.f"
> fdf
ffdf (all open) dim=c(2,1), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
 PhysicalName VirtualVmode PhysicalVmode  AsIs VirtualIsMatrix
PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
PhysicalIsOpen
col1 col1  integer   integer FALSE   FALSE
   FALSE 11       1
   TRUE
ffdf data
  col1
11
2    2
>

Regards,

Xiaobo Gu



On Fri, Dec 24, 2010 at 11:27 PM, Xiaobo Gu  wrote:
> Hi,
>    The read.csv.ffdf function in package ff will create the ff object
> physical file in the default directories, I am trying to let the files
> created in the paths users specify, I think the point is to make use
> of the asffdf_args parameter,
> I have a test CSV file named D:\rtemp\fftest.csv, the content of the
> file is as following:
>
> col1,col2,col3
> 1,"amber",2.4
> 2,"linda",4.5
>
> I tried the following code, hoping ff will create the physical files
> for col1,col2 and col3 to D:/a.f,D:/b.f,D:/c.f respectively
>
>  fdf <- read.csv.ffdf(file="D:/rtemp/fftest.csv",asffdf_args = list(
> col_args =  c(list(filename="D:/a.f"), list(filename="D:/b.f"),
> list(filename="D:/c.f"
> and the error message is :
> Error in as.ff.default(1:2, vmode = NULL, filename = "D:/a.f",
> filename = "D:/b.f",  :
>  formal argument "filename" matched by multiple actual arguments
>
> I also tried the following:
>
>> fdf <- read.csv.ffdf(file="D:/rtemp/fftest.csv",asffdf_args = list( col_args 
>> =  list(filename=c("D:/a.f","D:/b.f","D:/c.f"
> Error in ff(initdata = initdata, length = length, levels = levels,
> ordered = ordered,  :
>  bad argument initdata for existing file; initializing existing file is 
> invalid
> In addition: Warning messages:
> 1: In if (file.exists(filename)) { :
>  the condition has length > 1 and only the first element will be used
> 2: In if (file.exists(filename)) { :
>  the condition has length > 1 and only the first element will be used
> 3: In if (file.access(filename, 4) == -1) { :
>  the condition has length > 1 and only the first element will be used
> 4: In if (file.access(filename, 2) == -1) { :
>  the condition has length > 1 and only the first element will be used
> 5: In if (is.na(filesize)) stop("unable to open file") :
>  the condition has length > 1 and only the first element will be used
>
> My questions are:
> 1. What's the datatype of the col_args parameter of the as.ffdf function
> 2. If I can make layout of the asffdf_args parameter correct, how can
> I set the exact filenames for each column of the ff data frame.
>
> Regards,
>
> Xiaobo Gu
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create ff objects from database connection

2010-08-04 Thread Xiaobo Gu
Hi,

 I am sorry for that I can’t determine which R-sig list to post 
questions about package ff.

 

Now I have made a little progress with this:

 

read.dbres.ffdf <- function(

 res){

 

 data1 <- fetch(res, 0)

 if (nrow(data1) == 0){

   return (NULL)

 }

 ffd <- as.ffdf(data1)

 N <- nrow(ffd)

 

 while(!dbHasCompleted(res)){

   data1 <- fetch(res, 0)

   n <- nrow(data1)

   nrow(ffd) <- N+n;

   ffd[hi(N+1, N+n),] <- data1;

   

   N <- N+n  

 }

 return (ffd)

}

 

This function works well with result sets only containing integer and numeric 
columns, but has problem dealing with result sets with character , date ,time 
and timestamp columns , the error message is :

>  rs <- dbSendQuery(con, "select a, b from rtest")

> ffd <- read.dbres.ffdf(rs)

Error in ff(initdata = initdata, length = length, levels = levels, ordered = 
ordered,  : 

  unknown ffmode

 

 

Xiaobo.Gu

 

From: 顾小波 [mailto:guxiaobo1...@gmail.com] 
Sent: Sunday, August 01, 2010 10:04 PM
To: 'r-help@r-project.org'
Subject: How to create ff objects from database connection 

 

Hi

Does anybody know how to create ff objects with data reading from stream 
objects, such as data reading from PostgreSQL database through RPostgreSQL. For 
this purpose although we can save the data to a csv file through external tools 
and then read it through csv readers, but it requires one more data read and 
write operation, which is of high I/O cost for large datasets.

 

 

Xiaobo.Gu

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] assignment by value or reference

2010-09-05 Thread Xiaobo Gu
Hi Team,

 Can you please tell me the rules of assignment in R, by value or by 
reference.

>From my about 3 months of experience of part time job of R, it seems most 
>times it is by value, especially in function parameter and return values 
>assignment; and it is by reference when referencing container sub-objects of 
>container objects, such as elements of List objects and row/column objects of 
>DataFrame objectes; but it is by value when referencing the smallest unit of 
>element of a container object, such as cell of data frame objects.

 

 

Xiaobo.Gu

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Does anyone running the latest version of R on IBM AIX 5.3?

2011-06-03 Thread Xiaobo Gu
Because the major processing of R is single threaded, I think
computers such as IBM P servers with higher CPU Clock Speed will
achieve better performance.

Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to assess the accuracy of fitted logistic regression using glm

2011-06-06 Thread Xiaobo Gu
Hi,

I am trying glm with family = binomial to do binary logistic
regression, but how can I assess the accuracy of the fitted model, the
summary method can print a lot of information about the returned
object, such as coefficients, because statistics is not my speciality,
so can you share some rule of thumb to exam the  fitted model from the
practical perspective.

Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to assess the accuracy of fitted logistic regression using glm

2011-06-09 Thread Xiaobo Gu
Hi Professor Brian,

Thanks for your reply.

I think there are many statisticians here, and it is somehow R
related, hoping someone can
help me.

I have done a simple test, using a sample csv data which I post if need.

donut <- read.csv(file="D:/donut.csv", header = TRUE);
donut[["color"]] <- as.factor(donut[["color"]])
donut[["shape"]] <- as.factor(donut[["shape"]])
donut[["k"]] <- as.factor(donut[["k"]])
donut[["k0"]] <- as.factor(donut[["k0"]])
donut[["bias"]] <- as.factor(donut[["bias"]])

lr <- glm(color ~ shape + x + y, family = binomial, data = donut);
summary(lr)

Call:
glm(formula = color ~ shape + x + y, family = binomial, data = donut)

Deviance Residuals:
Min   1Q   Median   3Q  Max
-2.1079  -0.9476   0.5086   0.7518   1.4079

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  2.530101.65500   1.529   0.1263
shape22  0.056281.54990   0.036   0.9710
shape23 -0.745681.44813  -0.515   0.6066
shape24 -2.618961.38016  -1.898   0.0578 .
shape25 -2.076481.32818  -1.563   0.1180
x   -0.458851.52863  -0.300   0.7640
y   -0.593111.46999  -0.403   0.6866
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 50.446  on 39  degrees of freedom
Residual deviance: 42.473  on 33  degrees of freedom
AIC: 56.473

Number of Fisher Scoring iterations: 4

In the Coefficients section, is Pr(>|z|) the P-value for that
variable, and there
are a few other questions:
1. How to determine the predict power of each variables?
2. How to determine the overall performance of the fitted model, here what's the
difference between and "Deviance Residuals" and "Residual deviance"?
3. How to compare "Null deviance" and "Residual deviance"?
4. What does AIC mean, and how to use this measure?
5. What does the Signif. codes section mean?

Regards,

Xiaobo Gu



On Mon, Jun 6, 2011 at 9:59 PM, Prof Brian Ripley  wrote:
> On Mon, 6 Jun 2011, Xiaobo Gu wrote:
>
>> Hi,
>>
>> I am trying glm with family = binomial to do binary logistic
>> regression, but how can I assess the accuracy of the fitted model, the
>> summary method can print a lot of information about the returned
>> object, such as coefficients, because statistics is not my speciality,
>> so can you share some rule of thumb to exam the  fitted model from the
>> practical perspective.
>
> It depends entirely on why you did the fit.  People have written whole books
> on assessing the performance of classification procedures such as binary
> logistic regression.  For example, the residual deviance is closely related
> to log-probability scoring: for some purposes that is a good performance
> measure, for others (e.g. when you are going to threshold the predicted
> probabilities) it can be very misleading.
>
> In short, you need statistical advice, not R advice (the purpose of this
> list).
>
>>
>> Regards,
>>
>> Xiaobo Gu
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Brian D. Ripley,                  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assignment by value or reference

2011-03-08 Thread Xiaobo Gu
On Wed, Sep 15, 2010 at 5:05 PM, Uwe Ligges
 wrote:
> See the R Language Definition manual. Since R knows about lazy evaluation,
> it is sometimes neither by reference nor by value.
> If you want to think binary, then "by value" fits better than "by
> reference".
Hi,
Can we think it's eventually by value?

For simple functions such as:
is(df[[1]], "logical")
used to test wheather the first column of data frame df is of type
logical, will a new vector be created and used inside the is function?

Another example,

dbWriteTable(con, "tablename", df) will write the content of data
frame df into a database table, will a new data frame object created
and used inside the dbWriteTable function?

Thanks.


>
> Uwe Ligges
>
>
>
> On 05.09.2010 17:19, Xiaobo Gu wrote:
>>
>> Hi Team,
>>
>>          Can you please tell me the rules of assignment in R, by value or
>> by reference.
>>
>>> From my about 3 months of experience of part time job of R, it seems most
>>> times it is by value, especially in function parameter and return values
>>> assignment; and it is by reference when referencing container sub-objects of
>>> container objects, such as elements of List objects and row/column objects
>>> of DataFrame objectes; but it is by value when referencing the smallest unit
>>> of element of a container object, such as cell of data frame objects.
>>
>>
>>
>>
>>
>> Xiaobo.Gu
>>
>>
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assignment by value or reference

2011-03-08 Thread Xiaobo Gu
On Wed, Sep 15, 2010 at 5:05 PM, Uwe Ligges
 wrote:
> See the R Language Definition manual. Since R knows about lazy evaluation,
> it is sometimes neither by reference nor by value.
> If you want to think binary, then "by value" fits better than "by
> reference".
Hi,
Can we think it's eventually by value?

For simple functions such as:
is(df[[1]], "logical")
used to test wheather the first column of data frame df is of type
logical, will a new vector be created and used inside the is function?

Another example,

dbWriteTable(con, "tablename", df) will write the content of data
frame df into a database table, will a new data frame object created
and used inside the dbWriteTable function?

Thanks.


>
> Uwe Ligges
>
>
>
> On 05.09.2010 17:19, Xiaobo Gu wrote:
>>
>> Hi Team,
>>
>>          Can you please tell me the rules of assignment in R, by value or
>> by reference.
>>
>>> From my about 3 months of experience of part time job of R, it seems most
>>> times it is by value, especially in function parameter and return values
>>> assignment; and it is by reference when referencing container sub-objects of
>>> container objects, such as elements of List objects and row/column objects
>>> of DataFrame objectes; but it is by value when referencing the smallest unit
>>> of element of a container object, such as cell of data frame objects.
>>
>>
>>
>>
>>
>> Xiaobo.Gu
>>
>>
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] No response after click the "show Rules" button on Tab "Associate".

2011-03-09 Thread Xiaobo Gu
Hi,
I am using Rattle 2.6.4 with R 2.12.2 on win64, is this a bug ?

Following is the content after execute the associate analysis process:

Summary of the Apriori Association Rules:

Number of Rules: 23351

Summary of the Measures of Interestingness:

support confidence  lift
 Min.   :0.1250   Min.   :1Min.   :2.667
 1st Qu.:0.1250   1st Qu.:11st Qu.:2.667
 Median :0.1250   Median :1Median :4.000
 Mean   :0.1314   Mean   :1Mean   :3.983
 3rd Qu.:0.1250   3rd Qu.:13rd Qu.:4.000
 Max.   :0.3750   Max.   :1Max.   :8.000

Summary of the Execution of the Apriori Command:

parameter specification:
 confidence minval smax arem  aval originalSupport support minlen
maxlen target   ext
0.80.11 none FALSETRUE 0.1  1
10  rules FALSE

algorithmic control:
 filter tree heap memopt load sort verbose
0.1 TRUE TRUE  FALSE TRUE2TRUE

apriori - find association rules with the apriori algorithm
version 4.21 (2004.05.09)(c) 1996-2004   Christian Borgelt
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[35 item(s), 8 transaction(s)] done [0.00s].
sorting and recoding items ... [35 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 5 6 7 8 9 10 done [0.00s].
writing ... [23351 rule(s)] done [0.00s].
creating S4 object  ... done [0.01s].

Time taken: 0.01 secs

Rattle timestamp: 2011-03-09 23:00:14 dell
==


Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] No response after click the "show Rules" button on Tab "Associate".

2011-03-11 Thread Xiaobo Gu
On Fri, Mar 11, 2011 at 2:55 AM, Graham Williams
 wrote:
> Did you scroll down the window to see the rules?
OK, it takes a long time for rattle to show the rules, about 30
seconds, and why the message on the status bar is "the decision tree
model has been built. Time taken:0.01 secs", I have attached the
little dataset used to do the test.

Another question, why the confidence of the rules are all 1 ?

>
> Regards,
> Graham
>
>
>
>
> On 10 March 2011 02:07, Xiaobo Gu  wrote:
>> Hi,
>> I am using Rattle 2.6.4 with R 2.12.2 on win64, is this a bug ?
>>
>> Following is the content after execute the associate analysis process:
>>
>> Summary of the Apriori Association Rules:
>>
>> Number of Rules: 23351
>>
>> Summary of the Measures of Interestingness:
>>
>>    support         confidence      lift
>>  Min.   :0.1250   Min.   :1    Min.   :2.667
>>  1st Qu.:0.1250   1st Qu.:1    1st Qu.:2.667
>>  Median :0.1250   Median :1    Median :4.000
>>  Mean   :0.1314   Mean   :1    Mean   :3.983
>>  3rd Qu.:0.1250   3rd Qu.:1    3rd Qu.:4.000
>>  Max.   :0.3750   Max.   :1    Max.   :8.000
>>
>> Summary of the Execution of the Apriori Command:
>>
>> parameter specification:
>>  confidence minval smax arem  aval originalSupport support minlen
>> maxlen target   ext
>>        0.8    0.1    1 none FALSE            TRUE     0.1      1
>> 10  rules FALSE
>>
>> algorithmic control:
>>  filter tree heap memopt load sort verbose
>>    0.1 TRUE TRUE  FALSE TRUE    2    TRUE
>>
>> apriori - find association rules with the apriori algorithm
>> version 4.21 (2004.05.09)        (c) 1996-2004   Christian Borgelt
>> set item appearances ...[0 item(s)] done [0.00s].
>> set transactions ...[35 item(s), 8 transaction(s)] done [0.00s].
>> sorting and recoding items ... [35 item(s)] done [0.00s].
>> creating transaction tree ... done [0.00s].
>> checking subsets of size 1 2 3 4 5 6 7 8 9 10 done [0.00s].
>> writing ... [23351 rule(s)] done [0.00s].
>> creating S4 object  ... done [0.01s].
>>
>> Time taken: 0.01 secs
>>
>> Rattle timestamp: 2011-03-09 23:00:14 dell
>> ==
>>
>>
>> Xiaobo Gu
>>
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] No response after click the "show Rules" button on Tab "Associate".

2011-03-11 Thread Xiaobo Gu
On Sat, Mar 12, 2011 at 10:02 AM, Graham Williams
 wrote:
> On 12 March 2011 00:07, Xiaobo Gu  wrote:
>> On Fri, Mar 11, 2011 at 2:55 AM, Graham Williams
>>  wrote:
>>> Did you scroll down the window to see the rules?
>> OK, it takes a long time for rattle to show the rules, about 30
>> seconds, and why the message on the status bar is "the decision tree
>> model has been built. Time taken:0.01 secs", I have attached the
>> little dataset used to do the test.
>
> I've fixed the incorrect "decision tree" text. Thanks.
>
> For the other issue see below. And thanks for supplying a repeatable
> sample and dataset.
>
>> Another question, why the confidence of the rules are all 1 ?
>
>>> On 10 March 2011 02:07, Xiaobo Gu  wrote:
>
>>>> set transactions ...[35 item(s), 8 transaction(s)] done [0.00s].
>
> That does not look right?

I think it's because there are to few sample records, so all the rules
are with 100% confidence

> Be sure to make "items" your Target and
> "tx_no" your Ident rather than the other way around.

How to save the rules and use the rules to do product recommandations
with Rattle?
>
> Regards,
> Graham
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Does RHIPE support running complex mining models in on top of Hadoop?

2011-03-25 Thread Xiaobo Gu
Such as logistic regression, decision trees.

Or is that RHIPE only support MapReduce style algorithms.


Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] write.arff function in package foreign can't handle Date time

2011-11-19 Thread Xiaobo Gu
Hi,

 x1 <- c(as.Date("20110101","%Y%m%d"),as.Date("2012-01-01","%Y-%m-%d")); 
 x2 <- c("1","2"); 

ddf <- data.frame(x=x1,y=x2); 
ddf[["y"]] <- as.factor(ddf[["y"]])

 write.arff(ddf, file="D:/ddf.arff") 

Content of ddf.arff is

@relation ddf
@attribute x numeric
@attribute y {'1','2'}
@data
2011-01-01,'1'
2012-01-01,'2'


Here x is of type Date, but write.arff write it as numeric, but the actual 
content is string, the result file is not with valid ARFF file format(Weka 
can't read it in).
I think write.arff should write it as @attribute x DATE "-MM-dd" 


Regards





Xiaobo Gu
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] as.factor does not work inside function

2011-12-10 Thread Xiaobo Gu
Hi,

I am trying to write a function do cast columns of data frame as
factor in a loop, the source is :


as.factor.loop <- function(df, cols){

if (!is.null(df) && !is.null(cols) && length(cols) > 0)
{
for(col in cols)
{
df[[col]] <- as.factor(df[[col]])
}
}
}


source('D:/ambertuil.r')
x <- 1:5
y <- 2:6
df <- data.frame(x=x, y=y)
as.factor.loop(df, c("x"))

But after the function call, the df data frame does not change,
because

is.factor(df[["x]])
FALSE

But if I call this in R console directlly, it works

for(col in c("x","y")){df[[col]] <- as.factor(df[[col]])}



is.factor(df[["x]])
FALSE


Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.factor does not work inside function

2011-12-10 Thread Xiaobo Gu
I am sorry, it is

 for(col in c("x","y")){df[[col]] <- as.factor(df[[col]])}
 is.factor(df[["x]])
TRUE


On Sun, Dec 11, 2011 at 10:06 AM, Xiaobo Gu  wrote:
> Hi,
>
> I am trying to write a function do cast columns of data frame as
> factor in a loop, the source is :
>
>
> as.factor.loop <- function(df, cols){
>
>        if (!is.null(df) && !is.null(cols) && length(cols) > 0)
>        {
>                for(col in cols)
>    {
>                        df[[col]] <- as.factor(df[[col]])
>                }
>        }
> }
>
>
> source('D:/ambertuil.r')
> x <- 1:5
> y <- 2:6
> df <- data.frame(x=x, y=y)
> as.factor.loop(df, c("x"))
>
> But after the function call, the df data frame does not change,
> because
>
> is.factor(df[["x]])
> FALSE
>
> But if I call this in R console directlly, it works
>
> for(col in c("x","y")){df[[col]] <- as.factor(df[[col]])}
>
>
>
> is.factor(df[["x]])
> FALSE
>
>
> Regards,
>
> Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.factor does not work inside function

2011-12-10 Thread Xiaobo Gu
Hi Josh,

Your suggesstion works, and the following also works:
as.factor.loop <- function(df, cols){

if (!is.null(df) && !is.null(cols) && length(cols) > 0)
{
for(col in cols)
{
df[[col]] <- as.factor(df[[col]])
}
}
  df
}

> df <- as.factor.loop(df, c("x","y"))
> is.factor(df[["y"]])
[1] TRUE

Thanks.

Xiaobo Gu

On Sun, Dec 11, 2011 at 10:23 AM, Joshua Wiley  wrote:
> Hi Xiaobo,
>
> The problem is that your function is not assigning the results to your
> data frame---df is an internatl copy made by the function.  This is
> done to prevent calling functions to have unexpected events such as
> overwriting objects in the global environment.  Anyway, I think you
> can accomplish what you want using lapply():
>
> ## your data
> df <- data.frame(x=1:5, y=2:6)
> ## apply the function, as.factor() to all the elements in the first argument
> ## and save the results in the relevant columns of df
> df["x"] <- lapply(df["x"], as.factor)
>
> ## check results
> is.factor(df[, "x"])
>
> Hope this helps,
>
> Josh
>
>
> On Sat, Dec 10, 2011 at 6:06 PM, Xiaobo Gu  wrote:
>> Hi,
>>
>> I am trying to write a function do cast columns of data frame as
>> factor in a loop, the source is :
>>
>>
>> as.factor.loop <- function(df, cols){
>>
>>        if (!is.null(df) && !is.null(cols) && length(cols) > 0)
>>        {
>>                for(col in cols)
>>    {
>>                        df[[col]] <- as.factor(df[[col]])
>>                }
>>        }
>> }
>>
>>
>> source('D:/ambertuil.r')
>> x <- 1:5
>> y <- 2:6
>> df <- data.frame(x=x, y=y)
>> as.factor.loop(df, c("x"))
>>
>> But after the function call, the df data frame does not change,
>> because
>>
>> is.factor(df[["x]])
>> FALSE
>>
>> But if I call this in R console directlly, it works
>>
>> for(col in c("x","y")){df[[col]] <- as.factor(df[[col]])}
>>
>>
>>
>> is.factor(df[["x]])
>> FALSE
>>
>>
>> Regards,
>>
>> Xiaobo Gu
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] incomplete final line found warning

2011-12-10 Thread Xiaobo Gu
Hi,

I saved the following as a UTF-8 encoded file named amberutil.r

as.factor.loop <- function(df, cols){

if (!is.null(df) && !is.null(cols) && length(cols) > 0)
{
for(col in cols)
{
df[[col]] <- as.factor(df[[col]])
}
}
  df
}

And got this warning message,

> source('D:/ambertuil.r')
Warning message:
In readLines(file) : incomplete final line found on 'D:/ambertuil.r'

Can you help with this?

Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] incomplete final line found warning

2011-12-11 Thread Xiaobo Gu
On Sun, Dec 11, 2011 at 5:03 PM, Prof Brian Ripley
 wrote:
> On Sun, 11 Dec 2011, David Winsemius wrote:
>
>>
>> On Dec 10, 2011, at 10:01 PM, Xiaobo Gu wrote:
>
>
> without following the posting guide in several respects and hence leaving us
> guessing 
>
>
>>> Hi,
>>>
>>> I saved the following as a UTF-8 encoded file named amberutil.r
>
>
> BTW, it is hard to know how you know that ASCII is encoded as UTF-8, and on
> Windows (which from the file path it appears to be) it would not have worked
> had it been UTF-8 encoded.  Let's hope this did not mean what Windows calls
> 'Unicode', that is UTF-16LE.

I use RStudio to edit the source file, there is as save as encoding
option, and I chose UTF-8

>
>>>
>>> as.factor.loop <- function(df, cols){
>>>
>>>        if (!is.null(df) && !is.null(cols) && length(cols) > 0)
>>>        {
>>>                for(col in cols)
>>>  {
>>>                        df[[col]] <- as.factor(df[[col]])
>>>                }
>>>        }
>>> df
>>> }
>>>
>>> And got this warning message,
>>>
>>>> source('D:/ambertuil.r')
>>>
>>> Warning message:
>>> In readLines(file) : incomplete final line found on 'D:/ambertuil.r'
>>>
>>> Can you help with this?
>>
>>
>> Help with what? You got a warning. And it had information that should tell
>> you how to edit the file if the warning bothers you.

Can you help finding the reason about this warning.

>
>
> Also, we were not told the version of R.  Updating (as requested by the
> posting guide prior to posting) would most likely remove the harmless
> warning (AFAIK it occurs only in 2.14.0 and not in R-patched) if this were
> an ASCII file.

I am using R 2.14.0 64 bit on Windows.

>
> --
> Brian D. Ripley,                  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about escapes character in args parameter of the system2 function.

2011-12-14 Thread Xiaobo Gu
Hi,

I am trying to use the system2 function to execute external
applications(actually it is psql) inside R 2.14.0 on X64 Windows, but
the psql command has escape character inside it's full command line,
can you help to figure out the correct parameters for the system2
function,

The working command is :

psql -h 192.168.72.7 -U gpadmin -w -d miner_demo -c"\copy demo.store
to 'd:\store.csv' with csv header"

and my R code is:
  psql <- "psql.exe"
  gphost <- "192.168.72.7"
  gpuser <- "gpadmin"
  gpdb <- "miner_demo"

  copycmd <- "-c \"\\copy demo.store to 'd:\\store.csv' with csv header\""
  args <- c("-h", gphost, "-U", gpuser, "-w", "-d", gpdb, copycmd )
  system2(psql, args)

And the return value of system2 function is 127.

Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Where to download the splines package.

2012-01-08 Thread Xiaobo Gu
Hi,

install.packages("splines")

Warning in install.packages :
  package ‘splines’ is not available (for R version 2.14.1)


Regards,

Xiaobo Gu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where to download the splines package.

2012-01-09 Thread Xiaobo Gu
Thanks, reinstalling R 2.14.1 fixes this problem.

On Mon, Jan 9, 2012 at 8:31 PM, Duncan Murdoch  wrote:
>
> On 12-01-08 9:56 PM, Xiaobo Gu wrote:
>>
>> Hi,
>>
>> install.packages("splines")
>>
>> Warning in install.packages :
>>   package ‘splines’ is not available (for R version 2.14.1)
>
>
> splines is a base package, i.e. it is part of R, so you already have it in 
> the latest version.  It won't be updated other than when R is updated.
>
> Duncan Murdoch
>
>>
>>
>> Regards,
>>
>> Xiaobo Gu
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to adjust the stack size of R

2012-03-29 Thread Xiaobo Gu
Hi,

I got a stack overflow error when training a glm model with a very long
formula.


Regards,

Xiaobo Gu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to adjust the stack size of R

2012-03-31 Thread Xiaobo Gu
2012/3/31 Uwe Ligges 

>
>
> On 30.03.2012 03:16, Xiaobo Gu wrote:
>
>> Hi,
>>
>> I got a stack overflow error when training a glm model with a very long
>> formula.
>>
>
> I just tried with a formula of length 1000. How long was yours?
> Which version of R? Where is the repdroducible example?
>
>
We have a glm formula with 76251 terms, the text of the formula is about
150K, we are using R 2.14.2

Xiaobo Gu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to adjust the stack size of R

2012-04-01 Thread Xiaobo Gu


>>You would need do compile your own copy of R and increase the stack 
>>size, 
Can we do this at runtime?


>>You need at least 76252 obs and that means the design matrix needs > 46 
>>Gbyte! Hence a sensible calculation is not really possible unless you 
>>have really big machines around.
We do have big machines, but we don't have more obs for this data, so we will 
revise the algorithm to try less combinations data set size.


Regards,

Xiaobo Gu
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to adjust the stack size of R

2012-04-02 Thread Xiaobo Gu

>>> You would need do compile your own copy of R and increase the stack
>>> size,
> Can we do this at runtime?

>That depends on your unstated OS.  Uwe gave you an answer for Windows. 
>On many other OSes, the stack size is set up by the OS when the 
>process is started, and your OS documentation will tell you how to 
>increase it.

We are running R on Windows 7 64bit Home basic and Windows Server 2003 64bit


Xiaobo Gu
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What does the : operator mean in glm formulas

2012-01-19 Thread Xiaobo Gu
Hi,

I see the following is the credit scoreing in R guide :

m2<-glm(formula = good_bad ~ checking + duration + history+ purpose +amount + 
savings + employed + installp + marital +
coapp +age + other + depends + telephon + foreign +checking:amount

What does checking:amount mean?

Regards,

Xiaobo Gu
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What does the : operator mean in glm formulas

2012-01-19 Thread Xiaobo Gu
>Such a model consists of a series of terms separated by +operators. 
In the above ,term means individual variable.

>The terms themselves consist of variable and factor names separated by : 
>operators. 
What does term mean in this?

>Such a term is interpreted as the interaction of all the variables and factors 
>appearing in the term.

What does interaction mean, and what does term mean here ?






Xiaobo Gu

From: David Winsemius
Date: 2012-01-19 21:46
To: guxiaobo1982
CC: r-help; ds5j
Subject: Re: [R] What does the : operator mean in glm formulas

On Jan 19, 2012, at 8:02 AM, Xiaobo Gu wrote:

> Hi,
>
> I see the following is the credit scoreing in R guide :
>
> m2<-glm(formula = good_bad ~ checking + duration + history+ purpose  
> +amount + savings + employed + installp + marital +
> coapp +age + other + depends + telephon + foreign +checking:amount
>
> What does checking:amount mean?

?formula

-- 

David Winsemius, MD
West Hartford, CT
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.