date:20130313

Re: [R] Feature selection package for text mining

2013-03-13 Thread C.H.

FSelector

Maybe chi-sq is a good starting point.

On Wed, Mar 13, 2013 at 2:02 PM, Venkata Satish Basva
 wrote:
> Hi,
> I am doing a project on authorship attribution, where my term document
> matrix has around 10450 features.
> Can you please suggest me a package where I can find the feature selection
> function to reduce the dimensions.
>
> Regards,
> Venkata Satish Basva
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] string split at xth position

2013-03-13 Thread Johannes Radinger

Hi,

I have a vector of strings like:
c("a1b1","a2b2","a1b2") which I want to spilt into two parts like:
c("a1","a2","a2") and c("b1","b2,"b2"). So there is
always a first part with a+number and a second part with b+number.
Unfortunately there is no separator I could use to directly split
the vectors.. Any idea how to handle such cases?

/Johannes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] string split at xth position

2013-03-13 Thread Jorge I Velez

Dear Johannes,

May not be the best way, but this looks like what you described:

> x <- c("a1b1","a2b2","a1b2")
> x
[1] "a1b1" "a2b2" "a1b2"
> substr(x, 1, 2)
[1] "a1" "a2" "a1"
> substr(x, 3, 4)
[1] "b1" "b2" "b2"

HTH,
Jorge.-


On Wed, Mar 13, 2013 at 7:37 PM, Johannes Radinger <> wrote:

> Hi,
>
> I have a vector of strings like:
> c("a1b1","a2b2","a1b2") which I want to spilt into two parts like:
> c("a1","a2","a2") and c("b1","b2,"b2"). So there is
> always a first part with a+number and a second part with b+number.
> Unfortunately there is no separator I could use to directly split
> the vectors.. Any idea how to handle such cases?
>
> /Johannes
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread Marc Girondot




On 13/03/13 10:20, meng wrote:

Hi all:
Is there a plot tool to use different color indicates difference 
magnitude of data?

The plot is in the attachment.

Many thanks.

No sure what you want exactly as there is no attachment here.
Here are an example of what can be done:

x <- 1:128
y <- rnorm(128, 10, 2)

z <- x+y

nbcol <- heat.colors(128)

# standardize z to be from 1 to 128
zcol <-  ((z-min(z))/(max(z)-min(z)))*127+1

# different examples
plot(x, y, col=nbcol[zcol], pch=".", cex=10, bty="n")
plot(x, y, col=nbcol[zcol], pch=".", cex=zcol/10, bty="n")
plot(x, y, col=nbcol[128-zcol], pch=".", cex=10, bty="n")

Sincerly

Marc


--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add a continuous color ramp legend to a 3d scatter plot

2013-03-13 Thread Marc Girondot

Le 12/03/13 23:43, Zhuoting Wu a écrit :
> I have a 3 column dataset x,y,z, and I plotted a 3d scatter plot using:
>
> cols <- myColorRamp(c(topo.colors(10)),z)
> plot3d(x=x, y=y, z=z, col=cols)
>
> I wanted to add a legend to the 3d plot showing the color ramp. Any help
> will be greatly appreciated!
>
>
Look at the package fields:
?colorbar.plot

Marc

-- 
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change the Chinese/English fonts in the Lattice graphic package

2013-03-13 Thread Prof Brian Ripley

This is little to do with lattice (and 'font=2' is for base graphics: 
lattice is based on grid).  You may well not have bold or italic Chinese 
fonts.


Font families are set for the device.  You have not followed the posting 
guide: it asked for 'at a minimum' information about your setup.  So we 
cannot even guess what graphics device you used.


On 13/03/2013 03:14, jpm miao wrote:

Hi,

I am graphing with the following command in the Lattice and LatticeExtra
package

xyplot(xts,lty=c(1,2),col=c("blue","red"),type=c("l","g"),par.settings =
list(layout.heights = list(panel = c(2, 2))), aspect="xy",xlab="",ylab="%",
key=key1,screen=list(a,a,b,b,c,c,d,d), layout=c(2,2),
scales=list(x="same",y="same"))

where a, b, c, d contains English and Chinese characters

How can I modify the preceding line to change the fonts?

(1) I find from the manual that font=2 represents boldface, while 3
represents italics. Where should I add it?

(2) The default Chinese character seems to be ²Ó©úÅé (Shi Ming Tee). How
can I change it to Kai -Shu (·¢®Ñ)?

Thanks,

Miao

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] string split at xth position

2013-03-13 Thread Johannes Radinger

Thank you Jorge!

thats working perfectly...

/johannes



On Wed, Mar 13, 2013 at 9:45 AM, Jorge I Velez  wrote:
> Dear Johannes,
>
> May not be the best way, but this looks like what you described:
>
>> x <- c("a1b1","a2b2","a1b2")
>> x
> [1] "a1b1" "a2b2" "a1b2"
>> substr(x, 1, 2)
> [1] "a1" "a2" "a1"
>> substr(x, 3, 4)
> [1] "b1" "b2" "b2"
>
> HTH,
> Jorge.-
>
>
> On Wed, Mar 13, 2013 at 7:37 PM, Johannes Radinger <> wrote:
>>
>> Hi,
>>
>> I have a vector of strings like:
>> c("a1b1","a2b2","a1b2") which I want to spilt into two parts like:
>> c("a1","a2","a2") and c("b1","b2,"b2"). So there is
>> always a first part with a+number and a second part with b+number.
>> Unfortunately there is no separator I could use to directly split
>> the vectors.. Any idea how to handle such cases?
>>
>> /Johannes
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rugarch: GARCH with Johnson Su innovations

2013-03-13 Thread Patrick Burns


You want to give returns rather than prices to the
garch fitting function.  Log returns are more
appropriate than simple returns.

Actually a negative lambda is what I would expect.
Higher volatility (across time) is usually associated
with lower returns.  The risk premium is more likely
a cross-sectional phenomenon than a time series one.

Some people are not so believing in risk premia in
the first place -- see for instance, Eric Falkenstein.

This would have been better sent to the r-sig-finance
list.

Pat


On 12/03/2013 12:15, Wyss Patrick wrote:

Hey,

I'm trying to implement a GARCH model with Johnson-Su innovations in order to 
simulate returns of financial asset. The model should look like this:

r_t = alpha + lambda*sqrt(h_t) + sqrt(h_t)*epsilon_t
h_t = alpha0 + alpha1*epsilon_(t-1)^2 + beta1 * h_(t-1).

Alpha refers to a risk-free return, lambda to the risk-premium.

I've implemented it like this:

#specification of the model
spec = ugarchspec(variance.model = list(model = "sGARCH",
garchOrder = c(1,1), submodel = NULL, external.regressors =
NULL, variance.targeting = FALSE), mean.model = list(
armaOrder = c(0,0), include.mean = TRUE, archm = TRUE, archpow = 1,
arfima = FALSE, external.regressors = NULL, archex = FALSE),
distribution.model = "jsu", start.pars = list(), fixed.pars = list())

#fit the model to historical closing price (prices)
fit = ugarchfit(data = prices, spec = spec)

#save coefficients of the fitted model into 'par'
par <- coef(fit)
m = coef(fit)["mu"]
lambda = coef(fit)["archm"]
gamma = coef(fit)["skew"]
delta = coef(fit)["shape"]
#GARCH parameter
a0 = coef(fit)["omega"]
a1 = coef(fit)["alpha1"]
b1 = coef(fit)["beta1"]

My problem is that I often get negative values for lambda, i.e. for the 
intended risk-premium. So I'm wondering if I've made a mistake in the 
implementation, as one would usually expect a positive lambda.
And a second question is about the Johnson-Su distribution: Am I right by extracting the Johnson-Su 
parameters gamma (delta) by the keywords "skew" ("shape")?

Many thanks in advance,

Patrick


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merge datas

2013-03-13 Thread catalin roibu

Hello all!
I have a problem with R. I try to merge data like this:
structure(c(2.1785, 1.868, 2.1855, 2.5175, 2.025, 2.435, 1.809,
1.628, 1.327, 1.3485, 1.4335, 2.052, 2.2465, 2.151, 1.7945, 1.79,
1.6055, 1.616, 1.633, 1.665, 2.002, 2.152, 1.736, 1.7985, 1.9155,
1.7135, 1.548, 1.568, 1.713, 2.079, 1.875, 2.12, 2.072, 1.906,
1.4645, 1.3025, 1.407, 1.5445, 1.437, 1.463, 1.5235, 1.609, 1.738,
1.478, 1.573, 1.0465, 1.429, 1.632, 1.814, 1.933, 1.63, 1.482,
1.466, 1.4025, 1.6055, 1.279, 1.827, 1.201, 1.425, 1.678, 1.5535,
1.599, 1.826, 1.964, 1.68, 1.492, 1.509, 1.666, 1.5665, 1.666,
1.4885, 1.8205, 1.5965, 1.84, 1.551, 1.4835, 1.805, 1.7145, 1.902,
1.2085, 0.9095, 0.9325, 1.34, 1.6135, 1.5825, 1.757, 1.7105,
1.3115, 1.288, 1.567, 1.7795, 1.642, 1.4375, 1.4495, 1.4225,
1.4885, 1.251, 1.179, 1.188, 1.3605, 1.373, 1.2185, 1.405, 1.016,
0.979, 1.018, 1.0335, 1.39, 1.3005, 1.3955, 1.301, 1.6475, 1.1945,
1.3215, 1.0535, 1.1645, 1.0895, 1.041, 1.155, 1.322, 1.1615,
0.933, 1.1215, 1.022, 0.922, 0.8465, 1.103, 1.1375, 1.23, 1.289,
1.222, 1.4865, 1.4025, 1.4295, 1.156, 0.9085, 0.8755, 0.9135,
0.982, 1.145, 1.1295, 1.3475, 1.2415, 1.2505), .Names = c("1868",
"1869", "1870", "1871", "1872", "1873", "1874", "1875", "1876",
"1877", "1878", "1879", "1880", "1881", "1882", "1883", "1884",
"1885", "1886", "1887", "1888", "1889", "1890", "1891", "1892",
"1893", "1894", "1895", "1896", "1897", "1898", "1899", "1900",
"1901", "1902", "1903", "1904", "1905", "1906", "1907", "1908",
"1909", "1910", "1911", "1912", "1913", "1914", "1915", "1916",
"1917", "1918", "1919", "1920", "1921", "1922", "1923", "1924",
"1925", "1926", "1927", "1928", "1929", "1930", "1931", "1932",
"1933", "1934", "1935", "1936", "1937", "1938", "1939", "1940",
"1941", "1942", "1943", "1944", "1945", "1946", "1947", "1948",
"1949", "1950", "1951", "1952", "1953", "1954", "1955", "1956",
"1957", "1958", "1959", "1960", "1961", "1962", "1963", "1964",
"1965", "1966", "1967", "1968", "1969", "1970", "1971", "1972",
"1973", "1974", "1975", "1976", "1977", "1978", "1979", "1980",
"1981", "1982", "1983", "1984", "1985", "1986", "1987", "1988",
"1989", "1990", "1991", "1992", "1993", "1994", "1995", "1996",
"1997", "1998", "1999", "2000", "2001", "2002", "2003", "2004",
"2005", "2006", "2007", "2008", "2009", "2010", "2011"))

with a vector like this: extr<-c(1834,1876,1901,1928,2006)
The results must be like this:
row.names MCG3 extr

only for the extr values.

is possible to do this with R?

Thank you!
-- 
---
Catalin-Constantin ROIBU
Forestry engineer, PhD
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
   +4 0766 71 76 58
FAX:+4 0230 52 16 64
silvic.usv.ro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merge datas

2013-03-13 Thread Jorge I Velez

Dear Catalun,

If I understood your description, please see ?"%in%" and try

subset(x, names(x) %in% c(1834,1876,1901,1928,2006) )

where "x" is your data.

HTH,
Jorge.-


On Wed, Mar 13, 2013 at 9:25 PM, catalin roibu <> wrote:

> Hello all!
> I have a problem with R. I try to merge data like this:
> structure(c(2.1785, 1.868, 2.1855, 2.5175, 2.025, 2.435, 1.809,
> 1.628, 1.327, 1.3485, 1.4335, 2.052, 2.2465, 2.151, 1.7945, 1.79,
> 1.6055, 1.616, 1.633, 1.665, 2.002, 2.152, 1.736, 1.7985, 1.9155,
> 1.7135, 1.548, 1.568, 1.713, 2.079, 1.875, 2.12, 2.072, 1.906,
> 1.4645, 1.3025, 1.407, 1.5445, 1.437, 1.463, 1.5235, 1.609, 1.738,
> 1.478, 1.573, 1.0465, 1.429, 1.632, 1.814, 1.933, 1.63, 1.482,
> 1.466, 1.4025, 1.6055, 1.279, 1.827, 1.201, 1.425, 1.678, 1.5535,
> 1.599, 1.826, 1.964, 1.68, 1.492, 1.509, 1.666, 1.5665, 1.666,
> 1.4885, 1.8205, 1.5965, 1.84, 1.551, 1.4835, 1.805, 1.7145, 1.902,
> 1.2085, 0.9095, 0.9325, 1.34, 1.6135, 1.5825, 1.757, 1.7105,
> 1.3115, 1.288, 1.567, 1.7795, 1.642, 1.4375, 1.4495, 1.4225,
> 1.4885, 1.251, 1.179, 1.188, 1.3605, 1.373, 1.2185, 1.405, 1.016,
> 0.979, 1.018, 1.0335, 1.39, 1.3005, 1.3955, 1.301, 1.6475, 1.1945,
> 1.3215, 1.0535, 1.1645, 1.0895, 1.041, 1.155, 1.322, 1.1615,
> 0.933, 1.1215, 1.022, 0.922, 0.8465, 1.103, 1.1375, 1.23, 1.289,
> 1.222, 1.4865, 1.4025, 1.4295, 1.156, 0.9085, 0.8755, 0.9135,
> 0.982, 1.145, 1.1295, 1.3475, 1.2415, 1.2505), .Names = c("1868",
> "1869", "1870", "1871", "1872", "1873", "1874", "1875", "1876",
> "1877", "1878", "1879", "1880", "1881", "1882", "1883", "1884",
> "1885", "1886", "1887", "1888", "1889", "1890", "1891", "1892",
> "1893", "1894", "1895", "1896", "1897", "1898", "1899", "1900",
> "1901", "1902", "1903", "1904", "1905", "1906", "1907", "1908",
> "1909", "1910", "1911", "1912", "1913", "1914", "1915", "1916",
> "1917", "1918", "1919", "1920", "1921", "1922", "1923", "1924",
> "1925", "1926", "1927", "1928", "1929", "1930", "1931", "1932",
> "1933", "1934", "1935", "1936", "1937", "1938", "1939", "1940",
> "1941", "1942", "1943", "1944", "1945", "1946", "1947", "1948",
> "1949", "1950", "1951", "1952", "1953", "1954", "1955", "1956",
> "1957", "1958", "1959", "1960", "1961", "1962", "1963", "1964",
> "1965", "1966", "1967", "1968", "1969", "1970", "1971", "1972",
> "1973", "1974", "1975", "1976", "1977", "1978", "1979", "1980",
> "1981", "1982", "1983", "1984", "1985", "1986", "1987", "1988",
> "1989", "1990", "1991", "1992", "1993", "1994", "1995", "1996",
> "1997", "1998", "1999", "2000", "2001", "2002", "2003", "2004",
> "2005", "2006", "2007", "2008", "2009", "2010", "2011"))
>
> with a vector like this: extr<-c(1834,1876,1901,1928,2006)
> The results must be like this:
> row.names MCG3 extr
>
> only for the extr values.
>
> is possible to do this with R?
>
> Thank you!
> --
> ---
> Catalin-Constantin ROIBU
> Forestry engineer, PhD
> Forestry Faculty of Suceava
> Str. Universitatii no. 13, Suceava, 720229, Romania
> office phone +4 0230 52 29 78, ext. 531
> mobile phone   +4 0745 53 18 01
>+4 0766 71 76 58
> FAX:+4 0230 52 16 64
> silvic.usv.ro
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Feature selection package for text mining

2013-03-13 Thread mxkuhn

caret has recursive feature and simple feature filters. I've got some genetic 
algorithm code (using the GA package). 

CORElearn also has the relief algorithm and a lot of different measures of 
feature importance. 

Max

On Mar 13, 2013, at 3:57 AM, "C.H."  wrote:

> FSelector
> 
> Maybe chi-sq is a good starting point.
> 
> On Wed, Mar 13, 2013 at 2:02 PM, Venkata Satish Basva
>  wrote:
>> Hi,
>> I am doing a project on authorship attribution, where my term document
>> matrix has around 10450 features.
>> Can you please suggest me a package where I can find the feature selection
>> function to reduce the dimensions.
>> 
>> Regards,
>> Venkata Satish Basva
>> 
>>[[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merge datas

2013-03-13 Thread Berend Hasselman


On 13-03-2013, at 11:31, Jorge I Velez  wrote:

> Dear Catalun,
> 
> If I understood your description, please see ?"%in%" and try
> 
> subset(x, names(x) %in% c(1834,1876,1901,1928,2006) )
> 
> where "x" is your data.
> 

Or something like this

x[names(x) %in% extr]

where extr is the vector you mentioned.

Berend

> HTH,
> Jorge.-
> 
> 
> On Wed, Mar 13, 2013 at 9:25 PM, catalin roibu <> wrote:
> 
>> Hello all!
>> I have a problem with R. I try to merge data like this:
>> structure(c(2.1785, 1.868, 2.1855, 2.5175, 2.025, 2.435, 1.809,
>> 1.628, 1.327, 1.3485, 1.4335, 2.052, 2.2465, 2.151, 1.7945, 1.79,
>> 1.6055, 1.616, 1.633, 1.665, 2.002, 2.152, 1.736, 1.7985, 1.9155,
>> 1.7135, 1.548, 1.568, 1.713, 2.079, 1.875, 2.12, 2.072, 1.906,
>> 1.4645, 1.3025, 1.407, 1.5445, 1.437, 1.463, 1.5235, 1.609, 1.738,
>> 1.478, 1.573, 1.0465, 1.429, 1.632, 1.814, 1.933, 1.63, 1.482,
>> 1.466, 1.4025, 1.6055, 1.279, 1.827, 1.201, 1.425, 1.678, 1.5535,
>> 1.599, 1.826, 1.964, 1.68, 1.492, 1.509, 1.666, 1.5665, 1.666,
>> 1.4885, 1.8205, 1.5965, 1.84, 1.551, 1.4835, 1.805, 1.7145, 1.902,
>> 1.2085, 0.9095, 0.9325, 1.34, 1.6135, 1.5825, 1.757, 1.7105,
>> 1.3115, 1.288, 1.567, 1.7795, 1.642, 1.4375, 1.4495, 1.4225,
>> 1.4885, 1.251, 1.179, 1.188, 1.3605, 1.373, 1.2185, 1.405, 1.016,
>> 0.979, 1.018, 1.0335, 1.39, 1.3005, 1.3955, 1.301, 1.6475, 1.1945,
>> 1.3215, 1.0535, 1.1645, 1.0895, 1.041, 1.155, 1.322, 1.1615,
>> 0.933, 1.1215, 1.022, 0.922, 0.8465, 1.103, 1.1375, 1.23, 1.289,
>> 1.222, 1.4865, 1.4025, 1.4295, 1.156, 0.9085, 0.8755, 0.9135,
>> 0.982, 1.145, 1.1295, 1.3475, 1.2415, 1.2505), .Names = c("1868",
>> "1869", "1870", "1871", "1872", "1873", "1874", "1875", "1876",
>> "1877", "1878", "1879", "1880", "1881", "1882", "1883", "1884",
>> "1885", "1886", "1887", "1888", "1889", "1890", "1891", "1892",
>> "1893", "1894", "1895", "1896", "1897", "1898", "1899", "1900",
>> "1901", "1902", "1903", "1904", "1905", "1906", "1907", "1908",
>> "1909", "1910", "1911", "1912", "1913", "1914", "1915", "1916",
>> "1917", "1918", "1919", "1920", "1921", "1922", "1923", "1924",
>> "1925", "1926", "1927", "1928", "1929", "1930", "1931", "1932",
>> "1933", "1934", "1935", "1936", "1937", "1938", "1939", "1940",
>> "1941", "1942", "1943", "1944", "1945", "1946", "1947", "1948",
>> "1949", "1950", "1951", "1952", "1953", "1954", "1955", "1956",
>> "1957", "1958", "1959", "1960", "1961", "1962", "1963", "1964",
>> "1965", "1966", "1967", "1968", "1969", "1970", "1971", "1972",
>> "1973", "1974", "1975", "1976", "1977", "1978", "1979", "1980",
>> "1981", "1982", "1983", "1984", "1985", "1986", "1987", "1988",
>> "1989", "1990", "1991", "1992", "1993", "1994", "1995", "1996",
>> "1997", "1998", "1999", "2000", "2001", "2002", "2003", "2004",
>> "2005", "2006", "2007", "2008", "2009", "2010", "2011"))
>> 
>> with a vector like this: extr<-c(1834,1876,1901,1928,2006)
>> The results must be like this:
>> row.names MCG3 extr
>> 
>> only for the extr values.
>> 
>> is possible to do this with R?
>> 
>> Thank you!
>> --
>> ---
>> Catalin-Constantin ROIBU
>> Forestry engineer, PhD
>> Forestry Faculty of Suceava
>> Str. Universitatii no. 13, Suceava, 720229, Romania
>> office phone +4 0230 52 29 78, ext. 531
>> mobile phone   +4 0745 53 18 01
>>   +4 0766 71 76 58
>> FAX:+4 0230 52 16 64
>> silvic.usv.ro
>> 
>>[[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loading data frames and rbind them

2013-03-13 Thread A M Lavezzi

Dear Ivan and Greg, thaks a lot!

sorry for late reply, both ways work fine! I have to say that maybe Greg's
is a little faster (but I am working with a relatively small amount of
data, approx 130 xls files) so I do not notice remarkable differences.

The only suggestion I have is to add to the read.xlsx(...) part either:

stringsAsFactors=FALSE

or:

colClasses= c("numeric","character", ...)

to make the rbind operation robust (otherwise NA could be created)
thanks!
Mario


On Tue, Mar 12, 2013 at 10:17 PM, Greg Snow <538...@gmail.com> wrote:

> The only real improveent I can see over Ivan's solution is to use lapply
> instead of the loop (this may just be person preference though).
>
> Something like:
>
> list_df <- lapply( lista_rea_c, function(x) read.xls( file=
> paste0(path,x,"/",x,".xls"),1,header=TRUE,as.data.frame=TRUE))
> my_df <- do.call(rbind, list_df)
>
> You could even do that as a single line if you really wanted to, but it
> would be less readable.  You could also make it a little more readable by
> putting the paste on its own line to create all the path/filenames in a
> variable to pass to lapply.
>
>
>
> On Tue, Mar 12, 2013 at 9:06 AM, Ivan Calandra <
> ivan.calan...@u-bourgogne.fr
> > wrote:
>
> > Hi Mario!
> >
> > I'm not really familiar with this kind of manipulations, but I think you
> > can do it more or less like this (some people on this list might give a
> > more detailed answer):
> >
> > #Create an empty named list
> > list_df <- vector(mode="list", length=length(lista_rec_c))
> > names(list_df) <- lista_rea_c##or some part of it using gsub or
> > something similar
> >
> > #Import
> > for (i in lista_rea_c) {
> > list_df[[i]] <- read.xlsx(...)
> > }
> >
> > #rbind
> > do.call(rbind, list_df)
> >
> > This probably won't work like this exactly but you should be able to make
> > the modifications.
> >
> > HTH,
> > Ivan
> >
> > --
> > Ivan CALANDRA
> > Université de Bourgogne
> > UMR CNRS/uB 6282 Biogéosciences
> > 6 Boulevard Gabriel
> > 21000 Dijon, FRANCE
> > +33(0)3.80.39.63.06
> > ivan.calan...@u-bourgogne.fr
> > http://biogeosciences.u-**bourgogne.fr/calandra<
> http://biogeosciences.u-bourgogne.fr/calandra>
> >
> > Le 12/03/13 15:52, A M Lavezzi a écrit :
> >
> >> Hello everybody
> >>
> >> I have the following problem. I have to load a number of xls files from
> >> different folders (each xls file has the same number of columns, and
> >> different numbers of rows). Each xls file is named with a number, i.e.
> >> 12345.xls and is contained in a folder with same name, say 12345)
> >>
> >> Once loaded, I want to rbind all of them to obtain a single database.
> >>
> >> I think I successfully did the first part, using "assign":
> >>
> >> for (i in lista_rea_c){
> >>
> >> name=paste("reaNum",i,sep="",**collapse=NULL)
> >>
> >> assign(name,read.xlsx(file=**paste(path,i,"/",i,".xls",sep=**
> >> "",collapse=NULL),1,header=**TRUE,as.data.frame=TRUE))
> >>
> >> }
> >>
> >> where lista_rea_c contains the "numbers" and is obtained
> >> as: lista_rea_c = list.files(path = "/Users/mario/Dropbox/..., and path
> is
> >> defined elsewhere
> >> At this point I have a number of variables, names as "reaNum12345" ecc.
> >>
> >> I would like to rbind all of them, but what I get is that I rbind
> >> "reaNum12345", i.e. the variable name and not the data it contains.
> >>
> >> Can anyone help?
> >>
> >> thanks!
> >> Mario
> >>
> >>
> >>
> >>
> >> __**
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/**listinfo/r-help<
> https://stat.ethz.ch/mailman/listinfo/r-help>
> >> PLEASE do read the posting guide http://www.R-project.org/**
> >> posting-guide.html 
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > __**
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/**listinfo/r-help<
> https://stat.ethz.ch/mailman/listinfo/r-help>
> > PLEASE do read the posting guide http://www.R-project.org/**
> > posting-guide.html 
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
-- 
PLEASE NOTICE NEW EMAIL ADDRESS AND HOME PAGE URL

Andrea Mario Lavezzi
Dipartimento di Studi su Politica, Diritto e Società
Università di Palermo
Piazza Bologni 8
90134 Palermo, Italy
tel. ++39 091 23892208
fax ++39 091 6111268
skype: lavezzimario
email: mario.lavezzi (at) unipa.it
web: http://www.unipa.it/~mario.lavezzi

[[alternative HTML version

[R] expression exponent labeling

2013-03-13 Thread Berry Boessenkool



Hi all,

I want to label an axis with exponents, but can't get it done with expression.
Any hints would be very welcome!

# simulated data, somewhat similarly distributed to my real data:
set.seed(12); d <- rbeta(1e6, 0.2,2)*150 ; d <- d[d>1e-8]
hist( d  , breaks=100)
# now on a logarithmically scaled axis:
hist(log10(d), breaks=100, xaxt="n")
abline(v= log10(1:10*10^rep(-9:3, each=10)), col="darkgrey" ); box()
hist(log10(d), breaks=100, col="forestgreen", add=T)
axis(1, log10(1:10*10^rep(-9:3, each=10)), labels=F)
axis(1, -2:2, format(10^(-2:2), scient=3, drop0trailing=T) )
# the labels with lower values should be in the form of 10^x:
axis(1, -8:-3, expression( 10^(-8:-3)) )# doesn't work, because expression 
returns only one output
for(i in -8:-3) axis(1, i, expression(10^i)  ) # writes i at all locations

expression does exactly what it should, but I want something different here...
I've tried I(10^i) instead, but that's not right either.

Thanks ahead,
Berry


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] string split at xth position

2013-03-13 Thread arun



Hi,

You could use:
library(stringr)
?str_sub()
lapply(2:3,function(i) if(i==2) str_sub(x,end=i) else str_sub(x,i))
#[[1]]
#[1] "a1" "a2" "a1"

#[[2]]
#[1] "b1" "b2" "b2"
A.K.




- Original Message -
From: Johannes Radinger 
To: r-help@r-project.org
Cc: 
Sent: Wednesday, March 13, 2013 4:37 AM
Subject: [R] string split at xth position

Hi,

I have a vector of strings like:
c("a1b1","a2b2","a1b2") which I want to spilt into two parts like:
c("a1","a2","a2") and c("b1","b2,"b2"). So there is
always a first part with a+number and a second part with b+number.
Unfortunately there is no separator I could use to directly split
the vectors.. Any idea how to handle such cases?

/Johannes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread Robert Baer

R2wd (http://cran.r-project.org/web/packages/R2wd/R2wd.pdf 
) might do what 
you want.

Rob

On 3/12/2013 7:02 PM, Santosh wrote:
> Dear Rxperts,
> I am aware of Sweave that generates reports into a pdf, but do know of any
> tools to generate to export to a MS Word document...
>
> Is there  a way to use R to generate and export report/publication quality
> tables and figures and export them to MS word (for reporting purposes)?
>
> Thanks so much,
> Santosh
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 

Robert W. Baer, Ph.D.
Professor of Physiology
Kirksille College of Osteopathic Medicine
A. T. Still University of Health Sciences
Kirksville, MO 63501 USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merging a dataframe or vectors

2013-03-13 Thread Al Ehan

Hi,

I would like to know what is the easiest way to compile two or more set of
vectors or data frame, according to their index. They are interrelated to
one another by their assigned index. for example:

#data set 1
abc
#output:
  X403   X408   X410   X415   X418   X419   X420   X423   X424   X425
X426   X427
549.58 541.91 544.18 549.37 555.54 540.83 543.26 544.26 546.85 548.98
553.10 556.49
  X428
543.57

#data set2
def
#output:
 X401   X402   X404   X405   X406   X407   X409   X411   X412   X413   X414
  X416
528.46 524.15 527.18 526.04 533.71 537.79 536.80 532.38 517.14 529.32
523.29 539.58
 X417   X421   X422   X429
535.38 532.68 515.28 523.10

Both are numeric values and have indeces above each of the numbers which
referring to X401 to X429 indices. I would like to combine both by sorting
X401, X402,X403 and so on. Could somebody please help me before I waste my
time using excel to do this. Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread Gergely Daróczi

Just to second Jeff's answer about pandoc[1] with a minimal reproducible
example, you might give a try to my "pander" package [2] too:

> library(pander)
> Pandoc.brew(system.file('examples/minimal.brew', package='pander'),
output = tempfile(), convert = 'docx')

Where the content of the "minimal.brew" file is something you might have
got used to with Sweave - although it's using "brew" syntax instead. See
the examples of pander [3] for more details. Please note that pandoc should
be installed first, which is pretty easy on Windows.

Best,
Gergely

  [1] http://johnmacfarlane.net/pandoc/
  [2] http://rapporter.github.com/pander/
  [3] http://rapporter.github.com/pander/#examples

On 13 March 2013 03:28, Jeff Newmiller  wrote:

> knitr markdown+pandoc gives serviceable results, for low enough
> expectations
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> Santosh  wrote:
>
> >Dear Rxperts,
> >I am aware of Sweave that generates reports into a pdf, but do know of
> >any
> >tools to generate to export to a MS Word document...
> >
> >Is there  a way to use R to generate and export report/publication
> >quality
> >tables and figures and export them to MS word (for reporting purposes)?
> >
> >Thanks so much,
> >Santosh
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread Frank Harrell

The best rendering of advanced tables is done by converting from pdf to Word. 
See http://biostat.mc.vanderbilt.edu/SweaveConvert

Frank

Robert Baer wrote
> R2wd (http://cran.r-project.org/web/packages/R2wd/R2wd.pdf 
> ;) might do
> what 
> you want.
> 
> Rob
> 
> On 3/12/2013 7:02 PM, Santosh wrote:
>> Dear Rxperts,
>> I am aware of Sweave that generates reports into a pdf, but do know of
>> any
>> tools to generate to export to a MS Word document...
>>
>> Is there  a way to use R to generate and export report/publication
>> quality
>> tables and figures and export them to MS word (for reporting purposes)?
>>
>> Thanks so much,
>> Santosh
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> 

> R-help@

>  mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> -- 
> 
> Robert W. Baer, Ph.D.
> Professor of Physiology
> Kirksille College of Osteopathic Medicine
> A. T. Still University of Health Sciences
> Kirksville, MO 63501 USA
> 
> 
>   [[alternative HTML version deleted]]
> 
> __

> R-help@

>  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Export-R-generated-tables-and-figures-to-MS-Word-tp4661132p4661180.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Accuracy some classifiers

2013-03-13 Thread Frank Harrell

Which accuracy score are you using?  If it is proportion correctly
classified, that is a misleading improper scoring rule.
Frank

rkok wrote
> I am using machine learning for one researching.  I am using some
> classifiers with 5-fold CV . I would like to know how it is possible to
> extract the accuracy, for example, for KNN,neural networks and J48,  for
> each one of 5-fold because when I apply CV to my classifier, I obtain the
> "mean accuracy" of 5-fold  but each accuracy/error of each fold is not
> returned.
> 
> Any help is welcome and grateful.  Thanks in advance!
> 
> Regards!!





-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Accuracy-some-classifiers-tp4661109p4661181.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] image color analysis

2013-03-13 Thread Robert Baer


On 3/13/2013 12:05 AM, ishi soichi wrote:

I am not sure if I should ask this question in this list. But I'll try.

Currently I am trying to analyze images using EBImage and biOps.
One of the features that I need to extract from various images is the color
spectrum, namely, which colors each image consists of.

So, each image hopefully can be converted into some sort of color histogram
so that color ingredients are easily comparable with each other.

There are so many functionalities that these packages and others provide,
and I am hoping that someone would give me some guideline for the analysis.

Any suggestion?
Your question is quite general, so I'll make a couple of general 
comments, returning us to R at the end.


You need to read about spectral color systems, for example 
http://www.fourmilab.ch/documents/specrend/. You undoubtedly have used 
filters, whether a Bayer filter typically built into a color camera or 
more specific filters if you have used a monochrome camera to collect 
non-RGB channels.  You need to know what type of transforms might have 
been performed during the storage process.  For example, has the image 
already been transformed to RGB space before storage? In fluorescent 
spectroscopy, for example, it is common to use pseudo-coloring so the 
channels of the stored image may not be directly convertible into 
spectral color without additional information.


R can do all the appropriate matrix algebra once you define the 
specifics of your individual conversion.


Hope this helps,

Rob




Thanks.

ishida

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--

Robert W. Baer, Ph.D.
Professor of Physiology
Kirksille College of Osteopathic Medicine
A. T. Still University of Health Sciences
Kirksville, MO 63501 USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread Liviu Andronic

On Wed, Mar 13, 2013 at 1:02 AM, Santosh  wrote:
> Dear Rxperts,
> I am aware of Sweave that generates reports into a pdf, but do know of any
> tools to generate to export to a MS Word document...
>
> Is there  a way to use R to generate and export report/publication quality
> tables and figures and export them to MS word (for reporting purposes)?
>
Instead of pure LaTeX, you may use LyX to generate Sweave/knitr reports.

Liviu


> Thanks so much,
> Santosh
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expression exponent labeling

2013-03-13 Thread Gerrit Eichner


Hi, Berry,

I think


for(i in -8:-3) axis(1, i, substitute(10^j, list( j = i)))


achieves what you want.

 Regards --  Gerrit


On Wed, 13 Mar 2013, Berry Boessenkool wrote:




Hi all,

I want to label an axis with exponents, but can't get it done with expression.
Any hints would be very welcome!

# simulated data, somewhat similarly distributed to my real data:
set.seed(12); d <- rbeta(1e6, 0.2,2)*150 ; d <- d[d>1e-8]
hist( d  , breaks=100)
# now on a logarithmically scaled axis:
hist(log10(d), breaks=100, xaxt="n")
abline(v= log10(1:10*10^rep(-9:3, each=10)), col="darkgrey" ); box()
hist(log10(d), breaks=100, col="forestgreen", add=T)
axis(1, log10(1:10*10^rep(-9:3, each=10)), labels=F)
axis(1, -2:2, format(10^(-2:2), scient=3, drop0trailing=T) )
# the labels with lower values should be in the form of 10^x:
axis(1, -8:-3, expression( 10^(-8:-3)) )# doesn't work, because expression 
returns only one output
for(i in -8:-3) axis(1, i, expression(10^i)  ) # writes i at all locations

expression does exactly what it should, but I want something different here...
I've tried I(10^i) instead, but that's not right either.

Thanks ahead,
Berry



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extract letters from a column

2013-03-13 Thread SH

Dear list:

I would like to extract three letters from first and second elements
in one column and make a new column.

For example below,

> tempdf = read.table("clipboard", header=T, sep='\t')
> tempdf
  name var1 var2abb
1  Tom Cruiser16 TomCru
2   Bread Pett25 BrePet
3 Arnold Schwiezer37 ArnSch
> (p1 = substr(tempdf$name, 1, 3))
[1] "Tom" "Bre" "Arn"

I was able to extract three letters from first name, however, I don't
know how to extract three letters from last name (i.e., 'Cru', 'Pet',
and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.

Best,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread SH

Dear Jorge,

I gave me this result (below) since it defines starting from the forth
letter and ending 6th letter from the first element.

> substr(tempdf$name, 4, 6)
[1] " Cr" "ad " "old"

I would like to have letters from first and second elements if possible.

Thanks for replying,

Steve


On Wed, Mar 13, 2013 at 10:10 AM, Jorge I Velez
 wrote:
> Dear SH,
>
> Hmmm... what about
>
> substr(tempdf$name, 4, 6))
>
> ?
>
> HTH,
> Jorge.-
>
>
> On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:
>>
>> Dear list:
>>
>> I would like to extract three letters from first and second elements
>> in one column and make a new column.
>>
>> For example below,
>>
>> > tempdf = read.table("clipboard", header=T, sep='\t')
>> > tempdf
>>   name var1 var2abb
>> 1  Tom Cruiser16 TomCru
>> 2   Bread Pett25 BrePet
>> 3 Arnold Schwiezer37 ArnSch
>> > (p1 = substr(tempdf$name, 1, 3))
>> [1] "Tom" "Bre" "Arn"
>>
>> I was able to extract three letters from first name, however, I don't
>> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
>> and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.
>>
>> Best,
>>
>> Steve
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread Jorge I Velez

Dear SH,

Hmmm... what about

substr(tempdf$name, 4, 6))

?

HTH,
Jorge.-


On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:

> Dear list:
>
> I would like to extract three letters from first and second elements
> in one column and make a new column.
>
> For example below,
>
> > tempdf = read.table("clipboard", header=T, sep='\t')
> > tempdf
>   name var1 var2abb
> 1  Tom Cruiser16 TomCru
> 2   Bread Pett25 BrePet
> 3 Arnold Schwiezer37 ArnSch
> > (p1 = substr(tempdf$name, 1, 3))
> [1] "Tom" "Bre" "Arn"
>
> I was able to extract three letters from first name, however, I don't
> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
> and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.
>
> Best,
>
> Steve
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread Jorge I Velez

Try

substr(tempdf$abb 4, 6)

--JIV


On Thu, Mar 14, 2013 at 1:15 AM, SH  wrote:

> Dear Jorge,
>
> I gave me this result (below) since it defines starting from the forth
> letter and ending 6th letter from the first element.
>
> > substr(tempdf$name, 4, 6)
> [1] " Cr" "ad " "old"
>
> I would like to have letters from first and second elements if possible.
>
> Thanks for replying,
>
> Steve
>
>
> On Wed, Mar 13, 2013 at 10:10 AM, Jorge I Velez
>  wrote:
> > Dear SH,
> >
> > Hmmm... what about
> >
> > substr(tempdf$name, 4, 6))
> >
> > ?
> >
> > HTH,
> > Jorge.-
> >
> >
> > On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:
> >>
> >> Dear list:
> >>
> >> I would like to extract three letters from first and second elements
> >> in one column and make a new column.
> >>
> >> For example below,
> >>
> >> > tempdf = read.table("clipboard", header=T, sep='\t')
> >> > tempdf
> >>   name var1 var2abb
> >> 1  Tom Cruiser16 TomCru
> >> 2   Bread Pett25 BrePet
> >> 3 Arnold Schwiezer37 ArnSch
> >> > (p1 = substr(tempdf$name, 1, 3))
> >> [1] "Tom" "Bre" "Arn"
> >>
> >> I was able to extract three letters from first name, however, I don't
> >> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
> >> and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.
> >>
> >> Best,
> >>
> >> Steve
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread SH

What I want to do is to extrac three letters from first and last name
and to combine them to make another column 'abb'.  The column 'abb' is
to be a my final product.  I can make column 'abb' using 'paste'
function once I have two parts from the first column 'name'.

Thanks,

Steve

On Wed, Mar 13, 2013 at 10:17 AM, Jorge I Velez
 wrote:
> Try
>
> substr(tempdf$abb 4, 6)
>
> --JIV
>
>
>
> On Thu, Mar 14, 2013 at 1:15 AM, SH  wrote:
>>
>> Dear Jorge,
>>
>> I gave me this result (below) since it defines starting from the forth
>> letter and ending 6th letter from the first element.
>>
>> > substr(tempdf$name, 4, 6)
>> [1] " Cr" "ad " "old"
>>
>> I would like to have letters from first and second elements if possible.
>>
>> Thanks for replying,
>>
>> Steve
>>
>>
>> On Wed, Mar 13, 2013 at 10:10 AM, Jorge I Velez
>>  wrote:
>> > Dear SH,
>> >
>> > Hmmm... what about
>> >
>> > substr(tempdf$name, 4, 6))
>> >
>> > ?
>> >
>> > HTH,
>> > Jorge.-
>> >
>> >
>> > On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:
>> >>
>> >> Dear list:
>> >>
>> >> I would like to extract three letters from first and second elements
>> >> in one column and make a new column.
>> >>
>> >> For example below,
>> >>
>> >> > tempdf = read.table("clipboard", header=T, sep='\t')
>> >> > tempdf
>> >>   name var1 var2abb
>> >> 1  Tom Cruiser16 TomCru
>> >> 2   Bread Pett25 BrePet
>> >> 3 Arnold Schwiezer37 ArnSch
>> >> > (p1 = substr(tempdf$name, 1, 3))
>> >> [1] "Tom" "Bre" "Arn"
>> >>
>> >> I was able to extract three letters from first name, however, I don't
>> >> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
>> >> and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.
>> >>
>> >> Best,
>> >>
>> >> Steve
>> >>
>> >> __
>> >> R-help@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread arun



HI,


tempdf<-read.table(text="
name,var1,var2,abb
Tom Cruiser,1,6,TomCru
Bread Pett,2,5,BrePet
Arnold Schwiezer,3,7,ArnSch 
",sep=",",header=TRUE,stringsAsFactors=FALSE)
 substr(tempdf$name, 4, 6) #as some of the firstnames differ in the number of 
characters
#[1] " Cr" "ad " "old"

 substr(gsub(".*\\s+","",tempdf$name),1,3)
#[1] "Cru" "Pet" "Sch"
A.K.



- Original Message -
From: Jorge I Velez 
To: SH 
Cc: r-help@r-project.org
Sent: Wednesday, March 13, 2013 10:10 AM
Subject: Re: [R] Extract letters from a column

Dear SH,

Hmmm... what about

substr(tempdf$name, 4, 6))

?

HTH,
Jorge.-


On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:

> Dear list:
>
> I would like to extract three letters from first and second elements
> in one column and make a new column.
>
> For example below,
>
> > tempdf = read.table("clipboard", header=T, sep='\t')
> > tempdf
>               name var1 var2    abb
> 1      Tom Cruiser    1    6 TomCru
> 2       Bread Pett    2    5 BrePet
> 3 Arnold Schwiezer    3    7 ArnSch
> > (p1 = substr(tempdf$name, 1, 3))
> [1] "Tom" "Bre" "Arn"
>
> I was able to extract three letters from first name, however, I don't
> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
> and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.
>
> Best,
>
> Steve
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread Jorge I Velez

Try

x <- c("Tom Cruiser", "Bread Pett", "Arnold Schwiezer")
sapply(strsplit(x, " "), function(r) paste0(substr(r[1], 1, 3),
substr(r[2], 1, 3)))
[1] "TomCru" "BrePet" "ArnSch"

HTH,
Jorge.-


On Thu, Mar 14, 2013 at 1:21 AM, SH  wrote:

> What I want to do is to extrac three letters from first and last name
> and to combine them to make another column 'abb'.  The column 'abb' is
> to be a my final product.  I can make column 'abb' using 'paste'
> function once I have two parts from the first column 'name'.
>
> Thanks,
>
> Steve
>
> On Wed, Mar 13, 2013 at 10:17 AM, Jorge I Velez
>  wrote:
> > Try
> >
> > substr(tempdf$abb 4, 6)
> >
> > --JIV
> >
> >
> >
> > On Thu, Mar 14, 2013 at 1:15 AM, SH  wrote:
> >>
> >> Dear Jorge,
> >>
> >> I gave me this result (below) since it defines starting from the forth
> >> letter and ending 6th letter from the first element.
> >>
> >> > substr(tempdf$name, 4, 6)
> >> [1] " Cr" "ad " "old"
> >>
> >> I would like to have letters from first and second elements if possible.
> >>
> >> Thanks for replying,
> >>
> >> Steve
> >>
> >>
> >> On Wed, Mar 13, 2013 at 10:10 AM, Jorge I Velez
> >>  wrote:
> >> > Dear SH,
> >> >
> >> > Hmmm... what about
> >> >
> >> > substr(tempdf$name, 4, 6))
> >> >
> >> > ?
> >> >
> >> > HTH,
> >> > Jorge.-
> >> >
> >> >
> >> > On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:
> >> >>
> >> >> Dear list:
> >> >>
> >> >> I would like to extract three letters from first and second elements
> >> >> in one column and make a new column.
> >> >>
> >> >> For example below,
> >> >>
> >> >> > tempdf = read.table("clipboard", header=T, sep='\t')
> >> >> > tempdf
> >> >>   name var1 var2abb
> >> >> 1  Tom Cruiser16 TomCru
> >> >> 2   Bread Pett25 BrePet
> >> >> 3 Arnold Schwiezer37 ArnSch
> >> >> > (p1 = substr(tempdf$name, 1, 3))
> >> >> [1] "Tom" "Bre" "Arn"
> >> >>
> >> >> I was able to extract three letters from first name, however, I don't
> >> >> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
> >> >> and 'Sch').  Can anyone give me a suggestion?  Many thanks in
> advance.
> >> >>
> >> >> Best,
> >> >>
> >> >> Steve
> >> >>
> >> >> __
> >> >> R-help@r-project.org mailing list
> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting guide
> >> >> http://www.R-project.org/posting-guide.html
> >> >> and provide commented, minimal, self-contained, reproducible code.
> >> >
> >> >
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread Marc Schwartz

This could be done in a single step using gsub() with back references in the 
regex. 

> gsub("^(.{3}).* (.{3}).*$", "\\1\\2", "Tom Cruise")
[1] "TomCru"

Regards,

Marc Schwartz


On Mar 13, 2013, at 9:21 AM, SH  wrote:

> What I want to do is to extrac three letters from first and last name
> and to combine them to make another column 'abb'.  The column 'abb' is
> to be a my final product.  I can make column 'abb' using 'paste'
> function once I have two parts from the first column 'name'.
> 
> Thanks,
> 
> Steve
> 
> On Wed, Mar 13, 2013 at 10:17 AM, Jorge I Velez
>  wrote:
>> Try
>> 
>> substr(tempdf$abb 4, 6)
>> 
>> --JIV
>> 
>> 
>> 
>> On Thu, Mar 14, 2013 at 1:15 AM, SH  wrote:
>>> 
>>> Dear Jorge,
>>> 
>>> I gave me this result (below) since it defines starting from the forth
>>> letter and ending 6th letter from the first element.
>>> 
 substr(tempdf$name, 4, 6)
>>> [1] " Cr" "ad " "old"
>>> 
>>> I would like to have letters from first and second elements if possible.
>>> 
>>> Thanks for replying,
>>> 
>>> Steve
>>> 
>>> 
>>> On Wed, Mar 13, 2013 at 10:10 AM, Jorge I Velez
>>>  wrote:
 Dear SH,
 
 Hmmm... what about
 
 substr(tempdf$name, 4, 6))
 
 ?
 
 HTH,
 Jorge.-
 
 
 On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:
> 
> Dear list:
> 
> I would like to extract three letters from first and second elements
> in one column and make a new column.
> 
> For example below,
> 
>> tempdf = read.table("clipboard", header=T, sep='\t')
>> tempdf
>  name var1 var2abb
> 1  Tom Cruiser16 TomCru
> 2   Bread Pett25 BrePet
> 3 Arnold Schwiezer37 ArnSch
>> (p1 = substr(tempdf$name, 1, 3))
> [1] "Tom" "Bre" "Arn"
> 
> I was able to extract three letters from first name, however, I don't
> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
> and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.
> 
> Best,
> 
> Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread SH

Thank you so much, Jorge and arun!!!  Both works well.

Steve

On Wed, Mar 13, 2013 at 10:26 AM, Jorge I Velez
 wrote:
> Try
>
> x <- c("Tom Cruiser", "Bread Pett", "Arnold Schwiezer")
> sapply(strsplit(x, " "), function(r) paste0(substr(r[1], 1, 3), substr(r[2],
> 1, 3)))
> [1] "TomCru" "BrePet" "ArnSch"
>
> HTH,
> Jorge.-
>
>
> On Thu, Mar 14, 2013 at 1:21 AM, SH  wrote:
>>
>> What I want to do is to extrac three letters from first and last name
>> and to combine them to make another column 'abb'.  The column 'abb' is
>> to be a my final product.  I can make column 'abb' using 'paste'
>> function once I have two parts from the first column 'name'.
>>
>> Thanks,
>>
>> Steve
>>
>> On Wed, Mar 13, 2013 at 10:17 AM, Jorge I Velez
>>  wrote:
>> > Try
>> >
>> > substr(tempdf$abb 4, 6)
>> >
>> > --JIV
>> >
>> >
>> >
>> > On Thu, Mar 14, 2013 at 1:15 AM, SH  wrote:
>> >>
>> >> Dear Jorge,
>> >>
>> >> I gave me this result (below) since it defines starting from the forth
>> >> letter and ending 6th letter from the first element.
>> >>
>> >> > substr(tempdf$name, 4, 6)
>> >> [1] " Cr" "ad " "old"
>> >>
>> >> I would like to have letters from first and second elements if
>> >> possible.
>> >>
>> >> Thanks for replying,
>> >>
>> >> Steve
>> >>
>> >>
>> >> On Wed, Mar 13, 2013 at 10:10 AM, Jorge I Velez
>> >>  wrote:
>> >> > Dear SH,
>> >> >
>> >> > Hmmm... what about
>> >> >
>> >> > substr(tempdf$name, 4, 6))
>> >> >
>> >> > ?
>> >> >
>> >> > HTH,
>> >> > Jorge.-
>> >> >
>> >> >
>> >> > On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:
>> >> >>
>> >> >> Dear list:
>> >> >>
>> >> >> I would like to extract three letters from first and second elements
>> >> >> in one column and make a new column.
>> >> >>
>> >> >> For example below,
>> >> >>
>> >> >> > tempdf = read.table("clipboard", header=T, sep='\t')
>> >> >> > tempdf
>> >> >>   name var1 var2abb
>> >> >> 1  Tom Cruiser16 TomCru
>> >> >> 2   Bread Pett25 BrePet
>> >> >> 3 Arnold Schwiezer37 ArnSch
>> >> >> > (p1 = substr(tempdf$name, 1, 3))
>> >> >> [1] "Tom" "Bre" "Arn"
>> >> >>
>> >> >> I was able to extract three letters from first name, however, I
>> >> >> don't
>> >> >> know how to extract three letters from last name (i.e., 'Cru',
>> >> >> 'Pet',
>> >> >> and 'Sch').  Can anyone give me a suggestion?  Many thanks in
>> >> >> advance.
>> >> >>
>> >> >> Best,
>> >> >>
>> >> >> Steve
>> >> >>
>> >> >> __
>> >> >> R-help@r-project.org mailing list
>> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> PLEASE do read the posting guide
>> >> >> http://www.R-project.org/posting-guide.html
>> >> >> and provide commented, minimal, self-contained, reproducible code.
>> >> >
>> >> >
>> >
>> >
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract letters from a column

2013-03-13 Thread SH

mmm... great!  Thanks a lot all of you for helps!!!

Steve

On Wed, Mar 13, 2013 at 10:44 AM, Marc Schwartz  wrote:
> This could be done in a single step using gsub() with back references in the 
> regex.
>
>> gsub("^(.{3}).* (.{3}).*$", "\\1\\2", "Tom Cruise")
> [1] "TomCru"
>
> Regards,
>
> Marc Schwartz
>
>
> On Mar 13, 2013, at 9:21 AM, SH  wrote:
>
>> What I want to do is to extrac three letters from first and last name
>> and to combine them to make another column 'abb'.  The column 'abb' is
>> to be a my final product.  I can make column 'abb' using 'paste'
>> function once I have two parts from the first column 'name'.
>>
>> Thanks,
>>
>> Steve
>>
>> On Wed, Mar 13, 2013 at 10:17 AM, Jorge I Velez
>>  wrote:
>>> Try
>>>
>>> substr(tempdf$abb 4, 6)
>>>
>>> --JIV
>>>
>>>
>>>
>>> On Thu, Mar 14, 2013 at 1:15 AM, SH  wrote:

 Dear Jorge,

 I gave me this result (below) since it defines starting from the forth
 letter and ending 6th letter from the first element.

> substr(tempdf$name, 4, 6)
 [1] " Cr" "ad " "old"

 I would like to have letters from first and second elements if possible.

 Thanks for replying,

 Steve


 On Wed, Mar 13, 2013 at 10:10 AM, Jorge I Velez
  wrote:
> Dear SH,
>
> Hmmm... what about
>
> substr(tempdf$name, 4, 6))
>
> ?
>
> HTH,
> Jorge.-
>
>
> On Thu, Mar 14, 2013 at 1:06 AM, SH  wrote:
>>
>> Dear list:
>>
>> I would like to extract three letters from first and second elements
>> in one column and make a new column.
>>
>> For example below,
>>
>>> tempdf = read.table("clipboard", header=T, sep='\t')
>>> tempdf
>>  name var1 var2abb
>> 1  Tom Cruiser16 TomCru
>> 2   Bread Pett25 BrePet
>> 3 Arnold Schwiezer37 ArnSch
>>> (p1 = substr(tempdf$name, 1, 3))
>> [1] "Tom" "Bre" "Arn"
>>
>> I was able to extract three letters from first name, however, I don't
>> know how to extract three letters from last name (i.e., 'Cru', 'Pet',
>> and 'Sch').  Can anyone give me a suggestion?  Many thanks in advance.
>>
>> Best,
>>
>> Steve
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to transpose it in a fast way?

2013-03-13 Thread Yao He

Thanks for everybody's help!

I learn a lot from this discuss!



2013/3/10 jim holtman :
> Did you check out the 'colbycol' package.
>
> On Fri, Mar 8, 2013 at 5:46 PM, Martin Morgan  wrote:
>
>> On 03/08/2013 06:01 AM, Jan van der Laan wrote:
>>
>>>
>>> You could use the fact that scan reads the data rowwise, and the fact that
>>> arrays are stored columnwise:
>>>
>>> # generate a small example dataset
>>> exampl <- array(letters[1:25], dim=c(5,5))
>>> write.table(exampl, file="example.dat", row.names=FALSE. col.names=FALSE,
>>>  sep="\t", quote=FALSE)
>>>
>>> # and read...
>>> d <- scan("example.dat", what=character())
>>> d <- array(d, dim=c(5,5))
>>>
>>> t(exampl) == d
>>>
>>>
>>> Although this is probably faster, it doesn't help with the large size.
>>> You could
>>> used the n option of scan to read chunks/blocks and feed those to, for
>>> example,
>>> an ff array (which you ideally have preallocated).
>>>
>>
>> I think it's worth asking what the overall goal is; all we get from this
>> exercise is another large file that we can't easily manipulate in R!
>>
>> But nothing like a little challenge. The idea I think would be to
>> transpose in chunks of rows by scanning in some number of rows and writing
>> to a temporary file
>>
>> tpose1 <- function(fin, nrowPerChunk, ncol) {
>> v <- scan(fin, character(), nmax=ncol * nrowPerChunk)
>> m <- matrix(v, ncol=ncol, byrow=TRUE)
>> fout <- tempfile()
>> write(m, fout, nrow(m), append=TRUE)
>> fout
>> }
>>
>> Apparently the data is 60k x 60k, so we could maybe easily read 60k x 10k
>> at a time from some file fl <- "big.txt"
>>
>> ncol <- 6L
>> nrowPerChunk <- 1L
>> nChunks <- ncol / nrowPerChunk
>>
>> fin <- file(fl); open(fin)
>> fls <- replicate(nChunks, tpose1(fin, nrowPerChunk, ncol))
>> close(fin)
>>
>> 'fls' is now a vector of file paths, each containing a transposed slice of
>> the matrix. The next task is to splice these together. We could do this by
>> taking a slice of rows from each file, cbind'ing them together, and writing
>> to an output
>>
>> splice <- function(fout, cons, nrowPerChunk, ncol) {
>> slices <- lapply(cons, function(con) {
>> v <- scan(con, character(), nmax=nrowPerChunk * ncol)
>> matrix(v, nrowPerChunk, byrow=TRUE)
>> })
>> m <- do.call(cbind, slices)
>> write(t(m), fout, ncol(m), append=TRUE)
>> }
>>
>> We'd need to use open connections as inputs and output
>>
>> cons <- lapply(fls, file); for (con in cons) open(con)
>> fout <- file("big_transposed.txt"); open(fout, "w")
>> xx <- replicate(nChunks, splice(fout, cons, nrowPerChunk,
>> nrowPerChunk))
>> for (con in cons) close(con)
>> close(fout)
>>
>> As another approach, it looks like the data are from genotypes. If they
>> really only consist of pairs of A, C, G, T, then two pairs e.g., 'AA' 'CT'
>> could be encoded as a single byte
>>
>> alf <- c("A", "C", "G", "T")
>> nms <- outer(alf, alf, paste0)
>> map <- outer(setNames(as.raw(0:15), nms),
>>  setNames(as.raw(bitwShiftL(0:**15, 4)), nms),
>>  "|")
>>
>> with e.g.,
>>
>> > map[matrix(c("AA", "CT"), ncol=2)]
>> [1] d0
>>
>> This translates the problem of representing the 60k x 60k array as a 3.6
>> billion element vector of 60k * 60k * 8 bytes (approx. 30 Gbytes) to one of
>> 60k x 30k = 1.8 billion elements (fits in R-2.15 vectors) of approx 1.8
>> Gbyte (probably usable in an 8 Gbyte laptop).
>>
>> Personally, I would probably put this data in a netcdf / rdf5 file.
>> Perhaps I'd use snpStats or GWAStools in Bioconductor
>> http://bioconductor.org.
>>
>> Martin
>>
>>
>>> HTH,
>>>
>>> Jan
>>>
>>>
>>>
>>>
>>> peter dalgaard  schreef:
>>>
>>>  On Mar 7, 2013, at 01:18 , Yao He wrote:

  Dear all:
>
> I have a big data file of 6 columns and 6 rows like that:
>
> AA AC AA AA ...AT
> CC CC CT CT...TC
> ..
> .
>
> I want to transpose it and the output is a new like that
> AA CC 
> AC CC
> AA CT.
> AA CT.
> 
> 
> AT TC.
>
> The keypoint is  I can't read it into R by read.table() because the
> data is too large,so I try that:
> c<-file("silygenotype.txt","r"**)
> geno_t<-list()
> repeat{
>  line<-readLines(c,n=1)
>  if (length(line)==0)break  #end of file
>  line<-unlist(strsplit(line,"\**t"))
> geno_t<-cbind(geno_t,line)
> }
> write.table(geno_t,"xxx.txt")
>
> It works but it is too slow ,how to optimize it???
>


 As others have pointed out, that's a lot of data!

 You seem to have the right idea: If you read the columns line by line
 there is
 nothing to transpose. A couple of points, though:

 -

[R] FW: 3/13/2013 3:55:55 PM

2013-03-13 Thread Jin Minming


http://www.physioth.com/mkqrozh/njy 



 3/13/2013 3:55:55 PM .


Jin Minming
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to read a *.csv file in R?

2013-03-13 Thread Maximus

Hey guys,

I am dealing with this kind of data. To read the file in R I have nulled all
empty fields and tried:

dateBRENT   BRENTchgHWWIHWWIchg
Jan. 86 22,5NULL68,1-15,6
Feb.86  17  NULL64,9-21,6
Mar. 86 13,7NULL66,6-19,5
Apr.86  12,3NULL63,6-19,1
May 86  14  NULL61,5-20,9
June 86 11,8NULL59,8-20,7
July 86 9,4 NULL57,2-19,3
Aug.86  13,2NULL55,5-18,3
Sep.86  14,2NULL57,5-15,1
Oct. 86 13,7NULL55,5-14,1
Nov.86  14,4NULL54,9-14,9
Dec. 86 15,7NULL52,9-26,4
Jan. 87 18,3-18,67  49,8-26,87
Feb.87  17,31,7649,9-23,11
Mar. 87 17,829,93   49,7-25,38
Apr.87  18  46,34   50,5-20,6
May 87  18,632,86   52,3-14,96
June 87 18,859,32   53,5-10,54
July 87 19,8110,64  54,5-4,72
Aug.87  18,943,18   55,3-0,36
Sep.87  18,228,17   55,1-4,17
Oct. 87 18,635,77   57,84,14
Nov.87  17,722,92   55,51,09
Dec. 87 16,87,0156,56,81
Jan. 88 16,7-8,74   58,417,27
Feb.88  15,7-9,25   59,519,24

> heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep="")
Error in read.table(file = file, header = header, sep = sep, quote = quote, 
: 
  duplicate 'row.names' are not allowed

However, my row names are not duplicated. When I try:

> heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep=",")
Error in read.table(file = file, header = header, sep = sep, quote = quote, 
: 
  more columns than column names

I have saved the file with excel under *.csv(MSDOS).

How to read this file?

Thank you in advance for your help?





--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-read-a-csv-file-in-R-tp4661154.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Accuracy of some classifiers

2013-03-13 Thread Nicolás Sánchez

I am using machine learning for one researching.  I am using some
classifiers with 5-fold CV . I would like to know how it is possible to
extract the accuracy, for example, for KNN,neural networks and J48,  for
each one of 5-fold because when I apply CV to my classifier, I obtain the
"mean accuracy" of 5-fold  but each accuracy/error of each fold is not
returned.

Any help is welcome and grateful.  Thanks in advance!

Regards!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread meng

So strange to find the attachment is disappear.
Resent again.





 



At 2013-03-13 13:01:01,"Pascal Oettli"  wrote:
>Hi,
>
>The attachment has been deleted. Please be more specific.
>
>Regards,
>Pascal
>
>On 13/03/13 10:20, meng wrote:
>> Hi all:
>> Is there a plot tool to use different color indicates difference magnitude 
>> of data?
>> The plot is in the attachment.
>>
>> Many thanks.
>>
>>
>>
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multi-comparison of means

2013-03-13 Thread meng

Hi all:
I have a question about multi-comparison.
 
The data is in the attachment.
 
My purpose:
Compare the predicted means of the 3 methods(a,b,c) pairwisely.
 
I have 3 ideas:
 
#idea1
result_aov<-aov(y~ method + x1 + x2)
TukeyHSD(result_aov)
 difflwr   upr p adj
b-a  0.845  0.5861098 1.1038902 0.001
c-a  0.790  0.5311098 1.0488902 0.002
c-b -0.055 -0.3138902 0.2038902 0.8578386

#idea2
library(multcomp)
summary(glht(result_aov,linfct=mcp(method="Tukey")))
 Estimate Std. Error t value Pr(>|t|)
b - a == 0   0.3239 0.1402   2.309   0.0683 .
c - a == 0  -0.3332 0.1937  -1.720   0.2069
c - b == 0  -0.6570 0.1325  -4.960   <0.001 ***
 
#idea3
#ref=a
dat$method <- relevel(dat$method, ref="a")
lm_ref_a<-lm(y~method + x1 + x2)
summary(lm_ref_a)
Coefficients:
Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.922020.64418   1.431   0.1647   
methodb  0.323890.14025   2.309   0.0295 * 
methodc -0.333160.19372  -1.720   0.0978 . 
x1   0.579350.09356   6.192 1.78e-06 ***
x2   0.135960.11563   1.176   0.2507   
 
#ref=b
dat$method <- relevel(dat$method, ref="b")
lm_ref_b<-lm(y~method + x1 + x2)
summary(lm_ref_b)

Coefficients:
Estimate Std. Error t value Pr(>|t|)   
(Intercept)  1.245910.73770   1.689   0.1037   
methoda -0.323890.14025  -2.309   0.0295 * 
methodc -0.657050.13248  -4.960 4.14e-05 ***
 

In summary:
idea1:
a vs b:pvalue=0.001
a vs c:pvalue=0.002
b vs c:pvalue=0.8578386
idea2:
a vs b:pvalue=0.0683
a vs c:pvalue=0.2069
b vs c:pvalue<0.001
idea3:
a vs b:pvalue=0.0295
a vs c:pvalue=0.0978
b vs c:pvalue=4.14e-05

So the result of 3 ideas are different,and I don't know which one is correct.
Many thanks for your help.

My best
 method  x1  x2  y
a   4.1 5.3 3.95
a   4.6 5   4.35
a   4.8 6   4.55
a   5.4 6.2 4.7
a   5.2 6.1 4.9
a   5.7 5.9 4.9
a   6   6   5
a   5.9 6.1 5.5
a   4.2 5.2 4.05
a   4.6 5   4.3
b   5.5 6.2 5.1
b   5   7.1 5.3
b   6   7   5.5
b   6.2 6.1 5.65
b   5.9 6.5 5.7
b   5.2 6.8 5.4
b   6.4 7.1 5.95
b   5.4 6.1 4.95
b   6.1 6   5.55
b   5.8 6.4 5.55
c   6.1 7.1 5.05
c   6.3 7   5.2
c   6.5 6.2 5.45
c   6.7 6.8 5.55
c   7   7.1 5.85
c   6.5 6.9 4.9
c   7.1 6.7 6
c   6.9 7   4.9
c   6.7 6.9 5.35
c   7.2 7.4 5.85
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to set the pdflatex path

2013-03-13 Thread annoporci


Dear all,

This is my first post to the mailing list. I asked this question a little
while ago on stackoverflow but did not get an answer. Please allow me to
ask again.

I have R set up on both Windows7 and kUbuntu12 machines. On Windows, I
happen to have both MikTeX and TeXlive available (and both work).

How could I instruct R to call TeXlive instead of MikTeX?

Both MikTeX and TeXlive are on my Windows PATH.

This is what I get now:

  Sys.which("pdflatex")
   pdflatex
  "C:\\PROGRA~2\\MIKTEX~1.9\\miktex\\bin\\pdflatex.exe"

I tried to set pdflatex via Sys.setenv

  Sys.setenv(pdflatex="C:/texlive/2012/bin/win32")

but it doesn't seem to work in that Sys.which("pdflatex") returns MikTeX
again: (I also tried with PDFLATEX or "PDFLATEX" in the above)

The closest I've got to setting the path to pdflatex is by clearing the
PATH and including TeXlive thus:

  Sys.setenv("PATH" = "C:/texlive/2012/bin/win32")

This radical move apparently gives me the desired path:
  Sys.which("pdflatex")
   pdflatex
  "C:\\texlive\\2012\\bin\\win32\\pdflatex.exe"

but R still cannot find the TeXlive executables.

For instance, running texi2dvi:

  tools::texi2pdf(Out)
  Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet =
quiet,
:
  pdflatex is not available

For information, my PATH shows (edited):

  Sys.getenv("PATH")
  [1] "C:\\ ... ;C:\\Program Files (x86)\\MiKTeX
2.9\\miktex\\bin\\;C:\\texlive\\2012\\bin\\win32;..."

Changing the order of MikTeX and TeXlive in the PATH did not help R pick up
TeXlive.

Suggestions will be appreciated,

Patrick.

For reference, my question on stackoverflow:
http://stackoverflow.com/questions/15033615/setting-up-r-to-pick-up-texlive-rather-than-miktex-on-windows

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Empty cluster / segfault using vanilla kmeans with version 2.15.2

2013-03-13 Thread Dr. Detlef Groth


Hello,

here is a working reproducible example which crashes R using kmeans or 
gives empty clusters using the nstart option with R 15.2.



library(cluster)
kmeans(ruspini,4)
kmeans(ruspini,4,nstart=2)
kmeans(ruspini,4,nstart=4)
kmeans(ruspini,4,nstart=10)
?kmeans

either we got empty always clusters and or, after some further commands 
an segfault.


regards,
Detlef Groth




[R] Empty cluster / segfault using vanilla kmeans with version 2.15.2
Uwe Ligges ligges at statistik.tu-dortmund.de
Sat Feb 9 20:52:19 CET 2013

Previous message: [R] Empty cluster / segfault using vanilla kmeans 
with version 2.15.2

Next message: [R] Fractional logit in GLM?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

We need a reproducible example.

Uwe Ligges


On 03.02.2013 15:03, Luca Nanetti wrote:

Dear experts,
I am encountering a version-dependent issue.

My laptop runs Ubuntu 12.04 LTS 64-bit, R 2.14.1; the issue explained below
never occurred with this version of R
My desktop runs Ubuntu 11.10 64-bit, R 2.13.2; what follows applies to this
setup.

The data I'm clustering is constituted by the rows of a 320 x 6 matrix
containing integers ranging from 1 to 7, no missing data.
I applied kmeans() to this matrix, literally, 256 x 10â�¶ times using R
version 2.13.2 or 2.14.1, without never experiencing the slightest problem.
My usual setup is with k=5, nstart=256, iter.max=50.

Upgrading to R 2.15.2, I experienced either a warning message ('Empty
cluster. Choose a better set of initial centers') or a catastrophic
segfault. The only way I can get a solution whatsoever is putting nstart to
its default value, i.e. 1. However, just repeating the clustering, the same
issue still happen. Moreover, this is vastly suboptimal, because the risk
of local minima.

Something similar was reported many years ago, see
https://stat.ethz.ch/pipermail/r-help/2003-November/041784.html. It was
then suggested that R's behaviour was correct. I'm not familiar with such
an early R version, but the up-to-date documentation of kmeans clearly
states that "Except for the Lloyd-Forgy method, k clusters will always be
returned if a number is specified.".
I am using the default Hartigan-Wong, and I specify an exact number k:
thus, k clusters should be returned. They aren't, and the empty cluster is
then more likely the symptom of a bug rather than the outcome of a 'true'
local minimum.

Using synaptic, I managed to downgrade R to version 2.13.2. The problem
disappeard, i.e. the previous message/segfault didn't occur anymore.

Summarizing: given the same dataset, either an unreasonable message or a
segfault regularly happen in version 2.15.2 by invoking kmeans() on an
Ubuntu 11.10 64bit machine. This does not happen at all in previous
versions of R, on the same machine and operating system.

I respectfully suggest that the behaviour shown in the aforementioned
versions 2.13.2 and 2.14.1 should be considered 'normal', and that version
2.15.2 should revert to that.

Kind regards,
Luca Nanetti.

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread Sarah Goslee

On Wed, Mar 13, 2013 at 4:13 AM, meng  wrote:
> So strange to find the attachment is disappear.
> Resent again.

Not strange at all. This list does not accept binary attachments, as
detailed in the posting guide linked at the bottom of this and every
list message.

Sarah

>
> At 2013-03-13 13:01:01,"Pascal Oettli"  wrote:
>>Hi,
>>
>>The attachment has been deleted. Please be more specific.
>>
>>Regards,
>>Pascal
>>
>>On 13/03/13 10:20, meng wrote:
>>> Hi all:
>>> Is there a plot tool to use different color indicates difference magnitude 
>>> of data?
>>> The plot is in the attachment.
>>>
>>> Many thanks.
>>>
>>>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read a *.csv file in R?

2013-03-13 Thread Daniel Nordlund

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Maximus
> Sent: Wednesday, March 13, 2013 12:15 AM
> To: r-help@r-project.org
> Subject: [R] How to read a *.csv file in R?
> 
> Hey guys,
> 
> I am dealing with this kind of data. To read the file in R I have nulled
> all
> empty fields and tried:
> 
> date  BRENT   BRENTchgHWWIHWWIchg
> Jan. 86   22,5NULL68,1-15,6
> Feb.8617  NULL64,9-21,6
> Mar. 86   13,7NULL66,6-19,5
> Apr.8612,3NULL63,6-19,1
> May 8614  NULL61,5-20,9
> June 86   11,8NULL59,8-20,7
> July 86   9,4 NULL57,2-19,3
> Aug.8613,2NULL55,5-18,3
> Sep.8614,2NULL57,5-15,1
> Oct. 86   13,7NULL55,5-14,1
> Nov.8614,4NULL54,9-14,9
> Dec. 86   15,7NULL52,9-26,4
> Jan. 87   18,3-18,67  49,8-26,87
> Feb.8717,31,7649,9-23,11
> Mar. 87   17,829,93   49,7-25,38
> Apr.8718  46,34   50,5-20,6
> May 8718,632,86   52,3-14,96
> June 87   18,859,32   53,5-10,54
> July 87   19,8110,64  54,5-4,72
> Aug.8718,943,18   55,3-0,36
> Sep.8718,228,17   55,1-4,17
> Oct. 87   18,635,77   57,84,14
> Nov.8717,722,92   55,51,09
> Dec. 87   16,87,0156,56,81
> Jan. 88   16,7-8,74   58,417,27
> Feb.8815,7-9,25   59,519,24
> 
> > heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep="")
> Error in read.table(file = file, header = header, sep = sep, quote =
> quote,
> :
>   duplicate 'row.names' are not allowed
> 
> However, my row names are not duplicated. When I try:
> 
> > heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep=",")
> Error in read.table(file = file, header = header, sep = sep, quote =
> quote,
> :
>   more columns than column names
> 
> I have saved the file with excel under *.csv(MSDOS).
> 
> How to read this file?
> 
> Thank you in advance for your help?
> 
> 

The data appear to be tab delimited with the decimal point being a comma (','). 
 So, try read.csv2()

heisenberg <- read.csv2(file="comprice.csv", header=TRUE, sep="\t")


Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ADCP data processing in R

2013-03-13 Thread Charles Berry

Janesh Devkota  gmail.com> writes:

> 
> Hello R Users, 
> 
> I have ADCP (Acoustic Doppler Current Profiler) data measurements for a
> river and I want to process these data using R. Is there a R package to
> handle ADCP data ? Any suggestions are highly appreciated. 
> 

Google 'acoustic data current profiler CRAN' then click 

  Search only for acoustic current profiler CRAN

Package oce is the top hit.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read a *.csv file in R?

2013-03-13 Thread peter dalgaard


On Mar 13, 2013, at 16:54 , Daniel Nordlund wrote:

>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of Maximus
>> Sent: Wednesday, March 13, 2013 12:15 AM
>> To: r-help@r-project.org
>> Subject: [R] How to read a *.csv file in R?
>> 
>> Hey guys,
>> 
>> I am dealing with this kind of data. To read the file in R I have nulled
>> all
>> empty fields and tried:
>> 
>> date BRENT   BRENTchgHWWIHWWIchg
>> Jan. 86  22,5NULL68,1-15,6
>> Feb.86   17  NULL64,9-21,6
>> Mar. 86  13,7NULL66,6-19,5
>> Apr.86   12,3NULL63,6-19,1
>> May 86   14  NULL61,5-20,9
>> June 86  11,8NULL59,8-20,7
>> July 86  9,4 NULL57,2-19,3
>> Aug.86   13,2NULL55,5-18,3
>> Sep.86   14,2NULL57,5-15,1
>> Oct. 86  13,7NULL55,5-14,1
>> Nov.86   14,4NULL54,9-14,9
>> Dec. 86  15,7NULL52,9-26,4
>> Jan. 87  18,3-18,67  49,8-26,87
>> Feb.87   17,31,7649,9-23,11
>> Mar. 87  17,829,93   49,7-25,38
>> Apr.87   18  46,34   50,5-20,6
>> May 87   18,632,86   52,3-14,96
>> June 87  18,859,32   53,5-10,54
>> July 87  19,8110,64  54,5-4,72
>> Aug.87   18,943,18   55,3-0,36
>> Sep.87   18,228,17   55,1-4,17
>> Oct. 87  18,635,77   57,84,14
>> Nov.87   17,722,92   55,51,09
>> Dec. 87  16,87,0156,56,81
>> Jan. 88  16,7-8,74   58,417,27
>> Feb.88   15,7-9,25   59,519,24
>> 
>>> heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep="")
>> Error in read.table(file = file, header = header, sep = sep, quote =
>> quote,
>> :
>>  duplicate 'row.names' are not allowed
>> 
>> However, my row names are not duplicated. When I try:
>> 
>>> heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep=",")
>> Error in read.table(file = file, header = header, sep = sep, quote =
>> quote,
>> :
>>  more columns than column names
>> 
>> I have saved the file with excel under *.csv(MSDOS).
>> 
>> How to read this file?
>> 
>> Thank you in advance for your help?
>> 
>> 
> 
> The data appear to be tab delimited with the decimal point being a comma 
> (',').  So, try read.csv2()
> 
> heisenberg <- read.csv2(file="comprice.csv", header=TRUE, sep="\t")
> 
> 

read.delim2() would be more to the point. (M: Are you _sure_ Excel calls this a 
csv file? Those are usually semicolon-separated in German locales.)

It's usually easier to leave blank fields blank in delimited formats. If you 
code NULL for missing, you'll need the na.strings= argument.

-pd


> Hope this is helpful,
> 
> Dan
> 
> Daniel Nordlund
> Bothell, WA USA
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread John Kane

The R-help list strips most attachements other than text (and perhaps pngs? ) 
to deduce the risk of virus or malware being recieved.  

You could try parking the file on something like medifire and providing a link 
here.

John Kane
Kingston ON Canada

> -Original Message-
> From: laomen...@163.com
> Sent: Wed, 13 Mar 2013 16:13:33 +0800 (CST)
> To: kri...@ymail.com
> Subject: Re: [R] different color indicates difference magnitude
> 
> So strange to find the attachment is disappear.
> Resent again.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> At 2013-03-13 13:01:01,"Pascal Oettli"  wrote:
> >Hi,
>> 
> >The attachment has been deleted. Please be more specific.
>> 
> >Regards,
> >Pascal
>> 
> >On 13/03/13 10:20, meng wrote:
>>> Hi all:
>>> Is there a plot tool to use different color indicates difference
>>> magnitude of data?
>>> The plot is in the attachment.
>>> 
>>> Many thanks.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read a *.csv file in R?

2013-03-13 Thread John Kane

If that is a text file I'd guess that the seperator is a tab.  I don's see why 
you used NULL as R should just read in the file with NA's for empty values.

You might want to try :
heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep="\t")

John Kane
Kingston ON Canada


> -Original Message-
> From: e0928...@student.tuwien.ac.at
> Sent: Wed, 13 Mar 2013 00:14:58 -0700 (PDT)
> To: r-help@r-project.org
> Subject: [R] How to read a *.csv file in R?
> 
> Hey guys,
> 
> I am dealing with this kind of data. To read the file in R I have nulled
> all
> empty fields and tried:
> 
> date  BRENT   BRENTchgHWWIHWWIchg
> Jan. 86   22,5NULL68,1-15,6
> Feb.8617  NULL64,9-21,6
> Mar. 86   13,7NULL66,6-19,5
> Apr.8612,3NULL63,6-19,1
> May 8614  NULL61,5-20,9
> June 86   11,8NULL59,8-20,7
> July 86   9,4 NULL57,2-19,3
> Aug.8613,2NULL55,5-18,3
> Sep.8614,2NULL57,5-15,1
> Oct. 86   13,7NULL55,5-14,1
> Nov.8614,4NULL54,9-14,9
> Dec. 86   15,7NULL52,9-26,4
> Jan. 87   18,3-18,67  49,8-26,87
> Feb.8717,31,7649,9-23,11
> Mar. 87   17,829,93   49,7-25,38
> Apr.8718  46,34   50,5-20,6
> May 8718,632,86   52,3-14,96
> June 87   18,859,32   53,5-10,54
> July 87   19,8110,64  54,5-4,72
> Aug.8718,943,18   55,3-0,36
> Sep.8718,228,17   55,1-4,17
> Oct. 87   18,635,77   57,84,14
> Nov.8717,722,92   55,51,09
> Dec. 87   16,87,0156,56,81
> Jan. 88   16,7-8,74   58,417,27
> Feb.8815,7-9,25   59,519,24
> 
>> heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep="")
> Error in read.table(file = file, header = header, sep = sep, quote =
> quote,
> :
>   duplicate 'row.names' are not allowed
> 
> However, my row names are not duplicated. When I try:
> 
>> heisenberg <- read.csv(file="comprice.csv",head=TRUE,sep=",")
> Error in read.table(file = file, header = header, sep = sep, quote =
> quote,
> :
>   more columns than column names
> 
> I have saved the file with excel under *.csv(MSDOS).
> 
> How to read this file?
> 
> Thank you in advance for your help?
> 
> 
> 
> 
> 
> --
> View this message in context:
> http://r.789695.n4.nabble.com/How-to-read-a-csv-file-in-R-tp4661154.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] holding argument(s) fixed within lapply

2013-03-13 Thread Blaser Nello

One way is to use the do.call function. For example:

ret2 <- lapply(X=mylist2, 
   FUN=do.call, 
   what=function(...) f2(y=Y, ...))

Best, 
Nello

-Original Message-
Date: Tue, 12 Mar 2013 22:37:52 -0400
From: Benjamin Tyner 
To: r-help@r-project.org
Subject: Re: [R] holding argument(s) fixed within lapply
Message-ID: <513fe680.2070...@gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Apologies; resending in plain text...

Given a function with several arguments, I would like to perform an
lapply (or equivalent) while holding one or more arguments fixed to some
common value, and I would like to do it in as elegant a fashion as
possible, without resorting to wrapping a separate wrapper for the
function if possible. Moreover I would also like it to work in cases
where one or more arguments to the original function has a default
binding.

# Here is an example; the original function
f <- function(w, y, z){ w + y + z }

# common value I would like y to take
Y <- 5

# I have a list of arguments for the lapply()
mylist <- list(one = list(w = 1, z = 2),
   two = list(w = 3, z = 4)
   )

# one way to do it involves a custom wrapper; I do not like this
method
ret <- lapply(FUN = function(x,...) f(w = x$w, z = x$z, ...),
  X   = mylist,
  y   = Y
  )

# another way
ret <- lapply(FUN  = with.default,
  X= mylist,
  expr = f(w, y = Y, z)
  )

# yet another way
ret <- lapply(FUN  = eval,
  X= mylist,
  expr = substitute(f(w, y = Y, z))
  )

# now, the part I'm stuck on is for a version of f where z has a
default binding
f2 <- function(w, y, z = 0){ w + y + z }

# the same as mylist, but now z is optional
mylist2 <- list(one = list(w = 1),
two = list(w = 3, z = 4)
)

# undesired result (first element has length 0)
ret2 <- lapply(FUN = function(x,...) f2(w = x$w, z = x$z, ),
   X   = mylist2,
   y   = Y
   )

# errors out ('z' not found)
ret2 <- lapply(FUN  = with.default,
   X= mylist2,
   expr = f2(w, y = Y, z)
   )

# errors out again
ret2 <- lapply(FUN  = eval,
   X= mylist2,
   expr = substitute(f2(w, y = Y, z))
   )

# not quite...
ret2 <- lapply(FUN = gtools::defmacro(y = Y, expr = f2(w, y = Y,
z)),
   X   = mylist2
   )

It seems like there are many ways to skin this cat; open to any and all
guidance others care to offer.

Regards,
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multi-comparison of means

2013-03-13 Thread Richard M. Heiberger

Meng,

What seems to be going on is that the covariates are handled very
differently in TukeyHSD and in glht.

Please see the interaction_average and covariate_average arguments to glht.

I ran your example twice, first as you did, with the covariates after the
factor.
x2 is not significant if the factor comes first.

The second time I placed the covariates before the factor.


> result_aov <- aov(y ~ method + x1 + x2, data=meng)
> anova(result_aov)
Analysis of Variance Table

Response: y
  Df Sum Sq Mean Sq F valuePr(>F)
method 2 4.4705 2.23525 41.3822 1.170e-08 ***
x1 1 2.8352 2.83519 52.4892 1.363e-07 ***
x2 1 0.0747 0.07469  1.38270.2507
Residuals 25 1.3504 0.05401
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> result2_aov <- aov(y ~ x1 + x2 + method, data=meng)
> anova(result2_aov)
Analysis of Variance Table

Response: y
  Df Sum Sq Mean Sq  F valuePr(>F)
x1 1 5.4399  5.4399 100.7113 2.985e-10 ***
x2 1 0.3134  0.3134   5.8017   0.02371 *
method 2 1.6271  0.8135  15.0616 5.100e-05 ***
Residuals 25 1.3504  0.0540
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Looking at the plot of your data shows a very interesting pattern.

> xyplot(y ~ x1 + x2 | method, outer=TRUE, data=meng)

b and c both give higher values of y than does a.  You will also see
that in the table of means.

Although I leave the interpretation of x1 and x2 to you, my inclination is
to drop x2 and look at the ancova of method and x1.

ancova is in the HH package.
## install.packages("HH")  ## if you don't have it yet.
library(HH)

> result3_aov <- ancova(y ~ method + x1, data=meng)
> result3_aov
Analysis of Variance Table

Response: y
  Df Sum Sq Mean Sq F valuePr(>F)
method 2 4.4705 2.23525  40.782 9.616e-09 ***
x1 1 2.8352 2.83519  51.728 1.226e-07 ***
Residuals 26 1.4251 0.05481
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> update(attr(result3_aov, "trellis"), ylim=c(3.5, 6.7))
>

Now it looks like conditional on x1, all three methods differ.

Rich

On Wed, Mar 13, 2013 at 7:15 AM, meng  wrote:

> Hi all:
> I have a question about multi-comparison.
>
> The data is in the attachment.
>
> My purpose:
> Compare the predicted means of the 3 methods(a,b,c) pairwisely.
>
> I have 3 ideas:
>
> #idea1
> result_aov<-aov(y~ method + x1 + x2)
> TukeyHSD(result_aov)
>  difflwr   upr p adj
> b-a  0.845  0.5861098 1.1038902 0.001
> c-a  0.790  0.5311098 1.0488902 0.002
> c-b -0.055 -0.3138902 0.2038902 0.8578386
>
> #idea2
> library(multcomp)
> summary(glht(result_aov,linfct=mcp(method="Tukey")))
>  Estimate Std. Error t value Pr(>|t|)
> b - a == 0   0.3239 0.1402   2.309   0.0683 .
> c - a == 0  -0.3332 0.1937  -1.720   0.2069
> c - b == 0  -0.6570 0.1325  -4.960   <0.001 ***
>
> #idea3
> #ref=a
> dat$method <- relevel(dat$method, ref="a")
> lm_ref_a<-lm(y~method + x1 + x2)
> summary(lm_ref_a)
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept)  0.922020.64418   1.431   0.1647
> methodb  0.323890.14025   2.309   0.0295 *
> methodc -0.333160.19372  -1.720   0.0978 .
> x1   0.579350.09356   6.192 1.78e-06 ***
> x2   0.135960.11563   1.176   0.2507
>
> #ref=b
> dat$method <- relevel(dat$method, ref="b")
> lm_ref_b<-lm(y~method + x1 + x2)
> summary(lm_ref_b)
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept)  1.245910.73770   1.689   0.1037
> methoda -0.323890.14025  -2.309   0.0295 *
> methodc -0.657050.13248  -4.960 4.14e-05 ***
>
>
> In summary:
> idea1:
> a vs b:pvalue=0.001
> a vs c:pvalue=0.002
> b vs c:pvalue=0.8578386
> idea2:
> a vs b:pvalue=0.0683
> a vs c:pvalue=0.2069
> b vs c:pvalue<0.001
> idea3:
> a vs b:pvalue=0.0295
> a vs c:pvalue=0.0978
> b vs c:pvalue=4.14e-05
>
> So the result of 3 ideas are different,and I don't know which one is
> correct.
> Many thanks for your help.
>
> My best
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Numeric Class to Nominal class

2013-03-13 Thread Nicolás Sánchez

Hello everybody!

I have a question. I am working with R and I am loading a .txt from my
computer. This file contains a set of features from patients with numerical
values and the score, that determines the gravity of the disease( from 0 to
3). Due to I want to obtain a classification of my patients using
SVM,neural networks or C.4.5, I think I need to "convert" the numeric class
to nominal class, not?
I have thought to make a easy modification. The score is from 0 to 3. So, I
have converted "0" to nominal class "g0", and successively. But I am not
sure if this is the correct way to do it. Any information is welcome.

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Numeric Class to Nominal class

2013-03-13 Thread Nordlund, Dan (DSHS/RDA)

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Nicolás Sánchez
> Sent: Wednesday, March 13, 2013 9:30 AM
> To: r-help@r-project.org
> Subject: [R] Numeric Class to Nominal class
> 
> Hello everybody!
> 
> I have a question. I am working with R and I am loading a .txt from my
> computer. This file contains a set of features from patients with
> numerical
> values and the score, that determines the gravity of the disease( from
> 0 to
> 3). Due to I want to obtain a classification of my patients using
> SVM,neural networks or C.4.5, I think I need to "convert" the numeric
> class
> to nominal class, not?
> I have thought to make a easy modification. The score is from 0 to 3.
> So, I
> have converted "0" to nominal class "g0", and successively. But I am
> not
> sure if this is the correct way to do it. Any information is welcome.
> 
> Thanks!
> 

I would convert it to an ordered factor.  See ?factor


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread MacQueen, Don

There's the package
  rtf   Rich Text Format (RTF) Output

I've not tried it, but the name is suggestive.

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 3/12/13 5:02 PM, "Santosh"  wrote:

>Dear Rxperts,
>I am aware of Sweave that generates reports into a pdf, but do know of any
>tools to generate to export to a MS Word document...
>
>Is there  a way to use R to generate and export report/publication quality
>tables and figures and export them to MS word (for reporting purposes)?
>
>Thanks so much,
>Santosh
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread David L Carlson

If you are just looking for a range of colors that communicate low to high
values, try package RColorBrewer and look at the sequential palettes.

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352



> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of John Kane
> Sent: Wednesday, March 13, 2013 11:19 AM
> To: meng; Pascal Oettli
> Cc: R help
> Subject: Re: [R] different color indicates difference magnitude
> 
> The R-help list strips most attachements other than text (and perhaps
> pngs? ) to deduce the risk of virus or malware being recieved.
> 
> You could try parking the file on something like medifire and providing
> a link here.
> 
> John Kane
> Kingston ON Canada
> 
> 
> > -Original Message-
> > From: laomen...@163.com
> > Sent: Wed, 13 Mar 2013 16:13:33 +0800 (CST)
> > To: kri...@ymail.com
> > Subject: Re: [R] different color indicates difference magnitude
> >
> > So strange to find the attachment is disappear.
> > Resent again.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > At 2013-03-13 13:01:01,"Pascal Oettli"  wrote:
> > >Hi,
> >>
> > >The attachment has been deleted. Please be more specific.
> >>
> > >Regards,
> > >Pascal
> >>
> > >On 13/03/13 10:20, meng wrote:
> >>> Hi all:
> >>> Is there a plot tool to use different color indicates difference
> >>> magnitude of data?
> >>> The plot is in the attachment.
> >>>
> >>> Many thanks.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> __
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on
> your desktop!
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread David L Carlson

Package xtable will produce html output. If you save the file and then open
it with Word, you will get serviceable results. I've had better luck copying
the output from xtable and pasting it into Excel. Make necessary changes and
then paste the table into Word. Obviously very tedious if you are making
more than a few tables. There is also an R2wd package, but I haven't tried
it.

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of MacQueen, Don
> Sent: Wednesday, March 13, 2013 11:56 AM
> To: Santosh; r-help
> Subject: Re: [R] Export R generated tables and figures to MS Word
> 
> There's the package
>   rtf Rich Text Format (RTF) Output
> 
> I've not tried it, but the name is suggestive.
> 
> --
> Don MacQueen
> 
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> 
> 
> 
> 
> 
> On 3/12/13 5:02 PM, "Santosh"  wrote:
> 
> >Dear Rxperts,
> >I am aware of Sweave that generates reports into a pdf, but do know of
> any
> >tools to generate to export to a MS Word document...
> >
> >Is there  a way to use R to generate and export report/publication
> quality
> >tables and figures and export them to MS word (for reporting
> purposes)?
> >
> >Thanks so much,
> >Santosh
> >
> > [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R Advanced Programming Courses by XLSolutions Corp

2013-03-13 Thread Sue Turner

Please check out XLSolutions March-Mary 2013 R Advanced Programming
courses schedule
Washington DC, Boston, San Francisco, Las Vegas, etc.

More on website

http://www.xlsolutions-corp.com/courselistlisting.aspx

Ask for group discount and reserve your seat Now - Earlybird Rates.
Payment due after the class! Email Sue Turner:  sue at
xlsolutions-corp.com

Phone: 206-686-1578


Please let us know if you and your colleagues are interested in this
class to take advantage of group discount. Register now to secure your
seat.

Cheers,
Elvis Miller, PhD
Manager Training.
XLSolutions Corporation
206 686 1578
www.xlsolutions-corp.com
elvis at xlsolutions-corp.com
http://www.xlsolutions-corp.com/courselistlisting.aspx

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging a dataframe or vectors

2013-03-13 Thread David Winsemius


On Mar 13, 2013, at 6:11 AM, Al Ehan wrote:

> Hi,
> 
> I would like to know what is the easiest way to compile two or more set of
> vectors or data frame, according to their index. They are interrelated to
> one another by their assigned index. for example:
> 
> #data set 1
> abc
> #output:
>  X403   X408   X410   X415   X418   X419   X420   X423   X424   X425
> X426   X427
> 549.58 541.91 544.18 549.37 555.54 540.83 543.26 544.26 546.85 548.98
> 553.10 556.49
>  X428
> 543.57
> 
> #data set2
> def
> #output:
> X401   X402   X404   X405   X406   X407   X409   X411   X412   X413   X414
>  X416
> 528.46 524.15 527.18 526.04 533.71 537.79 536.80 532.38 517.14 529.32
> 523.29 539.58
> X417   X421   X422   X429
> 535.38 532.68 515.28 523.10

Why wouldn't you just use:

abcdef <- c(abc, def)
abcdef[order(names(abcdef))]

> 
> Both are numeric values and have indeces above each of the numbers which
> referring to X401 to X429 indices. I would like to combine both by sorting
> X401, X402,X403 and so on. Could somebody please help me before I waste my
> time using excel to do this. Thanks!
> 

>   [[alternative HTML version deleted]]
> 

> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread David Winsemius

On Mar 13, 2013, at 9:18 AM, John Kane wrote:

> The R-help list strips most attachements other than text (and perhaps pngs? ) 
> to deduce the risk of virus or malware being recieved.  

More accurately the server strips all files of type that are not in the set:  
MIME-TEXT, pdf, png.

Most mail clients will not label files ending in .dat, .csv or .fil as 
mime-text and so many files which would otherwise be helpful  and are ASCII 
files do get discarded. I do not think that the server applies a test to the 
extension but rather that the mail clients are causing the problem by labeling 
them something else.

I am attaching an ascii file with an extension `.dat`. I expect it to be 
stripped.

--- 
David

> 
> You could try parking the file on something like medifire and providing a 
> link here.
> 
> John Kane
> Kingston ON Canada
> 
> 
>> -Original Message-
>> From: laomen...@163.com
>> Sent: Wed, 13 Mar 2013 16:13:33 +0800 (CST)
>> To: kri...@ymail.com
>> Subject: Re: [R] different color indicates difference magnitude
>> 
>> So strange to find the attachment is disappear.
>> Resent again.
>> 
>> 
>> At 2013-03-13 13:01:01,"Pascal Oettli"  wrote:
>>> Hi,
>>> 
>>> The attachment has been deleted. Please be more specific.
>>> 
>>> Regards,
>>> Pascal
>>> 
>>> On 13/03/13 10:20, meng wrote:
 Hi all:
 Is there a plot tool to use different color indicates difference
 magnitude of data?
 The plot is in the attachment.

 Many thanks.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] string split at xth position

2013-03-13 Thread Keith Weintraub


Here is another way

require(stringr)

aaa<-paste0("a", 1:20)
bbb<-paste0("b", 101:120)
ab<-paste0(aaa,bbb)
ab


ptrn<-"([ab][[:digit:]]*)"
unlist(str_extract_all(ab, ptrn))



> Hi,
> 
> I have a vector of strings like:
> c("a1b1","a2b2","a1b2") which I want to spilt into two parts like:
> c("a1","a2","a2") and c("b1","b2,"b2"). So there is
> always a first part with a+number and a second part with b+number.
> Unfortunately there is no separator I could use to directly split
> the vectors.. Any idea how to handle such cases?
> 
> /Johannes


--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread John Kane

Thanks David. The clarification helps.  I had forgotten that pdfs would get 
through.

Howeve the file got to me, I suspect because I am listed in the To line under 
my actual email address.  I suspect that only meng and I recieved it and the 
rest of the list did not.

John Kane
Kingston ON Canada


> -Original Message-
> From: dwinsem...@comcast.net
> Sent: Wed, 13 Mar 2013 10:45:46 -0700
> To: jrkrid...@inbox.com
> Subject: Re: [R] different color indicates difference magnitude
> 
> 
> On Mar 13, 2013, at 9:18 AM, John Kane wrote:
> 
>> The R-help list strips most attachements other than text (and perhaps
>> pngs? ) to deduce the risk of virus or malware being recieved.
> 
> More accurately the server strips all files of type that are not in the
> set:  MIME-TEXT, pdf, png.
> 
> Most mail clients will not label files ending in .dat, .csv or .fil as
> mime-text and so many files which would otherwise be helpful  and are
> ASCII files do get discarded. I do not think that the server applies a
> test to the extension but rather that the mail clients are causing the
> problem by labeling them something else.
> 
> I am attaching an ascii file with an extension `.dat`. I expect it to be
> stripped.
> 
> 
> ---
> David
> 
>> 
>> You could try parking the file on something like medifire and providing
>> a link here.
>> 
>> John Kane
>> Kingston ON Canada
>> 
>> 
>>> -Original Message-
>>> From: laomen...@163.com
>>> Sent: Wed, 13 Mar 2013 16:13:33 +0800 (CST)
>>> To: kri...@ymail.com
>>> Subject: Re: [R] different color indicates difference magnitude
>>> 
>>> So strange to find the attachment is disappear.
>>> Resent again.
>>> 
>>> 
>>> At 2013-03-13 13:01:01,"Pascal Oettli"  wrote:
 Hi,
 
 The attachment has been deleted. Please be more specific.
 
 Regards,
 Pascal
 
 On 13/03/13 10:20, meng wrote:
> Hi all:
> Is there a plot tool to use different color indicates difference
> magnitude of data?
> The plot is in the attachment.
> 
> Many thanks.
> 
> 
> 
> David Winsemius
> Alameda, CA, USA


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different color indicates difference magnitude

2013-03-13 Thread David Winsemius


On Mar 13, 2013, at 10:45 AM, David Winsemius wrote:

> 
> On Mar 13, 2013, at 9:18 AM, John Kane wrote:
> 
>> The R-help list strips most attachements other than text (and perhaps pngs? 
>> ) to deduce the risk of virus or malware being recieved.  
> 
> More accurately the server strips all files of type that are not in the set:  
> MIME-TEXT, pdf, png.
> 
> Most mail clients will not label files ending in .dat, .csv or .fil as 
> mime-text and so many files which would otherwise be helpful  and are ASCII 
> files do get discarded. I do not think that the server applies a test to the 
> extension but rather that the mail clients are causing the problem by 
> labeling them something else.
> 
> I am attaching an ascii file with an extension `.dat`. I expect it to be 
> stripped.

My mail client labeled that attachment as:
--

--Apple-Mail-83-489973374
Content-Disposition: attachment;
filename=junk.dat
Content-Type: application/octet-stream;
name="junk.dat"
Content-Transfer-Encoding: 7bit
-

... and the server stripped it, since it was expecting something like this:

--Apple-Mail-83-489973374
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii

And I am here attaching a file I expect to remain attached:

a b c d e
f g h i j
k l m n o
p q r s t
u v w x y


-- 
David

> 
> --- 
> David
> 
>> 
>> You could try parking the file on something like medifire and providing a 
>> link here.
>> 
>> John Kane
>> Kingston ON Canada
>> 
>> 
>>> -Original Message-
>>> From: laomen...@163.com
>>> Sent: Wed, 13 Mar 2013 16:13:33 +0800 (CST)
>>> To: kri...@ymail.com
>>> Subject: Re: [R] different color indicates difference magnitude
>>> 
>>> So strange to find the attachment is disappear.
>>> Resent again.
>>> 
>>> 
>>> At 2013-03-13 13:01:01,"Pascal Oettli"  wrote:
 Hi,
 
 The attachment has been deleted. Please be more specific.
 
 Regards,
 Pascal
 
 On 13/03/13 10:20, meng wrote:
> Hi all:
> Is there a plot tool to use different color indicates difference
> magnitude of data?
> The plot is in the attachment.
> 
> Many thanks.
> 
> 
> 


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread Santosh

Dear Rxperts..
Awesome responses! Thank you so much for your responses! I think I have a
50-course meal to gobble! If you get more ideas.. Please do continue to
share.

Santosh


On Wed, Mar 13, 2013 at 10:13 AM, David L Carlson  wrote:

> Package xtable will produce html output. If you save the file and then open
> it with Word, you will get serviceable results. I've had better luck
> copying
> the output from xtable and pasting it into Excel. Make necessary changes
> and
> then paste the table into Word. Obviously very tedious if you are making
> more than a few tables. There is also an R2wd package, but I haven't tried
> it.
>
> --
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of MacQueen, Don
> > Sent: Wednesday, March 13, 2013 11:56 AM
> > To: Santosh; r-help
> > Subject: Re: [R] Export R generated tables and figures to MS Word
> >
> > There's the package
> >   rtf Rich Text Format (RTF) Output
> >
> > I've not tried it, but the name is suggestive.
> >
> > --
> > Don MacQueen
> >
> > Lawrence Livermore National Laboratory
> > 7000 East Ave., L-627
> > Livermore, CA 94550
> > 925-423-1062
> >
> >
> >
> >
> >
> > On 3/12/13 5:02 PM, "Santosh"  wrote:
> >
> > >Dear Rxperts,
> > >I am aware of Sweave that generates reports into a pdf, but do know of
> > any
> > >tools to generate to export to a MS Word document...
> > >
> > >Is there  a way to use R to generate and export report/publication
> > quality
> > >tables and figures and export them to MS word (for reporting
> > purposes)?
> > >
> > >Thanks so much,
> > >Santosh
> > >
> > > [[alternative HTML version deleted]]
> > >
> > >__
> > >R-help@r-project.org mailing list
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] string split at xth position

2013-03-13 Thread arun

Hi,

If you have cases like these:
 x1<- c("a1b11","a10b2","a2b2","a140b31")
 lapply(list(c(1,2),c(3,4)),function(i) substr(x1,i[1],i[2])) #it will not work
#[[1]]
#[1] "a1" "a1" "a2" "a1"

#[[2]]
#[1] "b1" "0b" "b2" "40"


let1<-unique(unlist(strsplit(gsub("\\d+","",x1),"")))
 split(unlist(strsplit(gsub("(\\w\\d+)(\\w\\d+)","\\1 \\2",x1)," ")),let1)
#$a
#[1] "a1"   "a10"  "a2"   "a140"

#$b
#[1] "b11" "b2"  "b2"  "b31"
A.K.




- Original Message -
From: arun 
To: Johannes Radinger 
Cc: 
Sent: Wednesday, March 13, 2013 8:32 AM
Subject: Re: [R] string split at xth position

Also,
You could try:
 lapply(list(c(1,2),c(3,4)),function(i) substr(x,i[1],i[2]))
#[[1]]
#[1] "a1" "a2" "a1"

#[[2]]
#[1] "b1" "b2" "b2"
A.K.



- Original Message -
From: Johannes Radinger 
To: r-help@r-project.org
Cc: 
Sent: Wednesday, March 13, 2013 4:37 AM
Subject: [R] string split at xth position

Hi,

I have a vector of strings like:
c("a1b1","a2b2","a1b2") which I want to spilt into two parts like:
c("a1","a2","a2") and c("b1","b2,"b2"). So there is
always a first part with a+number and a second part with b+number.
Unfortunately there is no separator I could use to directly split
the vectors.. Any idea how to handle such cases?

/Johannes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Export R generated tables and figures to MS Word

2013-03-13 Thread Greg Snow

I don't see any mention of odfWeave yet.  It works with OpenOffice files.
 OpenOffice is a free equivalent to MS Office and can read and write Word
documents.  So you can create your template file in OpenOffice (or use MS
word and convert to OpenOffice format, OO will read word docs, and recent
versions of Word will write OpenOffice docs), process it with R and
odfWeave, then convert the result to Word (or open it with recent versions
of Word that read odf files).

Note that Sword and R2wd have a part of their toolchain that is not free
software, so if you use either check to make sure that you have filled the
license conditions.

These days I personally use eithre the pander package or knitr (usually
with Markdown, sometimes LaTeX) and use pandoc to convert to Word format.
 Most of my tables are simple enough that this works great, with something
more complicated I would follow Frank's advice and first create a pdf from
LaTeX, then convert from there if needed.

On Wed, Mar 13, 2013 at 12:20 PM, Santosh  wrote:

> Dear Rxperts..
> Awesome responses! Thank you so much for your responses! I think I have a
> 50-course meal to gobble! If you get more ideas.. Please do continue to
> share.
>
> Santosh
>
>
> On Wed, Mar 13, 2013 at 10:13 AM, David L Carlson 
> wrote:
>
> > Package xtable will produce html output. If you save the file and then
> open
> > it with Word, you will get serviceable results. I've had better luck
> > copying
> > the output from xtable and pasting it into Excel. Make necessary changes
> > and
> > then paste the table into Word. Obviously very tedious if you are making
> > more than a few tables. There is also an R2wd package, but I haven't
> tried
> > it.
> >
> > --
> > David L Carlson
> > Associate Professor of Anthropology
> > Texas A&M University
> > College Station, TX 77843-4352
> >
> >
> > > -Original Message-
> > > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > > project.org] On Behalf Of MacQueen, Don
> > > Sent: Wednesday, March 13, 2013 11:56 AM
> > > To: Santosh; r-help
> > > Subject: Re: [R] Export R generated tables and figures to MS Word
> > >
> > > There's the package
> > >   rtf Rich Text Format (RTF) Output
> > >
> > > I've not tried it, but the name is suggestive.
> > >
> > > --
> > > Don MacQueen
> > >
> > > Lawrence Livermore National Laboratory
> > > 7000 East Ave., L-627
> > > Livermore, CA 94550
> > > 925-423-1062
> > >
> > >
> > >
> > >
> > >
> > > On 3/12/13 5:02 PM, "Santosh"  wrote:
> > >
> > > >Dear Rxperts,
> > > >I am aware of Sweave that generates reports into a pdf, but do know of
> > > any
> > > >tools to generate to export to a MS Word document...
> > > >
> > > >Is there  a way to use R to generate and export report/publication
> > > quality
> > > >tables and figures and export them to MS word (for reporting
> > > purposes)?
> > > >
> > > >Thanks so much,
> > > >Santosh
> > > >
> > > > [[alternative HTML version deleted]]
> > > >
> > > >__
> > > >R-help@r-project.org mailing list
> > > >https://stat.ethz.ch/mailman/listinfo/r-help
> > > >PLEASE do read the posting guide
> > > >http://www.R-project.org/posting-guide.html
> > > >and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Assign the number to each group of multiple rows

2013-03-13 Thread Lilia Dmitrieva

Dear R users,



My data have repeating "beh" parameter : 1 or 2 - type of animal behavior
in subsequent locations. I need to assign unique number to each sequence of
locations.

My data is:


>data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
>attach(data)
>data


   row beh

111

221

331

442

552

662

771

881

992

10  102


I need the output like this:

   row beh seq   trip.id

111   1   1

221   2   1

331   3   1

442   1   2

552   2   2

662   3   2

771   1   3

881   2   3

992   1   4
10  10   2   2  4

I managed to assign sequence numbers inside of each group:


> seq<-sequence(rle(beh)$length)

> new<-cbind(data,seq)
> new


   row beh seq

111   1

221   2

331   3

442   1

552   2

662   3

771   1

881   2

992   1

10  10   2   2



but I cant assign the numbers to the groups (the parameter "trip.id")...  I
would appreciate any help.



Regards,



Lilia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assign the number to each group of multiple rows

2013-03-13 Thread arun

Hi,
Try this:
data1<-data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
data1<-within(data1, {trip.id<- cumsum(c(1,abs(diff(beh; 
Seq<-ave(row,trip.id,FUN=seq)})
 data1
#   row beh Seq trip.id
#1    1   1   1   1
#2    2   1   2   1
#3    3   1   3   1
#4    4   2   1   2
#5    5   2   2   2
#6    6   2   3   2
#7    7   1   1   3
#8    8   1   2   3
#9    9   2   1   4
#10  10   2   2   4
A.K.




- Original Message -
From: Lilia Dmitrieva 
To: r-help@r-project.org
Cc: 
Sent: Wednesday, March 13, 2013 3:05 PM
Subject: [R] Assign the number to each group of multiple rows

Dear R users,



My data have repeating "beh" parameter : 1 or 2 - type of animal behavior
in subsequent locations. I need to assign unique number to each sequence of
locations.

My data is:


>data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
>attach(data)
>data


   row beh

1    1    1

2    2    1

3    3    1

4    4    2

5    5    2

6    6    2

7    7    1

8    8    1

9    9    2

10  10    2


I need the output like this:

   row beh seq  trip.id

1    1    1   1       1

2    2    1   2       1

3    3    1   3       1

4    4    2   1       2

5    5    2   2       2

6    6    2   3       2

7    7    1   1       3

8    8    1   2       3

9    9    2   1       4
10  10   2   2      4

I managed to assign sequence numbers inside of each group:


> seq<-sequence(rle(beh)$length)

> new<-cbind(data,seq)
> new


   row beh seq

1    1    1   1

2    2    1   2

3    3    1   3

4    4    2   1

5    5    2   2

6    6    2   3

7    7    1   1

8    8    1   2

9    9    2   1

10  10   2   2



but I can’t assign the numbers to the groups (the parameter "trip.id")...  I
would appreciate any help.



Regards,



Lilia

    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Empty cluster / segfault using vanilla kmeans with version 2.15.2

2013-03-13 Thread Uwe Ligges




On 13.03.2013 13:45, Dr. Detlef Groth wrote:

Hello,

here is a working reproducible example which crashes R using kmeans or
gives empty clusters using the nstart option with R 15.2.


library(cluster)
kmeans(ruspini,4)
kmeans(ruspini,4,nstart=2)
kmeans(ruspini,4,nstart=4)
kmeans(ruspini,4,nstart=10)
?kmeans

either we got empty always clusters and or, after some further commands
an segfault.


Yes, thanks, I can reproduce it in 2.15.3, but not in R-prerelease.

Maybe this is a side effect of a bug already fixed in R-prerelease. 
Since R-2.15.3 is frozen now, please upgrade to R-prerelease to become 
R-3.0.0 in April.


Best,
Uwe Ligges



regards,
Detlef Groth




[R] Empty cluster / segfault using vanilla kmeans with version 2.15.2
Uwe Ligges ligges at statistik.tu-dortmund.de
Sat Feb 9 20:52:19 CET 2013

 Previous message: [R] Empty cluster / segfault using vanilla kmeans
with version 2.15.2
 Next message: [R] Fractional logit in GLM?
 Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

We need a reproducible example.

Uwe Ligges


On 03.02.2013 15:03, Luca Nanetti wrote:

Dear experts,
I am encountering a version-dependent issue.

My laptop runs Ubuntu 12.04 LTS 64-bit, R 2.14.1; the issue explained
below
never occurred with this version of R
My desktop runs Ubuntu 11.10 64-bit, R 2.13.2; what follows applies to
this
setup.

The data I'm clustering is constituted by the rows of a 320 x 6 matrix
containing integers ranging from 1 to 7, no missing data.
I applied kmeans() to this matrix, literally, 256 x 10â�¶ times using R
version 2.13.2 or 2.14.1, without never experiencing the slightest
problem.
My usual setup is with k=5, nstart=256, iter.max=50.

Upgrading to R 2.15.2, I experienced either a warning message ('Empty
cluster. Choose a better set of initial centers') or a catastrophic
segfault. The only way I can get a solution whatsoever is putting
nstart to
its default value, i.e. 1. However, just repeating the clustering, the
same
issue still happen. Moreover, this is vastly suboptimal, because the risk
of local minima.

Something similar was reported many years ago, see
https://stat.ethz.ch/pipermail/r-help/2003-November/041784.html. It was
then suggested that R's behaviour was correct. I'm not familiar with such
an early R version, but the up-to-date documentation of kmeans clearly
states that "Except for the Lloyd-Forgy method, k clusters will always be
returned if a number is specified.".
I am using the default Hartigan-Wong, and I specify an exact number k:
thus, k clusters should be returned. They aren't, and the empty
cluster is
then more likely the symptom of a bug rather than the outcome of a 'true'
local minimum.

Using synaptic, I managed to downgrade R to version 2.13.2. The problem
disappeard, i.e. the previous message/segfault didn't occur anymore.

Summarizing: given the same dataset, either an unreasonable message or a
segfault regularly happen in version 2.15.2 by invoking kmeans() on an
Ubuntu 11.10 64bit machine. This does not happen at all in previous
versions of R, on the same machine and operating system.

I respectfully suggest that the behaviour shown in the aforementioned
versions 2.13.2 and 2.14.1 should be considered 'normal', and that
version
2.15.2 should revert to that.

Kind regards,
Luca Nanetti.

[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assign the number to each group of multiple rows

2013-03-13 Thread jim holtman

try this:

> data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
> data$tripid <- cumsum(c(TRUE, diff(data$beh) != 0))
> data
   row beh tripid
11   1  1
22   1  1
33   1  1
44   2  2
55   2  2
66   2  2
77   1  3
88   1  3
99   2  4
10  10   2  4
>


On Wed, Mar 13, 2013 at 3:05 PM, Lilia Dmitrieva  wrote:

> Dear R users,
>
>
>
> My data have repeating "beh" parameter : 1 or 2 - type of animal behavior
> in subsequent locations. I need to assign unique number to each sequence of
> locations.
>
> My data is:
>
>
> >data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
> >attach(data)
> >data
>
>
>row beh
>
> 111
>
> 221
>
> 331
>
> 442
>
> 552
>
> 662
>
> 771
>
> 881
>
> 992
>
> 10  102
>
>
> I need the output like this:
>
>row beh seq   trip.id
>
> 111   1   1
>
> 221   2   1
>
> 331   3   1
>
> 442   1   2
>
> 552   2   2
>
> 662   3   2
>
> 771   1   3
>
> 881   2   3
>
> 992   1   4
> 10  10   2   2  4
>
> I managed to assign sequence numbers inside of each group:
>
>
> > seq<-sequence(rle(beh)$length)
>
> > new<-cbind(data,seq)
> > new
>
>
>row beh seq
>
> 111   1
>
> 221   2
>
> 331   3
>
> 442   1
>
> 552   2
>
> 662   3
>
> 771   1
>
> 881   2
>
> 992   1
>
> 10  10   2   2
>
>
>
> but I cant assign the numbers to the groups (the parameter "trip.id")...
>  I
> would appreciate any help.
>
>
>
> Regards,
>
>
>
> Lilia
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assign the number to each group of multiple rows

2013-03-13 Thread jim holtman

I forgot the 'seq':

> data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
> data$tripid <- cumsum(c(TRUE, diff(data$beh) != 0))
> data$seq <- ave(data$beh, data$tripid, FUN = function(x) seq_along(x))
> data
   row beh tripid seq
11   1  1   1
22   1  1   2
33   1  1   3
44   2  2   1
55   2  2   2
66   2  2   3
77   1  3   1
88   1  3   2
99   2  4   1
10  10   2  4   2


On Wed, Mar 13, 2013 at 3:05 PM, Lilia Dmitrieva  wrote:

> Dear R users,
>
>
>
> My data have repeating "beh" parameter : 1 or 2 - type of animal behavior
> in subsequent locations. I need to assign unique number to each sequence of
> locations.
>
> My data is:
>
>
> >data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
> >attach(data)
> >data
>
>
>row beh
>
> 111
>
> 221
>
> 331
>
> 442
>
> 552
>
> 662
>
> 771
>
> 881
>
> 992
>
> 10  102
>
>
> I need the output like this:
>
>row beh seq   trip.id
>
> 111   1   1
>
> 221   2   1
>
> 331   3   1
>
> 442   1   2
>
> 552   2   2
>
> 662   3   2
>
> 771   1   3
>
> 881   2   3
>
> 992   1   4
> 10  10   2   2  4
>
> I managed to assign sequence numbers inside of each group:
>
>
> > seq<-sequence(rle(beh)$length)
>
> > new<-cbind(data,seq)
> > new
>
>
>row beh seq
>
> 111   1
>
> 221   2
>
> 331   3
>
> 442   1
>
> 552   2
>
> 662   3
>
> 771   1
>
> 881   2
>
> 992   1
>
> 10  10   2   2
>
>
>
> but I cant assign the numbers to the groups (the parameter "trip.id")...
>  I
> would appreciate any help.
>
>
>
> Regards,
>
>
>
> Lilia
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to convert a data.frame to tree structure object such as dendrogram

2013-03-13 Thread Bert Gunter

Here is a simpler, less clumsy version of my previous recursive R
solution that I sent you privately, which I'll also cc to the list
this time. It's now almost a one-liner.

To avoid problems with unused factor levels, I still prefer to have
character vectors not factors, as the data frame columns so:

df <- data.frame(a=c('A','A', 'A', 'B','B','C','C','C'), b=c('Aa',
'Ab','Ab','Ba','Bd', 'C1','C2','C3'), c=c('Aa1', 'Ab1', 'Ab2', 'Ba1',
'Bd2', 'C11','C12','C13'), stringsAsFactors=FALSE)

makeTree2 <-function(x, i,n)
{
  if(i==n)df[x,i]
  else {
spl <- split(x,df[x,i])
lapply(spl,function(x)makeTree2(x,i+1,n))   ##Can't use Recall()
  }
}

This is now called as

> makeTree2(seq_len(nrow(df)),1,ncol(df))  ## no list structure needed for x
## yielding (with the root implicit now)

$A
$A$Aa
[1] "Aa1"

$A$Ab
[1] "Ab1" "Ab2"


$B
$B$Ba
[1] "Ba1"

$B$Bd
[1] "Bd2"


$C
$C$C1
[1] "C11"

$C$C2
[1] "C12"

$C$C3
[1] "C13"



On Wed, Mar 13, 2013 at 10:25 AM, Not To Miss  wrote:
> The ideal solution, I think, is probably recursive. In the last min I
> decided to wrote a python script to do this ( use python instead of perl or
> R, because of python mutable dict data structure), although I had preferred
> to keep all my code in one R piece. I post code here just in case you are
> interested. It generates a dict of dict of dict ...
>
> Hopefully I would not get beaten up for posting python code in R mailing
> list. :-)
>
> import sys
> tree = {}
> ## input file is a table with columns TAB demilited
> for line in open(sys.argv[1]):
> if line.startswith('#'): continue
> items = line.strip().split('\t')
> tmp = tree
> for item in items:
> if not item in tmp:
> tmp[item]={}
> tmp = tmp[item]
>
> The tree looks like this for the example:
> {'A': {'Aa': {'Aa1': {}}, 'Ab': {'Ab1': {}, 'Ab2': {}}}, 'C': {'C3': {'C13':
> {}}, 'C2': {'C12': {}}, 'C1': {'C11': {}}}, 'B': {'Bd': {'Bd2': {}}, 'Ba':
> {'Ba1': {
>
> On Wed, Mar 13, 2013 at 10:35 AM, David Winsemius 
> wrote:
>>
>>
>> On Mar 12, 2013, at 9:22 PM, Not To Miss wrote:
>>
>> Nope, Bert, you miss me? :-D
>>
>> I apologize that I didn't provide a more realistic example and describe
>> the problem more clearly. The real data are just too complicated to post in
>> emails, so I made up a simple example, which perhaps seems a little over
>> simplistic now, but the basic structure are the same. Here is a more
>> approapriate one:
>> >data.frame(a=c('A','A', 'A', 'B','B','C','C','C'), b=c('Aa',
>> > 'Ab','Ab','Ba','Bd', 'C1','C2','C3'), c=c('Aa1', 'Ab1', 'Ab2', 'Ba1', 
>> > 'Bd2',
>> > 'C11','C12','C13'))
>>   a  b   c
>> 1 A Aa Aa1
>> 2 A Ab Ab1
>> 3 A Ab Ab2
>> 4 B Ba Ba1
>> 5 B Bd Bd2
>> 6 C C1 C11
>> 7 C C2 C12
>> 8 C C3 C13
>>
>> The data structure to convert to:
>>  |---Aa--Aa1
>>  A---|/--Ab1
>>  |   |---Ab--|
>>  |\--Ab2
>>  |   |---Ba--Ba1
>>  B---|
>>  |   |---Bd--Bd2
>>  |
>>  |/---C1-C11
>>  C---|C2-C12
>>   \---C3-C13
>>
>> It's multi-level nested and I won't know how many rows and columns of the
>> data.frame ahead of time. I plan to write a perl script to do the
>> conversion, just more familiar, if it's not easy to do in R. Thanks Don and
>> Greg for suggesting solutions.
>>
>>
>> After a bit of coding I am going to say your proposed answer is wrong (or
>> at least improperly specified). The first level can be recovered as you
>> suggest :
>>
>> > sapply(unique(dfrm[[1]]), function(x) dfrm[[2]][grep(x, dfrm[[2]]) ])
>> $A
>> [1] "Aa" "Ab" "Ab"
>>
>> $B
>> [1] "Ba" "Bd"
>>
>> $C
>> [1] "C1" "C2" "C3"
>>
>>
>> But the second level cannot be as you imagined. The third level items
>> beginning with "C1" all get associated together and there are no terminal
>> nodes for C2 or C3 at the third level.
>>
>> > sapply(unique(dfrm[[2]]), function(x) dfrm[[3]][grep(x, dfrm[[3]]) ])
>> $Aa
>> [1] "Aa1"
>>
>> $Ab
>> [1] "Ab1" "Ab2"
>>
>> $Ba
>> [1] "Ba1"
>>
>> $Bd
>> [1] "Bd2"
>>
>> $C1
>> [1] "C11" "C12" "C13"
>>
>> $C2
>> character(0)
>>
>> $C3
>> character(0)
>>
>> lev1 <- sapply(unique(dfrm[[1]]), function(x) dfrm[[2]][grep(x, dfrm[[2]])
>> ])
>>  lapply(lev1, function(ll) lapply(ll, function(lll) dfrm[[3]][grep(lll,
>> dfrm[[3]]) ])  )
>>
>> $A
>> $A[[1]]
>> [1] "Aa1"
>>
>> $A[[2]]
>> [1] "Ab1" "Ab2"
>>
>> $A[[3]]
>> [1] "Ab1" "Ab2"
>>
>>
>> $B
>> $B[[1]]
>> [1] "Ba1"
>>
>> $B[[2]]
>> [1] "Bd2"
>>
>>
>> $C
>> $C[[1]]
>> [1] "C11" "C12" "C13"
>>
>> $C[[2]]
>> character(0)
>>
>> $C[[3]]
>> character(0)
>>
>> --
>> David.
>>
>>
>>
>> On Tue, Mar 12, 2013 at 2:18 PM, Bert Gunter 
>> wrote:
>>>
>>> So Mr. "not.tomiss" missed?
>>>
>>> :(
>>>
>>> -- Bert
>>>
>>> On Tue, Mar 12, 2013 at 1:08 PM, David Winsemius 
>>> wrote:
>>> >
>>> > On Mar 12, 2013, at 9:37 AM, Not To Miss wrote:
>>> >
>>> >> Thanks. Is there any more elegant solution? What if I don't know how
>>> >> many
>>> >> levels of nesting ahead of time?
>>> >
>>> >

[R] 2 questions about svg output

2013-03-13 Thread Ivan Zaigralin

Hi everybody :)

I use R to plot things in svg format. One of the things is text, of course. 
I noticed that text() in svg() gets saved as path, which is unacceptable
for my purposes. (Interestingly, text() in cairo_pdf() gets saved as text.)
Is there a way to save text as text in svg?

And paths also is what I plot a lot. I know there is segments(), which plots
disconnected segments, and things like polypath(), which create closed paths
(and subpaths). These are all very useful, but is there a function to draw
a multi-segment path without closing it? That is, without connecting the
last vertex to the first one?

Thanks!



signature.asc
Description: OpenPGP digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assign the number to each group of multiple rows

2013-03-13 Thread Lilia Dmitrieva

Fantastic! Thank you so much



On 13 March 2013 19:48, arun  wrote:

> Hi,
> Try this:
> data1<-data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
> data1<-within(data1, {trip.id<- cumsum(c(1,abs(diff(beh; Seq<-ave(row,
> trip.id,FUN=seq)})
>  data1
> #   row beh Seq trip.id
> #11   1   1   1
> #22   1   2   1
> #33   1   3   1
> #44   2   1   2
> #55   2   2   2
> #66   2   3   2
> #77   1   1   3
> #88   1   2   3
> #99   2   1   4
> #10  10   2   2   4
> A.K.
>
>
>
>
> - Original Message -
> From: Lilia Dmitrieva 
> To: r-help@r-project.org
> Cc:
> Sent: Wednesday, March 13, 2013 3:05 PM
> Subject: [R] Assign the number to each group of multiple rows
>
> Dear R users,
>
>
>
> My data have repeating "beh" parameter : 1 or 2 - type of animal behavior
> in subsequent locations. I need to assign unique number to each sequence of
> locations.
>
> My data is:
>
>
> >data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
> >attach(data)
> >data
>
>
>row beh
>
> 111
>
> 221
>
> 331
>
> 442
>
> 552
>
> 662
>
> 771
>
> 881
>
> 992
>
> 10  102
>
>
> I need the output like this:
>
>row beh seq  trip.id
>
> 111   1   1
>
> 221   2   1
>
> 331   3   1
>
> 442   1   2
>
> 552   2   2
>
> 662   3   2
>
> 771   1   3
>
> 881   2   3
>
> 992   1   4
> 10  10   2   2  4
>
> I managed to assign sequence numbers inside of each group:
>
>
> > seq<-sequence(rle(beh)$length)
>
> > new<-cbind(data,seq)
> > new
>
>
>row beh seq
>
> 111   1
>
> 221   2
>
> 331   3
>
> 442   1
>
> 552   2
>
> 662   3
>
> 771   1
>
> 881   2
>
> 992   1
>
> 10  10   2   2
>
>
>
> but I cant assign the numbers to the groups (the parameter "trip.id")...
> I
> would appreciate any help.
>
>
>
> Regards,
>
>
>
> Lilia
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Assign the number to each group of multiple rows

2013-03-13 Thread Lilia Dmitrieva

This works too! Thank you.
Such a relief after few days spent on trying to solve it.

Lilia

On 13 March 2013 19:52, jim holtman  wrote:

> I forgot the 'seq':
>
> > data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
> > data$tripid <- cumsum(c(TRUE, diff(data$beh) != 0))
> > data$seq <- ave(data$beh, data$tripid, FUN = function(x) seq_along(x))
> > data
>row beh tripid seq
> 11   1  1   1
> 22   1  1   2
> 33   1  1   3
> 44   2  2   1
> 55   2  2   2
> 66   2  2   3
> 77   1  3   1
> 88   1  3   2
> 99   2  4   1
> 10  10   2  4   2
>
>
> On Wed, Mar 13, 2013 at 3:05 PM, Lilia Dmitrieva wrote:
>
>> Dear R users,
>>
>>
>>
>> My data have repeating "beh" parameter : 1 or 2 - type of animal behavior
>> in subsequent locations. I need to assign unique number to each sequence
>> of
>> locations.
>>
>> My data is:
>>
>>
>> >data=data.frame(row=seq(1:10),beh=c(1,1,1,2,2,2,1,1,2,2))
>> >attach(data)
>> >data
>>
>>
>>row beh
>>
>> 111
>>
>> 221
>>
>> 331
>>
>> 442
>>
>> 552
>>
>> 662
>>
>> 771
>>
>> 881
>>
>> 992
>>
>> 10  102
>>
>>
>> I need the output like this:
>>
>>row beh seq   trip.id
>>
>> 111   1   1
>>
>> 221   2   1
>>
>> 331   3   1
>>
>> 442   1   2
>>
>> 552   2   2
>>
>> 662   3   2
>>
>> 771   1   3
>>
>> 881   2   3
>>
>> 992   1   4
>> 10  10   2   2  4
>>
>> I managed to assign sequence numbers inside of each group:
>>
>>
>> > seq<-sequence(rle(beh)$length)
>>
>> > new<-cbind(data,seq)
>> > new
>>
>>
>>row beh seq
>>
>> 111   1
>>
>> 221   2
>>
>> 331   3
>>
>> 442   1
>>
>> 552   2
>>
>> 662   3
>>
>> 771   1
>>
>> 881   2
>>
>> 992   1
>>
>> 10  10   2   2
>>
>>
>>
>> but I cant assign the numbers to the groups (the parameter "trip.id")...
>>  I
>> would appreciate any help.
>>
>>
>>
>> Regards,
>>
>>
>>
>> Lilia
>>
>> [[alternative HTML version deleted]]
>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 2 questions about svg output

2013-03-13 Thread Paul Murrell


Hi

On 14/03/13 09:52, Ivan Zaigralin wrote:

Hi everybody :)

I use R to plot things in svg format. One of the things is text, of course. 
I noticed that text() in svg() gets saved as path, which is unacceptable
for my purposes. (Interestingly, text() in cairo_pdf() gets saved as text.)
Is there a way to save text as text in svg?


You could try the 'gridSVG' package, but that will only work if your 
graphics are grid/lattice/ggplot2.



And paths also is what I plot a lot. I know there is segments(), which plots
disconnected segments, and things like polypath(), which create closed paths
(and subpaths). These are all very useful, but is there a function to draw
a multi-segment path without closing it? That is, without connecting the
last vertex to the first one?


Try lines() (or grid.lines())

Paul


Thanks!



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expression exponent labeling

2013-03-13 Thread Peter Ehlers


On 2013-03-13 06:37, Gerrit Eichner wrote:

Hi, Berry,

I think


for(i in -8:-3) axis(1, i, substitute(10^j, list( j = i)))


achieves what you want.

   Regards --  Gerrit


Here's another (really no different) solution:

 for(i in -8:-3) axis(1, i, bquote(10^.(i)))

For more flexibility, if you need to do this a lot, you could
create an "expression" vector:

 x <- -8:-3
 z <- vector("expression", 6)
 for(i in 1:6) z[[i]] <- bquote(10^.(x[i]))
 axis(1, x, z)

Peter Ehlers



On Wed, 13 Mar 2013, Berry Boessenkool wrote:




Hi all,

I want to label an axis with exponents, but can't get it done with expression.
Any hints would be very welcome!

# simulated data, somewhat similarly distributed to my real data:
set.seed(12); d <- rbeta(1e6, 0.2,2)*150 ; d <- d[d>1e-8]
hist( d  , breaks=100)
# now on a logarithmically scaled axis:
hist(log10(d), breaks=100, xaxt="n")
abline(v= log10(1:10*10^rep(-9:3, each=10)), col="darkgrey" ); box()
hist(log10(d), breaks=100, col="forestgreen", add=T)
axis(1, log10(1:10*10^rep(-9:3, each=10)), labels=F)
axis(1, -2:2, format(10^(-2:2), scient=3, drop0trailing=T) )
# the labels with lower values should be in the form of 10^x:
axis(1, -8:-3, expression( 10^(-8:-3)) )# doesn't work, because expression 
returns only one output
for(i in -8:-3) axis(1, i, expression(10^i)  ) # writes i at all locations

expression does exactly what it should, but I want something different here...
I've tried I(10^i) instead, but that's not right either.

Thanks ahead,
Berry



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculating column difference in a matrix

2013-03-13 Thread Pedro Mardones

Dear R users;

Consider the following toy example:

a <- matrix(c(2,3,4,NA,NA,5,8,NA,8,NA), 5, 2)
b <- cbind(a,apply(a, 1, diff, na.rm = TRUE))

What I would like be able to get is:

c <- matrix(c(2,3,4,NA,NA,5,8,NA,8,NA,3,5,-4,8,NA), 5, 3)

i.e., for each row if both values (column 1 and 2) are NA then the
difference must return NA, but if any of those two values is different
from NA (!is.na), I would like to actually perform the difference
(something like assigning 0 instead to NA to the cell).

I'd appreciate any hint or comment

Best,

Pedro

BTW: the reason of this is that I'm trying to build an error checking
procedure for a big dataset I have

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating column difference in a matrix

2013-03-13 Thread arun

Hi,
Try this:
a1<-a

a1[rowSums(is.na(a1))==1][ is.na(a1[rowSums(is.na(a1))==1])]<-0
library(matrixStats) c1<- cbind(a,rowDiffs(a1))
 c1
# [,1] [,2] [,3]
#[1,]    2    5    3
#[2,]    3    8    5
#[3,]    4   NA   -4
#[4,]   NA    8    8
#[5,]   NA   NA   NA

identical(c,c1)
#[1] TRUE

A.K.



- Original Message -
From: Pedro Mardones 
To: R-help@r-project.org
Cc: 
Sent: Wednesday, March 13, 2013 5:43 PM
Subject: [R] calculating column difference in a matrix

Dear R users;

Consider the following toy example:

a <- matrix(c(2,3,4,NA,NA,5,8,NA,8,NA), 5, 2)
b <- cbind(a,apply(a, 1, diff, na.rm = TRUE))

What I would like be able to get is:

c <- matrix(c(2,3,4,NA,NA,5,8,NA,8,NA,3,5,-4,8,NA), 5, 3)

i.e., for each row if both values (column 1 and 2) are NA then the
difference must return NA, but if any of those two values is different
from NA (!is.na), I would like to actually perform the difference
(something like assigning 0 instead to NA to the cell).

I'd appreciate any hint or comment

Best,

Pedro

BTW: the reason of this is that I'm trying to build an error checking
procedure for a big dataset I have

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] loop in a data.table

2013-03-13 Thread Camilo Mora


Hi everyone,

I have a data.table called "data" with many columns which I want to  
group by column1 using data.table, given how fast it is.


The problem with looping a data.table is that data.table does not like  
quotations  to define the column names (e.g. "col2" instead of col2).  
I found a way around which is to use get("col2"), which works fine but  
the processing time multiples by 20.


So if I use:

data[,sum(col2),by=(key)]

entering the column names by hand, the operation is done in 1 sec. but  
if in the contrary I use:


data[,sum(get("col2")),by=(key)]

using a loop to put the column names, the same operation takes 20 sec.  
I cannot use the former code because I have 10 files to process  
but the later will simply take months to complete. Is there any  
alternative to the function "get" or any other way in which data.table  
con recognize the names of the columns?.


Thanks,
Camilo




Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
 Provider code: 313
 Phone 776 2282
 From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Determining maximum hourly slope per day

2013-03-13 Thread Peter Ehlers


On 2013-03-12 17:10, Nathan Miller wrote:

Hello,

I have a challenge!

I have a large dataset with three columns, "date","temp", "location".
"date" is in the format %m/%d/%y %H:%M, with a "temp" recorded every 10
minutes. These temperatures of surface temperatures and so fluctuate during
the day, heating up and then cooling down, so the data is a series of peaks
and troughs. I would like to develop a function that would go through a
dataset consisting of many sequential dates and determine for each day the
maximum hourly slope of temp~date for each site (the fastest hourly rate of
heating). The output would be the date, the maximum hourly slope for that
date, and the location. It would also be great if I could extract when
during the day the maximum hourly slope occurred.

I have been playing around with using the package lubridate to identify
each hour of the day using something like this to create a separate column
grouping the data into hours

library(lubridate)
data$date2 <- floor_date(data$date, "hour")

I was then imagining something like this though this code doesn't work as
written.

ddply(data, .(location, date2), function(d)
max(rollapply(slope(d$temp~d$date, data=d)))

Essentially what I'm imagining is calculating the slope (though I'd have to
write a quick slope function) of the date/temp relationship, use rollapply
to apply this function across the dataset, and determine the maximum slope,
grouped by location and hour (using date2). Hmm... and per day!

This seems complicated. Can others think of a simpler, more elegant means
of extracting this type of data? I struggled to put together a working
example with a set of data, but if this doesn't make sense let me know and
I'll see what I can do.


Thanks,
Nate


First, let's ignore location; if you can do it for one location,
you can surely do it for others.

Second, let's ignore date; if you can do it for one date, you
can surely do it for others.

That leaves us with the question of what you want to do for one
given date. If you want the maximum slope for any 60-minute interval
on that date (which I take your question to mean), then rollapply
should do the job. But I'm not very familiar with zoo, so here's a
crude approach:

  d <- data.frame(time = 1:72, temp = rnorm(72))
  slope <- rep(NA, 72)
  for(i in 6:72) {
 slope[i] <- coef(lm(temp ~ time, data = d, subset = (i-5):i))[2]
  }
  maxslope <- max(slope, na.rm = TRUE)
  idx <- which.max(slope)

Obviously, this can be extended to cover more than a 24-hour period.

Now, let's wait for Gabor to show us the trivial way with zoo::rollapply.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Modifying a data frame based on a vector that contains column numbers

2013-03-13 Thread Dimitri Liakhovitski

Hello!

# I have a data frame:
mydf<-data.frame(c1=rep(NA,5),c2=rep(NA,5),c3=rep(NA,5))

# I have an index whose length is always the same as nrow(mydf):
myindex<-c(1,2,3,2,1)

# I need c1 to have 1s in rows 1 and 5 (based on the information in myindex)
# I need c2 to have 1s in rows 2 and 4 (also based on myindex)
# I need c3 to have 1 in row 3
# In other words, I am trying to achieve this result:
mygoal<-data.frame(c1=c(1,NA,NA,NA,1),c2=c(NA,1,NA,1,NA),c3=c(NA,NA,1,NA,NA))

I know how to do it with a loop that runs through rows of mydf.
However, in real life I have a huge data frame with tons of rows, dozens of
columns (instead of 3 in this example) - I am afraid it'll take forever.
Any hint on how to do it faster, maybe using subindexing somehow?

Thank you very much!

-- 
Dimitri Liakhovitski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] image color analysis

2013-03-13 Thread ishi soichi

thanks for the tips! your answer will certainly help me a lot!

ishida


2013/3/13 Robert Baer 

> On 3/13/2013 12:05 AM, ishi soichi wrote:
>
>> I am not sure if I should ask this question in this list. But I'll try.
>>
>> Currently I am trying to analyze images using EBImage and biOps.
>> One of the features that I need to extract from various images is the
>> color
>> spectrum, namely, which colors each image consists of.
>>
>> So, each image hopefully can be converted into some sort of color
>> histogram
>> so that color ingredients are easily comparable with each other.
>>
>> There are so many functionalities that these packages and others provide,
>> and I am hoping that someone would give me some guideline for the
>> analysis.
>>
>> Any suggestion?
>>
> Your question is quite general, so I'll make a couple of general comments,
> returning us to R at the end.
>
> You need to read about spectral color systems, for example
> http://www.fourmilab.ch/**documents/specrend/.
> You undoubtedly have used filters, whether a Bayer filter typically built
> into a color camera or more specific filters if you have used a monochrome
> camera to collect non-RGB channels.  You need to know what type of
> transforms might have been performed during the storage process.  For
> example, has the image already been transformed to RGB space before
> storage? In fluorescent spectroscopy, for example, it is common to use
> pseudo-coloring so the channels of the stored image may not be directly
> convertible into spectral color without additional information.
>
> R can do all the appropriate matrix algebra once you define the specifics
> of your individual conversion.
>
> Hope this helps,
>
> Rob
>
>
>
>> Thanks.
>>
>> ishida
>>
>> [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
>
> Robert W. Baer, Ph.D.
> Professor of Physiology
> Kirksille College of Osteopathic Medicine
> A. T. Still University of Health Sciences
> Kirksville, MO 63501 USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm and lm can't find weights

2013-03-13 Thread Dimitri Liakhovitski

Thanks a lot, Marc!
Dimitri

On Mon, Mar 11, 2013 at 4:28 PM, Marc Schwartz  wrote:

>
> On Mar 11, 2013, at 1:46 PM, Dimitri Liakhovitski <
> dimitri.liakhovit...@gmail.com> wrote:
>
> > Hello, and apologies for not providing an example. However, my question
> is
> > more general.
> >
> > I have a lengthy function. This function is using another internal
> function
> > that modifies the data frame I am reading in. This internal function is
> > using the command model.frame (with data and weights inside) and returns
> a
> > data frame I am using for further analyses.
> > However, when I try to run my function (which has an lm or a glm
>  commmand)
> > using that new data frame and separately defined weights, I get an error:
> > can't find your weights object.
> > This happens despite the fact that I actually print the weights object
> > right before the glm command. The object is there - I can see it using
> > ls(). I checked and rechecked - there are no typos.
> > Interestingly, this happens only when I run it as a function.
> >
> > When I rename my arguments, go inside the function and run it line by
> line
> > - I don't get this problem. Clearly, something is happening with my
> weights
> > in the function environment. I was thinking - can it be that once I've
> used
> > model.frame - everything else - like glm and lm - is confused as to what
> > the weights are and doesn't want to take the weights I hand over to it
> but
> > is looking for them elsewhere?
> >
> > Thank you!
> > Dimitri
>
>
> Many discussions on this over the years, along with the 'subset' argument.
> From the Details section of ?lm:
>
>   All of weights, subset and offset are evaluated in the same way as
> variables in formula,
>   that is first in data and then in the environment of formula.
>
> You might also review:
>
>   http://developer.r-project.org/nonstandard-eval.pdf
>
> This post by Thomas from 2006 may be helpful:
>
>   http://tolstoy.newcastle.edu.au/R/devel/06/06/5869.html
>
> and the reply from Peter also:
>
>   http://tolstoy.newcastle.edu.au/R/devel/06/06/5868.html
>
> Regards,
>
> Marc Schwartz
>
>


-- 
Dimitri Liakhovitski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loop in a data.table

2013-03-13 Thread Steve Lianoglou

Hi,

On Wed, Mar 13, 2013 at 7:25 PM, Camilo Mora  wrote:
> Hi everyone,
>
> I have a data.table called "data" with many columns which I want to group by
> column1 using data.table, given how fast it is.
>
> The problem with looping a data.table is that data.table does not like
> quotations  to define the column names (e.g. "col2" instead of col2). I
> found a way around which is to use get("col2"), which works fine but the
> processing time multiples by 20.
>
> So if I use:
>
> data[,sum(col2),by=(key)]
>
> entering the column names by hand, the operation is done in 1 sec. but if in
> the contrary I use:
>
> data[,sum(get("col2")),by=(key)]
>
> using a loop to put the column names, the same operation takes 20 sec. I
> cannot use the former code because I have 10 files to process but the
> later will simply take months to complete. Is there any alternative to the
> function "get" or any other way in which data.table con recognize the names
> of the columns?.

I'm still not sure what you're trying to do. Could you maybe create an
example that's a bit closer to you real data and the stuff you want to
do on it?

Are all the columns of the same type?
Are you just summing columns?

If you post code into an email that reconstructions a small version of
your data.table (maybe 5-10 columns and one or two groups) it'd be
more clear for me.

Thanks,
-steve
-- 
Steve Lianoglou
Defender of The Thesis
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modifying a data frame based on a vector that contains column numbers

2013-03-13 Thread William Dunlap

Try looping over columns, as in

fDF <- function (x, column)
{
stopifnot(length(dim(x))==2, all(column > 0), all(column <= ncol(x)), 
length(column) == nrow(x))
u <- unique(column)
tmp <- split(seq_along(column), factor(column, levels = u))
for (i in seq_along(tmp)) {
x[ tmp[[i]],  u[i] ] <- 1
}
x
}

> fDF(mydf, myindex)
  c1 c2 c3
1  1 NA NA
2 NA  1 NA
3 NA NA  1
4 NA  1 NA
5  1 NA NA

If you use a matrix instead of a data.frame then the following works and is 
probably much quicker.
fMat <- function (x, column) 
{
stopifnot(is.matrix(x), all(column > 0), all(column <= ncol(x)), 
length(column) == nrow(x))
x[cbind(seq_len(nrow(x)), column)] <- 1
x
}

Your problem may be better represented with sparse matrices (see the Matrix 
package).

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Dimitri Liakhovitski
> Sent: Wednesday, March 13, 2013 5:11 PM
> To: r-help
> Subject: [R] Modifying a data frame based on a vector that contains column 
> numbers
> 
> Hello!
> 
> # I have a data frame:
> mydf<-data.frame(c1=rep(NA,5),c2=rep(NA,5),c3=rep(NA,5))
> 
> # I have an index whose length is always the same as nrow(mydf):
> myindex<-c(1,2,3,2,1)
> 
> # I need c1 to have 1s in rows 1 and 5 (based on the information in myindex)
> # I need c2 to have 1s in rows 2 and 4 (also based on myindex)
> # I need c3 to have 1 in row 3
> # In other words, I am trying to achieve this result:
> mygoal<-data.frame(c1=c(1,NA,NA,NA,1),c2=c(NA,1,NA,1,NA),c3=c(NA,NA,1,NA,NA))
> 
> I know how to do it with a loop that runs through rows of mydf.
> However, in real life I have a huge data frame with tons of rows, dozens of
> columns (instead of 3 in this example) - I am afraid it'll take forever.
> Any hint on how to do it faster, maybe using subindexing somehow?
> 
> Thank you very much!
> 
> --
> Dimitri Liakhovitski
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Modifying a data frame based on a vector that contains column numbers

2013-03-13 Thread arun

HI,
Try this:
 mydf1<- mydf
 mydf1[]<-lapply(1:3,function(i) {mydf[which(i== myindex),i]<-1; mydf[,i]})
 mydf1
#  c1 c2 c3
#1  1 NA NA
#2 NA  1 NA
#3 NA NA  1
#4 NA  1 NA
#5  1 NA NA


 identical(mydf1,mygoal)
#[1] TRUE
A.K.



- Original Message -
From: Dimitri Liakhovitski 
To: r-help 
Cc: 
Sent: Wednesday, March 13, 2013 8:10 PM
Subject: [R] Modifying a data frame based on a vector that contains column 
numbers

Hello!

# I have a data frame:
mydf<-data.frame(c1=rep(NA,5),c2=rep(NA,5),c3=rep(NA,5))

# I have an index whose length is always the same as nrow(mydf):
myindex<-c(1,2,3,2,1)

# I need c1 to have 1s in rows 1 and 5 (based on the information in myindex)
# I need c2 to have 1s in rows 2 and 4 (also based on myindex)
# I need c3 to have 1 in row 3
# In other words, I am trying to achieve this result:
mygoal<-data.frame(c1=c(1,NA,NA,NA,1),c2=c(NA,1,NA,1,NA),c3=c(NA,NA,1,NA,NA))

I know how to do it with a loop that runs through rows of mydf.
However, in real life I have a huge data frame with tons of rows, dozens of
columns (instead of 3 in this example) - I am afraid it'll take forever.
Any hint on how to do it faster, maybe using subindexing somehow?

Thank you very much!

-- 
Dimitri Liakhovitski

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 2 questions about svg output

2013-03-13 Thread Ivan Zaigralin

On 03/13/2013 05:16 PM, Paul Murrell wrote:
> On 14/03/13 09:52, Ivan Zaigralin wrote:
>> I use R to plot things in svg format. One of the things is text, of course.  
>>  
>> I noticed that text() in svg() gets saved as path, which is unacceptable
>> for my purposes. (Interestingly, text() in cairo_pdf() gets saved as text.)
>> Is there a way to save text as text in svg?
> 
> You could try the 'gridSVG' package, but that will only work if your graphics
> are grid/lattice/ggplot2.

Thanks, it works. But I have to give up drawing functions like barplot()
and pie()?

>> And paths also is what I plot a lot. I know there is segments(), which plots
>> disconnected segments, and things like polypath(), which create closed paths
>> (and subpaths). These are all very useful, but is there a function to draw
>> a multi-segment path without closing it? That is, without connecting the
>> last vertex to the first one?
> 
> Try lines() (or grid.lines())

Perfect, thanks.

P.S.: Paul, sorry for extra copy.

signature.asc
Description: OpenPGP digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape

2013-03-13 Thread arun

Hi Elisa,

You need to check your data.  For some 'st', the data is repeated/duplicated (I 
am assuming, didn't check it) especially for a particular year.

dat1<-read.csv("elisa.csv",sep="\t")
dat1$st<- as.character(dat1$st)
 str(dat1)
#'data.frame':    506953 obs. of  5 variables:
# $ st   : chr  "AGOMO" "AGOMO" "AGOMO" "AGOMO" ...
# $ year : int  2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 ...
# $ month    : int  1 1 1 1 1 1 1 1 1 1 ...
# $ day  : int  1 2 3 4 5 6 7 8 9 10 ...
# $ discharge: num  6.75 7.6 5.58 5.02 4.8 ...

library(reshape2)
dat2<- do.call(rbind,lapply(split(dat1,list(dat1$st,dat1$year)),function(x) 
{x$day<-seq_along(x$day);x}))
row.names(dat2)<-1:nrow(dat2)
 str(dat2)
#'data.frame':    506953 obs. of  5 variables:
# $ st   : chr  "TANNU" "TANNU" "TANNU" "TANNU" ...
# $ year : int  1935 1935 1935 1935 1935 1935 1935 1935 1935 1935 ...
# $ month    : int  1 1 1 1 1 1 1 1 1 1 ...
# $ day  : int  1 2 3 4 5 6 7 8 9 10 ...
# $ discharge: num  8.06 7.65 7.3 7.3 6.95 6.95 6.6 6.25 5.9 5.61 ...
res1<-lapply(split(dat2,dat2$st),function(x) 
dcast(x,day~year,mean,value.var="discharge"))
 length(res1)
#[1] 124
 which(lapply(res1,nrow)>366) #this is what I mentioned
#CURVO GHIST READO TERCA UZZCO 
#   34    52    82   115   118 
nrow(res1$CURVO)
#[1] 730

 nrow(res1$CURVO)
#[1] 730
 tail(res1$CURVO)
#    day 2005 2006 2007 2008  2010
#725 725  NaN  NaN  NaN  NaN 1.2552028
#726 726  NaN  NaN  NaN  NaN 1.1714796
#727 727  NaN  NaN  NaN  NaN 0.8065988
#728 728  NaN  NaN  NaN  NaN 0.9375889
#729 729  NaN  NaN  NaN  NaN 1.1693797
#730 730  NaN  NaN  NaN  NaN 1.2124010


If the data is duplicated, then you can try this to sort out:
res1[lapply(res1,nrow)>366]<-lapply(res1[lapply(res1,nrow)>366],function(x) 
x[1:366,])
which(lapply(res1,nrow)>366)
#named integer(0)

tail(res1$TERCA)
#    day 2004  2005  2006  2010
#361 361 4.451136 0.1457738 0.7695212 1.2963822
#362 362 4.539952 0.1371580 0.7409566 1.3431055
#363 363 2.133320 0.1179060 0.7325064 1.3905679
#364 364 1.499514 0.1238591 0.7223855 1.3318688
#365 365 1.136719 0.1312868 0.7449733 1.3561590
#366 366 1.004139   NaN   NaN 0.5610225


A.K.




From: eliza botto 
To: "smartpink...@yahoo.com"  
Sent: Wednesday, March 13, 2013 7:04 PM
Subject: reshape



Dear Arun,
[text file is attached in case the format of file is change.. excel 
file contains the data. 
Few days ago, i asked you a question about reshaping of data, of the file which 
is attached with this email
By using the following command 

library(reshape2)
 res<-lapply(split(dat1,dat1$st),function(x) 
dcast(x,month~year,mean,value.var="discharge"))

The data was converted to following shape

$VERRO
   month       2008      2009      2010
1      1  1.4737028  2.314878  2.672661
2      2  1.6700918  2.609722  2.112421
3      3  3.2387775  7.305766  6.939536
4      4  6.7063592 18.745256 13.278218
5      5 13.4666085 21.061198 12.185597
6      6  9.7578872  8.642275  6.973769
7      7  2.2772154  3.436705  3.357440
8      8  1.1911175  2.300386  1.994471
9      9  0.8528279  2.653862  1.242462
10    10  0.7581712  1.956000  1.753606
11    11  3.6661192  3.406465 10.984185
12    12  2.2877993  3.736377  2.312527

$VOBIC
   month      2005      2006      2008      2009
1      1 1.7360776 0.8095275 1.6369044 0.8195241
2      2 0.6962079 3.8510720 0.4319758 2.3304495
3      3 1.0423625 2.7687266 0.2904245 0.7015527
4      4 2.4158326 1.2315324 1.4287387 1.5701019
5      5 1.6852624 0.8981553 0.2336609 2.4217805
6      6 0.7311802 0.5769895 0.5988320 1.0135255
7      7 0.4366588 0.5744698 0.2393858 0.6352815
8      8 0.2259192 1.0504790 0.1727715 0.4938100
9      9 0.3061264 0.7786805 0.1607744 0.5704136
10    10 0.1007935 0.1449439 1.3299596 0.6592695
11    11 0.4050701 0.2634931 3.4369699 2.8625800
12    12 2.5665737 3.5129255 3.7440045 6.7098572

I now want to reshape the data so that i instead of month column, there is a 
"day" column and corresponding to that column there are values of discharge 
against each day.
Point to remember is that there can be leap years in the data. Therefore, in 
the day column there should be 366 rows and the discharge value against 366th 
row of "day" column should be NA for normal years.
The output should slightly look like the following

day2007200820092012
11.06481450.739607160.709476751.0910787
21.03883460.721533780.642139050.98537076
30.994321350.724910180.604172220.92753464
40.964183460.713164880.626546960.93985833
51.01195460.721533780.601655151.0182268
60.990267020.70683050.618238961.01867
70.96763580.627783840.624279221.0029522
80.984760260.589403720.675053341.0979852
90.992942280.596523180.63785911.1139438
100.960866430.623609280.631065041.1213348
110.961336750.685543460.632059741.0227273
120.910802230.692296270.679805130.99130809
130.898210330.695432570.625013880.98488831
140.90451450.697288760.652906231.0115222
150.971476140.721533780.631314180.97627949
160.9

[R] loop in a data.table

2013-03-13 Thread Camilo Mora


I would like to clarify my previous email about using data.table.

imagine the following data.frame called "data":

a b   c  d e
1 12 15 65 6
1 65 85 36 5
2 69 84 35 8
2 45 78 65 8

I want to aggregate the rows of columns b:d by the rows of column a.  
the aggregation is sum(col[b:d]/sum(col[e]).

For this I am using a data.table with a loop of the form:

##

ColNames<-colnames(data)   #gets the names of the columns

x=ncol(data)-1#number of columns to process minus the last column.

data<-data.table(data) #converts to data.table


for (z in 2:x)  #I start the loop in the second column and finish in column d
{
outputdata<-data[, sum(get(ColNames[z]))/sum(e), by="a"]
}



this works fine but the function "get" slowdown the aggregation of the  
rows by about 20 times. I wonder if there is an alternative fucntion  
to "get" or an alternative way to aggregate all columns at once. I am  
reading into the function .SD but have not yet figure out how to put  
more than one operation in the function.


right now I have:
###
outputdata=data[, lapply(.SD, sum), by="a", .SDcols=2:x]

##
this later code aggregates all columns at once but only by summing.  
eventually I need to divide the sum of each column by the sum of  
column e as well.


ANy help will be greatly appreciate.

Thanks,

Camilo






Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
 Provider code: 313
 Phone 776 2282
 From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add a continuous color ramp legend to a 3d scatter plot

2013-03-13 Thread Marc Girondot

Hi,

Try this.
Sincelery,

Marc

x <- rnorm(128, 10, 2)
y <- rnorm(128, 10, 2)

z <- x+y

nbcol <- heat.colors(128)

# standardize z to be from 1 to 128
zcol <-  ((z-min(z))/(max(z)-min(z)))*127+1

library(scatterplot3d)
library(fields)

scatterplot3d(x,y,z, pch=16,color=nbcol[zcol], grid=FALSE, box=FALSE, 
mar=c(5, 3, 5, 7)+0.1)

par(mar=c(5, 4, 4, 2) + 0.1)

image.plot( legend.only=TRUE, zlim= c(min(z), max(z)), nlevel=128, 
col=heat.colors(128))




Le 13/03/13 17:36, Zhuoting Wu a écrit :
> Thanks Marc! I tried the colorbar.plot and image.plot.
>
> The colorbar.plot gives color bars within the plot, but I want a color 
> bar legend on the side of the plot.
>
> The image.plot gives a legend that overlaps the plot, and the scale 
> doesn't match the 3d scatterplot at all (see attached).
>
> Here's my R script:
>
> cols <- myColorRamp(c(topo.colors(10)),z)
> scatterplot3d(x,y,z, pch=16,color=cols, grid=FALSE, box=FALSE)
> zr<- range(c(z))
> image.plot(legend.only=TRUE,col=cols, zlim=zr)
>
> I wanted to have a color ramp legend based on z on the side of the plot.
>
> I'll greatly appreciate any help!
>
> thanks,
> Z
>
> On Wed, Mar 13, 2013 at 2:11 AM, Marc Girondot  > wrote:
>
> Le 12/03/13 23:43, Zhuoting Wu a écrit :
> > I have a 3 column dataset x,y,z, and I plotted a 3d scatter plot
> using:
> >
> > cols <- myColorRamp(c(topo.colors(10)),z)
> > plot3d(x=x, y=y, z=z, col=cols)
> >
> > I wanted to add a legend to the 3d plot showing the color ramp.
> Any help
> > will be greatly appreciated!
> >
> >
> Look at the package fields:
> ?colorbar.plot
>
> Marc
>
> --
> __
> Marc Girondot, Pr
>
> Laboratoire Ecologie, Systématique et Evolution
> Equipe de Conservation des Populations et des Communautés
> CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
> Bâtiment 362
> 91405 Orsay Cedex, France
>
> Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
> e-mail: marc.giron...@u-psud.fr 
> Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
> Skype: girondot
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loop in a data.table

2013-03-13 Thread arun

Hi,

May be this helps:

dat1<- read.table(text="
a    b  c  d    e
1    12    15    65    6
1    65    85    36    5
2    69    84    35    8
2    45    78    65    8
",sep="",header=TRUE)
library(data.table)
 dat2<- data.table(dat1)
 dat2[,head(sapply(.SD,sum)/sapply(.SD,sum)[4],-1),by="a"]
#   a    V1
#1: 1  7.00
#2: 1  9.090909
#3: 1  9.181818
#4: 2  7.125000
#5: 2 10.125000
#6: 2  6.25


outputdat<-list()
 ColNames<-colnames(dat2)
 x<- ncol(dat2)-1
 ColNames<-colnames(dat2)
 x<- ncol(dat2)-1
 for(z in 2:x)
 {
 outputdat[[z]]<-dat2[,sum(get(ColNames[z]))/sum(e),by="a"]
 }


do.call(rbind,outputdat)
#   a    V1
#1: 1  7.00
#2: 2  7.125000
#3: 1  9.090909
#4: 2 10.125000
#5: 1  9.181818
#6: 2  6.25
A.K.

- Original Message -
From: Camilo Mora 
To: r-help@r-project.org
Cc: 
Sent: Wednesday, March 13, 2013 11:27 PM
Subject: [R] loop in a data.table

I would like to clarify my previous email about using data.table.

imagine the following data.frame called "data":

a     b       c      d     e
1     12     15     65     6
1     65     85     36     5
2     69     84     35     8
2     45     78     65     8

I want to aggregate the rows of columns b:d by the rows of column a. the 
aggregation is sum(col[b:d]/sum(col[e]).
For this I am using a data.table with a loop of the form:

##

ColNames<-colnames(data)   #gets the names of the columns

x=ncol(data)-1    #number of columns to process minus the last column.

data<-data.table(data)     #converts to data.table


for (z in 2:x)  #I start the loop in the second column and finish in column d
{
outputdata<-data[, sum(get(ColNames[z]))/sum(e), by="a"]
}



this works fine but the function "get" slowdown the aggregation of the rows by 
about 20 times. I wonder if there is an alternative fucntion to "get" or an 
alternative way to aggregate all columns at once. I am reading into the 
function .SD but have not yet figure out how to put more than one operation in 
the function.

right now I have:
###
outputdata=data[, lapply(.SD, sum), by="a", .SDcols=2:x]

##
this later code aggregates all columns at once but only by summing. eventually 
I need to divide the sum of each column by the sum of column e as well.

ANy help will be greatly appreciate.

Thanks,

Camilo






Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
         Provider code: 313
         Phone 776 2282
         From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

89 matches

Mail list logo