[R] Skip error in downloading file in loop

2017-10-21 Thread Miluji Sb
I am trying to download data from NASA web-service.

I am using the following code;

for( i in 1:8) {
  target1 <- paste0("
https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=NLDAS:NLDAS_FORA0125_H.002:TMP2m&location=GEOM:POINT(
",
cities[i, "lon_nldas"],
",%20", cities[i,"lat_nldas"],

")&startDate=1980-01-01T00&endDate=2016-12-31T23&type=asc2")
  target2 <- paste0("/Users/dasgupta/Dropbox (FEEM)/Flu Paper/climate
data/temperature/nldas/",# change for whatever destination directory
you may prefer.
cities[i,"city"], "_",
cities[i,"state"], ".csv")
  download.file(url=target1, destfile=target2,  method = "libcurl")
}

Any time the coordinates provided is out data coverage, the loop fails and
stops. Is there any way to force R to skip the error and continue with the
rest of the download?

My data looks like this:

dput(droplevels(head(cities, 8)))
structure(list(city = structure(1:8, .Label = c("Boston", "Bridgeport",
"Cambridge", "Fall River", "Hartford", "Lowell", "Lynn", "New Bedford"
), class = "factor"), state = structure(c(2L, 1L, 2L, 2L, 1L,
2L, 2L, 2L), .Label = c(" CT ", " MA "), class = "factor"), lon_nldas =
c(-71.05673836,
-73.19126922, -71.1060061, -71.14364822, -72.67401574, -71.31283992,
-70.82521832, -70.80586599), lat_nldas = c(42.35866236, 41.18188268,
42.36679363, 41.69735342, 41.76349276, 42.64588413, 42.46370197,
41.63767375)), .Names = c("city", "state", "lon_nldas", "lat_nldas"
), row.names = c(NA, 8L), class = "data.frame")

Any help will be appreciated. Thank you very much!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in Zero inflated model (ziP) with bam

2017-11-06 Thread Miluji Sb
Dear all,

I am trying to use 'bam' to run the following generalized additive model:

###
m <- bam(result ~ factor(city) + factor(year) + lnpopulation +
s(lnincome_pc) + ,data=full_df,na.action=na.omit,family=ziP(theta = NULL,
link = "identity",b=0))

But getting the following error:

###
Error in bam(result :
  extended families not supported by bam

The documentation for 'bam' mentions the following "This is a family object
specifying the distribution and link to use in fitting etc. See glm and
family for more details. The extended families listed in family.mgcv can
also be used."

The family.mgcv does include ziP. What am I doing wrong? Any guidance will
be appreciated. Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in Zero inflated model (ziP) with bam

2017-11-06 Thread Miluji Sb
Thanks for your reply. Yes, installed (reinstalled) and loaded. I am a bit
baffled...

Sincerely,

Milu

On Mon, Nov 6, 2017 at 6:30 PM, Bert Gunter  wrote:

> Do you have the mgcv package installed (I think it's part of the standard
> distro, though) /loaded? ziP is there, not in BAM.
>
> Other than that, sorry, no clue.
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Mon, Nov 6, 2017 at 9:12 AM, Miluji Sb  wrote:
>
>> Dear all,
>>
>> I am trying to use 'bam' to run the following generalized additive model:
>>
>> ###
>> m <- bam(result ~ factor(city) + factor(year) + lnpopulation +
>> s(lnincome_pc) + ,data=full_df,na.action=na.omit,family=ziP(theta = NULL,
>> link = "identity",b=0))
>>
>> But getting the following error:
>>
>> ###
>> Error in bam(result :
>>   extended families not supported by bam
>>
>> The documentation for 'bam' mentions the following "This is a family
>> object
>> specifying the distribution and link to use in fitting etc. See glm and
>> family for more details. The extended families listed in family.mgcv can
>> also be used."
>>
>> The family.mgcv does include ziP. What am I doing wrong? Any guidance will
>> be appreciated. Thank you!
>>
>> Sincerely,
>>
>> Milu
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GAM Poisson

2017-12-14 Thread Miluji Sb
Dear all,

I apologize as this may not be a strictly R question. I am running GAM
models using the mgcv package.

I was wondering if the interpretation of the smooth splines of the 'x'
variable is the same in the following two cases:

# Linear probability model
m1 <- gam(count ~ factor(city) + factor(year) + s(x),
data=data,na.action=na.omit)

# Poisson
m2 <- gam(count ~ factor(city) + factor(year) + s(x),data=data,
family=poisson,na.action=na.omit)

Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Take the maximum of every 12 columns

2018-02-20 Thread Miluji Sb
 Dear all,

I have monthly data in wide format, I am only providing data (at the bottom
of the email) for the first 24 columns but I have 2880 columns in total.

I would like to take max of every 12 columns. I have taken the mean of
every 12 columns with the following code:

byapply <- function(x, by, fun, ...)
{
  # Create index list
  if (length(by) == 1)
  {
nc <- ncol(x)
split.index <- rep(1:ceiling(nc / by), each = by, length.out = nc)
  } else # 'by' is a vector of groups
  {
nc <- length(by)
split.index <- by
  }
  index.list <- split(seq(from = 1, to = nc), split.index)

  # Pass index list to fun using sapply() and return object
  sapply(index.list, function(i)
  {
do.call(fun, list(x[, i], ...))
  })
}

## Compute annual means
y <- byapply(df, 12, rowMeans)

How can I switch rowMeans with a command that takes the maximum? I am a bit
baffled. Any help will be appreciated. Thank you.

Sincerely,

Milu

###
dput(droplevels(head(x, 5)))
structure(list(X0 = c(295.812103271484, 297.672424316406, 299.006805419922,
297.631500244141, 298.372741699219), X1 = c(295.361328125,
297.345092773438,
298.067504882812, 297.285339355469, 298.275268554688), X2 =
c(294.279602050781,
296.401550292969, 296.777984619141, 296.089111328125, 297.540374755859
), X3 = c(292.103118896484, 294.253601074219, 293.773803710938,
293.916229248047, 296.129943847656), X4 = c(288.525024414062,
291.274505615234, 289.502777099609, 290.723388671875, 293.615112304688
), X5 = c(286.018371582031, 288.748565673828, 286.463134765625,
288.393951416016, 291.710266113281), X6 = c(285.550537109375,
288.159149169922, 285.976501464844, 287.999816894531, 291.228271484375
), X7 = c(289.136962890625, 290.751159667969, 290.170257568359,
291.796203613281, 293.423248291016), X8 = c(292.410003662109,
292.701263427734, 294.25244140625, 295.320404052734, 295.248199462891
), X9 = c(293.821655273438, 294.139068603516, 296.630157470703,
296.963531494141, 296.036224365234), X10 = c(294.532531738281,
295.366607666016, 297.677551269531, 296.715911865234, 296.564178466797
), X11 = c(295.869476318359, 297.010070800781, 299.330169677734,
297.627593994141, 297.964935302734), X12 = c(295.986236572266,
297.675567626953, 299.056671142578, 297.598907470703, 298.481842041016
), X13 = c(295.947601318359, 297.934448242188, 298.745391845703,
297.704925537109, 298.819091796875), X14 = c(294.654327392578,
296.722717285156, 297.0986328125, 296.508239746094, 297.822021484375
), X15 = c(292.176361083984, 294.49658203125, 293.888305664062,
294.172149658203, 296.117095947266), X16 = c(288.400726318359,
291.029113769531, 289.361907958984, 290.566772460938, 293.554016113281
), X17 = c(285.665222167969, 288.293029785156, 286.118957519531,
288.105285644531, 291.429382324219), X18 = c(285.971252441406,
288.3798828125, 286.444580078125, 288.495880126953, 291.447326660156
), X19 = c(288.79296875, 290.357543945312, 289.657928466797,
291.449066162109, 293.095275878906), X20 = c(291.999877929688,
292.838348388672, 293.840362548828, 294.412322998047, 294.941253662109
), X21 = c(293.615447998047, 294.028106689453, 296.072296142578,
296.447387695312, 295.824615478516), X22 = c(294.719848632812,
295.392028808594, 297.453216552734, 297.114288330078, 296.883209228516
), X23 = c(295.634429931641, 296.783294677734, 298.592346191406,
297.469512939453, 297.832122802734)), .Names = c("X0", "X1",
"X2", "X3", "X4", "X5", "X6", "X7", "X8", "X9", "X10", "X11",
"X12", "X13", "X14", "X15", "X16", "X17", "X18", "X19", "X20",
"X21", "X22", "X23"), row.names = c(NA, 5L), class = "data.frame")


Mail
priva di virus. www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Take the maximum of every 12 columns

2018-02-20 Thread Miluji Sb
Thank you for your kind replies. Maybe I was not clear with my question (I
apologize) or I did not understand...

I would like to take the max for X0...X11 and X12...X24 in my dataset. When
I use pmax with the function byapply as in

byapply(df, 12, pmax)

I get back a list which I cannot convert to a dataframe. Am I missing
something? Thanks again!

Sincerely,

Milu

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Mail
priva di virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Tue, Feb 20, 2018 at 4:38 PM, Bert Gunter  wrote:

> Don't do this (sorry Thierry)! max() already does this -- see ?max
>
> > x <- data.frame(a =rnorm(10), b = rnorm(10))
> > max(x)
> [1] 1.799644
>
> > max(sapply(x,max))
> [1] 1.799644
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Tue, Feb 20, 2018 at 7:05 AM, Thierry Onkelinx <
> thierry.onkel...@inbo.be> wrote:
>
>> The maximum over twelve columns is the maximum of the twelve maxima of
>> each of the columns.
>>
>> single_col_max <- apply(x, 2, max)
>> twelve_col_max <- apply(
>>   matrix(single_col_max, nrow = 12),
>>   2,
>>   max
>> )
>>
>> ir. Thierry Onkelinx
>> Statisticus / Statistician
>>
>> Vlaamse Overheid / Government of Flanders
>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>> AND FOREST
>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>> thierry.onkel...@inbo.be
>> Havenlaan 88
>> <https://maps.google.com/?q=Havenlaan+88&entry=gmail&source=g> bus 73,
>> 1000 Brussel
>> www.inbo.be
>>
>> 
>> ///
>> To call in the statistician after the experiment is done may be no
>> more than asking him to perform a post-mortem examination: he may be
>> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does
>> not ensure that a reasonable answer can be extracted from a given body
>> of data. ~ John Tukey
>> 
>> ///
>>
>>
>>
>>
>> 2018-02-20 15:55 GMT+01:00 Miluji Sb :
>> >  Dear all,
>> >
>> > I have monthly data in wide format, I am only providing data (at the
>> bottom
>> > of the email) for the first 24 columns but I have 2880 columns in total.
>> >
>> > I would like to take max of every 12 columns. I have taken the mean of
>> > every 12 columns with the following code:
>> >
>> > byapply <- function(x, by, fun, ...)
>> > {
>> >   # Create index list
>> >   if (length(by) == 1)
>> >   {
>> > nc <- ncol(x)
>> > split.index <- rep(1:ceiling(nc / by), each = by, length.out = nc)
>> >   } else # 'by' is a vector of groups
>> >   {
>> > nc <- length(by)
>> > split.index <- by
>> >   }
>> >   index.list <- split(seq(from = 1, to = nc), split.index)
>> >
>> >   # Pass index list to fun using sapply() and return object
>> >   sapply(index.list, function(i)
>> >   {
>> > do.call(fun, list(x[, i], ...))
>> >   })
>> > }
>> >
>> > ## Compute annual means
>> > y <- byapply(df, 12, rowMeans)
>> >
>> > How can I switch rowMeans with a command that takes the maximum? I am a
>> bit
>> > baffled. Any help will be appreciated. Thank you.
>> >
>> > Sincerely,
>> >
>> > Milu
>> >
>> > ###
>> > dput(droplevels(head(x, 5)))
>> > structure(list(X0 = c(295.812103271484, 297.672424316406,
>> 299.006805419922,
>> > 297.631500244141, 298.372741699219), X1 = c(295.361328125,
>> > 297.345092773438,
>> > 298.067504882812, 297.285339355469, 298.275268554688), X2 =
>> > c(294.279602050781,
>> > 296.401550292969, 296.777984619141, 296.089111328125, 297.540374755859
>> > ), X3 = c(292.

Re: [R] Take the maximum of every 12 columns

2018-02-20 Thread Miluji Sb
This is what I was looking for. Thank you everyone!

Sincerely,

Milu

<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Mail
priva di virus. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Tue, Feb 20, 2018 at 5:10 PM, Ista Zahn  wrote:

> Hi Milu,
>
> byapply(df, 12, function(x) apply(x, 1, max))
>
> You might also be interested in the matrixStats package.
>
> Best,
> Ista
>
> On Tue, Feb 20, 2018 at 9:55 AM, Miluji Sb  wrote:
> >  Dear all,
> >
> > I have monthly data in wide format, I am only providing data (at the
> bottom
> > of the email) for the first 24 columns but I have 2880 columns in total.
> >
> > I would like to take max of every 12 columns. I have taken the mean of
> > every 12 columns with the following code:
> >
> > byapply <- function(x, by, fun, ...)
> > {
> >   # Create index list
> >   if (length(by) == 1)
> >   {
> > nc <- ncol(x)
> > split.index <- rep(1:ceiling(nc / by), each = by, length.out = nc)
> >   } else # 'by' is a vector of groups
> >   {
> > nc <- length(by)
> > split.index <- by
> >   }
> >   index.list <- split(seq(from = 1, to = nc), split.index)
> >
> >   # Pass index list to fun using sapply() and return object
> >   sapply(index.list, function(i)
> >   {
> > do.call(fun, list(x[, i], ...))
> >   })
> > }
> >
> > ## Compute annual means
> > y <- byapply(df, 12, rowMeans)
> >
> > How can I switch rowMeans with a command that takes the maximum? I am a
> bit
> > baffled. Any help will be appreciated. Thank you.
> >
> > Sincerely,
> >
> > Milu
> >
> > ###
> > dput(droplevels(head(x, 5)))
> > structure(list(X0 = c(295.812103271484, 297.672424316406,
> 299.006805419922,
> > 297.631500244141, 298.372741699219), X1 = c(295.361328125,
> > 297.345092773438,
> > 298.067504882812, 297.285339355469, 298.275268554688), X2 =
> > c(294.279602050781,
> > 296.401550292969, 296.777984619141, 296.089111328125, 297.540374755859
> > ), X3 = c(292.103118896484, 294.253601074219, 293.773803710938,
> > 293.916229248047, 296.129943847656), X4 = c(288.525024414062,
> > 291.274505615234, 289.502777099609, 290.723388671875, 293.615112304688
> > ), X5 = c(286.018371582031, 288.748565673828, 286.463134765625,
> > 288.393951416016, 291.710266113281), X6 = c(285.550537109375,
> > 288.159149169922, 285.976501464844, 287.999816894531, 291.228271484375
> > ), X7 = c(289.136962890625, 290.751159667969, 290.170257568359,
> > 291.796203613281, 293.423248291016), X8 = c(292.410003662109,
> > 292.701263427734, 294.25244140625, 295.320404052734, 295.248199462891
> > ), X9 = c(293.821655273438, 294.139068603516, 296.630157470703,
> > 296.963531494141, 296.036224365234), X10 = c(294.532531738281,
> > 295.366607666016, 297.677551269531, 296.715911865234, 296.564178466797
> > ), X11 = c(295.869476318359, 297.010070800781, 299.330169677734,
> > 297.627593994141, 297.964935302734), X12 = c(295.986236572266,
> > 297.675567626953, 299.056671142578, 297.598907470703, 298.481842041016
> > ), X13 = c(295.947601318359, 297.934448242188, 298.745391845703,
> > 297.704925537109, 298.819091796875), X14 = c(294.654327392578,
> > 296.722717285156, 297.0986328125, 296.508239746094, 297.822021484375
> > ), X15 = c(292.176361083984, 294.49658203125, 293.888305664062,
> > 294.172149658203, 296.117095947266), X16 = c(288.400726318359,
> > 291.029113769531, 289.361907958984, 290.566772460938, 293.554016113281
> > ), X17 = c(285.665222167969, 288.293029785156, 286.118957519531,
> > 288.105285644531, 291.429382324219), X18 = c(285.971252441406,
> > 288.3798828125, 286.444580078125, 288.495880126953, 291.447326660156
> > ), X19 = c(288.79296875, 290.357543945312, 289.657928466797,
> > 291.449066162109, 293.095275878906), X20 = c(291.999877929688,
> > 292.838348388672, 293.840362548828, 294.412322998047, 294.941253662109
> > ), X21 = c(293.615447998047, 294.028106689453, 296.072296142578,
> > 296.447387695312, 295.824615478516), X22 = c(294.719848632812,
> > 295.392028808594, 297.453216552734, 297.114288330078, 296.883209228516
> > ), X23 = c(295.634429931641, 296.783294677734, 298.592346191406,
> > 297.469512939453, 297.832122802734)), .Names = c("X0", "X1",
> > "X2", "X3", "X4", "X5", "X6&quo

Re: [R] Take the maximum of every 12 columns

2018-02-21 Thread Miluji Sb
Dear Henrik,

This is great - thank you so much!

Sincerely,

Milu

On Tue, Feb 20, 2018 at 10:12 PM, Henrik Bengtsson <
henrik.bengts...@gmail.com> wrote:

> It looks like OP uses a data.frame, so in order to use matrixStats
> (I'm the author) one would have to pay the price to coerce to a matrix
> before using matrixStats::rowMaxs().  However, if it is that the
> original data could equally well live in a matrix, then matrixStats
> should be computational efficient for this task.  (I've seen cases
> where an original matrix was turned into a data.frame just because
> that is what is commonly used elsewhere and because the user may not
> pay attention to the differences between matrices and data.frame.)
>
> If the original data would be a matrix 'X', then one can do the
> following with matrixStats:
>
> Y <- sapply(seq(from = 0, to = 2880, by = 12), FUN = function(offset) {
>rowMaxs(X, cols = offset + 1:12)
> })
>
> which avoids internal temporary copies required when using regular
> subsetting, e.g.
>
> Y <- sapply(seq(from = 0, to = 2880, by = 12), FUN = function(offset) {
>rowMaxs(X[, offset + 1:12])
> })
>
> Subsetting data frames by columns is already efficient, so the same
> argument does not apply there.
>
> /Henrik
>
> On Tue, Feb 20, 2018 at 10:00 AM, Ista Zahn  wrote:
> > On Tue, Feb 20, 2018 at 11:58 AM, Bert Gunter 
> > wrote:
> >
> >> Ista, et. al: efficiency?
> >> (Note: I needed to correct my previous post: do.call() is required for
> >> pmax() over the data frame)
> >>
> >> > x <- data.frame(matrix(runif(12e6), ncol=12))
> >>
> >> > system.time(r1 <- do.call(pmax,x))
> >>user  system elapsed
> >>   0.049   0.000   0.049
> >>
> >> > identical(r1,r2)
> >> [1] FALSE
> >> > system.time(r2 <- apply(x,1,max))
> >>user  system elapsed
> >>   2.162   0.045   2.207
> >>
> >> ## 150 times slower!
> >>
> >> > identical(r1,r2)
> >> [1] TRUE
> >>
> >> pmax() is there for a reason.
> >> Or is there something I am missing?
> >>
> >
> >
> > Personal preference I think. I prefer the consistency of apply. If speed
> > is an issue matrixStats is both consistent and fast:
> >
> > library(matrixStats)
> > x <- matrix(runif(12e6), ncol=12)
> >
> > system.time(r1 <- do.call(pmax,as.data.frame(x)))
> >   ##  user  system elapsed
> >   ## 0.109   0.000   0.109
> > system.time(r2 <- apply(x,1,max))
> >   ##  user  system elapsed
> >   ## 1.292   0.024   1.321
> > system.time(r3 <- rowMaxs(x))
> >   ##  user  system elapsed
> >   ## 0.044   0.000   0.044
> >
> > pmax is a fine alternative for max special case.
> >
> > Best,
> > Ista
> >
> >
> >
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> and
> >> sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >> On Tue, Feb 20, 2018 at 8:16 AM, Miluji Sb  wrote:
> >>
> >>> This is what I was looking for. Thank you everyone!
> >>>
> >>> Sincerely,
> >>>
> >>> Milu
> >>>
> >>>
> >>> <https://www.avast.com/sig-email?utm_medium=email&utm_
> source=link&utm_campaign=sig-email&utm_content=webmail> Mail
> >>> priva di virus. www.avast.com
> >>> <https://www.avast.com/sig-email?utm_medium=email&utm_
> source=link&utm_campaign=sig-email&utm_content=webmail>
> >>> <#m_4297398466082743447_m_6071581590498622123_DAB4FAD8-
> 2DD7-40BB-A1B8-4E2AA1F9FDF2>
> >>>
> >>> On Tue, Feb 20, 2018 at 5:10 PM, Ista Zahn  wrote:
> >>>
> >>>> Hi Milu,
> >>>>
> >>>> byapply(df, 12, function(x) apply(x, 1, max))
> >>>>
> >>>> You might also be interested in the matrixStats package.
> >>>>
> >>>> Best,
> >>>> Ista
> >>>>
> >>>> On Tue, Feb 20, 2018 at 9:55 AM, Miluji Sb 
> wrote:
> >>>> >  Dear all,
> >>>> >
> >>>> > I have monthly data in wide format, I am only providing data (at the
> >>>> bottom
> >>>> > of

[R] Take average of previous weeks

2018-03-25 Thread Miluji Sb
Dear all,

I have weekly data by city (variable citycode). I would like to take the
average of the previous two, three, four weeks (without the current week)
of the variable called value.

This is what I have tried to compute the average of the two previous weeks;

df = df %>%
  mutate(value.lag1 = lag(value, n = 1)) %>%
  mutate(value .2.previous = rollapply(data = value.lag1,
 width = 2,
 FUN = mean,
 align = "right",
 fill = NA,
 na.rm = T))

I crated the lag of the variable first and then attempted to compute the
average but this does not seem to to what I want. What I am doing wrong?
Any help will be appreciated. The data is below. Thank you.

Sincerely,

Milu

dput(droplevels(head(df, 10)))
structure(list(year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
1970L, 1970L, 1970L, 1970L), citycode = c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), month = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 3L), week = c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), date =
structure(c(1L,
2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), .Label = c("1970-01-10",
"1970-01-17", "1970-01-24", "1970-01-31", "1970-02-07", "1970-02-14",
"1970-02-21", "1970-02-28", "1970-03-07"), class = "factor"),
value = c(-15.035, -20.478, -22.245, -23.576, -8.840995,
-18.497, -13.892, -18.974, -15.919, -13.576)), .Names = c("year",
"citycode", "month", "week", "date", "tmin"), row.names = c(NA,
10L), class = "data.frame")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Take average of previous weeks

2018-03-26 Thread Miluji Sb
Dear Bert,

Thank you very much.This works. I was wondering if the fact that I want to
create new variables (sorry for not stating that fact) makes any
difference? Thank you again.

Sincerely,

Milu

On Sun, Mar 25, 2018 at 10:05 PM, Bert Gunter 
wrote:

> I am sure that this sort of thing has been asked and answered before,
> so in case my suggestions don't work for you, just search the archives
> a bit more.
> I am also sure that it can be handled directly by numerous functions
> in numerous packages, e.g. via time series methods or by calculating
> running means of suitably shifted series.
>
> However, as it seems to be a straightforward task, I'll provide what I
> think is a simple solution in base R. Adjust to your situation.
>
> ## First I need a little utility function to offset rows. Lots of ways
> to do this,many nicer than this I'm sure.
>
> > shift <- function(x,k)
> +## x is a vector of values -- e.g. of a column in your df
> + {
> +sapply(seq_len(k),function(i)c(rep(NA,i),head(x,-i)))
> + }
> >
> >
> > ## Testit
> > x <- c(1,3,5,7,8:11)
> > m <- shift(x,3) ## matrix of prior values up to lag 3
> > m ## note rows have been omitted where lags don't exist
>  [,1] [,2] [,3]
> [1,]   NA   NA   NA
> [2,]1   NA   NA
> [3,]31   NA
> [4,]531
> [5,]753
> [6,]875
> [7,]987
> [8,]   1098
> > rowMeans(m) ## means of previous 3
> [1]   NA   NA   NA 3.00 5.00 6.67 8.00 9.00
> > rowMeans(m[,1:2]) ## means of previous 2
> [1]  NA  NA 2.0 4.0 6.0 7.5 8.5 9.5
>
>
> Cheers,
> Bert
>
>
>
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Mar 25, 2018 at 7:48 AM, Miluji Sb  wrote:
> > Dear all,
> >
> > I have weekly data by city (variable citycode). I would like to take the
> > average of the previous two, three, four weeks (without the current week)
> > of the variable called value.
> >
> > This is what I have tried to compute the average of the two previous
> weeks;
> >
> > df = df %>%
> >   mutate(value.lag1 = lag(value, n = 1)) %>%
> >   mutate(value .2.previous = rollapply(data = value.lag1,
> >  width = 2,
> >  FUN = mean,
> >  align = "right",
> >  fill = NA,
> >  na.rm = T))
> >
> > I crated the lag of the variable first and then attempted to compute the
> > average but this does not seem to to what I want. What I am doing wrong?
> > Any help will be appreciated. The data is below. Thank you.
> >
> > Sincerely,
> >
> > Milu
> >
> > dput(droplevels(head(df, 10)))
> > structure(list(year = c(1970L, 1970L, 1970L, 1970L, 1970L, 1970L,
> > 1970L, 1970L, 1970L, 1970L), citycode = c(1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 1L, 1L), month = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
> > 2L, 3L), week = c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), date =
> > structure(c(1L,
> > 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L), .Label = c("1970-01-10",
> > "1970-01-17", "1970-01-24", "1970-01-31", "1970-02-07", "1970-02-14",
> > "1970-02-21", "1970-02-28", "1970-03-07"), class = "factor"),
> > value = c(-15.035, -20.478, -22.245, -23.576, -8.840995,
> > -18.497, -13.892, -18.974, -15.919, -13.576)), .Names = c("year",
> > "citycode", "month", "week", "date", "tmin"), row.names = c(NA,
> > 10L), class = "data.frame")
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Overlay line on a bar plot - multiple axis

2018-04-29 Thread Miluji Sb
Dear all,

I am trying to make a similar plot -
https://peltiertech.com/images/2013-09/BarLineSampleChart4.png.

I have data for two variables; count and z by city and week. I would like
to have a horizontal bar plot of *count* by city and a line plot of weekly
average of the variable *z*.

I have tried the following:

ggplot() + geom_bar(data=dat, aes(x=city, y=count),
stat="identity",position="dodge") + coord_flip() +
geom_line(data=dat, aes(x=week, y=mean_tmin))

However, this flips the code for both the bar plot and the line plot. I
would the y-axis of the line plot to be on the left-vertical axis. I read
that it is difficult to have a secondary axis with ggplot. Any help will be
highly appreciated. Thank you!

dput(dat)
structure(list(city = structure(c(4L, 5L, 1L, 3L, 2L, 4L, 3L,
1L, 2L, 5L, 3L, 1L, 2L, 4L, 5L), .Label = c("Akron", "Boston",
"Houston", "NYC", "OKC"), class = "factor"), week = c(1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), count = c(2.8,
2.7, 3.1, 2.5, 3.5, 5.3, 4.5, 9.5, 2.7, 2.1, 4.5, 9.5, 2.7, 5.3,
2.1), z = c(-4.1, 1.7, 1.5, 12.8, 1.9, 4, 11.2, 1.4, 2, 4, 10.9,
1.4, 1.7, 3.9, 4.3), City = structure(c(4L, 5L, 1L, 3L, 2L, 4L,
3L, 1L, 2L, 5L, 3L, 1L, 2L, 4L, 5L), .Label = c("Akron", "Boston",
"Houston", "NYC", "OKC"), class = "factor"), Count = c(4.47,
2.3, 7.37, 3.83, 2.97, 4.47, 3.83, 7.37,
2.97, 2.3, 3.83, 7.37, 2.97, 4.47, 2.3),
mean_tmin = c(2.76, 2.76, 2.76, 2.76, 2.76, 4.52, 4.52, 4.52,
4.52, 4.52, 4.44, 4.44, 4.44, 4.44, 4.44)), .Names = c("city",
"week", "count", "z", "City", "Count", "mean_tmin"), class = "data.frame",
row.names = c(NA,
-15L))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overlay line on a bar plot - multiple axis

2018-04-30 Thread Miluji Sb
Dear Jim and Ron,

Thank you very much. Both the solutions are very neat and working.
Appreciate all your help.

Sincerely,

Milu

On Mon, Apr 30, 2018 at 12:28 PM, Ron Crump  wrote:

> Hi Miluji,
>
> Using Jim's interpretation of your desired graph,
> you could do it in ggplot2 using your dat DF by:
>
> ggplot() +
>  geom_bar(data=dat,
>aes(x=week,y=count,fill=city),stat="identity",position="dodge") +
>  coord_flip() +
>  geom_line(data=dat, aes(x=week, y=mean_tmin))
>
> There would still need some work to be done to get the weekly mean
> into a legend, but it is achievable if required.
>
> Ron.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bilateral matrix

2018-05-08 Thread Miluji Sb
I have data on current and previous location of individuals. I would like
to have a matrix with bilateral movement between locations. I would like
the final output to look like the second table below.

I have tried using crosstab() from the ecodist but I do not have another
variable to measure the flow. Ultimately I would like to compute the
probability of movement between cities (movement to city_i/total movement
from city_j).

Is it possible to aggregate the data in this way? Any guidance would be
highly appreciated. Thank you!

# Original data
structure(list(id = 101:115, current_location = structure(c(2L,
8L, 8L, 3L, 6L, 5L, 1L, 2L, 7L, 4L, 2L, 8L, 8L, 3L, 6L), .Label =
c("Austin",
"Boston", "Cambridge", "Durham", "Houston", "Lynn", "New Orleans",
"New York"), class = "factor"), previous_location = structure(c(6L,
2L, 4L, 6L, 7L, 5L, 1L, 3L, 6L, 2L, 6L, 2L, 4L, 6L, 7L), .Label =
c("Atlanta",
"Austin", "Cleveland", "Houston", "New Orleans", "OKC", "Tulsa"
), class = "factor")), class = "data.frame", row.names = c(NA,
-15L))

# Expected output
structure(list(X = structure(c(3L, 1L, 2L), .Label = c("Austin",
"Houston", "OKC"), class = "factor"), Boston = c(2L, NA, NA),
New.York = c(NA, 2L, 2L), Cambridge = c(2L, NA, NA)), class =
"data.frame", row.names = c(NA,
-3L))

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bilateral matrix

2018-05-16 Thread Miluji Sb
Dear Bert and Huzefa,

Apologies for the late reply, my account got hacked and I have just managed
to recover it.

Thank you very much for your replies and the solutions. Both work well.

I was wondering if there was any way to ensure (force) that all possible
combinations show up in the output. The full dataset has 25 cities but of
course people have not moved from Boston to all the other 24 cities. I
would like all the combinations if possible.

Thank you again!

Sincerely,

Milu

On Tue, May 8, 2018 at 6:28 PM, Bert Gunter  wrote:

> or in base R : ?xtabs??
>
> as in:
> xtabs(~previous_location + current_location,data=x)
>
> (You can convert the 0s to NA's if you like)
>
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Tue, May 8, 2018 at 9:21 AM, Huzefa Khalil 
> wrote:
>
>> Dear Miluji,
>>
>> If I understand correctly, this should get you what you need.
>>
>> temp1 <-
>> structure(list(id = 101:115, current_location = structure(c(2L,
>> 8L, 8L, 3L, 6L, 5L, 1L, 2L, 7L, 4L, 2L, 8L, 8L, 3L, 6L), .Label =
>> c("Austin",
>> "Boston", "Cambridge", "Durham", "Houston", "Lynn", "New Orleans",
>> "New York"), class = "factor"), previous_location = structure(c(6L,
>> 2L, 4L, 6L, 7L, 5L, 1L, 3L, 6L, 2L, 6L, 2L, 4L, 6L, 7L), .Label =
>> c("Atlanta",
>> "Austin", "Cleveland", "Houston", "New Orleans", "OKC", "Tulsa"
>> ), class = "factor")), class = "data.frame", row.names = c(NA,
>> -15L))
>>
>> dcast(temp1, previous_location ~ current_location)
>>
>> On Tue, May 8, 2018 at 12:10 PM, Miluji Sb  wrote:
>> > I have data on current and previous location of individuals. I would
>> like
>> > to have a matrix with bilateral movement between locations. I would like
>> > the final output to look like the second table below.
>> >
>> > I have tried using crosstab() from the ecodist but I do not have another
>> > variable to measure the flow. Ultimately I would like to compute the
>> > probability of movement between cities (movement to city_i/total
>> movement
>> > from city_j).
>> >
>> > Is it possible to aggregate the data in this way? Any guidance would be
>> > highly appreciated. Thank you!
>> >
>> > # Original data
>> > structure(list(id = 101:115, current_location = structure(c(2L,
>> > 8L, 8L, 3L, 6L, 5L, 1L, 2L, 7L, 4L, 2L, 8L, 8L, 3L, 6L), .Label =
>> > c("Austin",
>> > "Boston", "Cambridge", "Durham", "Houston", "Lynn", "New Orleans",
>> > "New York"), class = "factor"), previous_location = structure(c(6L,
>> > 2L, 4L, 6L, 7L, 5L, 1L, 3L, 6L, 2L, 6L, 2L, 4L, 6L, 7L), .Label =
>> > c("Atlanta",
>> > "Austin", "Cleveland", "Houston", "New Orleans", "OKC", "Tulsa"
>> > ), class = "factor")), class = "data.frame", row.names = c(NA,
>> > -15L))
>> >
>> > # Expected output
>> > structure(list(X = structure(c(3L, 1L, 2L), .Label = c("Austin",
>> > "Houston", "OKC"), class = "factor"), Boston = c(2L, NA, NA),
>> > New.York = c(NA, 2L, 2L), Cambridge = c(2L, NA, NA)), class =
>> > "data.frame", row.names = c(NA,
>> > -3L))
>> >
>> > Sincerely,
>> >
>> > Milu
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bilateral matrix

2018-05-17 Thread Miluji Sb
Dear William and Ben,

Thank you for your replies and elegant solutions. I am having trouble with
the fact that two of the previous locations do not appear in current
locations (that is no one moved to OKC and Dallas from other cities), so
these two cities are not being included in the output.

I have provided a better sample of the data and the ideal output (wide form
- a 10x10 bilateral matrix) but haven't been able to do this. Would it be
easier if I create variable for each ID - it would be equal to 1 if the
person moved? I am a bit lost - thank you again!

### data
structure(list(ID = 1:12, previous_location. = structure(c(3L,
9L, 8L, 10L, 2L, 5L, 1L, 7L, 4L, 6L, 10L, 5L), .Label = c("Atlanta",
"Austin", "Boston", "Cambridge", "Dallas", "Durham", "Lynn",
"New Orleans", "New York", "OKC"), class = "factor"), current_location. =
structure(c(8L,
3L, 3L, 8L, 4L, 1L, 4L, 5L, 6L, 4L, 7L, 2L), .Label = c("Atlanta",
"Austin", "Boston", "Cambridge", "Durham", "Lynn", "New Orleans",
"New York"), class = "factor")), class = "data.frame", row.names = c(NA,
-12L))

### ideal output
structure(list(previous_location. = structure(c(3L, 9L, 8L, 10L,
2L, 5L, 1L, 7L, 4L, 6L), .Label = c("Atlanta", "Austin", "Boston",
"Cambridge", "Dallas", "Durham", "Lynn", "New Orleans", "New York",
"OKC"), class = "factor"), Boston = c(0L, 1L, 1L, 0L, 0L, 0L,
0L, 0L, 0L, 0L), New.York = c(1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L,
0L, 0L), New.Orleans = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L,
0L), OKC = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Austin = c(0L,
0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L), Dallas = c(0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L), Atlanta = c(0L, 0L, 0L, 0L, 0L, 1L,
0L, 0L, 0L, 0L), Lynn = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
0L), Cambridge = c(0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L), Durham = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L)), class = "data.frame", row.names =
c(NA,
-10L))

Sincerely,

Milu

On Wed, May 16, 2018 at 5:12 PM, Bert Gunter  wrote:

> xtabs does this automatically if your cross classifying variables are
> factors with levels all the cities (sorted, if you like):
>
>  > x <- sample(letters[1:5],8, rep=TRUE)
> > y <- sample(letters[1:5],8,rep=TRUE)
>
> > xtabs(~ x + y)
>y
> x   c d e
>   a 1 0 0
>   b 0 0 1
>   c 1 0 0
>   d 1 1 1
>   e 1 1 0
>
> > lvls <- sort(union(x,y))
> > x <- factor(x, levels = lvls)
> > y <- factor(y, levels = lvls)
>
> > xtabs( ~ x + y)
>y
> x   a b c d e
>   a 0 0 1 0 0
>   b 0 0 0 0 1
>   c 0 0 1 0 0
>   d 0 0 1 1 1
>   e 0 0 1 1 0
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Wed, May 16, 2018 at 7:49 AM, Miluji Sb  wrote:
>
>> Dear Bert and Huzefa,
>>
>> Apologies for the late reply, my account got hacked and I have just
>> managed to recover it.
>>
>> Thank you very much for your replies and the solutions. Both work well.
>>
>> I was wondering if there was any way to ensure (force) that all possible
>> combinations show up in the output. The full dataset has 25 cities but of
>> course people have not moved from Boston to all the other 24 cities. I
>> would like all the combinations if possible.
>>
>> Thank you again!
>>
>> Sincerely,
>>
>> Milu
>>
>> On Tue, May 8, 2018 at 6:28 PM, Bert Gunter 
>> wrote:
>>
>>> or in base R : ?xtabs??
>>>
>>> as in:
>>> xtabs(~previous_location + current_location,data=x)
>>>
>>> (You can convert the 0s to NA's if you like)
>>>
>>>
>>> Cheers,
>>> Bert
>>>
>>>
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>> On Tue, May 8, 2018 at 9:21 AM, Huzefa Khalil 
>>> wrote:
>>>
>>>> Dear Miluji,
>>>>
>>>> If I understand correctly, this should get you what you need.
>>>>
>>>> temp1 <-
>>>> structure(list(id = 101:115, current_location = structure(c(2L,
>>>> 8L, 8L, 3L, 6L, 5L, 1L

[R] Convert daily data to weekly data

2018-05-29 Thread Miluji Sb
Dear all,

I have daily data in wide-format from 01/01/1986 to 31/12/2016 by ID. I
would like to convert this to weekly average data. The data has been
generated by an algorithm.

I know that I can use the lubridate package but that would require me to
first convert the data to long-form (which is what I want). I am at a bit
of loss of how to extract the date from the variable names and then
converting the data to weekly average. Any help will be high appreciated.

### data
structure(list(X1986.01.01.10.30.00 = c(16.8181762695312, 16.8181762695312,
18.8294372558594, 16.8181762695312, 18.8294372558594, 16.835693359375
), X1986.01.02.10.30.00 = c(16.2272033691406, 16.2272033691406,
18.0772094726562, 16.2272033691406, 18.0772094726562, 16.2159423828125
), X1986.01.03.10.30.00 = c(15.8944396972656, 15.8944396972656,
17.7444152832031, 15.8944396972656, 17.7444152832031, 15.91943359375
), X1986.01.04.10.30.00 = c(16.2752380371094, 16.2752380371094,
18.125244140625, 16.2752380371094, 18.125244140625, 16.3352355957031
), X1986.01.05.10.30.00 = c(15.3706359863281, 15.3706359863281,
17.2806396484375, 15.3706359863281, 17.2806396484375, 15.3556213378906
), X1986.01.06.10.30.00 = c(15.3798828125, 15.3798828125, 17.3136291503906,
15.3798828125, 17.3136291503906, 15.4974060058594), ID = 1:6), .Names =
c("X1986.01.01.10.30.00",
"X1986.01.02.10.30.00", "X1986.01.03.10.30.00", "X1986.01.04.10.30.00",
"X1986.01.05.10.30.00", "X1986.01.06.10.30.00", "ID"), row.names = c(NA,
6L), class = "data.frame")

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert daily data to weekly data

2018-05-29 Thread Miluji Sb
Dear Petr,

Thanks for your reply and the solution. The example dataset contains data
for the first six days of the year 1986. "X1986.01.01.10.30.00" is 01
January 1986 and the rest of the variable is redundant information. The
last date is given as "X2016.12.31.10.30.00".

I realized that I missed one information in my previous email, I would like
to compute the weekly average by the variable ID. Thanks again!

Sincerely,

Shouro

On Tue, May 29, 2018 at 3:24 PM, PIKAL Petr  wrote:

> Hi
>
> Based on your explanation I would advice to use
>
> ?cut.POSIXt
>
> with breaks "week". However your data are rather strange, you have data
> frame with names which looks like dates
>
> names(test)
> [1] "X1986.01.01.10.30.00" "X1986.01.02.10.30.00" "X1986.01.03.10.30.00"
> [4] "X1986.01.04.10.30.00" "X1986.01.05.10.30.00" "X1986.01.06.10.30.00"
> [7] "ID"
>
> and under each name you have 6 numeric values
> test[,1]
> [1] 16.81818 16.81818 18.82944 16.81818 18.82944 16.83569
>
> You (probably) can get dates by
> as.Date(substring(names(test),2,11), format="%Y.%m.%d")
> [1] "1986-01-01" "1986-01-02" "1986-01-03" "1986-01-04" "1986-01-05"
> [6] "1986-01-06" NA
>
> but if you want just average those 6 values below each date you could do
>
> colMeans(test)
>
> and/or bind it together.
>
> > ddd<-as.Date(substring(names(test),2,11), format="%Y.%m.%d")
> > data.frame(ddd, aver=colMeans(test))
> ddd aver
> X1986.01.01.10.30.00 1986-01-01 17.49152
> X1986.01.02.10.30.00 1986-01-02 16.84200
> X1986.01.03.10.30.00 1986-01-03 16.51526
> X1986.01.04.10.30.00 1986-01-04 16.90191
> X1986.01.05.10.30.00 1986-01-05 16.00480
> X1986.01.06.10.30.00 1986-01-06 16.04405
> ID   3.5
>
> Cheers
> Petr
>
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a
> podléhají tomuto právně závaznému prohlášení o vyloučení odpovědnosti:
> https://www.precheza.cz/01-dovetek/ | This email and any documents
> attached to it may be confidential and are subject to the legally binding
> disclaimer: https://www.precheza.cz/en/01-disclaimer/
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Miluji
> Sb
> > Sent: Tuesday, May 29, 2018 2:59 PM
> > To: r-help mailing list 
> > Subject: [R] Convert daily data to weekly data
> >
> > Dear all,
> >
> > I have daily data in wide-format from 01/01/1986 to 31/12/2016 by ID. I
> would
> > like to convert this to weekly average data. The data has been generated
> by an
> > algorithm.
> >
> > I know that I can use the lubridate package but that would require me to
> first
> > convert the data to long-form (which is what I want). I am at a bit of
> loss of
> > how to extract the date from the variable names and then converting the
> data
> > to weekly average. Any help will be high appreciated.
> >
> > ### data
> > structure(list(X1986.01.01.10.30.00 = c(16.8181762695312,
> > 16.8181762695312, 18.8294372558594, 16.8181762695312,
> > 18.8294372558594, 16.835693359375 ), X1986.01.02.10.30.00 =
> > c(16.2272033691406, 16.2272033691406, 18.0772094726562,
> > 16.2272033691406, 18.0772094726562, 16.2159423828125 ),
> > X1986.01.03.10.30.00 = c(15.8944396972656, 15.8944396972656,
> > 17.7444152832031, 15.8944396972656, 17.7444152832031, 15.91943359375
> > ), X1986.01.04.10.30.00 = c(16.2752380371094, 16.2752380371094,
> > 18.125244140625, 16.2752380371094, 18.125244140625, 16.3352355957031
> > ), X1986.01.05.10.30.00 = c(15.3706359863281, 15.3706359863281,
> > 17.2806396484375, 15.3706359863281, 17.2806396484375,
> > 15.3556213378906 ), X1986.01.06.10.30.00 = c(15.3798828125,
> > 15.3798828125, 17.3136291503906, 15.3798828125, 17.3136291503906,
> > 15.4974060058594), ID = 1:6), .Names = c("X1986.01.01.10.30.00",
> > "X1986.01.02.10.30.00", "X1986.01.03.10.30.00", "X1986.01.04.10.30.00",
> > "X1986.01.05.10.30.00", "X1986.01.06.10.30.00", "ID"), row.names = c(NA,
> 6L),
> > class = "data.frame")
> >
> > Sincerely,
> >
> > Milu
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.or

Re: [R] Convert daily data to weekly data

2018-05-30 Thread Miluji Sb
Dear Jim,

Thank you so much. The code works perfectly and is very fast for the large
amounts of data that I am processing!

Thanks again!

Sincerely,

Shouro

On Tue, May 29, 2018 at 11:50 PM, jim holtman  wrote:

> Forgot the year if you also want to summarise by that.
>
>
> > x <- structure(list(X1986.01.01.10.30.00 = c(16.8181762695312,
> 16.8181762695312,
> +  18.8294372558594, 16 
> [TRUNCATED]
>
> > library(tidyverse)
>
> > library(lubridate)
>
> > # convert to long form
> > x_long <- gather(x, key = 'date', value = "value", -ID)
>
> > # change the date to POSIXct
> > x_long$date <- ymd_hms(substring(x_long$date, 2, 19))
>
> > # add the week of the year
> > x_long$week <- week(x_long$date)
>
> > x_long$year <- year(x_long$date)
>
> > # average by ID/week
> > avg <- x_long %>%
> +   group_by(ID, year, week) %>%
> +   summarise(avg = mean(value))
> > avg
> # A tibble: 6 x 4
> # Groups:   ID, year [?]
>  ID  year  week   avg
>  
> 1 1 1986.1.  16.0
> 2 2 1986.1.  16.0
> 3 3 1986.1.  17.9
> 4 4 1986.1.  16.0
> 5 5 1986.1.  17.9
> 6 6 1986.1.  16.0
>
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Tue, May 29, 2018 at 7:02 AM, Miluji Sb  wrote:
>
>> Dear Petr,
>>
>> Thanks for your reply and the solution. The example dataset contains data
>> for the first six days of the year 1986. "X1986.01.01.10.30.00" is 01
>> January 1986 and the rest of the variable is redundant information. The
>> last date is given as "X2016.12.31.10.30.00".
>>
>> I realized that I missed one information in my previous email, I would
>> like
>> to compute the weekly average by the variable ID. Thanks again!
>>
>> Sincerely,
>>
>> Shouro
>>
>> On Tue, May 29, 2018 at 3:24 PM, PIKAL Petr 
>> wrote:
>>
>> > Hi
>> >
>> > Based on your explanation I would advice to use
>> >
>> > ?cut.POSIXt
>> >
>> > with breaks "week". However your data are rather strange, you have data
>> > frame with names which looks like dates
>> >
>> > names(test)
>> > [1] "X1986.01.01.10.30.00" "X1986.01.02.10.30.00" "X1986.01.03.10.30.00"
>> > [4] "X1986.01.04.10.30.00" "X1986.01.05.10.30.00" "X1986.01.06.10.30.00"
>> > [7] "ID"
>> >
>> > and under each name you have 6 numeric values
>> > test[,1]
>> > [1] 16.81818 16.81818 18.82944 16.81818 18.82944 16.83569
>> >
>> > You (probably) can get dates by
>> > as.Date(substring(names(test),2,11), format="%Y.%m.%d")
>> > [1] "1986-01-01" "1986-01-02" "1986-01-03" "1986-01-04" "1986-01-05"
>> > [6] "1986-01-06" NA
>> >
>> > but if you want just average those 6 values below each date you could do
>> >
>> > colMeans(test)
>> >
>> > and/or bind it together.
>> >
>> > > ddd<-as.Date(substring(names(test),2,11), format="%Y.%m.%d")
>> > > data.frame(ddd, aver=colMeans(test))
>> > ddd aver
>> > X1986.01.01.10.30.00 1986-01-01 17.49152
>> > X1986.01.02.10.30.00 1986-01-02 16.84200
>> > X1986.01.03.10.30.00 1986-01-03 16.51526
>> > X1986.01.04.10.30.00 1986-01-04 16.90191
>> > X1986.01.05.10.30.00 1986-01-05 16.00480
>> > X1986.01.06.10.30.00 1986-01-06 16.04405
>> > ID   3.5
>> >
>> > Cheers
>> > Petr
>> >
>> > Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a
>> > podléhají tomuto právně závaznému prohlášení o vyloučení odpovědnosti:
>> > https://www.precheza.cz/01-dovetek/ | This email and any documents
>> > attached to it may be confidential and are subject to the legally
>> binding
>> > disclaimer: https://www.precheza.cz/en/01-disclaimer/
>> >
>> > > -Original Message-
>> > > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
>> Miluji
>> > Sb
>> > > Sent: Tuesday, May 29, 2018 2:59 PM
>> > > To: r-help mailing list 
>> > > Subject: [R] Convert daily data to weekly data
>> > >

[R] Subset Rasterbrick by time

2018-06-18 Thread Miluji Sb
 Dear all,

I have a rasterbrick with the date/time information provided which I would
like to subset by year.

However, when I use the following code for sub-setting;

new_brick <- subset(original, which(getZ( original ) >= as.Date("2000-01-01
10:30:00") & getZ(original ) <= as.Date("2014-12-31 10:30:00")))

The date/time information seems to be lost.

Furthermore, the class of the date/time seems to be character;

##
class(getZ( original ))
[1] "character"

Is it possible to convert this string to date before sub-setting or retain
the date/time information after sub-setting?

### original RasterBrick ###
class   : RasterBrick
dimensions  : 600, 1440, 864000, 11320  (nrow, ncol, ncell, nlayers)
resolution  : 0.25, 0.25  (x, y)
extent  : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source :
/work/mm01117/GLDAS_025_deg/daily/gldas_tavg_tmin_tmax_precip_windspd_sphum_daily_1986_2016.nc4
names   : X1986.01.01.10.30.00, X1986.01.02.10.30.00,
X1986.01.03.10.30.00, X1986.01.04.10.30.00, X1986.01.05.10.30.00,
X1986.01.06.10.30.00, X1986.01.07.10.30.00, X1986.01.08.10.30.00,
X1986.01.09.10.30.00, X1986.01.10.10.30.00, X1986.01.11.10.30.00,
X1986.01.12.10.30.00, X1986.01.13.10.30.00, X1986.01.14.10.30.00,
X1986.01.15.10.30.00, ...
Date/time   : 1986-01-01 10:30:00, 2016-12-31 10:30:00 (min, max)
varname : v1

### new RasterBrick ###
class   : RasterStack
dimensions  : 600, 1440, 864000, 5477  (nrow, ncol, ncell, nlayers)
resolution  : 0.25, 0.25  (x, y)
extent  : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
names   : X2000.01.01.10.30.00, X2000.01.02.10.30.00,
X2000.01.03.10.30.00, X2000.01.04.10.30.00, X2000.01.05.10.30.00,
X2000.01.06.10.30.00, X2000.01.07.10.30.00, X2000.01.08.10.30.00,
X2000.01.09.10.30.00, X2000.01.10.10.30.00, X2000.01.11.10.30.00,
X2000.01.12.10.30.00, X2000.01.13.10.30.00, X2000.01.14.10.30.00,
X2000.01.15.10.30.00, ...

Any help will be greatly appreciated.

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset Rasterbrick by time

2018-06-19 Thread Miluji Sb
Dear David,

Subsetting works but the 'date' information is lost in the new file.

Thanks, Mike. I was not aware of the bug but will work on learning
about (getZ) and (setZ). Thanks again!

Sincerely,

Milu

On Tue, Jun 19, 2018 at 7:32 AM, Michael Sumner  wrote:

>
>
> On Mon, 18 Jun 2018, 22:09 David Winsemius, 
> wrote:
>
>>
>>
>> > On Jun 18, 2018, at 7:21 AM, Miluji Sb  wrote:
>> >
>> > Dear all,
>> >
>> > I have a rasterbrick with the date/time information provided which I
>> would
>> > like to subset by year.
>> >
>> > However, when I use the following code for sub-setting;
>> >
>> > new_brick <- subset(original, which(getZ( original ) >=
>> as.Date("2000-01-01
>> > 10:30:00") & getZ(original ) <= as.Date("2014-12-31 10:30:00")))
>> >
>> > The date/time information seems to be lost.
>> >
>>
>
> This is a bug, I tend to extract (getZ) the dates, do the subset logic on
> both and restore (setZ).
>
> It takes a bit of learning and practice, good luck. I can't expand more at
> the moment. See R-Sig-Geo for more specific discussion forum, and #rstats
> on twitter is really good.
>
> Cheers, Mike
>
>> > Furthermore, the class of the date/time seems to be character;
>> >
>> > ##
>> > class(getZ( original ))
>> > [1] "character"
>> >
>> > Is it possible to convert this string to date before sub-setting or
>> retain
>> > the date/time information after sub-setting?
>>
>> Yes, it is certainly possible, but why bother? R's Comparison operators
>> work on character values so you should be able to do this (if the
>> subsetting is syntactically correct:
>>
>>  new_brick <- subset(original, which(getZ( original ) >= "2000-01-01
>> 10:30:00" & getZ(original ) <= "2014-12-31 10:30:00") )
>>
>>
>> As always if you had presented the output of dput(head(original))
>> assuming that head is a meaningful operation on such an object, the
>> demonstration would have been possible. An alternate would be to offer a
>> library call to a package and then load a relevant example.
>>
>>
>> Best;
>> David
>> >
>> > ### original RasterBrick ###
>> > class   : RasterBrick
>> > dimensions  : 600, 1440, 864000, 11320  (nrow, ncol, ncell, nlayers)
>> > resolution  : 0.25, 0.25  (x, y)
>> > extent  : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
>> > coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
>> > data source :
>> > /work/mm01117/GLDAS_025_deg/daily/gldas_tavg_tmin_tmax_
>> precip_windspd_sphum_daily_1986_2016.nc4
>> > names   : X1986.01.01.10.30.00, X1986.01.02.10.30.00,
>> > X1986.01.03.10.30.00, X1986.01.04.10.30.00, X1986.01.05.10.30.00,
>> > X1986.01.06.10.30.00, X1986.01.07.10.30.00, X1986.01.08.10.30.00,
>> > X1986.01.09.10.30.00, X1986.01.10.10.30.00, X1986.01.11.10.30.00,
>> > X1986.01.12.10.30.00, X1986.01.13.10.30.00, X1986.01.14.10.30.00,
>> > X1986.01.15.10.30.00, ...
>> > Date/time   : 1986-01-01 10:30:00, 2016-12-31 10:30:00 (min, max)
>> > varname : v1
>> >
>> > ### new RasterBrick ###
>> > class   : RasterStack
>> > dimensions  : 600, 1440, 864000, 5477  (nrow, ncol, ncell, nlayers)
>> > resolution  : 0.25, 0.25  (x, y)
>> > extent  : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
>> > coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
>> > names   : X2000.01.01.10.30.00, X2000.01.02.10.30.00,
>> > X2000.01.03.10.30.00, X2000.01.04.10.30.00, X2000.01.05.10.30.00,
>> > X2000.01.06.10.30.00, X2000.01.07.10.30.00, X2000.01.08.10.30.00,
>> > X2000.01.09.10.30.00, X2000.01.10.10.30.00, X2000.01.11.10.30.00,
>> > X2000.01.12.10.30.00, X2000.01.13.10.30.00, X2000.01.14.10.30.00,
>> > X2000.01.15.10.30.00, ...
>> >
>> > Any help will be greatly appreciated.
>> >
>> > Sincerely,
>> >
>> > Milu
>> >
>> >   [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>&g

[R] Prediction with two fixed-effects - large number of IDs

2017-06-17 Thread Miluji Sb
Dear all,

I am running a panel regression with time and location fixed effects:

###

reg1 <- lm(lny ~ factor(id) + factor(year) + x1+ I(x1)^2 + x2+ I(x2)^2 ,
 data=mydata, na.action="na.omit")
###

My goal is to use the estimation for prediction. However, I have 8,500 IDs,
which is resulting in very slow computation. Ideally, I would like to do
the following:

###
reg2 <- felm(lny ~ x1+ I(x1)^2 + x2+ I(x2)^2 | id + year , data=mydata,
na.action="na.omit")
###

However, predict does not work with felm. Is there a way to either make lm
faster or use predict with felm? Is parallelizing an option?

Any help will be appreciated. Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Prediction with two fixed-effects - large number of IDs

2017-06-17 Thread Miluji Sb
Dear Jeff,

Thank you so much and apologies for the typo in I() - it was silly.

I will try the biglm package - thanks!

Sincerely,

Milu

On Sat, Jun 17, 2017 at 9:01 PM, Jeff Newmiller 
wrote:

> I have no direct experience with such horrific models, but your formula is
> a mess and Google suggests the biglm package with ffdf.
>
> Specifically, you should convert your discrete variables to factors before
> you build the model, particularly since you want to use predict after the
> fact, for which you will need a new data set with the exact same levels in
> the factors.
>
> Also, your use of I() is broken and redundant.  I think formulas
>
> lny ~ id + year + x1 + I(x1^2) + x2 + I(x2^2)
>
> or
>
> lny ~ id + year + x1^2 + x2^2
>
> would obtain the intended prediction results.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> On June 17, 2017 11:24:05 AM PDT, Miluji Sb  wrote:
> >Dear all,
> >
> >I am running a panel regression with time and location fixed effects:
> >
> >###
> >
> >reg1 <- lm(lny ~ factor(id) + factor(year) + x1+ I(x1)^2 + x2+ I(x2)^2
> >,
> > data=mydata, na.action="na.omit")
> >###
> >
> >My goal is to use the estimation for prediction. However, I have 8,500
> >IDs,
> >which is resulting in very slow computation. Ideally, I would like to
> >do
> >the following:
> >
> >###
> >reg2 <- felm(lny ~ x1+ I(x1)^2 + x2+ I(x2)^2 | id + year , data=mydata,
> >na.action="na.omit")
> >###
> >
> >However, predict does not work with felm. Is there a way to either make
> >lm
> >faster or use predict with felm? Is parallelizing an option?
> >
> >Any help will be appreciated. Thank you!
> >
> >Sincerely,
> >
> >Milu
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ISO3 code to 7 continents names

2017-09-07 Thread Miluji Sb
Dear all.

Is it possible to convert.identify iso3 country names to the seven
continent names?

# Asia, Africa, Antarctica, Australia, Europe, South America, and North
America,

I have tried the following:

###
region <- merge(countryExData,df,by.x='ISO3V10',by.y='iso3')

where df is the name of my dataset with iso3 the identification variable
but there seems to be a a lot of missing values.

Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ISO3 code to 7 continents names

2017-09-07 Thread Miluji Sb
df is a data frame consisting of one variable (iso3 codes) such as

USA
RUS
ARG
BGD
ITA
FRA


Some of these iso3 codes are repeated and I would like the corresponding
continent name, the countrycode package does not seem to distinguish
between North and South America. Thanks.

Sincerely,

Milu

On Thu, Sep 7, 2017 at 9:00 PM, David Winsemius 
wrote:

>
> > On Sep 7, 2017, at 11:36 AM, Miluji Sb  wrote:
> >
> > Dear all.
> >
> > Is it possible to convert.identify iso3 country names to the seven
> > continent names?
> >
> > # Asia, Africa, Antarctica, Australia, Europe, South America, and North
> > America,
> >
> > I have tried the following:
> >
> > ###
> > region <- merge(countryExData,df,by.x='ISO3V10',by.y='iso3')
> >
> > where df is the name of my dataset with iso3 the identification variable
> > but there seems to be a a lot of missing values.
>
> Please provide a sufficient amount of the dataframe named `df` to allow a
> properly tested response.
>
> >
> >   [[alternative HTML version deleted]]
>
> And do read the Posting Guide. This is a plain text mailing list.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> 'Any technology distinguishable from magic is insufficiently advanced.'
>  -Gehm's Corollary to Clarke's Third Law
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Download data from NASA for multiple locations - RCurl

2017-10-15 Thread Miluji Sb
Dear all,

i am trying to download time-series climatic data from GES DISC (NASA)
Hydrology Data Rods web-service. Unfortunately, no wget method is
available.

Five parameters are needed for data retrieval: variable, location,
startDate, endDate, and type. For example:

###
https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2
###

In this case, variable: Tair_f_inst (temperature), location: (-71.06,
42.36), startDate: 01 January 1970; endDate: 31 December 1979; type:  asc2
(output 2-column ASCII).

I am trying to download data for 100 US cities, data for which I have in
the following data.frame:

###
cities <-  dput(droplevels(head(cities, 5)))
structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport",
"Cambridge", "Fall River", "Hartford"), class = "factor"), state =
structure(c(2L,
1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"),
lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36,
41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state",
"lon", "lat"), row.names = c(NA, 5L), class = "data.frame")
###

Is it possible to download the data for the multiple locations
automatically (e.g. RCurl) and save them as csv? Essentially, reading
coordinates from the data.frame and entering it in the URL.

I would also like to add identifying information to each of the data files
from the cities data.frame. I have been doing the following for a single
file:

###
x <- readLines(con=url("
https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&location=GEOM:POINT(-71.06,%2042.36)&type=asc2
"))
x <- x[-(1:13)]

mydata <- data.frame(year = substr(x,1,4),
 month = substr(x, 6,7),
 day = substr(x, 9, 10),
 hour = substr(x, 12, 13),
 temp = substr(x, 21, 27))

mydata$city <- rep(cities[1,1], nrow(mydata))
mydata$state <- rep(cities[1,2], nrow(mydata))
mydata$lon <- rep(cities[1,3], nrow(mydata))
mydata$lat <- rep(cities[1,4], nrow(mydata))
###

Help and advice would be greatly appreciated. Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Download data from NASA for multiple locations - RCurl

2017-10-15 Thread Miluji Sb
Dear David,

This is amazing, thank you so much. If I may ask another question:

The output looks like the following:

###
dput(head(x,15))
c("Metadata for Requested Time Series:", "",
"prod_name=GLDAS_NOAH025_3H_v2.0",
"param_short_name=Tair_f_inst", "param_name=Near surface air temperature",
"unit=K", "begin_time=1970-01-01T00", "end_time=1979-12-31T21",
"lat= 42.36", "lon=-71.06", "Request_time=2017-10-15 22:20:03 GMT",
"", "Date&Time   Data", "1970-01-01T00:00:00\t267.769",
"1970-01-01T03:00:00\t264.595")
###

Thus I need to drop the first 13 rows and do the following to add
identifying information:

###
mydata <- data.frame(year = substr(x,1,4),
 month = substr(x, 6,7),
 day = substr(x, 9, 10),
 hour = substr(x, 12, 13),
 temp = substr(x, 21, 27))

mydata$city <- rep(cities[1,1], nrow(mydata))
mydata$state <- rep(cities[1,2], nrow(mydata))
mydata$lon <- rep(cities[1,3], nrow(mydata))
mydata$lat <- rep(cities[1,4], nrow(mydata))
###

Is it possible to incorporate these into your code so the data looks like
this:

dput(droplevels(head(mydata)))
structure(list(year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "1970",
class = "factor"),
month = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class =
"factor"),
day = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "01", class =
"factor"),
hour = structure(1:6, .Label = c("00", "03", "06", "09",
"12", "15"), class = "factor"), temp = structure(c(6L, 4L,
2L, 1L, 3L, 5L), .Label = c("261.559", "262.525", "262.648",
"264.595", "265.812", "267.769"), class = "factor"), city =
structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = "Boston", class = "factor"),
state = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = " MA ", class =
"factor"),
lon = c(-71.06, -71.06, -71.06, -71.06, -71.06, -71.06),
lat = c(42.36, 42.36, 42.36, 42.36, 42.36, 42.36)), .Names = c("year",
"month", "day", "hour", "temp", "city", "state", "lon", "lat"
), row.names = c(NA, 6L), class = "data.frame")

Apologies for asking repeated questions and thank you again!

Sincerely,

Milu

On Sun, Oct 15, 2017 at 11:45 PM, David Winsemius 
wrote:

>
> > On Oct 15, 2017, at 2:02 PM, Miluji Sb  wrote:
> >
> > Dear all,
> >
> > i am trying to download time-series climatic data from GES DISC (NASA)
> > Hydrology Data Rods web-service. Unfortunately, no wget method is
> > available.
> >
> > Five parameters are needed for data retrieval: variable, location,
> > startDate, endDate, and type. For example:
> >
> > ###
> > https://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/access/
> timeseries.cgi?variable=GLDAS2:GLDAS_NOAH025_3H_v2.0:
> Tair_f_inst&startDate=1970-01-01T00&endDate=1979-12-31T00&
> location=GEOM:POINT(-71.06,%2042.36)&type=asc2
> > ###
> >
> > In this case, variable: Tair_f_inst (temperature), location: (-71.06,
> > 42.36), startDate: 01 January 1970; endDate: 31 December 1979; type:
> asc2
> > (output 2-column ASCII).
> >
> > I am trying to download data for 100 US cities, data for which I have in
> > the following data.frame:
> >
> > ###
> > cities <-  dput(droplevels(head(cities, 5)))
> > structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport",
> > "Cambridge", "Fall River", "Hartford"), class = "factor"), state =
> > structure(c(2L,
> > 1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"),
> >lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36,
> >41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state",
> > "lon", "lat"), row.names = c(NA, 5L), class = "data.frame")
> > ###
> >
> > Is it possible to download the data for the multiple locations
> > automatically (e.g. RCurl) and save them as csv? Essentially, reading
> > coordinates from the data.frame and entering it in the URL.
> >
> > I would also like to add identifying information to each of the data
> files
> > from the cities data.frame. I have been doing the following for a single
> > file:
>
> Didn't seem that difficult:
>
> library(do

Re: [R] Download data from NASA for multiple locations - RCurl

2017-10-16 Thread Miluji Sb
I have done the following using readLines

directory <- "~/"
files <- list.files(directory)
data_frames <- vector("list", length(files))
for (i in seq_along(files)) {
  df <- readLines(file.path(directory, files[i]))
  df <- df[-(1:13)]
  df <- data.frame(year = substr(df,1,4),
   month = substr(df, 6,7),
   day = substr(df, 9, 10),
   hour = substr(df, 12, 13),
   temp = substr(df, 21, 27))
  data_frames[[i]] <- df
}

What I have been have been having trouble is adding the following
information from the cities file (100 cities) for each of the downloaded
data files. I would like to do the following but automatically:

###
mydata$city <- rep(cities[1,1], nrow(mydata))
mydata$state <- rep(cities[1,2], nrow(mydata))
mydata$lon <- rep(cities[1,3], nrow(mydata))
mydata$lat <- rep(cities[1,4], nrow(mydata))
###

The information for cities look like this:

###
cities <-  dput(droplevels(head(cities, 5)))
structure(list(city = structure(1:5, .Label = c("Boston", "Bridgeport",
"Cambridge", "Fall River", "Hartford"), class = "factor"), state =
structure(c(2L,
1L, 2L, 2L, 1L), .Label = c(" CT ", " MA "), class = "factor"),
lon = c(-71.06, -73.19, -71.11, -71.16, -72.67), lat = c(42.36,
41.18, 42.37, 41.7, 41.77)), .Names = c("city", "state",
"lon", "lat"), row.names = c(NA, 5L), class = "data.frame")
###

Apologies if this seems trivial but I have been having a hard time. Thank
you again.

Sincerely,

Milu

On Mon, Oct 16, 2017 at 7:13 PM, David Winsemius 
wrote:

>
> > On Oct 15, 2017, at 3:35 PM, Miluji Sb  wrote:
> >
> > Dear David,
> >
> > This is amazing, thank you so much. If I may ask another question:
> >
> > The output looks like the following:
> >
> > ###
> > dput(head(x,15))
> > c("Metadata for Requested Time Series:", "", "prod_name=GLDAS_NOAH025_3H_
> v2.0",
> > "param_short_name=Tair_f_inst", "param_name=Near surface air
> temperature",
> > "unit=K", "begin_time=1970-01-01T00", "end_time=1979-12-31T21",
> > "lat= 42.36", "lon=-71.06", "Request_time=2017-10-15 22:20:03 GMT",
> > "", "Date&Time   Data", "1970-01-01T00:00:00\t267.769",
> > "1970-01-01T03:00:00\t264.595")
> > ###
> >
> > Thus I need to drop the first 13 rows and do the following to add
> identifying information:
>
> Are you having difficulty reading in the data from disk? The `read.table`
> function has a "skip" parameter.
> >
> > ###
> > mydata <- data.frame(year = substr(x,1,4),
>
> That would not appear to do anything useful with x. The `x` object is not
> a long string. The items you want are in separate elements of x.
>
> substr(x,1,4)   # now returns
>  [1] "Meta" "" "prod" "para" "para" "unit" "begi" "end_" "lat=" "lon="
> "Requ" "" "Date"
> [14] "1970" "1970"
>
> You need to learn basic R indexing. The year might be extracted from the
> 7th element of x x via code like this:
>
> year <- substr( x[7], 1,4)
>
> >  month = substr(x, 6,7),
> >  day = substr(x, 9, 10),
> >  hour = substr(x, 12, 13),
> >  temp = substr(x, 21, 27))
>
> The time and temp items would naturally be read in with read.table (or in
> the case of tab-delimited data with read.delim) after skipping the first 14
> lines.
>
>
> >
> > mydata$city <- rep(cities[1,1], nrow(mydata))
>
> There's no need to use `rep` with data.frame. If one argument to
> data.frame is length n then all single elelment arguments will be
> "recycled" to fill in the needed number of rows. Please take the time to
> work through all the pages of "Introduction to R" (shipped with all
> distributions of R) or pick another introductory text. We cannot provide
> tutoring to all students. You need to put in the needed self-study first.
>
> --
> David.
>
>
> > mydata$state <- rep(cities[1,2], nrow(mydata))
> > mydata$lon <- rep(cities[1,3], nrow(mydata))
> > mydata$lat <- rep(cities[1,4], nrow(mydata))
> > ###
> >
> > Is it possible to incorporate these into your code so the data looks
> like this:
> >
> > dput(droplevels(head(mydata

[R] Match Coordinates to NUTS 2 ID

2016-05-26 Thread Miluji Sb
Dear all,


I have downloaded the NUTS 2 level data from
library(“rgdal”)
library(“RColorBrewer”)
library(“classInt”)
#library(“SmarterPoland”)
library(fields)

# Download Administrative Level data from EuroStat
temp <- tempfile(fileext = ".zip")
download.file("
http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2010_60M_SH.zip
",
  temp)
unzip(temp)

# Read data
EU_NUTS <- readOGR(dsn = "./NUTS_2010_60M_SH/data", layer =
"NUTS_RG_60M_2010")

# Subset NUTS 2 level data
map_nuts2 <- subset(EU_NUTS, STAT_LEVL_ == 2)

I also have data for a variable by coordinates, which looks like this:

structure(list(LON = c(-125.25, -124.75, -124.25, -124.25, -124.25,
-124.25), LAT = c(49.75, 49.25, 42.75, 43.25, 48.75, 49.25),
yr = c(2.91457704560515, 9.94774197180345, -2.71956412885765,
-0.466213169185147, -36.6645659563374, 10.5168056769535)), .Names =
c("LON",
"LAT", "yr"), row.names = c(NA, 6L), class = "data.frame")

I would like to match the coordinates to their corresponding NUTS 2 region
- is this possible? Any help will be high appreciated. Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Match Coordinates to NUTS 2 ID

2016-05-27 Thread Miluji Sb
Thank you for your reply. I am trying to use the over function - but having
trouble. Asked at R-sig-geo. Thanks again!

Sincerely,

Milu

On Fri, May 27, 2016 at 12:49 AM, MacQueen, Don  wrote:

> Perhaps the
>   over()
> function in the sp package.
>
> (in which case, R-sig-geo might be a better place to ask).
>
> -Don
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
>
>
>
>
> On 5/26/16, 2:30 PM, "R-help on behalf of Miluji Sb"
>  wrote:
>
> >Dear all,
> >
> >
> >I have downloaded the NUTS 2 level data from
> >library(³rgdal²)
> >library(³RColorBrewer²)
> >library(³classInt²)
> >#library(³SmarterPoland²)
> >library(fields)
> >
> ># Download Administrative Level data from EuroStat
> >temp <- tempfile(fileext = ".zip")
> >download.file("
> >
> http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2010_60M_SH.zip
> >",
> >  temp)
> >unzip(temp)
> >
> ># Read data
> >EU_NUTS <- readOGR(dsn = "./NUTS_2010_60M_SH/data", layer =
> >"NUTS_RG_60M_2010")
> >
> ># Subset NUTS 2 level data
> >map_nuts2 <- subset(EU_NUTS, STAT_LEVL_ == 2)
> >
> >I also have data for a variable by coordinates, which looks like this:
> >
> >structure(list(LON = c(-125.25, -124.75, -124.25, -124.25, -124.25,
> >-124.25), LAT = c(49.75, 49.25, 42.75, 43.25, 48.75, 49.25),
> >yr = c(2.91457704560515, 9.94774197180345, -2.71956412885765,
> >-0.466213169185147, -36.6645659563374, 10.5168056769535)), .Names =
> >c("LON",
> >"LAT", "yr"), row.names = c(NA, 6L), class = "data.frame")
> >
> >I would like to match the coordinates to their corresponding NUTS 2 region
> >- is this possible? Any help will be high appreciated. Thank you!
> >
> >Sincerely,
> >
> >Milu
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Convert ncdf data to dataframe

2016-06-02 Thread Miluji Sb
Dear all,

I have used the following code to read in a ncdf file

library(chron)
library(lattice)
library(ncdf4)
library(data.table)

ncname <- ("/file_path")
ncfname <- paste(ncname, ".nc", sep = "")
dname <- "ssl"  # note: tmp means temperature (not temporary)

ncin <- nc_open(ncfname)
print(ncin)

The attributes of the file are:

 4 variables (excluding dimension variables):
double longitude[row]
long_name: longitude
units: degrees_east
standard_name: longitude
double latitude[row]
long_name: latitude
units: degrees_north
standard_name: latitude
double ssl[col,row]
long_name: storm surge level
units: m
_FillValue: -9
scale_factor: 1
add_offset: 1
float RP[col]
long_name: return period
units: yr
Contents: The RPs have been estimated following the Peak Over
Threshold Method (see reference below)
Starting date: 01-Dec-2009
End date: 30-Nov-2099 21:00:00
 2 dimensions:
col  Size:8
row  Size:2242

I would like to convert the data into a dataframe by longitude and
latitude. Is that possible? Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Aggregate data to lower resolution

2016-07-21 Thread Miluji Sb
Dear all,

I have the following GDP data by latitude and longitude at 0.5 degree by
0.5 degree.

temp <- dput(head(ptsDF,10))
structure(list(longitude = c(-68.25, -67.75, -67.25, -68.25,
-67.75, -67.25, -71.25, -70.75, -69.25, -68.75), latitude = c(-54.75,
-54.75, -54.75, -54.25, -54.25, -54.25, -53.75, -53.75, -53.75,
-53.75), GDP = c(1.683046, 0.3212307, 0.0486207, 0.1223268, 0.0171909,
0.0062104, 0.22379, 0.1406729, 0.0030038, 0.0057422)), .Names =
c("longitude",
"latitude", "GDP"), row.names = c(4L, 17L, 30L, 43L, 56L, 69L,
82L, 95L, 108L, 121L), class = "data.frame")

I would like to aggregate the data 1 degree by 1 degree. I understand that
the first step is to convert to raster. I have tried:

rasterDF <- rasterFromXYZ(temp)
r <- aggregate(rasterDF,fact=2, fun=sum)

But this does not seem to work. Could anyone help me out please? Thank you
in advance.

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Aggregate data to lower resolution

2016-07-22 Thread Miluji Sb
Dear Jean,

Thank you so much for your reply and the solution, This does work. I was
wondering is this similar to 'rasterFromXYZ'? Thanks again!

Sincerely,

Milu

On Fri, Jul 22, 2016 at 3:06 PM, Adams, Jean  wrote:

> Milu,
>
> Perhaps an approach like this would work.  In the example below, I
> calculate the mean GDP for each 1 degree by 1 degree.
>
> temp$long1 <- floor(temp$longitude)
> temp$lat1 <- floor(temp$latitude)
> temp1 <- aggregate(GDP ~ long1 + lat1, temp, mean)
>
>   long1 lat1GDP
> 1   -69  -55 0.90268640
> 2   -68  -55 0.09831317
> 3   -72  -54 0.22379000
> 4   -71  -54 0.14067290
> 5   -70  -54 0.00300380
> 6   -69  -54 0.00574220
>
> Jean
>
> On Thu, Jul 21, 2016 at 3:57 PM, Miluji Sb  wrote:
>
>> Dear all,
>>
>> I have the following GDP data by latitude and longitude at 0.5 degree by
>> 0.5 degree.
>>
>> temp <- dput(head(ptsDF,10))
>> structure(list(longitude = c(-68.25, -67.75, -67.25, -68.25,
>> -67.75, -67.25, -71.25, -70.75, -69.25, -68.75), latitude = c(-54.75,
>> -54.75, -54.75, -54.25, -54.25, -54.25, -53.75, -53.75, -53.75,
>> -53.75), GDP = c(1.683046, 0.3212307, 0.0486207, 0.1223268, 0.0171909,
>> 0.0062104, 0.22379, 0.1406729, 0.0030038, 0.0057422)), .Names =
>> c("longitude",
>> "latitude", "GDP"), row.names = c(4L, 17L, 30L, 43L, 56L, 69L,
>> 82L, 95L, 108L, 121L), class = "data.frame")
>>
>> I would like to aggregate the data 1 degree by 1 degree. I understand that
>> the first step is to convert to raster. I have tried:
>>
>> rasterDF <- rasterFromXYZ(temp)
>> r <- aggregate(rasterDF,fact=2, fun=sum)
>>
>> But this does not seem to work. Could anyone help me out please? Thank you
>> in advance.
>>
>> Sincerely,
>>
>> Milu
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Country names from latitude and longitude

2016-07-25 Thread Miluji Sb
I have the following data at 0.5 degree by 0.5 degree.

temp <- dput(head(ptsDF,10))
structure(list(longitude = c(-68.25, -67.75, -67.25, -68.25,
-67.75, -67.25, -71.25, -70.75, -69.25, -68.75), latitude = c(-54.75,
-54.75, -54.75, -54.25, -54.25, -54.25, -53.75, -53.75, -53.75,
-53.75), GDP = c(1.683046, 0.3212307, 0.0486207, 0.1223268, 0.0171909,
0.0062104, 0.22379, 0.1406729, 0.0030038, 0.0057422)), .Names =
c("longitude",
"latitude", "GDP"), row.names = c(4L, 17L, 30L, 43L, 56L, 69L,
82L, 95L, 108L, 121L), class = "data.frame")

I would like add the corresponding country names to each of the
coordinates. This is what I have done:

library(data.table)
library(rgdal)
library(reshape2)
library(dplyr)
library(tidyr)
library(lubridate)
library(maps)
library(countrycode)
library(ggplot2)
library(raster)

coord_cells <-temp [,c(1,2)]
pts_cells <-
SpatialPoints(coord_cells,proj4string=CRS(proj4string(worldmap)))
indices_cells <- over(pts_cells, worldmap,na.rm=TRUE)
foo_cells<-indices_cells[,c(3,5)] # Keep ISO3 and country names only
new_data <- cbind(foo_cells, temp)

However, I get a large number of NAs for coordinates which should have
corresponding countries. What am I doing wrong? Any suggestions? Thank you.

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Climate data in R

2016-08-01 Thread Miluji Sb
Dear all,

I have a set of coordinates. Is it possible to extract climate data
(temperature and precipitation) by coordinates using the R packages such as
rnoaa?

For example;

out <- ncdc(datasetid='ANNUAL', stationid='GHCND:USW00014895',
datatypeid='TEMP')

But instead of stationid can I pass a list of coordinates through it?
Thanks a lot!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Marginal effects with plm

2018-09-05 Thread Miluji Sb
Dear all,

I am running the following panel regression;

plm1 <- plm(formula = log(y) ~ x1 + I(x1^2) + heat*debt_dummy + tt, data =
df, index=c("region","year"))

where 'df' is a pdata.frame. I would like to obtain marginal effects of 'y'
for the variable 'x1'. I have tried the packages 'prediction' and 'margins'
without luck.

Is it possible to obtain marginal effects with 'plm'? Any help will be
highly appreciated. Thank you.

Error in UseMethod("predict") :
  no applicable method for 'predict' applied to an object of class
"c('plm', 'panelmodel')"

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Marginal effects with plm

2018-09-06 Thread Miluji Sb
Dear John,

Thank you very much for the solution and the suggestion. I have tried the
following;

plm1 <- plm(formula = log(gva_ind) ~  poly(x1, 2, raw=TRUE) +
heat*debt_dummy + tt, data = df, index=c("region","year"))

Ef.hd <- Effect(c("heat", "debt_dummy"), plm1)

But get the following error;  - Error in UseMethod("droplevels") : no
applicable method for 'droplevels' applied to an object of class "NULL"

Is this something to do with the way the plm object? Thanks again!

Sincerely,

Milu

On Thu, Sep 6, 2018 at 1:12 AM Fox, John  wrote:

> Dear Milu,
>
> Depending upon what you mean by "marginal effects," you might try the
> effects package. For example, for your model, try
>
> (Ef.hd <- Effect(c("heat", "debt_dummy"), plm1))
> plot(Ef.hd)
>
> A couple of comments about the model: I'd prefer to specify the formula as
> log(y) ~ poly(x1, 2) + heat*debt + tt or log(y) ~ poly(x1, 2, raw=TRUE) +
> heat*debt + tt (assuming that debt_dummy is a precoded dummy regressor for
> a factor debt).
>
> I hope this helps,
>  John
>
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: socialsciences.mcmaster.ca/jfox/
>
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Miluji
> > Sb
> > Sent: Wednesday, September 5, 2018 6:30 PM
> > To: r-help mailing list 
> > Subject: [R] Marginal effects with plm
> >
> > Dear all,
> >
> > I am running the following panel regression;
> >
> > plm1 <- plm(formula = log(y) ~ x1 + I(x1^2) + heat*debt_dummy + tt, data
> > = df, index=c("region","year"))
> >
> > where 'df' is a pdata.frame. I would like to obtain marginal effects of
> > 'y'
> > for the variable 'x1'. I have tried the packages 'prediction' and
> > 'margins'
> > without luck.
> >
> > Is it possible to obtain marginal effects with 'plm'? Any help will be
> > highly appreciated. Thank you.
> >
> > Error in UseMethod("predict") :
> >   no applicable method for 'predict' applied to an object of class
> > "c('plm', 'panelmodel')"
> >
> > Sincerely,
> >
> > Milu
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Marginal effects with plm

2018-09-06 Thread Miluji Sb
Dear John,

Apologies for not providing reproducible example. I just tried with a plm
example but ran into the same issue;

library(plm)
data("Produc", package = "plm")
zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data = Produc,
index = c("state","year"))

Ef.hd <- Effect(c("pc", "emp", "unemp"), zz)

Error in UseMethod("droplevels") :
  no applicable method for 'droplevels' applied to an object of class "NULL"

What am I doing wrong? Thanks again.

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Marginal effects with plm

2018-09-06 Thread Miluji Sb
Dear Ista,

Thanks for your reply. I tried both "prediction" and "margins" but neither
of them seem to work  with plm.

Sincerely,

Milu

On Thu, Sep 6, 2018 at 3:04 PM Ista Zahn  wrote:

> You might be interested in the "prediction" and "margins" packages.
>
> --Ista
>
> On Wed, Sep 5, 2018 at 6:30 PM Miluji Sb  wrote:
> >
> > Dear all,
> >
> > I am running the following panel regression;
> >
> > plm1 <- plm(formula = log(y) ~ x1 + I(x1^2) + heat*debt_dummy + tt, data
> =
> > df, index=c("region","year"))
> >
> > where 'df' is a pdata.frame. I would like to obtain marginal effects of
> 'y'
> > for the variable 'x1'. I have tried the packages 'prediction' and
> 'margins'
> > without luck.
> >
> > Is it possible to obtain marginal effects with 'plm'? Any help will be
> > highly appreciated. Thank you.
> >
> > Error in UseMethod("predict") :
> >   no applicable method for 'predict' applied to an object of class
> > "c('plm', 'panelmodel')"
> >
> > Sincerely,
> >
> > Milu
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Smoothing by group - Panel data - exponential/loess

2018-10-07 Thread Miluji Sb
Dear all,

I have panel data for a series (g) for three time periods. The variable is
likely autocorrelated. I would like to generate a new variable using
exponential/loess smoothing by group (gid).

For time series, I could have done something like this;

smoothdf <- data.frame(
  x = 1:n,
  y = as.vector(smooth(dat$g)),
  method = "smooth()"
)

But confused about the panel data setting. Apologies if this a stat
question along with an R query. Any help will be greatly appreciated.

### data
structure(list(gid = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L,
4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L,
9L), year = c(2030L, 2050L, 2100L, 2030L, 2050L, 2100L, 2030L,
2050L, 2100L, 2030L, 2050L, 2100L, 2030L, 2050L, 2100L, 2030L,
2050L, 2100L, 2030L, 2050L, 2100L, 2030L, 2050L, 2100L, 2030L,
2050L, 2100L), g = c(9.24e-05, 0.0001133, 6.3e-05, 7.72e-10,
1.41e-09, 4.68e-09, 0.0001736, 0.0002286, 0.0001541, 1.87e-15,
3.76e-15, 8.52e-15, 0.0001822, 0.0002391, 0.0001348, 3.69e-06,
8.11e-06, 8.63e-06, 3.41e-06, 7.32e-06, 7.18e-06, 8.47e-07, 1.83e-06,
1.84e-06, 1.13e-06, 2.37e-06, 2.15e-06)), class = "data.frame", row.names =
c(NA,
-27L))


Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Obtain coordinates for city names

2018-11-06 Thread Miluji Sb
I have a dataframe (more than 50,000 observations), of cities in the EU.

My goal is to assign NUTS-2 code to each of these cities. However, I am not
aware of any direct way of achieving this, so I wanted to first assign
coordinates to the cities and then use the 'over' function to match with
NUTS regions from EU shapefile.

I have tried to use 'geocode' from the ggmap package but there is a 2,500
per day limit. Is there any other solution? Any help will be greatly
appreciated.

Cross-posed yesterday on r-sig-ge,

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot coordinates on world map with Robinson CRS - ggplot2

2019-02-18 Thread Miluji Sb
Dear all,

I am trying to plot coordinates on a world map with Robinson CRS. While the
world map is generated without any issues, when I try to plot the points -
I only get a single point.

The code I am using and the coordinates data is below. What am I doing
wrong? Any help/suggestions will be highly appreciated.

library(data.table)
library(foreign)
library(readstata13)
library(rgdal)
library(maptools)
library(ggplot2)
library(dplyr)

load(url("
https://github.com/valentinitnelav/RandomScripts/blob/master/NaturalEarth.RData?raw=true
"))

PROJ <- "+proj=robin +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84
+units=m +no_defs"

NE_countries_rob  <- spTransform(NE_countries, CRSobj = PROJ)
NE_graticules_rob <- spTransform(NE_graticules, CRSobj = PROJ)
NE_box_rob<- spTransform(NE_box, CRSobj = PROJ)

# project long-lat coordinates for graticule label data frames
# (two extra columns with projected XY are created)
prj.coord <- project(cbind(lbl.Y$lon, lbl.Y$lat), proj=PROJ)
lbl.Y.prj <- cbind(prj.coord, lbl.Y)
names(lbl.Y.prj)[1:2] <- c("X.prj","Y.prj")

prj.coord <- project(cbind(lbl.X$lon, lbl.X$lat), proj=PROJ)
lbl.X.prj <- cbind(prj.coord, lbl.X)
names(lbl.X.prj)[1:2] <- c("X.prj","Y.prj")

m <- ggplot() +
  # add Natural Earth countries projected to Robinson, give black border
and fill with gray
  geom_polygon(data=NE_countries_rob, aes(long,lat, group=group),
colour="black", fill="gray80", size = 0.25) +
  # Note: "Regions defined for each Polygons" warning has to do with
fortify transformation. Might get deprecated in future!
  # alternatively, use use map_data(NE_countries) to transform to data
frame and then use project() to change to desired projection.
  # add Natural Earth box projected to Robinson
  geom_polygon(data=NE_box_rob, aes(x=long, y=lat), colour="black",
fill="transparent", size = 0.25) +
  # add graticules projected to Robinson
  geom_path(data=NE_graticules_rob, aes(long, lat, group=group),
linetype="dotted", color="grey50", size = 0.25) +
  # add graticule labels - latitude and longitude
  geom_text(data = lbl.Y.prj, aes(x = X.prj, y = Y.prj, label = lbl),
color="grey50", size=2) +
  geom_text(data = lbl.X.prj, aes(x = X.prj, y = Y.prj, label = lbl),
color="grey50", size=2) +
  # the default, ratio = 1 in coord_fixed ensures that one unit on the
x-axis is the same length as one unit on the y-axis
  coord_fixed(ratio = 1) +
  geom_point(data=df,
 aes(x=lon, y=lat), colour="Deep Pink",
 fill="Pink",pch=21, size=2, alpha=I(0.7))
  # remove the background and default gridlines
  theme_void()

## coordinates dataframe
dput(df)
structure(list(lon = c(2.67569724621467, 17.5766416259819,
28.4126232192772,
23.8147674538232, 29.8917589327105), lat = c(28.1503115976162,
-12.3388787380201, 9.78891068739477, -22.1873831176644, -3.36546931479253
)), class = "data.frame", row.names = c(NA, -5L))

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Two Time Fixed Effects - LFE package

2015-11-14 Thread Miluji Sb
I have weekly panel data for more than a hundred cities. The independent
variables are temperature and precipitation. The time dimensions are year
and week and likely have time invariant characteristics and are all
important for proper estimation.

Could I use the LFE (or plm) package to estimate something like this by
including the location and two time fixed-effects?

felm(outcome ~ temperature + precipitation | city + year + week

Thanks!

MS

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two Time Fixed Effects - LFE package

2015-11-15 Thread Miluji Sb
Dear Andrew,

Thank you for your reply. Its an R question. The weeks are coded as 1-53
for each year and I would like to control weeks and years as time fixed
effects.

Will this be an issue if I estimate this type of regression using the LFE
package?

felm(outcome ~ temperature + precipitation | city + year + week

Thanks again!

Sincerely,

MS

On Sun, Nov 15, 2015 at 12:13 AM, Andrew Crane-Droesch 
wrote:

> Is this an R question or an econometrics question?  I'll assume that it is
> an R question.  If your weeks are coded sequentially (i.e.: weeks since a
> particular date), then they'll be strictly determined by year.  If however
> you're interested in the effect of a particular week of the year (week 7,
> for example), then you'll need to recode your week variable as a factor
> with 52 levels.  For that you'd likely need the "%%" operator.  For example:
>
> 1> 1:10%%3
>  [1] 1 2 0 1 2 0 1 2 0 1
>
>
>
>
> On 11/14/2015 05:18 PM, Miluji Sb wrote:
>
>> I have weekly panel data for more than a hundred cities. The independent
>> variables are temperature and precipitation. The time dimensions are year
>> and week and likely have time invariant characteristics and are all
>> important for proper estimation.
>>
>> Could I use the LFE (or plm) package to estimate something like this by
>> including the location and two time fixed-effects?
>>
>> felm(outcome ~ temperature + precipitation | city + year + week
>>
>> Thanks!
>>
>> MS
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Split Strings

2016-01-17 Thread Miluji Sb
I have a list of strings of different lengths and would like to split each
string by underscore "_"

pc_m2_45_ssp3_wheat
pc_m2_45_ssp3_wheat
ssp3_maize
m2_wheat

I would like to separate each part of the string into different columns
such as

pc m2 45 ssp3 wheat

But because of the different lengths - I would like NA in the columns for
the variables have fewer parts such as

NA NA NA m2 wheat

I have tried unlist(strsplit(x, "_")) to split, it works for one variable
but not for the list - gives me "non-character argument" error. I would
highly appreciate any help. Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split Strings

2016-01-18 Thread Miluji Sb
Thank you everyone for the codes and the link. They work well!

Mr. Lemon, thank you for the detailed code and the explanations. I
appreciate it. One thing though, in the last line

sapply(split_strings,fill_strings,list(max_length,element_sets))

should it be unlist instead of list - I get this error "Error in
FUN(X[[i]], ...) : (list) object cannot be coerced to type 'integer'".
Thanks again!



On Mon, Jan 18, 2016 at 9:19 AM, Jim Lemon  wrote:

> Hi Miluji,
> While the other answers are correct in general, I noticed that your
> request was for the elements of an incomplete string to be placed in the
> same positions as in the complete strings. Perhaps this will help:
>
> strings<-list("pc_m2_45_ssp3_wheat","pc_m2_45_ssp3_wheat",
>  "ssp3_maize","m2_wheat","pc_m2_45_ssp3_maize")
> split_strings<-strsplit(unlist(strings),"_")
> max_length <- max(sapply(split_strings,length))
> complete_sets<-split_strings[sapply(split_strings,length)==max_length]
> element_sets<-list()
>
> # build a list with the unique elements of each complete string
> for(i in 1:max_length)
>  element_sets[[i]]<-unique(sapply(complete_sets,"[",i))
>
> # function to guess the position of the elements in a partial string
> # and return them in the hopefully correct positions
> fill_strings<-function(split_string,max_length,element_sets) {
>  if(length(split_string) < max_length) {
>   new_split_string<-rep(NA,max_length)
>   for(i in 1:length(split_string)) {
>for(j in 1:length(complete_sets)) {
> if(grep(split_string[i],element_sets[j]))
>  new_split_string[j]<-split_string[i]
>}
>   }
>   return(new_split_string)
>  }
>  return(split_string)
> }
>
> # however, if you know that the incomplete strings will always
> # be composed of the last elements in the complete strings
> fill_strings<-function(split_string,max_length) {
>  lenstring<-length(split_string)
>  if(lenstring < max_length)
>   split_string<-c(rep(NA,max_length-lenstring),split_string)
>  return(split_string)
> }
>
> sapply(split_strings,fill_strings,list(max_length,element_sets))
>
> Jim
>
> On Mon, Jan 18, 2016 at 7:56 AM, Miluji Sb  wrote:
>
>> I have a list of strings of different lengths and would like to split each
>> string by underscore "_"
>>
>> pc_m2_45_ssp3_wheat
>> pc_m2_45_ssp3_wheat
>> ssp3_maize
>> m2_wheat
>>
>> I would like to separate each part of the string into different columns
>> such as
>>
>> pc m2 45 ssp3 wheat
>>
>> But because of the different lengths - I would like NA in the columns for
>> the variables have fewer parts such as
>>
>> NA NA NA m2 wheat
>>
>> I have tried unlist(strsplit(x, "_")) to split, it works for one variable
>> but not for the list - gives me "non-character argument" error. I would
>> highly appreciate any help. Thank you!
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Map of Europe at NUTS 2 Level

2016-03-10 Thread Miluji Sb
Dear all.

I would like to draw a map of France, Italy, Spain, and Portugal at NUTS 2
level. I used the following code:

library(“rgdal”)
library(“RColorBrewer”)
library(“classInt”)
#library(“SmarterPoland”)
library(fields)

# Download Administrative Level data from EuroStat
temp <- tempfile(fileext = ".zip")
download.file("
http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2010_60M_SH.zip
",
  temp)
unzip(temp)

# Read data
EU_NUTS <- readOGR(dsn = "./NUTS_2010_60M_SH/data", layer =
"NUTS_RG_60M_2010")

# Subset NUTS 2 level data
map_nuts2 <- subset(EU_NUTS, STAT_LEVL_ == 2)

# Draw basic plot
plot(map_nuts2)

This does produce a plot but its rather 'ugle'. Is there any way I can
subset the data further and draw a map for France, Italy, Spain, and
Portugal only? Thank you very much!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding Two-Headed Arrow in map legend

2016-04-08 Thread Miluji Sb
I am trying to draw maps for the world using:

library(rworldmap)
library(maptools)
library(RColorBrewer)


tmp2<- dput(head(pece,10))
structure(list(iso3 = c("AUS", "AUT", "BEL", "CAN", "CHE", "CHL",
"CZE", "DEU", "DNK", "ESP"), eps_score = c(0.877343773841858,
2.68984365463257, 1.31406247615814, 1.98046875, 2.6155540466,
NA, 1.44414067268372, 2.34257817268372, 2.89687490463257, 2.15937495231628
), gov_eff = c(1.7649562899, 1.8567421659, 1.7450476837,
1.8841785876, 1.99181815710935, 1.2147377396, 0.865833342075348,
1.6499602636, 2.15416664878527, 1.3682975705), sh_va_enint =
c(13.4375638961792,
8.90904521942139, 10.368335723877, 14.0469560623169, NA, NA,
13.5679216384888, 9.67090892791748, 10.5978908538818, 8.34146690368652
), rd_in_va = c(2.17547988891602, 2.47147130966187, 2.53955459594727,
2.01138758659363, NA, NA, 1.49587619304657, 2.72330951690674,
2.5316367149353, 1.48551619052887)), datalabel = "", time.stamp = " 9 Mar
2016 17:43", .Names = c("iso3",
"eps_score", "gov_eff", "sh_va_enint", "rd_in_va"), formats = c("%9s",
"%8.0g", "%10.0g", "%9.0g", "%9.0g"), types = c(6L, 254L, 255L,
254L, 254L), val.labels = c("", "", "", "", ""), var.labels = c("",
"(mean) eps_score", "(mean) gov_eff", "(mean) sh_va_enint", "(mean)
rd_in_va"
), expansion.fields = list(c("_dta", "ReS_i", "countrycode"),
c("_dta", "ReS_ver", "v.2"), c("_dta", "ReS_j", "year"),
c("_dta", "ReS_str", "0"), c("_dta", "ReS_Xij", "a_"), c("_dta",
"__JVarLab", "ACT"), c("_dta", "__XijVarLabrdd_", "(sum) rdd"
), c("_dta", "__XijVarLabp", "Value"), c("_dta", "__XijVarLabpop",
"Population"), c("_dta", "__XijVarLabest_lu_f", "Source of lu"
), c("_dta", "__XijVarLablu", "Percentage of No Schooling"
), c("_dta", "__XijVarLabest_lp_f", "Source of lp"), c("_dta",
"__XijVarLablp", "Percentage of Primary"), c("_dta", "__XijVarLablh",
"Percentage of Tertiary"), c("_dta", "__XijVarLabest_lh_f",
"Source of lh"), c("_dta", "__XijVarLabls", "Percentage of Secondary"
), c("_dta", "__XijVarLabest_ls_f", "Source of ls"), c("_dta",
"__XijVarLabvalue", "Value"), c("_dta", "_TStvar", "year"
), c("_dta", "_TSpanel", "id2"), c("_dta", "_TSdelta",
"+1.0X+000"
), c("_dta", "_TSitrvl", "1"), c("_dta", "tis", "year"),
c("_dta", "iis", "id2")), version = 12L, row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")
n <- joinCountryData2Map(pece, joinCode="ISO3", nameJoinColumn="iso3")
n <- n[-which(row.names(n)=='Antarctica'),]

# EPS
colourPalette <- rev(brewer.pal(7, "RdYlGn"))

eps <- mapCountryData(n, nameColumnToPlot="eps_score", mapTitle="EPS
Score",colourPalette=colourPalette,
  catMethod="fixedWidth", missingCountryCol = "white",
addLegend=FALSE)
do.call(addMapLegend, c(eps, legendLabels="all", legendWidth=0.5))

Instead of adding numeric based legend, I would like to add a two-headed
arrow with some text. I would be grateful for any help. Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding Two-Headed Arrow in map legend

2016-04-09 Thread Miluji Sb
Forgot to copy the list

Dear Jim,

Thank you for your reply. I must be doing something wrong, If this is my
command to plot a map of Europe:

eps_europe <- mapCountryData(n, nameColumnToPlot="eps_score", mapTitle="EPS
Score - Europe",colourPalette=colourPalette,
 catMethod="fixedWidth", missingCountryCol =
"white", mapRegion="Europe", addLegend=FALSE)

The following command does not seem to add the arrow. What am I doing wrong?

do.call(addMapLegend, c(eps_europe, legendLabels="none",
arrows(-100,-140,100,-140,code=3)))

Thank you again. I really appreciate it.

Sincerely,

Milu

On Sat, Apr 9, 2016 at 12:20 PM, Jim Lemon  wrote:

> Hi Miluji,
> Try this:
>
> arrows(-100,-140,100,-140,code=3)
>
> Jim
>
>
> On Fri, Apr 8, 2016 at 10:24 PM, Miluji Sb  wrote:
> > I am trying to draw maps for the world using:
> >
> > library(rworldmap)
> > library(maptools)
> > library(RColorBrewer)
> >
> >
> > tmp2<- dput(head(pece,10))
> > structure(list(iso3 = c("AUS", "AUT", "BEL", "CAN", "CHE", "CHL",
> > "CZE", "DEU", "DNK", "ESP"), eps_score = c(0.877343773841858,
> > 2.68984365463257, 1.31406247615814, 1.98046875, 2.6155540466,
> > NA, 1.44414067268372, 2.34257817268372, 2.89687490463257,
> 2.15937495231628
> > ), gov_eff = c(1.7649562899, 1.8567421659, 1.7450476837,
> > 1.8841785876, 1.99181815710935, 1.2147377396, 0.865833342075348,
> > 1.6499602636, 2.15416664878527, 1.3682975705), sh_va_enint =
> > c(13.4375638961792,
> > 8.90904521942139, 10.368335723877, 14.0469560623169, NA, NA,
> > 13.5679216384888, 9.67090892791748, 10.5978908538818, 8.34146690368652
> > ), rd_in_va = c(2.17547988891602, 2.47147130966187, 2.53955459594727,
> > 2.01138758659363, NA, NA, 1.49587619304657, 2.72330951690674,
> > 2.5316367149353, 1.48551619052887)), datalabel = "", time.stamp = " 9 Mar
> > 2016 17:43", .Names = c("iso3",
> > "eps_score", "gov_eff", "sh_va_enint", "rd_in_va"), formats = c("%9s",
> > "%8.0g", "%10.0g", "%9.0g", "%9.0g"), types = c(6L, 254L, 255L,
> > 254L, 254L), val.labels = c("", "", "", "", ""), var.labels = c("",
> > "(mean) eps_score", "(mean) gov_eff", "(mean) sh_va_enint", "(mean)
> > rd_in_va"
> > ), expansion.fields = list(c("_dta", "ReS_i", "countrycode"),
> > c("_dta", "ReS_ver", "v.2"), c("_dta", "ReS_j", "year"),
> > c("_dta", "ReS_str", "0"), c("_dta", "ReS_Xij", "a_"), c("_dta",
> > "__JVarLab", "ACT"), c("_dta", "__XijVarLabrdd_", "(sum) rdd"
> > ), c("_dta", "__XijVarLabp", "Value"), c("_dta", "__XijVarLabpop",
> > "Population"), c("_dta", "__XijVarLabest_lu_f", "Source of lu"
> > ), c("_dta", "__XijVarLablu", "Percentage of No Schooling"
> > ), c("_dta", "__XijVarLabest_lp_f", "Source of lp"), c("_dta",
> > "__XijVarLablp", "Percentage of Primary"), c("_dta", "__XijVarLablh",
> > "Percentage of Tertiary"), c("_dta", "__XijVarLabest_lh_f",
> > "Source of lh"), c("_dta", "__XijVarLabls", "Percentage of Secondary"
> > ), c("_dta", "__XijVarLabest_ls_f", "Source of ls"), c("_dta",
> > "__XijVarLabvalue", "Value"), c("_dta", "_TStvar", "year"
> > ), c("_dta", "_TSpanel", "id2"), c("_dta", "_TSdelta",
> > "+1.0X+000"
> > ), c("_dta", "_TSitrvl", "1"), c("_dta", "tis", "year"),
> > c("_dta", "iis", "id2")), version = 12L, row.names = c("1",
> > "2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")
> > n <- joinCountryData2Map(pece, joinCode="ISO3", nameJoinColumn="iso3")
> > n <- n[-which(row.names(n)=='Antarctica'),]
> >
> > # EPS
> > colourPalette <- rev(brewer.pal(7, "RdYlGn"))
> >
> > eps <- mapCountryData(n, nameColumnToPlot="eps_score", mapTitle="EPS
> > Score",colourPalette=colourPalette,
> >   catMethod="fixedWidth", missingCountryCol =
> "white",
> > addLegend=FALSE)
> > do.call(addMapLegend, c(eps, legendLabels="all", legendWidth=0.5))
> >
> > Instead of adding numeric based legend, I would like to add a two-headed
> > arrow with some text. I would be grateful for any help. Thank you!
> >
> > Sincerely,
> >
> > Milu
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding Two-Headed Arrow in map legend

2016-04-09 Thread Miluji Sb
Dear David,

Thank you for your answer. Sorry for the embarrassing mistake. However,
even with when I generate a map for the whole world using:

 eps <- mapCountryData(n, nameColumnToPlot="eps_score", mapTitle="EPS
Score",colourPalette=colourPalette,
  catMethod="fixedWidth", missingCountryCol = "white",
addLegend=FALSE)

And then use:

do.call(addMapLegend, c(eps, legendLabels="none",
arrows(-100,-140,100,-140,code=3)))

Only a legend with the colours is generated, no arrows. My session info is
below. Thanks again!

R version 3.2.4 (2016-03-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.2 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] countrycode_0.18   ggplot2_2.1.0  RColorBrewer_1.1-2 foreign_0.8-66
maptools_0.8-39rworldmap_1.3-6
[7] sp_1.2-0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4  lattice_0.20-33  grid_3.2.4   plyr_1.8.3
gtable_0.2.0 spam_1.3-0
 [7] scales_0.4.0 tools_3.2.4  munsell_0.4.3maps_3.1.0
fields_8.3-6 colorspace_1.2-6

On Sat, Apr 9, 2016 at 7:34 PM, David Winsemius 
wrote:

>
> > On Apr 9, 2016, at 8:13 AM, Miluji Sb  wrote:
> >
> > Forgot to copy the list
> >
> > Dear Jim,
> >
> > Thank you for your reply. I must be doing something wrong, If this is my
> > command to plot a map of Europe:
> >
> > eps_europe <- mapCountryData(n, nameColumnToPlot="eps_score",
> mapTitle="EPS
> > Score - Europe",colourPalette=colourPalette,
> > catMethod="fixedWidth", missingCountryCol =
> > "white", mapRegion="Europe", addLegend=FALSE)
> >
> > The following command does not seem to add the arrow. What am I doing
> wrong?
> >
> > do.call(addMapLegend, c(eps_europe, legendLabels="none",
> > arrows(-100,-140,100,-140,code=3)))
> >
>
> Your earlier question had a full world map. That was the context for Jim's
> reply, which did plot a two headed arrow above the legend in your earlier
> question. Now you have restricted the plot region to Europe so the
> coordinates of -100,-140,100,-140 no longer are on the visible plot area.
> You need to decide where you want the arrows using sensible coordinates.
>
> --
> David.
>
>
> > Thank you again. I really appreciate it.
> >
> > Sincerely,
> >
> > Milu
> >
> > On Sat, Apr 9, 2016 at 12:20 PM, Jim Lemon  wrote:
> >
> >> Hi Miluji,
> >> Try this:
> >>
> >> arrows(-100,-140,100,-140,code=3)
> >>
> >> Jim
> >>
> >>
> >> On Fri, Apr 8, 2016 at 10:24 PM, Miluji Sb  wrote:
> >>> I am trying to draw maps for the world using:
> >>>
> >>> library(rworldmap)
> >>> library(maptools)
> >>> library(RColorBrewer)
> >>>
> >>>
> >>> tmp2<- dput(head(pece,10))
> >>> structure(list(iso3 = c("AUS", "AUT", "BEL", "CAN", "CHE", "CHL",
> >>> "CZE", "DEU", "DNK", "ESP"), eps_score = c(0.877343773841858,
> >>> 2.68984365463257, 1.31406247615814, 1.98046875, 2.6155540466,
> >>> NA, 1.44414067268372, 2.34257817268372, 2.89687490463257,
> >> 2.15937495231628
> >>> ), gov_eff = c(1.7649562899, 1.8567421659, 1.7450476837,
> >>> 1.8841785876, 1.99181815710935, 1.2147377396,
> 0.865833342075348,
> >>> 1.6499602636, 2.15416664878527, 1.3682975705), sh_va_enint =
> >>> c(13.4375638961792,
> >>> 8.90904521942139, 10.368335723877, 14.0469560623169, NA, NA,
> >>> 13.5679216384888, 9.67090892791748, 10.5978908538818, 8.34146690368652
> >>> ), rd_in_va = c(2.17547988891602, 2.47147130966187, 2.53955459594727,
> >>> 2.01138758659363, NA, NA, 1.49587619304657, 2.72330951690674,
> >>> 2.5316367149353, 1.48551619052887)), datalabel = "", time.stamp = " 9
> Mar
> >>> 2016 17:43", .Names = c("iso3",
> >>> "eps_score", "gov_eff", "sh_va_enint", "rd_in_va"), formats = c("%9s",
> >>> "%8.0g", "%10.0g", "%9.0g", "%9.0g"), types = c(6L, 254L, 255L,
> >>> 254L, 254L), val.labels = c("", "", "", "", ""), var.labels 

Re: [R] Adding Two-Headed Arrow in map legend

2016-04-10 Thread Miluji Sb
Hello David,

This is exactly what I want but I still can't get the arrows. R and R
studio is updated. Thanks again!

Sincerely,

Milu

On Sat, Apr 9, 2016 at 10:29 PM, David Winsemius 
wrote:

>
> > On Apr 9, 2016, at 1:27 PM, David Winsemius 
> wrote:
> >
> >
> >> On Apr 9, 2016, at 11:18 AM, David Winsemius 
> wrote:
> >>
> >>
> >>> On Apr 9, 2016, at 10:46 AM, Miluji Sb  wrote:
> >>>
> >>> Dear David,
> >>>
> >>> Thank you for your answer. Sorry for the embarrassing mistake.
> However, even with when I generate a map for the whole world using:
> >>>
> >>> eps <- mapCountryData(n, nameColumnToPlot="eps_score", mapTitle="EPS
> Score",colourPalette=colourPalette,
> >>> catMethod="fixedWidth", missingCountryCol =
> "white", addLegend=FALSE)
> >>>
> >>> And then use:
> >>>
> >>> do.call(addMapLegend, c(eps, legendLabels="none",
> arrows(-100,-140,100,-140,code=3)))
> >>
> >> I do get an arrow using same version of R and OSX. See attached. (I
> think that png images will be accepted by the mailserver.)
> >
> > Nope I was wrong, but the copy to Milugi did arrive.
> >
> > Here's a (somewhat larger) pdf:
>
>
>
>
> >
> >
> > --
> > David
> >>
> >>
> >>
> >> --
> >> David.
> >>>
> >>> Only a legend with the colours is generated, no arrows. My session
> info is below. Thanks again!
> >>>
> >>> R version 3.2.4 (2016-03-10)
> >>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> >>> Running under: OS X 10.11.2 (El Capitan)
> >>>
> >>> locale:
> >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >>>
> >>> attached base packages:
> >>> [1] stats graphics  grDevices utils datasets  methods   base
> >>>
> >>> other attached packages:
> >>> [1] countrycode_0.18   ggplot2_2.1.0  RColorBrewer_1.1-2
> foreign_0.8-66 maptools_0.8-39rworldmap_1.3-6
> >>> [7] sp_1.2-0
> >>>
> >>> loaded via a namespace (and not attached):
> >>> [1] Rcpp_0.12.4  lattice_0.20-33  grid_3.2.4   plyr_1.8.3
>  gtable_0.2.0 spam_1.3-0
> >>> [7] scales_0.4.0 tools_3.2.4  munsell_0.4.3maps_3.1.0
>  fields_8.3-6 colorspace_1.2-6
> >>>
> >>> On Sat, Apr 9, 2016 at 7:34 PM, David Winsemius <
> dwinsem...@comcast.net> wrote:
> >>>
> >>>> On Apr 9, 2016, at 8:13 AM, Miluji Sb  wrote:
> >>>>
> >>>> Forgot to copy the list
> >>>>
> >>>> Dear Jim,
> >>>>
> >>>> Thank you for your reply. I must be doing something wrong, If this is
> my
> >>>> command to plot a map of Europe:
> >>>>
> >>>> eps_europe <- mapCountryData(n, nameColumnToPlot="eps_score",
> mapTitle="EPS
> >>>> Score - Europe",colourPalette=colourPalette,
> >>>>   catMethod="fixedWidth", missingCountryCol =
> >>>> "white", mapRegion="Europe", addLegend=FALSE)
> >>>>
> >>>> The following command does not seem to add the arrow. What am I doing
> wrong?
> >>>>
> >>>> do.call(addMapLegend, c(eps_europe, legendLabels="none",
> >>>> arrows(-100,-140,100,-140,code=3)))
> >>>>
> >>>
> >>> Your earlier question had a full world map. That was the context for
> Jim's reply, which did plot a two headed arrow above the legend in your
> earlier question. Now you have restricted the plot region to Europe so the
> coordinates of -100,-140,100,-140 no longer are on the visible plot area.
> You need to decide where you want the arrows using sensible coordinates.
> >>>
> >>> --
> >>> David.
> >>>
> >>>
> >>>> Thank you again. I really appreciate it.
> >>>>
> >>>> Sincerely,
> >>>>
> >>>> Milu
> >>>>
> >>>> On Sat, Apr 9, 2016 at 12:20 PM, Jim Lemon 
> wrote:
> >>>>
> >>>>> Hi Miluji,
> >>>>> Try this:
> >>>>>
> >>>>> arrows(-100,-140,100,-140,code=3)
> >>>>>
> &

Re: [R] Adding Two-Headed Arrow in map legend

2016-04-10 Thread Miluji Sb
Dear David,

The device was the issue. The quartz() device works fine but pdf() does
not. Now I just need to figure out the limits for map for Europe. Thanks
for all your help and patience.

Sincerely,

Milu

On Sun, Apr 10, 2016 at 7:10 PM, David Winsemius 
wrote:

>
> > On Apr 10, 2016, at 4:12 AM, Miluji Sb  wrote:
> >
> > Hello David,
> >
> > This is exactly what I want but I still can't get the arrows. R and R
> studio is updated. Thanks again!
>
> I didn't try it in Rstudio until just now (and I don't remember that you
> ever mentioned RStudio as a possible issue.) The plotting I see in the
> default graphics Rstudio window is rather different than what I see in the
> default plotting window for the R.app GUI. I'm guessing you are reporting
> results from viewing plots in that IDE's viewing window.
>
> I would try plotting with the png() or pdf() devices and see if the
> results are more predictable. I just tried with pdf() from RStudio and the
> results were much closer to what I was seeing with saving from R.app. The
> `quartz()` device seems to deliver consistent results for me. The RStudio
> device is something they call RStudioGD, and I don't have sufficient
> experience to explain its quirks.
>
> --
> David.
>
> >
> > Sincerely,
> >
> > Milu
> >
> > On Sat, Apr 9, 2016 at 10:29 PM, David Winsemius 
> wrote:
> >
> > > On Apr 9, 2016, at 1:27 PM, David Winsemius 
> wrote:
> > >
> > >
> > >> On Apr 9, 2016, at 11:18 AM, David Winsemius 
> wrote:
> > >>
> > >>
> > >>> On Apr 9, 2016, at 10:46 AM, Miluji Sb  wrote:
> > >>>
> > >>> Dear David,
> > >>>
> > >>> Thank you for your answer. Sorry for the embarrassing mistake.
> However, even with when I generate a map for the whole world using:
> > >>>
> > >>> eps <- mapCountryData(n, nameColumnToPlot="eps_score", mapTitle="EPS
> Score",colourPalette=colourPalette,
> > >>> catMethod="fixedWidth", missingCountryCol =
> "white", addLegend=FALSE)
> > >>>
> > >>> And then use:
> > >>>
> > >>> do.call(addMapLegend, c(eps, legendLabels="none",
> arrows(-100,-140,100,-140,code=3)))
> > >>
> > >> I do get an arrow using same version of R and OSX. See attached. (I
> think that png images will be accepted by the mailserver.)
> > >
> > > Nope I was wrong, but the copy to Milugi did arrive.
> > >
> > > Here's a (somewhat larger) pdf:
> >
> >
> >
> >
> > >
> > >
> > > --
> > > David
> > >>
> > >>
> > >>
> > >> --
> > >> David.
> > >>>
> > >>> Only a legend with the colours is generated, no arrows. My session
> info is below. Thanks again!
> > >>>
> > >>> R version 3.2.4 (2016-03-10)
> > >>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> > >>> Running under: OS X 10.11.2 (El Capitan)
> > >>>
> > >>> locale:
> > >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> > >>>
> > >>> attached base packages:
> > >>> [1] stats graphics  grDevices utils datasets  methods   base
> > >>>
> > >>> other attached packages:
> > >>> [1] countrycode_0.18   ggplot2_2.1.0  RColorBrewer_1.1-2
> foreign_0.8-66 maptools_0.8-39rworldmap_1.3-6
> > >>> [7] sp_1.2-0
> > >>>
> > >>> loaded via a namespace (and not attached):
> > >>> [1] Rcpp_0.12.4  lattice_0.20-33  grid_3.2.4   plyr_1.8.3
>gtable_0.2.0 spam_1.3-0
> > >>> [7] scales_0.4.0 tools_3.2.4  munsell_0.4.3maps_3.1.0
>fields_8.3-6 colorspace_1.2-6
> > >>>
> > >>> On Sat, Apr 9, 2016 at 7:34 PM, David Winsemius <
> dwinsem...@comcast.net> wrote:
> > >>>
> > >>>> On Apr 9, 2016, at 8:13 AM, Miluji Sb  wrote:
> > >>>>
> > >>>> Forgot to copy the list
> > >>>>
> > >>>> Dear Jim,
> > >>>>
> > >>>> Thank you for your reply. I must be doing something wrong, If this
> is my
> > >>>> command to plot a map of Europe:
> > >>>>
> > >>>> eps_europe <- m

Re: [R] Adding Two-Headed Arrow in map legend

2016-04-11 Thread Miluji Sb
Dear David,

Thank you very much for your replies! I didn't know about par('usr').

I get different coordinates though:

[1] -19.75966  54.75966  33.6  71.4

But the arrow is not at the bottom of the map. I will keep playing with
this. Thanks again!

Sincerely,

Milu

On Mon, Apr 11, 2016 at 12:00 AM, David Winsemius 
wrote:

>
>
> > On Apr 10, 2016, at 1:45 PM, Miluji Sb  wrote:
> >
> > Dear David,
> >
> > The device was the issue. The quartz() device works fine but pdf() does
> not. Now I just need to figure out the limits for map for Europe. Thanks
> for all your help and patience.
>
>
> After plotting a map of Europe with base graphics the coordinates of the
> lower-left and upper-right corners are obtained by par('usr')
>
> > par('usr')
> [1] -12.2  47.2  25.89375  79.10625
>
> --
> David.
>
>
> >
> > Sincerely,
> >
> > Milu
> >
> > On Sun, Apr 10, 2016 at 7:10 PM, David Winsemius 
> wrote:
> >
> > > On Apr 10, 2016, at 4:12 AM, Miluji Sb  wrote:
> > >
> > > Hello David,
> > >
> > > This is exactly what I want but I still can't get the arrows. R and R
> studio is updated. Thanks again!
> >
> > I didn't try it in Rstudio until just now (and I don't remember that you
> ever mentioned RStudio as a possible issue.) The plotting I see in the
> default graphics Rstudio window is rather different than what I see in the
> default plotting window for the R.app GUI. I'm guessing you are reporting
> results from viewing plots in that IDE's viewing window.
> >
> > I would try plotting with the png() or pdf() devices and see if the
> results are more predictable. I just tried with pdf() from RStudio and the
> results were much closer to what I was seeing with saving from R.app. The
> `quartz()` device seems to deliver consistent results for me. The RStudio
> device is something they call RStudioGD, and I don't have sufficient
> experience to explain its quirks.
> >
> > --
> > David.
> >
> > >
> > > Sincerely,
> > >
> > > Milu
> > >
> > > On Sat, Apr 9, 2016 at 10:29 PM, David Winsemius <
> dwinsem...@comcast.net> wrote:
> > >
> > > > On Apr 9, 2016, at 1:27 PM, David Winsemius 
> wrote:
> > > >
> > > >
> > > >> On Apr 9, 2016, at 11:18 AM, David Winsemius <
> dwinsem...@comcast.net> wrote:
> > > >>
> > > >>
> > > >>> On Apr 9, 2016, at 10:46 AM, Miluji Sb  wrote:
> > > >>>
> > > >>> Dear David,
> > > >>>
> > > >>> Thank you for your answer. Sorry for the embarrassing mistake.
> However, even with when I generate a map for the whole world using:
> > > >>>
> > > >>> eps <- mapCountryData(n, nameColumnToPlot="eps_score",
> mapTitle="EPS Score",colourPalette=colourPalette,
> > > >>> catMethod="fixedWidth", missingCountryCol =
> "white", addLegend=FALSE)
> > > >>>
> > > >>> And then use:
> > > >>>
> > > >>> do.call(addMapLegend, c(eps, legendLabels="none",
> arrows(-100,-140,100,-140,code=3)))
> > > >>
> > > >> I do get an arrow using same version of R and OSX. See attached. (I
> think that png images will be accepted by the mailserver.)
> > > >
> > > > Nope I was wrong, but the copy to Milugi did arrive.
> > > >
> > > > Here's a (somewhat larger) pdf:
> > >
> > >
> > >
> > >
> > > >
> > > >
> > > > --
> > > > David
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> David.
> > > >>>
> > > >>> Only a legend with the colours is generated, no arrows. My session
> info is below. Thanks again!
> > > >>>
> > > >>> R version 3.2.4 (2016-03-10)
> > > >>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> > > >>> Running under: OS X 10.11.2 (El Capitan)
> > > >>>
> > > >>> locale:
> > > >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> > > >>>
> > > >>> attached base packages:
> > > >>> [1] stats graphics  grDevices utils datasets  methods
>  base
> > > >

Re: [R] Adding Two-Headed Arrow in map legend

2016-04-14 Thread Miluji Sb
Hello Jim,

You're amazing. This is what finally worked:

arrows(-1,19,35.6,19,code=3, xpd=T).

Don't know the coordinates were giving so much trouble. Maybe something to
do with maps in rworldmap. Thanks again!

Sincerely,

Milu

On Wed, Apr 13, 2016 at 6:51 AM, Jim Lemon  wrote:

> Hi Milu,
> My fault here. As I don't have the data to make the map and try out my
> suggestions I mixed up the x and y coordinates. Try this:
>
> par(xpd=TRUE)
> arrows(-19.75966,53,33.6,53,code=3)
> par(xpd=FALSE)
>
> Jim
>
> On Tue, Apr 12, 2016 at 10:11 PM, Miluji Sb  wrote:
> > Hello Jim,
> >
> > Thanks again. I am getting the two-headed arrow but I cannot seem to get
> the
> > coordinates right for the arrow to appear beneath the map. These
> coordinates
> > puts the arrow on the left hand side. Thanks again!
> >
> > Sincerely,
> >
> > Milu
> >
> > On Tue, Apr 12, 2016 at 1:15 PM, Jim Lemon  wrote:
> >>
> >> Hi Milu,
> >> There is a two-headed arrow on the image you sent, and it seems to be
> >> where you specified. Did you want it beneath the map, as:
> >>
> >> par(xpd=TRUE)
> >> arrows(-22,54.75,-22,74,code=3)
> >> par(xpd=FALSE)
> >>
> >> Jim
> >>
> >> On Tue, Apr 12, 2016 at 7:58 PM, Miluji Sb  wrote:
> >> > Dear Jim,
> >> >
> >> > Thanks again! I do want the arrows at the bottom (beneath the map).
> This
> >> > is
> >> > what I am doing:
> >> >
> >> > # Draw the map
> >> > eps_europe <- mapCountryData(n, nameColumnToPlot="eps_score",
> >> > mapTitle="EPS
> >> > Score - Europe",colourPalette=colourPalette,
> >> > catMethod="fixedWidth", missingCountryCol = "white",
> mapRegion="Europe",
> >> > addLegend=FALSE)
> >> >
> >> > # ISO3 codes on the map
> >> > text(n, labels="ISO3", cex=0.30)
> >> >
> >> > # Obtain coordinates for the arrow
> >> > par('usr')
> >> >
> >> > # -19.75966  54.75966  33.6  71.4
> >> >
> >> > # Arrows
> >> > par(xpd=TRUE)
> >> > arrows(-19.75966,  54.75966,  33.6,  71.4,code=3)
> >> > par(xpd=FALSE)
> >> >
> >> > As the output shows I cannot seem to get the correct coordinates for
> the
> >> > arrows. Thanks again.
> >> >
> >> > Sincerely,
> >> >
> >> > Milu
> >
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Use multiple cores on Linux

2016-04-20 Thread Miluji Sb
I am trying to run the following code in R on a Linux cluster. I would like
to use the full processing power (specifying cores/nodes/memory). The code
essentially runs predictions based on a GAM regression and saves the
results as a CSV file for multiple sets of data (here I only show two).

Is it possible to run this code using HPC packages such as
Rmpi/snow/doParallel? Thank you!

#
library(data.table)
library(mgcv)
library(reshape2)
library(dplyr)
library(tidyr)
library(lubridate)
library(DataCombine)
#
gam_max_count_wk <- gam(count_pop ~ factor(citycode) + factor(year) +
factor(week) + s(lnincome) + s(tmax) +
s(hmax),data=cont,na.action="na.omit", method="ML")

#
# Historic
temp_hist <- read.csv("/work/sd00815/giss_historic/giss_temp_hist.csv")
humid_hist <- read.csv("/work/sd00815/giss_historic/giss_hum_hist.csv")
#
temp_hist <- as.data.table(temp_hist)
humid_hist <- as.data.table(humid_hist)
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_hist, mykey)
setkeyv(humid_hist, mykey)
#
hist<- merge(temp_hist, humid_hist, by=mykey)
#
hist$X.x <- NULL
hist$X.y <- NULL
#
# Max
hist_max <- hist
hist_max$FIPS <- hist_max$year <- hist_max$month <- hist_max$tmin <-
hist_max$tmean <- hist_max$hmin <- hist_max$hmean <- NULL
#
# Adding Factors
hist_max$citycode <- rep(101,nrow(hist_max))
hist_max$year <- rep(2010,nrow(hist_max))
hist_max$lnincome <- rep(10.262,nrow(hist_max))
#
# Predictions
pred_hist_max <- predict.gam(gam_max_count_wk,hist_max)
#
pred_hist_max <- as.data.table(pred_hist_max)
pred_hist_max <- cbind(hist, pred_hist_max)
pred_hist_max$tmax <- pred_hist_max$tmean <- pred_hist_max$tmin <-
pred_hist_max$hmean <- pred_hist_max$hmax <- pred_hist_max$hmin <- NULL
#
# Aggregate by FIPS
max_hist <- pred_hist_max %>%
  group_by(FIPS) %>%
  summarise(pred_hist = mean(pred_hist_max))
#
### Future
## 4.5
# 4.5_2021_2050
temp_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
humid_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2021_2050_temp.csv")
#
# Max
temp_sim <- as.data.table(temp_sim)
setnames(temp_sim, "max", "tmax")
setnames(temp_sim, "min", "tmin")
setnames(temp_sim, "avg", "tmean")
#
humid_sim <- as.data.table(humid_sim)
setnames(humid_sim, "max", "hmax")
setnames(humid_sim, "min", "hmin")
setnames(humid_sim, "avg", "hmean")
#
temp_sim$X <- NULL
humid_sim$X <- NULL
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_sim, mykey)
setkeyv(humid_sim, mykey)
#
sim <- merge(temp_sim, humid_sim, by=mykey)
#
sim_max <- sim
#
sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
#
# Adding Factors
sim_max$citycode <- rep(101,nrow(sim_max))
sim_max$year <- rep(2010,nrow(sim_max))
sim_max$week <- rep(1,nrow(sim_max))
sim_max$lnincome <- rep(10.262,nrow(sim_max))
#
# Predictions
pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
#
pred_sim_max <- as.data.table(pred_sim_max)
pred_sim_max <- cbind(sim, pred_sim_max)
pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
#
# Aggregate by FIPS
max_sim <- pred_sim_max %>%
  group_by(FIPS) %>%
  summarise(pred_sim = mean(pred_sim_max))
#
# Merge with Historical Data
max_hist$FIPS <- as.factor(max_hist$FIPS)
max_sim$FIPS <- as.factor(max_sim$FIPS)
#
mykey1<- c("FIPS")
setkeyv(max_hist, mykey1)
setkeyv(max_sim, mykey1)
max_change <- merge(max_hist, max_sim, by=mykey1)
max_change$change <-
((max_change$pred_sim-max_change$pred_hist)/max_change$pred_hist)*100
#
write.csv(max_change, file =
"/work/sd00815/projections_data/year_wk_fe/giss/max/giss_4.5_2021_2050.csv")



# 4.5_2081_2100
temp_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
humid_sim <-
read.csv("/work/sd00815/giss_future/giss_4.5_2081_2100_temp.csv")
#
# Max
temp_sim <- as.data.table(temp_sim)
setnames(temp_sim, "max", "tmax")
setnames(temp_sim, "min", "tmin")
setnames(temp_sim, "avg", "tmean")
#
humid_sim <- as.data.table(humid_sim)
setnames(humid_sim, "max", "hmax")
setnames(humid_sim, "min", "hmin")
setnames(humid_sim, "avg", "hmean")
#
temp_sim$X <- NULL
humid_sim$X <- NULL
#
# Merge
mykey<- c("FIPS", "year","month", "week")
setkeyv(temp_sim, mykey)
setkeyv(humid_sim, mykey)
#
sim <- merge(temp_sim, humid_sim, by=mykey)
#
sim_max <- sim
#
sim_max$FIPS <- sim_max$year <- sim_max$month <- sim_max$tmin <-
sim_max$tmean <- sim_max$hmin <- sim_max$hmean <- NULL
#
# Adding Factors
sim_max$citycode <- rep(101,nrow(sim_max))
sim_max$year <- rep(2010,nrow(sim_max))
sim_max$week <- rep(1,nrow(sim_max))
sim_max$lnincome <- rep(10.262,nrow(sim_max))
#
# Predictions
pred_sim_max <- predict.gam(gam_max_count_wk,sim_max)
#
pred_sim_max <- as.data.table(pred_sim_max)
pred_sim_max <- cbind(sim, pred_sim_max)
pred_sim_max$tmax <- pred_sim_max$tmean <- pred_sim_max$tmin <-
pred_sim_max$hmean <- pred_sim_max$hmax <- pred_sim_max$hmin <- NULL
#
# Aggregate by FIPS
max_sim <- pred_sim

[R] Aggregate FIPS data to State and Census divisions

2016-05-01 Thread Miluji Sb
Dear all,

I have the following data by US FIPS code. Is there a package to aggregate
the data by State and Census divisions?

temp <- dput(head(pop1,5))
structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009"
), death_2050A1 = c(18.19158, 101.63088, 13.18896, 10.30068,
131.91798), death_2050A2 = c(22.16349, 116.58387, 15.85324, 12.78564,
155.20506), death_2050B1 = c(21.38906, 76.23018, 21.38218, 17.14269,
151.64466), death_2050B2 = c(23.43543, 81.39378, 22.96802, 18.76926,
161.86404), death_2050BC = c(21.89947, 93.88002, 18.60352, 15.1032,
152.43414)), .Names = c("FIPS", "death_2050A1", "death_2050A2",
"death_2050B1", "death_2050B2", "death_2050BC"), row.names = c(NA,
5L), class = "data.frame")

Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Aggregate FIPS data to State and Census divisions

2016-05-01 Thread Miluji Sb
Dear Dennis,

Thank you for your reply. I can use the dplyr/data.table packages to
aggregate - its the matching FIPS codes to their states that I am having
trouble. Thanks again.

Sincerely,

Milu

On Sun, May 1, 2016 at 6:20 PM, Dennis Murphy  wrote:

> Hi:
>
> Several such packages exist. Given the size of your data, it's likely
> that the dplyr and data.table packages would be worth investigating.
> Both are well documented.
>
> Dennis
>
> On Sun, May 1, 2016 at 8:30 AM, Miluji Sb  wrote:
> > Dear all,
> >
> > I have the following data by US FIPS code. Is there a package to
> aggregate
> > the data by State and Census divisions?
> >
> > temp <- dput(head(pop1,5))
> > structure(list(FIPS = c("01001", "01003", "01005", "01007", "01009"
> > ), death_2050A1 = c(18.19158, 101.63088, 13.18896, 10.30068,
> > 131.91798), death_2050A2 = c(22.16349, 116.58387, 15.85324, 12.78564,
> > 155.20506), death_2050B1 = c(21.38906, 76.23018, 21.38218, 17.14269,
> > 151.64466), death_2050B2 = c(23.43543, 81.39378, 22.96802, 18.76926,
> > 161.86404), death_2050BC = c(21.89947, 93.88002, 18.60352, 15.1032,
> > 152.43414)), .Names = c("FIPS", "death_2050A1", "death_2050A2",
> > "death_2050B1", "death_2050B2", "death_2050BC"), row.names = c(NA,
> > 5L), class = "data.frame")
> >
> > Thank you!
> >
> > Sincerely,
> >
> > Milu
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert Coordinates to Address - Wikimapia

2016-05-17 Thread Miluji Sb
Is it possible to convert latitude and longitude into addresses in R using
Wikimapia mapping instead of Google? The reason being that Wikimapia
addresses have more details.

This is the code I have been using but this obtains the information from
Google.

library(data.table)
library(ggmap)

coords<- structure(list(Lattitude = c(23.8642394, 23.8643628, 23.8645843,
23.8632958, 23.8632859), Longitude = c(90.3981852, 90.3981042,
90.3954784, 90.3940043, 90.3940025)), .Names = c("Lattitude",
"Longitude"), row.names = c(NA, 5L), class = "data.frame")

coords$textAddress <- mapply(FUN = function(lon, lat) revgeocode(c(lon,
lat)), coords$Longitude, coords$Lattitude)

Thank you!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Merge data by coordinates

2016-10-16 Thread Miluji Sb
Dear all,

I have two dataframe 1 by latitude and longitude but they always do not
match. Is it possible to merge them (e.g. nearest distance)?

# Dataframe 1
structure(list(lat = c(54L, 55L, 51L, 54L, 53L, 50L, 47L, 51L,
49L, 54L), lon = c(14L, 8L, 15L, 7L, 6L, 5L, 13L, 5L, 13L, 11L
), PPP2000_40 = c(4606, 6575, 6593, 7431, 9393, 10773, 11716,
12226, 13544, 14526)), .Names = c("lat", "lon", "PPP2000_40"), row.names =
c(6764L,
8796L, 8901L, 9611L, 11649L, 12819L, 13763L, 14389L, 15641L,
16571L), class = "data.frame")

# Dataframe 2
structure(list(lat = c(47, 47, 47, 47, 47, 47, 48, 48, 48, 48
), lon = c(7, 8, 9, 10, 11, 12, 7, 8, 9, 10), GDP = c(19.09982,
13.31977, 14.95925, 6.8575635, 23.334565, 6.485748, 24.01197,
14.30393075, 21.33759675, 9.71803675)), .Names = c("lat", "lon",
"GDP"), row.names = c(NA, 10L), class = "data.frame")

Thank you so much!

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Average every 4 columns

2016-11-09 Thread Miluji Sb
Dear all,

I have a dataset with hundreds of columns, I am only providing only 12
columns. Is it possible to take the mean of every four (or 12) columns
 (value601, value602, value603, value604 etc.in this case) and repeat for
the hundreds of columns? Thank you.

Sincerely,

Milu

structure(list(value601 = c(10.1738710403442, 3.54112911224365,
12.9192342758179, 3.17447590827942, 11.7332258224487, 7.68282270431519,
-7.11564493179321, 0.987620949745178, 13.0476207733154, 6.36939525604248
), value602 = c(13.0642414093018, 5.53129482269287, 16.0519638061523,
2.88946437835693, 14.9204912185669, 9.42428588867188, -6.80674123764038,
-0.614241063594818, 16.7947769165039, 7.9541072845459), value603 =
c(22.0399188995361,
14.398024559021, 24.9523792266846, 12.0878629684448, 23.6459674835205,
18.3277816772461, -2.54092741012573, 10.5550804138184, 25.1016540527344,
16.2166938781738), value604 = c(27.7165412902832, 20.3255825042725,
30.8430004119873, 16.6856250762939, 29.2485408782959, 24.3775005340576,
6.47758340835571, 15.5897912979126, 30.7387924194336, 22.3637084960938
), value605 = c(31.6644763946533, 23.4093952178955, 35.1488723754883,
19.7132263183594, 33.3924179077148, 29.5846366882324, 10.2083873748779,
19.3551616668701, 35.3076629638672, 27.4299201965332), value606 =
c(33.9698333740234,
26.8574161529541, 36.8900833129883, 22.8604583740234, 34.8642921447754,
33.8158760070801, 14.7055835723877, 22.1144580841064, 37.0545425415039,
32.1087913513184), value607 = c(36.0279846191406, 26.9297180175781,
38.2701225280762, 23.2643146514893, 36.7398796081543, 34.1216125488281,
17.1387901306152, 24.0419750213623, 37.8542327880859, 32.7677421569824
), value608 = c(34.0242347717285, 25.7720966339111, 36.4897193908691,
22.0332260131836, 34.8011703491211, 32.6856842041016, 16.6232261657715,
21.5571365356445, 36.1491546630859, 31.1716938018799), value609 =
c(27.5402088165283,
21.7590408325195, 30.5214176177979, 18.4252090454102, 29.1156253814697,
26.9878330230713, 12.4962501525879, 17.7259578704834, 30.9099159240723,
25.4832077026367), value610 = c(23.4706859588623, 17.0126209259033,
26.8166942596436, 15.297459602356, 25.1733055114746, 23.5616931915283,
8.86995983123779, 13.5793552398682, 27.5732250213623, 22.1691932678223
), value611 = c(14.5820417404175, 9.08279132843018, 17.8419170379639,
8.36016654968262, 16.5633754730225, 14.8123331069946, 1.32095837593079,
5.73408317565918, 18.9752082824707, 13.572542236), value612 =
c(9.12979793548584,
2.79943537712097, 11.6504030227661, 2.21584677696228, 10.5404834747314,
7.55471754074097, -5.58141136169434, -0.566209673881531, 12.3264112472534,
6.65576601028442)), .Names = c("value601", "value602", "value603",
"value604", "value605", "value606", "value607", "value608", "value609",
"value610", "value611", "value612"), row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Average every 4 columns

2016-11-09 Thread Miluji Sb
Thanks a lot for your quick reply. I made a mistake in the question, I
meant to ask every 4 (or 12) rows not columns. Apologies. Thanks again!

Sincerely,

Milu

On Wed, Nov 9, 2016 at 5:45 PM, Dalthorp, Daniel  wrote:

> Hi Milu,
> The following should work for an array x (provided dim(x)[2] is divisible
> by 4):
>
> colMeans(x[,0:(dim(x)[2]/4-1)*4+1])
>
> -Dan
>
> On Wed, Nov 9, 2016 at 8:29 AM, Miluji Sb  wrote:
>
>> Dear all,
>>
>> I have a dataset with hundreds of columns, I am only providing only 12
>> columns. Is it possible to take the mean of every four (or 12) columns
>>  (value601, value602, value603, value604 etc.in this case) and repeat for
>> the hundreds of columns? Thank you.
>>
>> Sincerely,
>>
>> Milu
>>
>> structure(list(value601 = c(10.1738710403442, 3.54112911224365,
>> 12.9192342758179, 3.17447590827942, 11.7332258224487, 7.68282270431519,
>> -7.11564493179321, 0.987620949745178, 13.0476207733154, 6.36939525604248
>> ), value602 = c(13.0642414093018, 5.53129482269287, 16.0519638061523,
>> 2.88946437835693, 14.9204912185669, 9.42428588867188, -6.80674123764038,
>> -0.614241063594818, 16.7947769165039, 7.9541072845459), value603 =
>> c(22.0399188995361,
>> 14.398024559021, 24.9523792266846, 12.0878629684448, 23.6459674835205,
>> 18.3277816772461, -2.54092741012573, 10.5550804138184, 25.1016540527344,
>> 16.2166938781738), value604 = c(27.7165412902832, 20.3255825042725,
>> 30.8430004119873, 16.6856250762939, 29.2485408782959, 24.3775005340576,
>> 6.47758340835571, 15.5897912979126, 30.7387924194336, 22.3637084960938
>> ), value605 = c(31.6644763946533, 23.4093952178955, 35.1488723754883,
>> 19.7132263183594, 33.3924179077148, 29.5846366882324, 10.2083873748779,
>> 19.3551616668701, 35.3076629638672, 27.4299201965332), value606 =
>> c(33.9698333740234,
>> 26.8574161529541, 36.8900833129883, 22.8604583740234, 34.8642921447754,
>> 33.8158760070801, 14.7055835723877, 22.1144580841064, 37.0545425415039,
>> 32.1087913513184), value607 = c(36.0279846191406, 26.9297180175781,
>> 38.2701225280762, 23.2643146514893, 36.7398796081543, 34.1216125488281,
>> 17.1387901306152, 24.0419750213623, 37.8542327880859, 32.7677421569824
>> ), value608 = c(34.0242347717285, 25.7720966339111, 36.4897193908691,
>> 22.0332260131836, 34.8011703491211, 32.6856842041016, 16.6232261657715,
>> 21.5571365356445, 36.1491546630859, 31.1716938018799), value609 =
>> c(27.5402088165283,
>> 21.7590408325195, 30.5214176177979, 18.4252090454102, 29.1156253814697,
>> 26.9878330230713, 12.4962501525879, 17.7259578704834, 30.9099159240723,
>> 25.4832077026367), value610 = c(23.4706859588623, 17.0126209259033,
>> 26.8166942596436, 15.297459602356, 25.1733055114746, 23.5616931915283,
>> 8.86995983123779, 13.5793552398682, 27.5732250213623, 22.1691932678223
>> ), value611 = c(14.5820417404175, 9.08279132843018, 17.8419170379639,
>> 8.36016654968262, 16.5633754730225, 14.8123331069946, 1.32095837593079,
>> 5.73408317565918, 18.9752082824707, 13.572542236), value612 =
>> c(9.12979793548584,
>> 2.79943537712097, 11.6504030227661, 2.21584677696228, 10.5404834747314,
>> 7.55471754074097, -5.58141136169434, -0.566209673881531, 12.3264112472534,
>> 6.65576601028442)), .Names = c("value601", "value602", "value603",
>> "value604", "value605", "value606", "value607", "value608", "value609",
>> "value610", "value611", "value612"), row.names = c("1", "2",
>> "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>
> --
> Dan Dalthorp, PhD
> USGS Forest and Rangeland Ecosystem Science Center
> Forest Sciences Lab, Rm 189
> 3200 SW Jefferson Way
> Corvallis, OR 97331
> ph: 541-750-0953
> ddalth...@usgs.gov
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Melt and compute Max, Mean, Min

2016-11-18 Thread Miluji Sb
Dear all,

I have 51 years of data (1960 - 2010) in csv format, where each file
represents one year of data. Below is what each file looks like.

These are temperature data by coordinates, my goal is to to compute max,
min, and mean by year for each of the coordinates and construct a panel
dataset. Any help will be appreciated, thank you!

Sincerely,

Milu

temp <- dput(head(df,5))
structure(list(ISO3 = structure(c(28L, 28L, 28L, NA, 28L), .Label =
c("AFG",
"AGO", "ALB", "ARE", "ARG", "ARM", "AUS", "AUT", "AZE", "BDI",
"BEL", "BEN", "BFA", "BGD", "BGR", "BHS", "BIH", "BLR", "BLZ",
"BOL", "BRA", "BRN", "BTN", "BWA", "CAF", "CAN", "CHE", "CHL",
"CHN", "CIV", "CMR", "COD", "COG", "COL", "CRI", "CUB", "CYP",
"CZE", "DEU", "DJI", "DNK", "DOM", "DZA", "ECU", "EGY", "ERI",
"ESH", "ESP", "EST", "ETH", "FIN", "FJI", "FLK", "FRA", "GAB",
"GBR", "GEO", "GHA", "GIN", "GNB", "GNQ", "GRC", "GRL", "GTM",
"GUF", "GUY", "HND", "HRV", "HTI", "HUN", "IDN", "IND", "IRL",
"IRN", "IRQ", "ISL", "ISR", "ITA", "JAM", "JOR", "JPN", "KAZ",
"KEN", "KGZ", "KHM", "KIR", "KOR", "KWT", "LAO", "LBN", "LBR",
"LBY", "LCA", "LKA", "LSO", "LTU", "LUX", "LVA", "MAR", "MDA",
"MDG", "MEX", "MKD", "MLI", "MMR", "MNE", "MNG", "MOZ", "MRT",
"MWI", "MYS", "NAM", "NCL", "NER", "NGA", "NIC", "NLD", "NOR",
"NPL", "NZL", "OMN", "PAK", "PAN", "PER", "PHL", "PNG", "POL",
"PRI", "PRK", "PRT", "PRY", "QAT", "ROU", "RUS", "RWA", "SAU",
"SDN", "SEN", "SJM", "SLB", "SLE", "SLV", "SOM", "SRB", "SUR",
"SVK", "SVN", "SWE", "SWZ", "SYR", "TCD", "TGO", "THA", "TJK",
"TKM", "TLS", "TUN", "TUR", "TWN", "TZA", "UGA", "UKR", "URY",
"USA", "UZB", "VEN", "VNM", "VUT", "YEM", "ZAF", "ZMB", "ZWE"
), class = "factor"), lon = c(-69L, -68L, -72L, -71L, -70L),
lat = c(-55L, -55L, -54L, -54L, -54L), day_1 = c(NA, NA,
0, 0, 0), day_2 = c(NA, NA, 0, 0, 0), day_3 = c(NA, NA, 0,
23.37984, 0), day_4 = c(NA, NA, 0, 0, 0), day_5 = c(NA, NA,
0, 0, 0), day_6 = c(NA, NA, 0, 0, 0), day_7 = c(NA, NA, 2.83824,
11.80116, 1.24956), day_8 = c(NA, NA, 0, 1.68588, 14.69448
), day_9 = c(NA, NA, 0, 0, 1.09296), day_10 = c(NA, NA, 0,
0, 0), day_11 = c(NA, NA, 3.78, 3.7422, 0), day_12 = c(NA,
NA, 0.54, 0, 0), day_13 = c(NA, NA, 0, 0, 0), day_14 = c(NA,
NA, 0, 0, 0.39204), day_15 = c(NA, NA, 0, 0, 11.58732), day_16 = c(NA,
NA, 0, 0, 0), day_17 = c(NA, NA, 0, 1.14048, 12.26448), day_18 = c(NA,
NA, 0, 1.1934, 7.59024), day_19 = c(NA, NA, 9.74268, 0, 0
), day_20 = c(NA, NA, 0, 0, 0), day_21 = c(NA, NA, 1.96776,
0, 0), day_22 = c(NA, NA, 0, 0, 0), day_23 = c(NA, NA, 0,
0, 0), day_24 = c(NA, NA, 6.21756, 2.74752, 0), day_25 = c(NA,
NA, 0, 0, 3.37932), day_26 = c(NA, NA, 4.8384, 0, 0), day_27 = c(NA,
NA, 0, 0, 0), day_28 = c(NA, NA, 0, 0, 0), day_29 = c(NA,
NA, 22.37328, 0, 0), day_30 = c(NA, NA, 28.97424, 11.25468,
0), day_31 = c(NA, NA, 0, 0, 0), day_32 = c(NA, NA, 0, 0,
2.00448), day_33 = c(NA, NA, 0, 0, 0), day_34 = c(NA, NA,
0, 0, 0), day_35 = c(NA, NA, 0, 0, 0), day_36 = c(NA, NA,
0, 0, 0), day_37 = c(NA, NA, 0, 0, 0), day_38 = c(NA, NA,
32.7132, 31.71852, 0), day_39 = c(NA, NA, 0, 0, 5.84604),
day_40 = c(NA, NA, 0, 0, 0), day_41 = c(NA, NA, 0, 0, 0),
day_42 = c(NA, NA, 0, 0, 0), day_43 = c(NA, NA, 0, 0, 0),
day_44 = c(NA, NA, 0, 0, 1.78416), day_45 = c(NA, NA, 0,
0, 0), day_46 = c(NA, NA, 33.84504, 0, 0), day_47 = c(NA,
NA, 0, 0, 0), day_48 = c(NA, NA, 0, 0, 0), day_49 = c(NA,
NA, 0, 0, 0), day_50 = c(NA, NA, 0, 0.4752, 0), day_51 = c(NA,
NA, 0, 0, 22.02012), day_52 = c(NA, NA, 0, 0, 0), day_53 = c(NA,
NA, 0, 0, 3.48084), day_54 = c(NA, NA, 0, 0, 0), day_55 = c(NA,
NA, 0.58212, 0, 0), day_56 = c(NA, NA, 0.35316, 0, 0), day_57 = c(NA,
NA, 0, 0, 12.65436), day_58 = c(NA, NA, 0, 0, 0), day_59 = c(NA,
NA, 0, 0, 0), day_60 = c(NA, NA, 3.03372, 22.05576, 0), day_61 = c(NA,
NA, 2.5758, 0, 0), day_62 = c(NA, NA, 0, 0, 0), day_63 = c(NA,
NA, 3.67416, 25.22016, 4.21524), day_64 = c(NA, NA, 0.52488,
3.60288, 0), day_65 = c(NA, NA, 12.82608, 0, 0), day_66 = c(NA,
NA, 0, 0, 0), day_67 = c(NA, NA, 0, 0, 0), day_68 = c(NA,
NA, 0, 0, 0), day_69 = c(NA, NA, 0, 0, 0), day_70 = c(NA,
NA, 1.11564, 5.17536, 0), day_71 = c(NA, NA, 1.18584, 0,
0), day_72 = c(NA, NA, 0, 0, 0.10584), day_73 = c(NA, NA,
0.62748, 14.39748, 7.50708), day_74 = c(NA, NA, 7.20252,
20.02644, 1.07244), day_75 = c(NA, NA, 1.87488, 0, 0), day_76 = c(NA,
NA, 0.26784, 0, 0), day_77 = c(NA, NA, 0, 0, 0), day_78 = c(NA,
NA, 0, 0, 2.81664), day_79 = c(NA, NA, 0, 0, 0), day_80 = c(NA,
NA, 0, 0, 0), day_81 = c(NA, NA, 0, 0, 0), day_82 = c(NA,
NA, 1.29276, 0, 0), day_83 = c(NA, NA, 0.18468, 0, 1.46124
), day_84 = c(NA, NA, 0, 0, 0), day_85 = c(NA, NA, 0, 0,
0), day_86 = c(NA, NA, 57.36528, 0, 0), day_87 = c(NA, NA,
8.19504, 0, 0), day_88 = c(NA, NA, 0, 0, 0), day_89 = c(NA,
NA, 6.45732, 0, 0), day_90 = c(NA, NA, 0, 0, 0), day_91 = c(NA,
NA, 0, 0, 44.20332), 

Re: [R] Melt and compute Max, Mean, Min

2016-11-18 Thread Miluji Sb
Dear Petr,

Thank you for the code, apologies though as I copied the wrong data, This
is precipitation data and not temperature.

For the loop, could I do something like this?

filelist <- list.files(pattern=".csv")

myDTs <- lapply(filelist, function(.file) {

apply(temp[,-(1:3)],1, mean, na.rm=T)

}

Thanks again!

Sincerely,

Milu


On Fri, Nov 18, 2016 at 2:46 PM, PIKAL Petr  wrote:

> Hi
>
> I am not completely sure what you want to do but
>
> > apply(temp[,-(1:3)],1, mean, na.rm=T)
>12345
>  NaN  NaN 2.159516 1.519914 1.514007
> > apply(temp[,-(1:3)],1, max, na.rm=T)
>12345
> -Inf -Inf 57.36528 39.45348 45.23904
> Warning messages:
> 1: In FUN(newX[, i], ...) :
>   no non-missing arguments to max; returning -Inf
> 2: In FUN(newX[, i], ...) :
>   no non-missing arguments to max; returning -Inf
> > apply(temp[,-(1:3)],1, min, na.rm=T)
>   1   2   3   4   5
> Inf Inf   0   0   0
>
> gives you mentioned summary for each row. If you have duplicate rows you
> shall first aggregate them. However, it seems to me that your data are not
> correct. It is quite strange that for given lat/lon you have one day value
> 23 and the next day 0.
>
> temp[1:5, 1:10]
>   ISO3 lon lat day_1 day_2day_3 day_4 day_5 day_6day_7
> 1  CHL -69 -55NANA   NANANANA   NA
> 2  CHL -68 -55NANA   NANANANA   NA
> 3  CHL -72 -54 0 0  0.0 0 0 0  2.83824
> 4  -71 -54 0 0 23.37984 0 0 0 11.80116
> 5  CHL -70 -54 0 0  0.0 0 0 0  1.24956
>
> If you want to process all your files you can do it in cycle. The function
>
> list.files()
>
> can be handy for that task.
>
> Cheers
> Petr
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Miluji
> Sb
> > Sent: Friday, November 18, 2016 1:49 PM
> > To: r-help mailing list 
> > Subject: [R] Melt and compute Max, Mean, Min
> >
> > Dear all,
> >
> > I have 51 years of data (1960 - 2010) in csv format, where each file
> represents
> > one year of data. Below is what each file looks like.
> >
> > These are temperature data by coordinates, my goal is to to compute max,
> > min, and mean by year for each of the coordinates and construct a panel
> > dataset. Any help will be appreciated, thank you!
> >
> > Sincerely,
> >
> > Milu
> >
> > temp <- dput(head(df,5))
> > structure(list(ISO3 = structure(c(28L, 28L, 28L, NA, 28L), .Label =
> c("AFG",
> > "AGO", "ALB", "ARE", "ARG", "ARM", "AUS", "AUT", "AZE", "BDI", "BEL",
> > "BEN", "BFA", "BGD", "BGR", "BHS", "BIH", "BLR", "BLZ", "BOL", "BRA",
> > "BRN", "BTN", "BWA", "CAF", "CAN", "CHE", "CHL", "CHN", "CIV", "CMR",
> > "COD", "COG", "COL", "CRI", "CUB", "CYP", "CZE", "DEU", "DJI", "DNK",
> > "DOM", "DZA", "ECU", "EGY", "ERI", "ESH", "ESP", "EST", "ETH", "FIN",
> "FJI",
> > "FLK", "FRA", "GAB", "GBR", "GEO", "GHA", "GIN", "GNB", "GNQ", "GRC",
> > "GRL", "GTM", "GUF", "GUY", "HND", "HRV", "HTI", "HUN", "IDN", "IND",
> > "IRL", "IRN", "IRQ", "ISL", "ISR", "ITA", "JAM", "JOR", "JPN", "KAZ",
> "KEN",
> > "KGZ", "KHM", "KIR", "KOR", "KWT", "LAO", "LBN", "LBR", "LBY", "LCA",
> > "LKA", "LSO", "LTU", "LUX", "LVA", "MAR", "MDA", "MDG", "MEX", "MKD",
> > "MLI", "MMR", "MNE", "MNG", "MOZ", "MRT", "MWI", "MYS", "NAM",
> > "NCL", "NER", "NGA", "NIC", "NLD", "NOR", "NPL", &q

Re: [R] Melt and compute Max, Mean, Min

2016-11-18 Thread Miluji Sb
If I do:

as.data.frame(apply(df[,-(1:3)],1, mean, na.rm=T))

is it possible to sequentially name the variables as "mean_1960",
"max_1960". "min_1960", "mean_1961", "max_1961". "min_1961", ...?

On Fri, Nov 18, 2016 at 3:10 PM, Miluji Sb  wrote:

> Dear Petr,
>
> Thank you for the code, apologies though as I copied the wrong data, This
> is precipitation data and not temperature.
>
> For the loop, could I do something like this?
>
> filelist <- list.files(pattern=".csv")
>
> myDTs <- lapply(filelist, function(.file) {
>
> apply(temp[,-(1:3)],1, mean, na.rm=T)
>
> }
>
> Thanks again!
>
> Sincerely,
>
> Milu
>
>
> On Fri, Nov 18, 2016 at 2:46 PM, PIKAL Petr 
> wrote:
>
>> Hi
>>
>> I am not completely sure what you want to do but
>>
>> > apply(temp[,-(1:3)],1, mean, na.rm=T)
>>12345
>>  NaN  NaN 2.159516 1.519914 1.514007
>> > apply(temp[,-(1:3)],1, max, na.rm=T)
>>12345
>> -Inf -Inf 57.36528 39.45348 45.23904
>> Warning messages:
>> 1: In FUN(newX[, i], ...) :
>>   no non-missing arguments to max; returning -Inf
>> 2: In FUN(newX[, i], ...) :
>>   no non-missing arguments to max; returning -Inf
>> > apply(temp[,-(1:3)],1, min, na.rm=T)
>>   1   2   3   4   5
>> Inf Inf   0   0   0
>>
>> gives you mentioned summary for each row. If you have duplicate rows you
>> shall first aggregate them. However, it seems to me that your data are not
>> correct. It is quite strange that for given lat/lon you have one day value
>> 23 and the next day 0.
>>
>> temp[1:5, 1:10]
>>   ISO3 lon lat day_1 day_2day_3 day_4 day_5 day_6day_7
>> 1  CHL -69 -55NANA   NANANANA   NA
>> 2  CHL -68 -55NANA   NANANANA   NA
>> 3  CHL -72 -54 0 0  0.0 0 0 0  2.83824
>> 4  -71 -54 0 0 23.37984 0 0 0 11.80116
>> 5  CHL -70 -54 0 0  0.0 0 0 0  1.24956
>>
>> If you want to process all your files you can do it in cycle. The function
>>
>> list.files()
>>
>> can be handy for that task.
>>
>> Cheers
>> Petr
>>
>> > -Original Message-
>> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Miluji
>> Sb
>> > Sent: Friday, November 18, 2016 1:49 PM
>> > To: r-help mailing list 
>> > Subject: [R] Melt and compute Max, Mean, Min
>> >
>> > Dear all,
>> >
>> > I have 51 years of data (1960 - 2010) in csv format, where each file
>> represents
>> > one year of data. Below is what each file looks like.
>> >
>> > These are temperature data by coordinates, my goal is to to compute max,
>> > min, and mean by year for each of the coordinates and construct a panel
>> > dataset. Any help will be appreciated, thank you!
>> >
>> > Sincerely,
>> >
>> > Milu
>> >
>> > temp <- dput(head(df,5))
>> > structure(list(ISO3 = structure(c(28L, 28L, 28L, NA, 28L), .Label =
>> c("AFG",
>> > "AGO", "ALB", "ARE", "ARG", "ARM", "AUS", "AUT", "AZE", "BDI", "BEL",
>> > "BEN", "BFA", "BGD", "BGR", "BHS", "BIH", "BLR", "BLZ", "BOL", "BRA",
>> > "BRN", "BTN", "BWA", "CAF", "CAN", "CHE", "CHL", "CHN", "CIV", "CMR",
>> > "COD", "COG", "COL", "CRI", "CUB", "CYP", "CZE", "DEU", "DJI", "DNK",
>> > "DOM", "DZA", "ECU", "EGY", "ERI", "ESH", "ESP", "EST", "ETH", "FIN",
>> "FJI",
>> > "FLK", "FRA", "GAB", "GBR", "GEO", "GHA", "GIN", "GNB", "GNQ", "GRC",
>> > "GRL", "GTM", "GUF", "GUY", "HND", "HRV", "HTI", "HUN", "IDN", "IND",
>> > "IRL", "IRN", "IRQ", "ISL", "ISR", "ITA", "JAM", "JOR&q

[R] Convert arc-second to degree

2016-11-29 Thread Miluji Sb
Dear all,

I am using the Gridded Population of the World (v4) for the year 2010. The
data is in GeoTiFF format.

Source:
http://sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count-adjusted-to-2015-unwpp-country-totals/data-download

I imported the data using:

library(raster)
library(maptools)
library(ncdf4)
library(rgdal)

population <-
raster("gpw-v4-population-count-adjusted-to-2015-unwpp-country-totals_2010.tif")
population1 <- stack(population )
extent(population1 ) <- c(-180, 180,-58,85)

### Information
class   : RasterStack
dimensions  : 17400, 43200, 75168, 1  (nrow, ncol, ncell, nlayers)
resolution  : 0.00833, 0.00833  (x, y)
extent  : -180, 180, -60, 85  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84
+towgs84=0,0,0
names   :
gpw.v4.population.count.adjusted.to.2015.unwpp.country.totals_2010
min values  :
   0
max values  :
141715.3
###

I need to extract population by a set of coordinates which are at 1° x 1°,
how can I convert from arc-second to degree in R? The information also
shows that resolution is 0.00833° x 0.00833°, is it enough to do
something like this?

pop_agg <- aggregate(population1 , fact=120, fun=sum)

Thank you very much.

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reshape to wide format

2016-12-12 Thread Miluji Sb
Dear all,

I have the following monthly data by coordinates:

I would like to reshape this data to wide format so that each column is a
coordinate and each row is a month,

coordinate1 coordinate2 coordinate3...
Month 1
Month 2

Is the best option to concatenate the iso3, lon, and lat variables to
create an ID variable? I realize that this question might be very basic but
I'm slightly baffled. Thank you.

temp <- dput(head(precip_2000,20))
structure(list(iso3 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("AFG",
"AGO", "ALB", "ARE", "ARG", "ARM", "AUS", "AUT", "AZE", "BDI",
"BEL", "BEN", "BFA", "BGD", "BGR", "BHS", "BIH", "BLR", "BLZ",
"BOL", "BRA", "BRN", "BTN", "BWA", "CAF", "CAN", "CHE", "CHL",
"CHN", "CIV", "CMR", "COD", "COG", "COL", "CRI", "CUB", "CYP",
"CZE", "DEU", "DJI", "DNK", "DOM", "DZA", "ECU", "EGY", "ERI",
"ESH", "ESP", "EST", "ETH", "FIN", "FJI", "FLK", "FRA", "GAB",
"GBR", "GEO", "GHA", "GIN", "GNB", "GNQ", "GRC", "GRL", "GTM",
"GUF", "GUY", "HND", "HRV", "HTI", "HUN", "IDN", "IND", "IRL",
"IRN", "IRQ", "ISL", "ISR", "ITA", "JAM", "JOR", "JPN", "KAZ",
"KEN", "KGZ", "KHM", "KIR", "KOR", "KWT", "LAO", "LBN", "LBR",
"LBY", "LCA", "LKA", "LSO", "LTU", "LUX", "LVA", "MAR", "MDA",
"MDG", "MEX", "MKD", "MLI", "MMR", "MNE", "MNG", "MOZ", "MRT",
"MWI", "MYS", "NAM", "NCL", "NER", "NGA", "NIC", "NLD", "NOR",
"NPL", "NZL", "OMN", "PAK", "PAN", "PER", "PHL", "PNG", "POL",
"PRI", "PRK", "PRT", "PRY", "QAT", "ROU", "RUS", "RWA", "SAU",
"SDN", "SEN", "SJM", "SLB", "SLE", "SLV", "SOM", "SRB", "SUR",
"SVK", "SVN", "SWE", "SWZ", "SYR", "TCD", "TGO", "THA", "TJK",
"TKM", "TLS", "TUN", "TUR", "TWN", "TZA", "UGA", "UKR", "URY",
"USA", "UZB", "VEN", "VNM", "VUT", "YEM", "ZAF", "ZMB", "ZWE"
), class = "factor"), lon = c(61L, 61L, 61L, 61L, 61L, 61L, 61L,
61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L, 61L
), lat = c(32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L, 32L,
32L, 32L, 33L, 33L, 33L, 33L, 33L, 33L, 33L, 33L), dm = structure(c(1L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 2L, 3L, 4L, 1L, 5L, 6L, 7L,
8L, 9L, 10L, 11L), .Label = c("2000m1", "2000m10", "2000m11",
"2000m12", "2000m2", "2000m3", "2000m4", "2000m5", "2000m6",
"2000m7", "2000m8", "2000m9"), class = "factor"), month = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L), precip = c(0.996665806451613, 0.156711724137931,
0.242477419354839, 0, 0, 0, 0, 0, 0, 0, 0.121536, 0.38866064516129,
1.13312903225806, 0.355208275862069, 0.307277419354839, 0.008316,
0, 0, 0, 0.0008361290322581)), .Names = c("iso3", "lon", "lat",
"dm", "month", "precip"), row.names = c(NA, 20L), class = "data.frame")

Sincerely,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshape to wide format

2016-12-22 Thread Miluji Sb
Apologies for the late reply. Thank you very much!

I get the following warnings. If I modify the code to add both month and
year as part of the ID, will it still be correct?

df$ID<-paste(df$iso3,df$lon,df$lat,df$year,df$month,sep="")

wide <- reshape(df, v.names="precip", timevar="ID", idvar="month",
direction="wide",  drop=c("iso3", "lon", "lat", "year"))

Sincerely,

Milu

warnings()
Warning messages:
1: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
  some constant variables (year) are really varying
2: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
  multiple rows match for ID=AFG6132: first taken
3: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
  multiple rows match for ID=AFG6133: first taken
4: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
  multiple rows match for ID=AFG6134: first taken

On Tue, Dec 13, 2016 at 6:21 PM, David L Carlson  wrote:

> You can also use function reshape() in stats:
>
> temp$ID<-paste(temp$iso3,temp$lon,temp$lat,sep="")
> wide <- reshape(temp, v.names="precip", timevar="ID", idvar="month",
>  direction="wide",  drop=c("iso3", "lon", "lat", "dm"))
> wide
>
>month precip.AFG6132 precip.AFG6133
> 1  1  0.99666581.133129032
> 2  2  0.15671170.355208276
> 3  3  0.24247740.307277419
> 4  4  0.0000.008316000
> 5  5  0.0000.0
> 6  6  0.0000.0
> 7  7  0.0000.0
> 8  8  0.0000.000836129
> 9  9  0.000 NA
> 1010  0.000 NA
> 1111  0.1215360 NA
> 1212  0.3886606 NA
>
> -
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jim Lemon
> Sent: Tuesday, December 13, 2016 2:59 AM
> To: Miluji Sb; r-help mailing list
> Subject: Re: [R] Reshape to wide format
>
> Hi Milu,
> I may have the wrong idea, but is this what you want?
>
> temp$ID<-paste(temp$iso3,temp$lon,temp$lat,sep="")
> library(prettyR)
> newtemp<-stretch_df(temp,"month","precip")[,c(5,7,8)]
> names(newtemp)<-c("month",unique(temp$ID))
>
> Jim
>
>
> On Tue, Dec 13, 2016 at 4:10 AM, Miluji Sb  wrote:
> > Dear all,
> >
> > I have the following monthly data by coordinates:
> >
> > I would like to reshape this data to wide format so that each column is a
> > coordinate and each row is a month,
> >
> > coordinate1 coordinate2 coordinate3...
> > Month 1
> > Month 2
> >
> > Is the best option to concatenate the iso3, lon, and lat variables to
> > create an ID variable? I realize that this question might be very basic
> but
> > I'm slightly baffled. Thank you.
> >
> > temp <- dput(head(precip_2000,20))
> > structure(list(iso3 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("AFG",
> > "AGO", "ALB", "ARE", "ARG", "ARM", "AUS", "AUT", "AZE", "BDI",
> > "BEL", "BEN", "BFA", "BGD", "BGR", "BHS", "BIH", "BLR", "BLZ",
> > "BOL", "BRA", "BRN", "BTN", "BWA", "CAF", "CAN", "CHE", "CHL",
> > "CHN", "CIV", "CMR", "COD", "COG", "COL", "CRI", "CUB", "CYP",
> > "CZE", "DEU", "DJI", "DNK", "DOM", "DZA", "ECU", "EGY", "ERI",
> > "ESH", "ESP", "EST", "ETH", "FIN", "FJI", "FLK", "FRA", "GAB",
> > "GBR", "GEO", "GHA", "GIN", "GNB", "GNQ", "GRC", "GRL", "GTM",
> > "GUF", "GUY", "HND", "HRV", "HTI", "HUN", "IDN", "IND", "IRL",
> > "IRN", "IRQ", "ISL", "ISR", "ITA", "JAM", "J

[R] Problem with Zelig - gam.poisson

2016-12-23 Thread Miluji Sb
I am getting the following error when trying to run a gam.possion model
with Zelig.

Error in eval(expr, envir, enclos) : attempt to apply non-function

Even the example won't run.

library(mgcv)
library(Zelig)

n <- 400
sig <- 2
x0 <- runif(n, 0, 1); x1 <- runif(n, 0, 1)
x2 <- runif(n, 0, 1); x3 <- runif(n, 0, 1)
f0 <- function(x) 2 * sin(pi * x)
f1 <- function(x) exp(2 * x)
f2 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 10 * (10 * x)^3 * (1 -
x)^10
f3 <- function(x) 0 * x
f <- f0(x0) + f1(x1) + f2(x2)
g <- exp(f/4)
y <- rpois(rep(1, n), g)
my.data <- as.data.frame(cbind(y, x0, x1, x2, x3))

z.out <- zelig(y ~ s(x0) + s(x1) + s(x2) + s(x3), model = "gam.poisson",
data = my.data)

What am I doing wrong? Session info is below. Thanks a lot!

Sincerely,

Milu


sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C   LC_TIME=English_United
States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Zelig_5.0-13 mgcv_1.8-15  nlme_3.1-128

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8nloptr_1.0.4   plyr_1.8.4
miscTools_0.6-22   tools_3.3.2lme4_1.1-12
 [7] MatchIt_2.4-21 jsonlite_1.1   tibble_1.2
lattice_0.20-34Matrix_1.2-7.1 DBI_0.5-1
[13] maxLik_1.3-4   parallel_3.3.2 SparseM_1.74   coda_0.18-1
 AER_1.2-4  dplyr_0.5.0
[19] MatrixModels_0.4-1 nnet_7.3-12stats4_3.3.2   lmtest_0.9-34
 grid_3.3.2 R6_2.2.0
[25] geepack_1.2-1  survival_2.40-1VGAM_1.0-2
foreign_0.8-67 minqa_1.2.4Amelia_1.7.4
[31] Formula_1.2-1  car_2.1-3  magrittr_1.5   mcmc_0.9-4
  MASS_7.3-45splines_3.3.2
[37] pbkrtest_0.4-6 assertthat_0.1 quantreg_5.29
 sandwich_2.3-4 survey_3.31-5  MCMCpack_1.3-8
[43] lazyeval_0.2.0 zoo_1.7-13

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rbind common header to file list

2021-01-19 Thread Miluji Sb
Dear all,

I have more than 200 text files in a folder without header - example below.
I would like to read, add a common date header to all the files, and write
(replace) the files.

## Read files
filelist = list.files(pattern = ".*.txt")
datalist = lapply(filelist, function(x)read.table(x, header=F))

## What I want to add
date <- 2101
datalist _new <- lapply(datalist, function(x) rbind(x, date))

How do I add this date as a common header and replace the files? Any help
will be highly appreciated. Thank you.

Best,

Milu

## Sample data
xy <- dput(head(x,6))
structure(list(V1 = c("-5.28082885742185,-0.509039307",
"-6.09873046874998,-0.349584961",
"-2.07150878906248,6.264276123", "-1.11102905273435,6.365716553",
"2.37749633789065,14.57106934", "4.9619079589844,18.91350708"
)), row.names = c(NA, 6L), class = "data.frame")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rbind common header to file list

2021-01-20 Thread Miluji Sb
Thank you for your reply and the solution. Yes, I would like the date to be
the column header for all the files in the list.

This is what tried following your suggestion;

filelist = list.files(pattern = ".*.txt")
date <- 2101

for (file %in% filelist){
  datalist <- read.table(file)
  write.table(datalist, file= file, col.names= date)
  }

However, I get the following error

Error: unexpected SPECIAL in "for (file %in%"


Is it something silly I am missing? Thank you again!

Best,

Milu

On Wed, Jan 20, 2021 at 1:29 AM  wrote:

> I'd use a for loop. But I may be misunderstanding the q!
>
>
> filelist = list.files(pattern = ".*.txt")
> date <- 2101
>
> for (file %in% filelist) {
>
> datalist <- read.table(file)
>
> write.table( datalist, file= file, col.names= date)
>
> }
>
> Does what I think you want.
> I would actually write to a new filename (sub folder?) To avoid disaster!
>
>
>
>
>
> On 19 Jan 2021 23:45, Miluji Sb  wrote:
>
> Dear all,
>
> I have more than 200 text files in a folder without header - example
> below.
> I would like to read, add a common date header to all the files, and write
> (replace) the files.
>
> ## Read files
> filelist = list.files(pattern = ".*.txt")
> datalist = lapply(filelist, function(x)read.table(x, header=F))
>
> ## What I want to add
> date <- 2101
> datalist _new <- lapply(datalist, function(x) rbind(x, date))
>
> How do I add this date as a common header and replace the files? Any help
> will be highly appreciated. Thank you.
>
> Best,
>
> Milu
>
> ## Sample data
> xy <- dput(head(x,6))
> structure(list(V1 = c("-5.28082885742185,-0.509039307",
> "-6.09873046874998,-0.349584961",
> "-2.07150878906248,6.264276123", "-1.11102905273435,6.365716553",
> "2.37749633789065,14.57106934", "4.9619079589844,18.91350708"
> )), row.names = c(NA, 6L), class = "data.frame")
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rbind common header to file list

2021-01-21 Thread Miluji Sb
Thank you, that was it.

Best,

Milu

On Wed, Jan 20, 2021 at 1:33 PM Eric Berger  wrote:

> for ( file in filelist )
>
>
>
> On Wed, Jan 20, 2021 at 2:21 PM Miluji Sb  wrote:
>
>> Thank you for your reply and the solution. Yes, I would like the date to
>> be
>> the column header for all the files in the list.
>>
>> This is what tried following your suggestion;
>>
>> filelist = list.files(pattern = ".*.txt")
>> date <- 2101
>>
>> for (file %in% filelist){
>>   datalist <- read.table(file)
>>   write.table(datalist, file= file, col.names= date)
>>   }
>>
>> However, I get the following error
>>
>> Error: unexpected SPECIAL in "for (file %in%"
>>
>>
>> Is it something silly I am missing? Thank you again!
>>
>> Best,
>>
>> Milu
>>
>> On Wed, Jan 20, 2021 at 1:29 AM  wrote:
>>
>> > I'd use a for loop. But I may be misunderstanding the q!
>> >
>> >
>> > filelist = list.files(pattern = ".*.txt")
>> > date <- 2101
>> >
>> > for (file %in% filelist) {
>> >
>> > datalist <- read.table(file)
>> >
>> > write.table( datalist, file= file, col.names= date)
>> >
>> > }
>> >
>> > Does what I think you want.
>> > I would actually write to a new filename (sub folder?) To avoid
>> disaster!
>> >
>> >
>> >
>> >
>> >
>> > On 19 Jan 2021 23:45, Miluji Sb  wrote:
>> >
>> > Dear all,
>> >
>> > I have more than 200 text files in a folder without header - example
>> > below.
>> > I would like to read, add a common date header to all the files, and
>> write
>> > (replace) the files.
>> >
>> > ## Read files
>> > filelist = list.files(pattern = ".*.txt")
>> > datalist = lapply(filelist, function(x)read.table(x, header=F))
>> >
>> > ## What I want to add
>> > date <- 2101
>> > datalist _new <- lapply(datalist, function(x) rbind(x, date))
>> >
>> > How do I add this date as a common header and replace the files? Any
>> help
>> > will be highly appreciated. Thank you.
>> >
>> > Best,
>> >
>> > Milu
>> >
>> > ## Sample data
>> > xy <- dput(head(x,6))
>> > structure(list(V1 = c("-5.28082885742185,-0.509039307",
>> > "-6.09873046874998,-0.349584961",
>> > "-2.07150878906248,6.264276123", "-1.11102905273435,6.365716553",
>> > "2.37749633789065,14.57106934", "4.9619079589844,18.91350708"
>> > )), row.names = c(NA, 6L), class = "data.frame")
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sum every n (4) observations by group

2021-12-19 Thread Miluji Sb
Dear all,

I have a dataset (below) by ID and time sequence. I would like to sum every
four observations by ID.

I am confused how to combine the two conditions. Any help will be highly
appreciated. Thank you!

Best.

Milu

## Dataset
structure(list(ID = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C",
"C", "C", "C"), Date = c(4140L, 4141L, 4142L, 4143L, 4144L, 4145L,
4146L, 4147L, 4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L,
4147L, 4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L, 4147L
), Value = c(0.000207232, 0.000240141, 0.000271414, 0.000258384,
0.00024364, 0.00027148, 0.000280585, 0.000289691, 0.000298797,
0.000307903, 0.000317008, 0.000326114, 0.00033522, 0.000344326,
0.000353431, 0.000362537, 0.000371643, 0.000380749, 0.000389854,
0.00039896, 0.000408066, 0.000417172, 0.000426277, 0.000435383
)), class = "data.frame", row.names = c(NA, -24L))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum every n (4) observations by group

2021-12-19 Thread Miluji Sb
Dear Peter,

Thanks so much for your reply and the code! This is helpful.

What I would like is the data.frame below - sum values for *4140, 4141,
4142, 4143 *and then for *4144, 4145, 4146, 4147 *for IDs A, B, and C. Does
that make sense? Thanks again!

Best.

Milu

results <- structure(list(ID = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C",
"C", "C", "C"), Date = c(4140L, 4141L, 4142L, 4143L, 4144L, 4145L,
4146L, 4147L, 4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L,
4147L, 4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L, 4147L
), Value = c(0.000207232, 0.000240141, 0.000271414, 0.000258384,
0.00024364, 0.00027148, 0.000280585, 0.000289691, 0.000298797,
0.000307903, 0.000317008, 0.000326114, 0.00033522, 0.000344326,
0.000353431, 0.000362537, 0.000371643, 0.000380749, 0.000389854,
0.00039896, 0.000408066, 0.000417172, 0.000426277, 0.000435383
), sum = c(NA, NA, NA, 0.000977171, NA, NA, NA, 0.001054089,
NA, NA, NA, 0.001213399, NA, NA, NA, 0.001395514, NA, NA, NA,
0.001541206, NA, NA, NA, 0.001686898)), class = "data.frame", row.names =
c(NA,
-24L))

On Sun, Dec 19, 2021 at 7:50 PM Peter Langfelder 
wrote:

> I'm not sure I understand the task, but if I do, assuming your data
> frame is assigned to a variable named df, I would do something like
>
> sumNs = function(x, n)
> {
>if (length(x) %%n !=0) stop("Length of 'x' must be a multiple of 'n'.")
>n1 = length(x)/n
>ind = rep(1:n1, each = n)
>tapply(x, ind, sum)
> }
> sums = tapply(df$Value, df$ID, sumNs, 4)
>
> Peter
>
> On Sun, Dec 19, 2021 at 10:32 AM Miluji Sb  wrote:
> >
> > Dear all,
> >
> > I have a dataset (below) by ID and time sequence. I would like to sum
> every
> > four observations by ID.
> >
> > I am confused how to combine the two conditions. Any help will be highly
> > appreciated. Thank you!
> >
> > Best.
> >
> > Milu
> >
> > ## Dataset
> > structure(list(ID = c("A", "A", "A", "A", "A", "A", "A", "A",
> > "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C",
> > "C", "C", "C"), Date = c(4140L, 4141L, 4142L, 4143L, 4144L, 4145L,
> > 4146L, 4147L, 4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L,
> > 4147L, 4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L, 4147L
> > ), Value = c(0.000207232, 0.000240141, 0.000271414, 0.000258384,
> > 0.00024364, 0.00027148, 0.000280585, 0.000289691, 0.000298797,
> > 0.000307903, 0.000317008, 0.000326114, 0.00033522, 0.000344326,
> > 0.000353431, 0.000362537, 0.000371643, 0.000380749, 0.000389854,
> > 0.00039896, 0.000408066, 0.000417172, 0.000426277, 0.000435383
> > )), class = "data.frame", row.names = c(NA, -24L))
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sequential date by group

2019-09-02 Thread Miluji Sb
Dear all,

I have a panel data with a large number of groups and the cumulative number
of months (1 - 372) for January 1995 to December 2005. My goal is to
extract the corresponding month and year for each observation.

I tried the following;

###
x %>%
  group_by(id) %>%
  do( data.frame(., Date= seq(.$startdate,
  as.Date('1975-01-01'), by = '1 month')))

However, I get the following error;

###
Error in seq.default(.$startdate, as.Date("1975-01-01"), by = "1 month") :
  'from' must be of length 1

Essentially, I want to convert the month number (1-372) to January 1975 to
December 2005 by ID. Is that possible? Any help will be highly appreciated.

### Data ###
x <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), month =
c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), startdate = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1975-01-01", class = "factor"),
enddate = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
"2005-12-31", class = "factor")), class = "data.frame", row.names = c(NA,
-9L))

Cross-posted in Statalist - however, Stata seems to have a very inflexible
date-time structure.

Best regards,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sequential date by group

2019-09-03 Thread Miluji Sb
Thanks, Jeff. I am trying your solutions - appreciate your help!

Best,

Milu

On Tue, Sep 3, 2019 at 12:51 AM Jeff Newmiller 
wrote:

> Your sample data has startdate and enddate columns as though these values
> might vary throughout the data set, but your description suggests that
> those values are fixed for the whole dataset.
>
> If those values are really constant, then
>
> x$Year <- 1995L + 12 * ( ( x$month - 1L ) %/% 12L )
> x$Month <- ( x$month - 1L ) %% 12L + 1
>
> should be enough.
>
> If in fact your startdate can be different for different IDs, then you
> would need to wrap this up a bit:
>
>  base R
> x$startdate <- as.Date( as.character( x$startdate ) )
> x$Y <- as.integer( format( DF$startdate, format = "%Y" ) )
> xlist <- split( x, x$id )
> xlist1 <- lapply( xlist
>  , FUN = function( DF ) {
>  DF$Year <- Y + 12 * ( ( DF$month - 1L ) %/% 12L )
>  DF$Month <- ( DF$month - 1L ) %% 12L + 1
>  DF
>}
>  )
> x1 <- unsplit( xlist1, x$id )
>
> ## dplyr
> library(dplyr)
> x2 <- (   x
>%>% mutate( Y = as.integer( format( startdate, format = "%Y" ) ) )
>%>% group_by( id )
>%>% mutate( Year = Y + 12L * ( ( month - 1L ) %/% 12L )
>  , Month = ( month - 1L ) %% 12L + 1L
>  )
>%>% ungroup
>)
> #
>
> On Mon, 2 Sep 2019, Miluji Sb wrote:
>
> > Dear all,
> >
> > I have a panel data with a large number of groups and the cumulative
> number
> > of months (1 - 372) for January 1995 to December 2005. My goal is to
> > extract the corresponding month and year for each observation.
> >
> > I tried the following;
> >
> > ###
> > x %>%
> >  group_by(id) %>%
> >  do( data.frame(., Date= seq(.$startdate,
> >  as.Date('1975-01-01'), by = '1 month')))
> >
> > However, I get the following error;
> >
> > ###
> > Error in seq.default(.$startdate, as.Date("1975-01-01"), by = "1 month")
> :
> >  'from' must be of length 1
> >
> > Essentially, I want to convert the month number (1-372) to January 1975
> to
> > December 2005 by ID. Is that possible? Any help will be highly
> appreciated.
> >
> > ### Data ###
> > x <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), month =
> > c(1L,
> > 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), startdate = structure(c(1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1975-01-01", class =
> "factor"),
> >enddate = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label =
> > "2005-12-31", class = "factor")), class = "data.frame", row.names = c(NA,
> > -9L))
> >
> > Cross-posted in Statalist - however, Stata seems to have a very
> inflexible
> > date-time structure.
> >
> > Best regards,
> >
> > Milu
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Match coordinates to regional polygon

2020-03-09 Thread Miluji Sb
Dear all,

I am trying to match a large number of coordinates (attached) sub-national
regions using GADM shapefile.

Coordinates:
https://drive.google.com/file/d/1PUsi4d0wP7hB6Aps6UmpXsnIPSD3I1sT/view?usp=sharing

Shapefile:
https://drive.google.com/drive/folders/1DxyygjQNeg2GIS9doaFM8yM2LvBYZi8E?usp=sharing


This is the code I am using;

###
library(data.table)
library(ncdf4)
library(rgdal)
library(raster)
library(tidyverse)

## Read coordinates
coords <- read.csv("coords.csv")

## Read shapefile
world <-readOGR(dsn = "/gadm36_levels_shp", layer = "gadm36_2")

## Convert to spatial points
pts <- coords
coordinates(pts) <- ~lon+lat

## Provide the same CRS to the coordinates as the shapefile
proj4string(pts) <- proj4string(world)

## Use over function to match coordinates to polygons
pts_district <- over(pts, world)$NAME_1
###

However, less than 20% of the coordinates are matched to a polygon. What am
I doing wrong? Any help will be greatly appreciated.

Regards,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract NASA VIIRS data in R

2020-03-18 Thread Miluji Sb
Dear all,

Hope everyone is keeping safe.

I am trying to extract/read VIIRS nighttime lights data but the output
seems rather strange.

I have uploaded the file here
 so as
not exceed the size limit of the email.

## Code ##
library(ncdf4)
library(rgdal)

netcdf_file <- c("VNP02DNB_NRT.A2020069.1048.001.nc
")
nl <- brick(netcdf_file, lvar=0, values=TRUE,
varname="observation_data/DNB_observations")

## Detail ##
class  : RasterLayer
dimensions : 3232, 4064, 13134848  (nrow, ncol, ncell)
resolution : 1, 1  (x, y)
extent : 0.5, 4064.5, 0.5, 3232.5  (xmin, xmax, ymin, ymax)
crs: NA
source : C:/Users/shour/Desktop/VNP02DNB_NRT.A2020069.1048.001.nc

names  : DNB.observations.at.pixel.locations
zvar   : observation_data/DNB_observations

The spatial resolution is supposed to be 750m but this shows 1°. What am I
doing wrong? Any help will be greatly appreciated. Thank you.

Cross-posted in R sig-geo but no reply - apologies.

Sincerely,

Millu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Year and month from a sequence

2020-07-14 Thread Miluji Sb
Dear all,
I have a panel dataset with a large number of groups (200K+) with a
sequence representing the month of the observations (2400 - 4139). The data
is from January 1861 to December 2005.
My goal is to extract the corresponding month and year for each observation.

I tried the following;
x$dt <- seq(from=as.Date("1861-01-01"),by='1 month',length=209000)

However, the end date is clearly out of the range. I would like to convert
the month number (2400 - 4139) to January 1861 to December 2005 by ID. How
can I achieve this? Any help will be highly appreciated.

Best,

Milu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Year and month from a sequence

2020-07-16 Thread Miluji Sb
Dear Jim, Eric, and Rui,

Great solutions! Thank you so much.

Best,

Milu

On Wed, Jul 15, 2020 at 9:20 AM Rui Barradas  wrote:

> Hello,
>
> Yet another way, with package lubridate.
> When creating the "Date" objects, I set day = 1 to compare to TRUE below.
>
>
> library(lubridate)
>
> start <- as.Date(ISOdate(1861, 1, 1))
> end <- as.Date(ISOdate(2005, 12, 1))
>
> start + months(4139 - 2400)
> #[1] "2005-12-01"
>
> end == start + months(4139 - 2400)
> #[1] TRUE
>
>
> The argument to function months can be a vector of integers.
>
>
> start + months(1:(4139 - 2400))
>
>
>
> Then extract the months.
>
> month(start)
> month(end)
> month(start + months())
> month(start + months(4139 - 2400))
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 23:26 de 14/07/20, Miluji Sb escreveu:
> > Dear all,
> > I have a panel dataset with a large number of groups (200K+) with a
> > sequence representing the month of the observations (2400 - 4139). The
> data
> > is from January 1861 to December 2005.
> > My goal is to extract the corresponding month and year for each
> observation.
> >
> > I tried the following;
> > x$dt <- seq(from=as.Date("1861-01-01"),by='1 month',length=209000)
> >
> > However, the end date is clearly out of the range. I would like to
> convert
> > the month number (2400 - 4139) to January 1861 to December 2005 by ID.
> How
> > can I achieve this? Any help will be highly appreciated.
> >
> > Best,
> >
> > Milu
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.