It is better to understand the requirements before suggesting a way to do it.
My GUESS is that the questioner wants the circles of different sizes based on
a factor but the natural choices start too high for their needs/taste. So they
want a base size of 0.8 that is either the minimum or maximu
Always hard to tell if THIS is a homework project. As with most things in R,
if you can not find at least a dozen ways to do it, it is not worth doing.
The question (way below) was how to take two vectors of length two and make
a longer results based on using the ":" operator to generate a range b
Nice alternative for some cases but I do not get the desires result as one
long vector. I would change the last line to this:
unlist(as.vector(sapply(1:length(a),
FUN=function(x,
a,
b) a[x]:b[x],
Many problems can often be solved with some thought by using the right tools,
such as the ones from the tidyverse.
Without giving a specific answer, you might want to think about using the
group_by() functionality in a pipeline that would lump together all rows
matching say having the same valu
TOPIC: Why some returned values do not automatically print.
Again, not seeing the internals, my guess is the function returned not the
expected but "invisible(expected)" which just marks it as not to be
automatically printed.
So if you want it printed, ask for it explicitly as in:
print(me.pro
Steven,
You need to mention what you actually did to get proper advice. Your problem is
at the source.
Simply put, the R interpreter does have somewhat different behavior when the
program is directly typed in (or slightly indirectly as in R STUDIO) than when
you ask it to open another file as
There are endless ways to do what you want, Seyit. If you wish to remain in
base R, using the names function on the left-hand side changes the names as in:
names(something) <- c("new", "names")
And in general, you may want to learn how to use an alternate set of methods
that work well with ggpl
Martin,
You did not say your two starting objects were already sets. You said they
were vectors of strings. It may well be that your strings included
duplicates. For example, If I read in lots of text with a blank line between
paragraphs, I would have lots of seemingly empty and identical parts. J
There are many ways, Rolf. You need to look into the syntax of regular
expressions. It depends on how sure you are that the formats are exactly as
needed. Escaping the period with one or more backslashes is one way. Using
string functions is another.
Suggestion. See if you can make a regular expre
Rolf,
Try:
xxx[[2,3]]
The double bracket return an item, not a list containing the item.
> xxx[2,3]
[[1]]
[[1]]$a
[1] "m"
[[1]]$b
[1] 95
> xxx[[2,3]]
$a
[1] "m"
$b
[1] 95
-Original Message-
From: R-help On Behalf Of Rolf Turner
Sent: Sunday, February 14, 2021 10:35 PM
To: "r-help@
This discussion is a bit weird so can we step back.
Someone wants help on how to read in a file that apparently was not written
following one of several consistent sets of rules.
If it was fixed width, R has functions that can read that.
If it was separated by commas, tabs, single spaces, arbitr
I am sure you can get more done with a caret than a stick. I need a stick for
another problem, though.
A serious question. I somehow upset my R/RSTUDIO setup while trying to see why
a markdown only allowed me to save an HTML version, not PDF and DOC as it used
to. It now fails on any such docum
idyverse") line in your Rmd rather
than library (tidyverse)
Does this happen on a Rmd example created by doing File/New/R Markdown
With no changes?
On 28 Feb 2021 22:04, Avi Gross via R-help mailto:r-help@r-project.org> > wrote:
I am sure you can get more done with a
It sounds to me like you want to take your data and extract one column for
JUST the date and another column for just some measure of the time, such as
the number of seconds since midnight or hours in a decimal format where
12:45 PM might be 12.75.
You now can graph date along the X axis and time a
a horizontal line connecting from
day 1 to day 2 to day 3 to day 4.
Similarly for events 3, and 4. Is that convenient to do?
Greg Coats
On Mar 16, 2021, at 8:01 PM, Avi Gross via R-help
<mailto:r-help@r-project.org> wrote:
Here is an example that worked for me doing roughly what I mentioned
This may not be the right place to ask about ggplot which is part of
packages but are you aware how ggplot works additively?
You can say something like:
P <- ggplot(...) ... + ...
Then later say:
P <- p + geom_...()
And so on.
So if you set al the layers you want first into a variable like p,
There are rather straightforward ways to manipulate your data step by step to
make harder things possible, or you can use creative ways harder for people to
understand.
So adding columns to your data that take existing times/dates and record them
with names like Q1Y2021 can give you abilities b
Just some thoughts I am considering about the issue of how to make giant
objects in memory without making them giant or all in memory.
As stupid as this sounds, when things get really big, it can mean not only
processing your data in smaller amounts but using other techniques than asking
expand
Tuhin,
What do you mean by a 2-D dataset? You say some columns contain strings so
it does not sound like you are using a matrix as then ALL columns would be
of the same type.
So are you using a data.frame or tibble or something you made on your own?
Can you address one column at a time and woul
Just FYI, Jeremie, you can do what you want fairly easily if you look at the
options available to print() and sprint().
You can ask NA conversion to be done here directly at print time:
print(mat, na.print="")
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[,10]
JL,
There are many ways to do what you want. If you need to do it by yourself
using standard R, there are ways but if you are allowed to use packages,
like the forcats package in the tidyverse, it can be fairly simple. Here for
example is a way to convert a factor with the four levels you mention
This is unfortunately a bad habit many of us got from earlier languages like
the C group of languages where 0 is FALSE and 1 (and anything non-zero) is
TRUE. A language like Python is arguably even worse in that all kinds of
things can be TRUE or FALSE in odd ways, like a non-empty string or even a
Just a caution. There IS an operator of `!!` in the tidyverse called "bang
bang" that does a kind of substitution and you can look up the help page for it
as:
?`!!`
I just tried it on an example and it definitely will in some cases do this
other evaluation.
I doubt this will clash, but of cou
Why should that work in a dplyr function?
medal_data <- medal_counts_ctry %>% filter(medal_counts_ctry$.rows > 100)
Generally in dplyr you do not use the dollar sign notation. And is there a
column starting with a period called ".rows" ??
Without seeing what your data looks like, and assuming yo
Micha,
Others have provided ways in standard R so I will contribute a somewhat odd
solution using the dplyr and related packages in the tidyverse including a
sample data.frame/tibble I made. It requires newer versions of R and other
packages as it uses some fairly esoteric features including "
Bill,
A Matrix can only contain one kind of data. I ran your code after modifying
it to be proper and took a transpose to get it risght-side up:
t(wanted)
date netIncomegrossProfit
2020-09-30 "2020-09-30" "5741100.00" "10495600.00"
2019-09-30 "2019-09-30" "552560
as it may I think the requester has enough info and we can move on.
-Original Message-
From: Jeff Newmiller
Sent: Friday, July 2, 2021 1:03 AM
To: Avi Gross ; Avi Gross via R-help
; R-help@r-project.org
Subject: Re: [R] concatenating columns in data.frame
I use parts of the tidyverse f
Ding,
Just to get you to stop asking, here is a solution I hope works.
In English, if you are asking that ONE instance of a duplicate be marked in
a new column with TRUE or 1 while all remaining ones are marked as FALSE or
2 or whatever, that is easy enough. The method is to use the assistive
fun
Rolf,
Your example shows two plots with one above the other.
If that is what you want, then a solution like the one Jeff provided using
facet_grid() to separate data based on the parameter value. It also scales
up if you add additional sets of data for gamma and delta up to a point.
An alterna
Rolf,
Your questions probably should go to a group focused on the ggplot package, not
a general R group where many do not use it.
A little judicious searching like "R ggplot use greek letters in text" gets you
some pointers that show how to do much more than Greek letters but more complex
Luigi,
Duncan answered part of your question. My feedback is to consider looking at
your data using other tools besides str().
There are ways in base R to get lists of row or column names or count them
or ask what types they are and so forth.
Printing an entire large object is hard but printing
It is not too clear to me what you want to do and why that package is the way
to do it. Is the package a required part of your assignment? If so, maybe
someone else can help you find how to properly install it on your machine,
assuming you have permissions to replace the other package it seems t
Kai,
It is easier to want to help someone if they generally know what they are doing
and are stuck on something. Less so when they do not know enough to explain to
us what they want, show what they did, and so on.
I modified the data you showed and hopefully it can be recreated this way:
libra
The code supplied is not proper for several reasons including not being on
multiple lines properly and use of variables not defined.
"percentage" is a field in data.frame "email" not in "graph_text" and of course
you need to load libraries properly to use the functions.
I rewrote and fixed a fe
Kai,
The answer is fairly probable to find if you examine your variable "eth" as
that is the only time you are being asked to provide the argument as in
"ggplot(data=eth, ..) ...)
As the message states, it expects that argument to be a data frame or something
it can change into a data.frame.
.frame
So, it seems the data.frame can not do this data convert? Do you know which
statement/function can do this?
thank you for your help.
On Thursday, August 26, 2021, 09:33:51 AM PDT, Avi Gross via R-help
mailto:r-help@r-project.org> > wrote:
Kai,
The answer is fairly probable
Am I seeing an odd aspect to this discussion.
There are many ways to solve problems and some may be favored by some more
than others.
All require some examination of the data so it can be massaged into shape
for the processes that follow.
If you insist on using the matrix method to arrange that
Seems trivial enough Elizabeth, either using a matrix or data.frame.
R is vectorized mostly so A[,1] notation selects a column all at once. Your
condition is thus:
A[,1] == B[,1]
After using your sample data to initialize an A and a B, I get this:
> A[,1] == B[,1]
[1] FALSE FALSE FALSE TRUE T
Why would you ask your question without mentioning that the two vectors may
be of unequal length when your abbreviated example was not like that!
You have two CASES here. In one A is longer and in one B is longer. When
they are the same, it does not matter.
So in your scenario, consider loo
Just for the hell of is I looked at the huge amount of data to see the
lengths:
> nrow(A)
[1] 8760
> nrow(B)
[1] 734
> sum(is.na(A[, 2]))
[1] 8760
> sum(is.na(B[, 2]))
[1] 0
So it seems your first huge matrix has 8,760 rows where the second entry is
always NA.
B seems to have 733
Luigi,
If you are sure you are looking at something like a data.frame, and all you
want o know is how many rows and how many columns are in it, then str() is
perhaps too detailed a tool.
The functions nrow() and ncol() tell you what you want and you can get both
together with dim(). You can, of c
Thanks for the interesting method Rui. So that is a way to do a redirect of
output not to a sinkfile but to an in-memory variable as a textConnection.
Of course, one has to wonder why the makers of str thought it would be too
inefficient to have an option that returns the output in a form that c
What is stopping you Abou?
Some of us here start wondering if we have better things to do than homework
for others. Help is supposed to be after they try and encounter issues that we
may help with.
So think about your problem. You supplied data in a file that is NOT in CSV
format but is in Tab
sor, Statistics and Data Science
Graduate Coordinator
Department of Mathematics and Statistics
University of Southern Maine
On Thu, Sep 2, 2021 at 9:42 PM Avi Gross via R-help mailto:r-help@r-project.org> > wrote:
What is stopping you Abou?
Some of us here start wondering if we ha
Thomas,
There are many approaches tried over the years to do partitioning along the
lines you mentioned and others. R already has many built-in or in packages
including some that are quite optimized. So anyone doing serious work can
often avoid doing this the hard way and build on earlier work.
N
Abou,
I believe I addressed this issue in a private message the other day.
As a general rule, truncating can leave a remainder. If
M = length(whatever)/3
Then M is no longer an integer. It can be a number ending in .333... or .666...
as well as 0.
Now R may silently truncate somethi
I am sure there are many good ways to do the task including taking the
data.frame out into a list of data.frames and making the change to each by
taking the nth row that matches nrow(it) and changing it and then recombining.
What follows are several attempts leading up to one at the end I find i
Excellent function to use, Terry.
I note when I used it on a vector (in this case the first column of a
data.frame, it accepted last=TRUE as well a fromlast=TRUE, which I did not see
documented. Used on a data.frame, that change fails as function
duplicated.data.frame only passes along the
Rich,
Did I miss something? The summarise() command is telling you that you had not
implicitly grouped the data and it made a guess. The canonical way is:
... %>% group_by(year, month, day, hour) %>% summarise(...)
You decide which fields to group by, sometimes including others so they are in
As Eric has pointed out, perhaps Rich is not thinking pipelined. Summarize()
takes a first argument as:
summarise(.data=whatever, ...)
But in a pipeline, you OMIT the first argument and let the pipeline supply an
argument silently.
What I think summarize saw was something like:
summari
I think we wandered away into a package rather than base R, but the request
seems easy enough.
Just FYI, Rich, as you seem not to have incorporated the advice we gave yet
about the first argument, your use of group_by() is a tad odd.
disc %>%
group_by(hour) %>%
group_by(day) %>%
d thing.
-Original Message-
From: R-help On Behalf Of Rich Shepard
Sent: Monday, September 13, 2021 7:04 PM
To: r-help@r-project.org
Subject: Re: [R] tidyverse: grouped summaries (with summarize) [RESOLVED]
On Mon, 13 Sep 2021, Avi Gross via R-help wrote:
> As Eric has pointed out, p
Rich,
I reproduced your problem on my re-arranging the code the mailer mangled. I
tried variations like not using pipes or changing what it is grouped by and
they all show your results on the abbreviated data with the error:
`summarise()` has grouped output by 'year'. You can override using the
Rich,
I have to wonder about how your data was placed in the CSV file based on
what you report.
functions like read.table() (which is called by read.csv()) ultimately make
guesses about what number of columns to expect and what the contents are
likely to be. They may just examine the first N entr
Rich,
You have helped us understand and at this point, suppose we now are sure
about the way missing info is supplied. What you show is not the same as the
CSV sample earlier but assuming you know that "Eqp" is the one and only way
they signaled bad data.
One choice is to fix the original data be
Calling something a data.frame does not make it a data.frame.
The abbreviated object shown below is a list of singletons. If it is a column
in a larger object that is a data.frame, then it is a list column which is
valid but can be ticklish to handle within base R but less so in the tidyverse.
Calling something a data.frame does not make it a data.frame.
The abbreviated object shown below is a list of singletons. If it is a column
in a larger object that is a data.frame, then it is a list column which is
valid but can be ticklish to handle within base R but less so in the tidyverse.
n
I'd like to point out that base R can handle a list as a data frame column,
it's just that you have to make the list of class "AsIs". So in your example
temp <- list("Hello", 1, 1.1, "bye")
data.frame(alpha = 1:4, beta = I(temp))
means
handle them become better or at least better understood.
-Original Message-
From: R-help On Behalf Of Avi Gross via R-help
Sent: Wednesday, September 15, 2021 1:23 AM
To: R-help@r-project.org
Subject: Re: [R] How to remove all rows that have a numeric in the first (or
any) column
You are
Glad we have solutions BUT I note that the more abstract question is how to
convert any columns that are factors to their base type and that may well NOT
be character. They can be integers or doubles or complex or Boolean and maybe
even raw.
So undoing factorization may require using something
academic exercise for me.
-Original Message-
From: Bert Gunter
Sent: Sunday, September 19, 2021 7:19 PM
To: Avi Gross
Cc: Luigi Marongiu ; Rui Barradas
; r-help
Subject: Re: [R] how to remove factors from whole dataframe?
You do not understand factors. There is no "base type" that c
Ravi,
I have no idea what motivated the people who made ifelse() but there is no
reason they felt the need to program it to meet your precise need. As others
have noted, it probably was built to handle simple cases well and it expects
to return a result that is the same length as the input. If som
This is supposed to be a forum for help so general and philosophical
discussions belong elsewhere, or nowhere.
Having said that, I want to make a brief point. Both new and experienced
people make implicit assumptions about the code they use. Often nobody looks
at how the sausage is made. The re
e: text/plain; charset="utf-8"
Hi Avi,
Definitely a learning moment. I may consider writing an ifElse() for
my own use and sharing it if anyone wants it.
Jim
On Sun, Oct 10, 2021 at 6:36 AM Avi Gross via R-help
wrote:
>
> This is supposed
intersect() is a generic function so the question is which one does someone
want to know if it remains in the same order?
But a deeper question is what ORDER?
intersect(A, B)
intersect(B, A)
Note the results have to be the same but not the order unless they start
sorted the same way.
-Orig
I wonder why it is not as simple as:
Call mutate on the data and have a condition that looks like:
data %>% mutate(cases = ifelse(multiple_cond, NA, cases) -> output
-Original Message-
From: R-help On Behalf Of Dr Eberhard W Lisse
Sent: Monday, October 25, 2021 11:49 AM
To: r-help@r-pro
There can be people doing homework for a course and as noted, the normal
expectation is to use the resources provided including classroom instruction
(or the often ZOOM or recordings) as well as textbooks.
Forums like this are not a substitute and some nice people will sometimes
volunteer not to d
The error below was fairly clear. The R 'if" statement is not vectorized and
takes a single logical argument. It is not normally used in a pipeline
unless at that point the data has been reduced to a vector of length 1.
I do not want to look at your code further without the data behind it but
sugg
I am not sure your overall question fits into this forum but a brief
internet search can find plenty of info.
But in brief, R is a language in which much of what numpy does was built in
from the start and many things are vectorized. Much of what the python
pandas language does is also part of nati
As others have replied, the customary way is to use the seq() function that
takes additional arguments besides a from= and a to= such as by= to specify
the step size and two others sometimes handy of length.out= and along.with=
In your case seq(from=1.5, to=3.5, by=0.5) works as well as the shorte
Bert,
R is used all over the place, sometimes not visibly.
A search shows the NY times using it in 2011, 2009, ...:
https://www.nytimes.com/2009/01/07/technology/business-computing/07program.h
tml
https://blog.revolutionanalytics.com/2011/03/how-the-new-york-times-uses-r-f
or-data-visualization
This is a fairly simple request and well covered by introductory reading
material.
A decent example was given and I see Andrew provided a base R reply that
should be sufficient. But I do not think he realized you wanted something
different so his answer is not in the format you wanted:
> tapply(d
Jim,
Your code gives the output in quite a different format and as an object of
class "by" that is not easily convertible to a data.frame. So, yes, it is an
answer that produces the right numbers but not in the places or data
structures I think they (or if it is HW ...) wanted.
Trivial standard c
to get the
desired output in that particular format. That file will be saved
and used as an input file for another external process.
val
On Mon, Nov 1, 2021 at 6:08 PM Avi Gross via R-help
wrote:
>
> Jim,
>
> Your code gives the output in quite a different format and as
bject: Re: [R] by group
>
> Thank you all!
> I can assure you that this is not HW. This is a sample of my large data set
> and I want a simple and efficient approach to get the
> desired output in that particular format. That file will be saved
> and used as an in
an input file for another external process.
val
On Mon, Nov 1, 2021 at 6:08 PM Avi Gross via R-help
mailto:r-help@r-project.org> > wrote:
>
> Jim,
>
> Your code gives the output in quite a different format and as an object of
> class "by" that is not easily conv
I have several things I considered about this topic.
It is, in general, not possible to do some things in one language or another
even if you find a bridge. Python lets you place all kinds of things into a
dictionary including many static objects like tuples or even other
dictionaries. What is all
:47 AM
To: r-help@r-project.org
Subject: Re: [R] Is there a hash data structure for R
On 03-11-2021 00:42, Avi Gross via R-help wrote:
>
> Finally, someone mentioned how creating a data.frame with duplicate
> names for columns is not a problem as it can automagically CHANGE them
>
Gabrielle,
Why would you expect that to work?
rbind() binds rows of internal R data structures that are some variety of
data.frame with exactly the same columns in the same order into a larger object
of that type.
You are not providing rbind() with the names of variables holding the info but
I am not clear why Python came up on this forum. Yes, you can do all sorts of
stuff in Python (or other languages) in ways similar or not to doing them in R.
The topic here was reading in data from multiple CSV files and I saw no mention
about whether some columns are supposed to be of type char
Rich,
I think many here may not quite have enough info to help you.
But the subject of multiple plots has come up. There are a slew of ways,
especially in the ggplot paradigm, to make multiple smaller plots into a
larger display showing them in some number of rows and columns, or other
ways. Some
Behalf Of Rich Shepard
Sent: Thursday, November 11, 2021 8:50 AM
To: r-help@r-project.org
Subject: Re: [R] ggplot2: multiple box plots, different tibbles/dataframes
On Wed, 10 Nov 2021, Avi Gross via R-help wrote:
> I think many here may not quite have enough info to help you.
Avi,
Actua
Rich,
I think we have suggested something several times that you ignore as you are
focused on your way of thinking.
If you read the last part of the letter I wrote in public, I suggest
combining your multiple dataframes into one if they are compatible and
including a new column called something l
As I replied to Rich privately for another message, I suggest that you may
well be able to fit what you need in memory, if careful. But my main point
is that when you have so much data, you do not need all of it to make a
representative graph. A boxplot made using 100,000 data points may well have
d
Sent: Thursday, November 11, 2021 1:25 PM
To: r-help@r-project.org
Subject: Re: [R] ggplot2: multiple box plots, different tibbles/dataframes
On Thu, 11 Nov 2021, Avi Gross via R-help wrote:
> Say I have a data.frame with columns called PLACE and MEASURE others.
> The one I call PLACE would
As noted, this is not the place to ask about dplyr but the answer you may
want is perhaps straight R.
If you have a list called weekdays and you know you o not want to take the
fifth, then indexing with -5 removes it:
> weekdays <- list("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat")
> weekdays[
This seems to be a topic that comes up periodically. The various ways in R
and other packages for reading in data often come with methods that simply
guess wrong or encounter one or more data items in a column that really do
not fit so fields may just by default become a more common denominator of
Several recent questions and answers have mad e me look at some code and I
realized that some functions may not be great to use when you are dealing
with very large amounts of data that may already be getting close to limits
of your memory. Does the function you call to do one thing to your object
On Sun, 28 Nov 2021 at 06:57, Avi Gross via R-help mailto:r-help@r-project.org> > wrote:
Several recent questions and answers have mad e me look at some code and I
realized that some functions may not be great to use when you are dealing
with very large amounts of data that may alread
Stephen,
Although what is in the STANDARD R distribution can vary several ways, in
general, if you need to add a line like:
library(something)
or
require(something)
and your code does not work unless you have done that, then you can imagine
it is not sort of built in to R as it starts.
Having s
The right answer obviously depends on the REQUIREMENTS and they may not have
been fully stated.
This is a bit like finding the mode of a set of numbers. The most frequent
value may not be as representative of the data as the mean or even the median
for some purposes, as well as other measures o
Jeff,
I am wondering what it even means to do what you say. In a compiled
language, I can imagine wrapping up an executable along with some kind of
run-time image (which may actually contain the parts of the executable that
includes what has not run yet) and revive it elsewhere.
But even there, h
I think Rich has shared aspects of the data before and may have forgotten we
want something here and now.
Besides a small sample of what the relevant columns look like and a
suggestion of what he wants some new column to look like, we probably need
more to understand what he wants.
The issue coul
Milu,
Your data seems to be very consistent in that each value of ID has eight
rows. You seem to want to just sum every four so that fits:
ID Date Value
1 A 4140 0.000207232
2 A 4141 0.000240141
3 A 4142 0.000271414
4 A 4143 0.000258384
5 A 4144 0.000243640
6 A 4145 0.0002714
Stephen,
You can sort using sort() either before or after doing a unique. Unique
removes all duplicates in any order so sorting before may be wasteful. in
your data shown below, do this:
sort(unique(Data[1]))
sort(unique(Data[2]))
sort(unique(Data[3]))
sort(unique(Data[4]))
Even simpler is to de
Strategy Consultant
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com
On 12/20/21 12:15 PM, Avi Gross via R-help wrote:
Stephen,
You can sort using sort() either before or after doing a unique. Unique
removes all duplicates in any order so sorting before may be wasteful. in
Duncan and Martin and ...,
There are multiple groups where people discuss R and this seems to be a help
group. The topic keeps coming up as to whether you should teach anything other
than base R and I claim it depends.
Many packages are indeed written mostly using base R or using other package
Stephen,
Languages have their own philosophies and are often focused initially on doing
specific things well. Later, they tend to accumulate additional functionality
both in the base language and extensions.
I am wondering if you have explained your need precisely enough to get the
answers you
Duncan,
Let's not go there discussing the trouble with tibbles when the topic asked how
to do things in more native R.
The reality is that tibbles when used in the tidyverse often use somewhat
different ways to select what columns you want including some very quite
sophisticated ones like:
se
I wonder if the package Adrian Dușa created might be helpful or point you along
the way.
It was eventually named "declared"
https://cran.r-project.org/web/packages/declared/index.html
With a vignette here:
https://cran.r-project.org/web/packages/declared/vignettes/declared.pdf
I do not know
1 - 100 of 145 matches
Mail list logo