Re: [R] Mailinglist

Rui Barradas Sun, 06 Jan 2019 09:28:43 -0800

Hello,

In many continental European countries, such as mine, the function touse is


read.csv2

It defaults to

sep = ";", dec = ","

Note that these functions are in fact calls to read.table with specialdefault arguments. Another default that changes is header = TRUE.You might also want to set stringsAsFactors = FALSE since the defaultvalue TRUE is a common source for errors.


Hope this helps,

Rui Barradas

Às 16:45 de 06/01/2019, Michael Dewey escreveu:

Dear Rachel

Not sure if this is going to help but if it is a csv file thenread.csv() is your friend. Read the help first in case you need tospecify what is being used for the decimal point and the separator as ifit is from the Netherlands they may not be the default settings.


michael

On 06/01/2019 16:37, Rachel Thompson wrote:

Hi Jeff,

Thanks for your email.

I am an intern from Amsterdam and I have to do an analysis in R. Ispoke tomy professor in Amsterdam and my supervisor's here in Boston. But theyareto busy to help. I informed them from the start that I am not familiarwith

R(Rstudio) and they told me that I would receive guidance. So since they
can not help me, I decided to share my problem online.
(It is a CVS file imported into R)

Please understand that I am new to this. I will unsubscribe to themailing

list if my question does not belong here.

Thanks,

Rachel

On Sun, Jan 6, 2019 at 11:01 AM Jeff Newmiller <jdnew...@dcn.davis.ca.us>
wrote:

I would not want to leave the impression that I think the task athand is

merely tedious... my point is that there are numerous steps involved and
each step depends on information that has not been communicated to the

list, and there is a learning curve even in knowing what to includein an

email question. What I do think is that knowing enough basic R syntax to
express small bits of the problem in R will be a vast improvement over

attempting to use only English descriptions, and Rachel has to bridgethat

initial gap.

For example, some images of data were apparently sent to Jim only,yet he

still does not know in what format the data file is stored, so that

technique was not very effective. One way for the question to becomemorefocused is for Rachel to study up on her own how to import data andprovideus with a "dput" (see the StackOverflow discussion I referencedbefore) of

a small sample of data. Another is for Rachel to use basic R syntax to
create an anonymous data set from scratch (also outlined in the SO
discussion). These approaches allow us to keep the focus of our mailing

list discussion on manipulating the data into summaries. Anotherapproachis to re-focus the question on importing data by supplying a downloadlink

to the data so we can make suggestions as to what R commands will handle

this data in its raw form. In any case, we cannot leapfrog over thedata to

the analysis as the question stands.

Given the above, I have to wonder why Rachel hasn't simply used the tool
she is familiar with... SPSS... to do this? If it is because this is an
academic assignment to learn R then she should be talking to her

institutional support (instructor/teaching assistant/tutoring staff)anyway

since there is a no-homework policy on this list (and that avenue would
have the benefit of being conducted orally and most likely in her native
language).

On January 6, 2019 1:12:46 AM PST, Jim Lemon <drjimle...@gmail.com>wrote:

Hi Rachel,
It looks to me as though the first thing you want to do is to get your
data, which you attach as images, into a data frame. If these are flat
files like CSV or TAB, you should be able to read them in with some
variant of the read.table function. If Excel, look at the various
Excel import packages. Then you can operate on the data frame by doing
things like tabulating Participant ID against the code for SMS or call
(which I assume are those 3000+ numbers). You can take the differences
in what look like POSIX time values between successive TRUE and FALSE
screen values to get the duration of screen activity and it looks like
participant activity is recorded at regular intervals. As Jeff
suggested, this is really just boring work figuring out how to extract
the events:

call_indices<-which(Probetype == xxxxxxCallLogProbe & ValueSpecified
== _id  & Valuedetailed ==3271)

using suitable logical statements and then tabulating them by
ParticipantID. If you know how to do that in SPSS, it won't be too
hard to translate the logical statements into R syntax as above. I may
have misunderstood the variable names, but I think the logic is clear.

Jim

On Sun, Jan 6, 2019 at 4:07 PM Rachel Thompson
<rachel.thomp...@student.uva.nl> wrote:


Hi Jim,

Thank you for the clarification. Since I only work in SPSS and I am

>from Amsterdam I have had problems with specifying what I am trying to

do in this specific program and also in clear English language.


I think I want to indeed aggregate these events for each subject over

the observation. But in this case several observations.

1. I want to have a summary of how many times a specific subject got

called (CallLogProbe)

2. I want to have a summary of how many times a specific subject got

a text message (SMS probe)

3. I want to have a summary of how many times a specific subject
- Turned their screen on - True  (ScreenProbe)
- Or did not turn their screen on - False (ScreenProbe)
4.  I want to have a summary of the activity level of a specific

subject

- Activity level - none (ActivityProbe)
- Activity level- low     (ActivityProbe)
- Activity level - High  (ActivityProbe)

I want to do this for all the 36 subjects(Participants).

In the end, I have to define percentages, so I am able to

say...Subject 36 has low social interactions ( because they only got
called and texted 500 times in total, while the average of all the
participants is 10000 or something). I have to come up with the
percentages myself and define cutoff points of what is considered
low-medium-high, based on what the results of all the subjects are.


I hope that I am as clear as possible .


I feel as if I am on my way of understanding it, but since I do not

clearly know, I am trying out a lot of different codes etc. and I do
not know if I am doing the right thing. I indeed made a new data frame
etc, but I still feel a bit lost. Do I need to make one per subject or
per Probe etc..



Thanks for your help. I hope that you can help me resolve this issue.


Best,


Rachel






On Sat, Jan 5, 2019 at 9:03 PM Jim Lemon <drjimle...@gmail.com>

wrote:


Hi Rachel,
I'll take a guess and assume that you are monitoring the mobile

phones

of 36 people, adding an observation every time some specified change
of state is sensed on each device. I'll also assume that you are

only

recording four types of measurement. It seems that you want to
aggregate these events for each subject over the interval or
observation (or over each day or something). I think you are going

to

create a new data frame of these summaries from the one you have of
individual observations. Creating each summary doesn't look too

hard,

but you will have to define more precisely what you want those
summaries to be. For instance, "I want the mean activity level for
each subject during the overall time that their mobile phone is
switched on", One you have clearly defined your goals, it probably
won't be too hard to get to them.

Jim

On Sun, Jan 6, 2019 at 5:39 AM Rachel Thompson
<rachel.thomp...@student.uva.nl> wrote:


Dear Mr/Mrs,

This is my first time working in R studio.
I have a database of 36 participants but it has 150600 entries.
Column -         Column - Column            - Column

Participant       Activityprobe - Activity Level  - High/low/none

Participant       Screenprobe - screenon/off     -

Participant       SMSprobe etc

Participant       CallLogProbe etc.

I need a code that helps me count the activity level of all the

participants

High activity level. No activity level and Low activity level.
And to help me find out for every participant what the percentages

are of

all their high/no/low activity.

For screenprobe I need to count how many times the participant

turned their

screen on and how many times they turned it off and the percentage

of

screen on/off.

For callLog I need to count how many times each participant got

called and

the percentage.

For SMS I need to count the number of SMS for each participant and

their

percentage.

I also need to categorize the probes. So that my database shows

all the

activity levels first, organized by none/high/low and then all the
screenprobes, organized by on and off etc...

I hope that my description is clear and that you can maybe help

me.


Best,

Rachel

         [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Sent from my phone. Please excuse my brevity.


    [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mailinglist

Reply via email to