Re: Pspp-users Digest, Vol 128, Issue 3

Dr. Oliver Walter Thu, 19 Jan 2017 14:55:47 -0800

I also use the syntax in the way, you described, Alan. Hence, it is anargument in favour of implementing some important commands in the GUI.This morning I liked to merge some files but because the command is notimplemented in the GUI and not easy to understand in the manual, Istopped using PSPP and used R instead, because its "R Commander" hassome possibilities to merge files. This is the reason why I do not likecomments such as "learn syntax, please" as some other comments could beunderstood. If we start to make such comments we could also say: "Use Rinstead of PSPP, please".


Oliver Walter



Am 19.01.2017 um 21:40 schrieb Alan Mead:

That's a good question. When I learned the syntax, it was the onlyway to do it. There are some good resources available online toaddress specific questions. I usually google something like spsssyntax compute and that produces a lot of hits for most questions.
I should say that these days, I do use the GUI to generate the syntaxfor virtually all analyses. What I do is to navigate to the dialog Ifor the analysis that I want, add some random variables (for mostanalyses, you have to find numeric variables), select the options Iwant and then paste the syntax. I then edit it to include thevariables I want. If the analysis is just a variable or two and theyare easily found, then it's just as easy to simply pick that one. Butif there are many variables and especially if they are arranged in thedataset contiguously, then it's far easier to edit the syntax tochange the randomly-selected variables to something like "X to Y"(which includes all columns between variable X and variable Y in thedataset, including X and Y).
I personally also use the GUI exclusively to generate syntax forreading in raw data (reading SAV files with GET is pretty trivial) . Itry hard to analyze tab-delimited files with the variables names ontop and I've found that those are usually read very well intoPSPP/SPSS. I paste the syntax generated by the GUI so I can edit itif needed. For example, sometimes it will guess wrong about a variabletype. Or I might want to manually change a variable name so itmatches another file.
So, I write syntax mainly for data manipulation (finding bad data,scoring, creating new variables from input data, etc.) and thatincludes a fairly small number of statements:
execute.
count
compute
recode
value label
variable label
temporary
select if
do if ... else ... end if.
sort cases
get
save
write
format
merge files
Maybe I'm digressing. And maybe these comments are mainly for peoplewho will be doing long, complex, or very important analyses. But Ihighly recommend using syntax (whether generated by GUI or by hand)for all analyses. For one thing, it's self-documenting (if you saveit)... You can go back later and see exactly what you did (e.g., whatvariables were included in "INDEX"? How was that Likert scale scored?What regions went into that market segment?) and if you find aproblem, you already have all the syntax you need to re-run theanalysis. In fact, PSPP produces readable output but with SPSS if youdon't have a copy of SPSS then you won't be able to read the outputfile or the SAV file. So, the syntax is the only file that will bereadable (they're just text files; you can open them with yourfavorite text editor or Notepad/Wordpad on Windows). If you did anSPSS analysis and saved just the output and SAV data, you might not beable to read either file years (or months) from now when you no longerhave SPSS.
I also think you should avoid ever modifying existing variables, sothat you can re-run your syntax to reproduce an analysis. (You couldalso never over-write a SAV file, so that the modified variablesbecome part of a new SAV file, but this is fraught with peril andtends to lead to a series of undocumented but indistinguishabledatasets, DATA1.SAV, DATA2.SAV, etc.... Far better to document youranalysis in syntax and avoid modifying existing variables by creatingnew ones.)
Sometimes, you can also re-use old syntax (if you analyze similardatasets frequently).
Also, I recommend that when feasible (and sometimes it simply isn't),you should avoid using SAV files. Or only use them as temporary files,not as permanent storage of data. Instead, your analysis should beginby reading a "raw" data source and then do the whole analysis. Thereason is that you cannot tell what data transformation have beenapplied to the dataset. Whereas if you read the data from a rawsource, you always know that that raw source data is in it's knownoriginal state. This might not be an issue if your analyses do notrequire data transformations; but I find that most of my analyses dorequire a lot. In those cases, this isn't a trivial issue.
Once I had an NSF grant which entailed creating an ethics measure andwe used SPSS to score it. It would have been the work of centuries tore-create the scoring (and verifying it) through a GUI for eachdataset. Instead, I copied a fairly complex chunk of syntax andadapted it to the names of the variables in the current dataset. Ihad a syntax error in one statement because the number of items hadchanged. Because it didn't execute, my data were half scored (halfunscored) and I compounded the problem by not noticing the error andusing the score in a later analysis. If I'd written the syntax tocreate new scored variables, it wouldn't be possible for my scoredvariables to be "half scored" ... some of them wouldn't exist. And inthat case, the missing variables would have stopped the analysis,instead of allowing erroneous results (from half-scored data) to beproduced.
This is definitely a problem is complex analyses or when manipulatingdata, but I'd argue it's a potential source of error in any analysisthat involves any degree of data transformation. IIRC, PSPPdistributes a small example dataset with some kind of Likert data(customer satisfaction ratings?) and in some version of that exampledataset, one of the items had been reversed (i.e., the Likertresponses had been swapped to 1->5, 2->4, 4->2, 5->1) and saved. Youcannot tell this from the SAV file (at all). In fact, I'm inferringit from a data analysis, but it's the only possible way that oneLikert item could be so different. Garbage in, garbage out and youoften cannot verify that a SAV file is not "garbage" unless you'vejust created it.
You should also be generous in adding comments to your syntax. Acomment is a note to the reader/analyst about the syntax and lookslike this:
* data cleaning code .
* removed item 12 on 17-jan-2017 because it had a poor ITC .
* this is the composite that worked best out of the four we tried.it's R2 was 0.56.
* scoring for the customer satisfaction Likert responses.
I will admit that syntax requires adhering to the rules of PSPP/SPSSsyntax. You leave off a period at the end or a quote (or use thewrong quote) and PSPP/SPSS gives you a cryptic error message. I thinkthis is one of the reasons novice PSPP/SPSS users avoid syntax, but Ithink they are handicapping themselves as a result.
One final thing: one of the main advantages of PSPP is that it's free(i.e., user-editable) software, which includes the manual. So if youhave modifications to make the manual clearer or to add examples, I'msure the developers will be delighted to see your changes/additions.
-Alan



On 1/19/2017 1:11 PM, Aj Hollenbach wrote:
Thanks Alan. What is the best approach, in your opinion, to learningthe syntax for these types of expressions? Again, I wholly reliedupon the GUI in SPSS. I did take a look at the PSPP manual, but didnot immediate see examples of the structure of the syntax.
Thanks,
Allen
On Thu, Jan 19, 2017 at 12:00 PM, <pspp-users-requ...@gnu.org<mailto:pspp-users-requ...@gnu.org>> wrote:
    Send Pspp-users mailing list submissions to
    pspp-users@gnu.org <mailto:pspp-users@gnu.org>

    To subscribe or unsubscribe via the World Wide Web, visit
    https://lists.gnu.org/mailman/listinfo/pspp-users
    <https://lists.gnu.org/mailman/listinfo/pspp-users>
    or, via email, send a message with subject or body 'help' to
    pspp-users-requ...@gnu.org <mailto:pspp-users-requ...@gnu.org>

    You can reach the person managing the list at
    pspp-users-ow...@gnu.org <mailto:pspp-users-ow...@gnu.org>

    When replying, please edit your Subject line so it is more specific
    than "Re: Contents of Pspp-users digest..."

    Today's Topics:

       1. Re: Selecting cases using the "IF" Function (Alan Mead)


    ---------- Forwarded message ----------
    From: Alan Mead <ame...@alanmead.org <mailto:ame...@alanmead.org>>
    To: pspp-users@gnu.org <mailto:pspp-users@gnu.org>
    Cc:
    Date: Wed, 18 Jan 2017 11:10:03 -0600
    Subject: Re: Selecting cases using the "IF" Function
    As Dr. Water says, syntax is a solution.  The steps would be to
    (1) paste the desired analysis and then (2) edit the syntax to
    insert the "IF" statement above it.

    You also need to decide if you want to "permanently" delete the
    non-selected cases or not.  If I have a long series of analyses,
    I might select cases (say valid cases) and save them (or use a
    filter).  But Hollenbach describes analyzing subsets of the
    dataset and in that case I often find the temporary command to be
    helpful.  The syntax for _each analysis_ would look like this:

    temporary.
    select if( region = 1 or (region=1 and id=3)).
    freq ...

    You would highlight all three statements and run them.  The
    "temporary" command causes the selection to be in effect only for
    the next analysis. You repeat the "temporary" and "select if" for
    each analysis (or, again, use a filter).

    BTW, I honestly think just typing the syntax of the "select if"
    is easier than using the GUI.

    -Alan


    On 1/18/2017 9:53 AM, Aj Hollenbach wrote:
    Hi PSPP Users,

    I am transitioning from SPSS to PSPP and am having some troubles
    with case selection. Specifically, under SPSS, I used to be able
    to select cases using a radio button in the Data / Select Cases
    dialogue box that stated "Select if condition is satisfied...".
    However, under PSPP, I have found that this option is not
    available, and that you can only select cases based upon (1) a
    random sample, (2) case range, or (3) a filter variable. In
    other words, there is no option for using the IF function for
    selection purposes. I am attaching screenshots from both programs.

    I greatly appreciate any advice that others might have on how to
    best make a selection of cases using a conditional IF statement.
    In short, I am running analysis of household survey data, but
    only want to use data from a handful of the administrative
    jurisdictions (provinces) within the larger data set.

    Regards,
    Allen

    PS: I am running GNU pspp 0.10.1-g1082b8
--
    Alan D. Mead, Ph.D.
    President, Talent Algorithms Inc.

    science + technology = better workers

    http://www.alanmead.org


    _______________________________________________
    Pspp-users mailing list
    Pspp-users@gnu.org <mailto:Pspp-users@gnu.org>
    https://lists.gnu.org/mailman/listinfo/pspp-users
    <https://lists.gnu.org/mailman/listinfo/pspp-users>




_______________________________________________
Pspp-users mailing list
Pspp-users@gnu.org
https://lists.gnu.org/mailman/listinfo/pspp-users
--

Alan D. Mead, Ph.D.
President, Talent Algorithms Inc.

science + technology = better workers

http://www.alanmead.org

I've... seen things you people wouldn't believe...
functions on fire in a copy of Orion.
I watched C-Sharp glitter in the dark near a programmable gate.
All those moments will be lost in time, like Ruby... on... Rails... Time for Pi.

           --"The Register" user Alister, applying the famous
             "Blade Runner" speech to software development


_______________________________________________
Pspp-users mailing list
Pspp-users@gnu.org
https://lists.gnu.org/mailman/listinfo/pspp-users

_______________________________________________
Pspp-users mailing list
Pspp-users@gnu.org
https://lists.gnu.org/mailman/listinfo/pspp-users

Re: Pspp-users Digest, Vol 128, Issue 3

Reply via email to