Alternatively, you could hack up a small perl program, using the pspp.pm module to read the file and write it out however you wish. That would also be far more optimal that using pspp to do the task.
On Wed, Dec 04, 2019 at 09:09:34AM -0800, Ben Pfaff wrote: That *is* higher than I would expect. Do you see less disk activity if you use the "pspp-convert" program? It does not have the exact feature you want (in particular the /CELLS=LABELS part) but it is better optimized in general for that particular task. On Wed, Dec 4, 2019 at 4:42 AM Dave Trollope <d...@knowledgehound.com> wrote: > > We just moved Pspp to Kubernetes containers where we use it to extract csvs from sav files. The sav files are about 1gb and each csv is about 150mb. > > We???ve watched the file system as it does it and over 7gb of the file system is used while writing 150mb. I assume the SAVE command is doing lots of seeks and insertions in the file magnifying the file system usage. Any options to limit this behavior? > > Here is the script we are using > GET FILE = "{}" > > SAVE TRANSLATE > /OUTFILE="{}" > /TYPE=CSV > /FIELDNAMES > /REPLACE > /KEEP={} > /MISSING=RECODE > /CELLS=LABELS. > Cheers > Dave > -- Avoid eavesdropping. Send strong encrypted email. PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3