In our cluster testing, we allocate at least 16G to each recon-all command (see below). We run about 150 of those commands in parallel on the cluster during off-peak usage and it takes about 12 hours for them to complete.
$ jobsubmit -A fsm -p basic -t 1-00:00:00 -m 16G "recon-all -all -s 0042 -I /path_to_subject_0042/001.mgz The system requirements, https://surfer.nmr.mgh.harvard.edu/fswiki/SystemRequirements, stating 8G for recon all is intended for a users stand alone machine where memory is limited, there are no time/job resource limits - and so it might take 15 hours for a single recon-all to run. - R. On Mar 28, 2024, at 15:37, Dong, Yilei <yid...@health.ucsd.edu<mailto:yid...@health.ucsd.edu>> wrote: External Email - Use Caution Hi Yujing, For context, the way how I have the recon-all set-up is we have a recon-all job submission script that runs recon-all for 1 image. Given the folder of 100 MRI images, we have another script that calls upon the recon-all script for each image within the folder via a for loop. The result is 100 jobs running in parallel on the cluster. I have attached a screenshot of the parameters we set whenever we submit each recon-all job for each image by SLURM to UCSD's cluster. The maximum time we are allowed for each job is 48 hours. However, my recon-all for all 100 subjects takes on average 4.5 to 6 hours to run. Each image is also allocated 8 GB of memory. Our jobs are also shared-node jobs, which means we run more than 1 job on a single node. Should we allocate more memory to each job than 8 GB or just reduce the number of jobs we submit at a time if 100 is a lot? We were able to successfully run through the entire recon-all for 5 subjects in parallel in the past. Sincerely, Yilei ________________________________ From: freesurfer-boun...@nmr.mgh.harvard.edu<mailto:freesurfer-boun...@nmr.mgh.harvard.edu> <freesurfer-boun...@nmr.mgh.harvard.edu<mailto:freesurfer-boun...@nmr.mgh.harvard.edu>> on behalf of Huang, Yujing <yhuan...@mgh.harvard.edu<mailto:yhuan...@mgh.harvard.edu>> Sent: Thursday, March 28, 2024 11:53 AM To: Freesurfer support list <freesurfer@nmr.mgh.harvard.edu<mailto:freesurfer@nmr.mgh.harvard.edu>> Subject: Re: [Freesurfer] recon-all nu_correct error Hi Yilei, I think "nu_correct: Command not found. FREESURFER: Undefined variable." is unrelated to your recon-all didn’t finish. That error happens in the beginning of recon-all and recon-all uses ANTS N4BiasFieldCorrection for nu correct instead. We have addressed this error in our dev version. I’m wondering if your cluster computer enforces some time, memory … limit on your job. Best, Yujing From: freesurfer-boun...@nmr.mgh.harvard.edu<mailto:freesurfer-boun...@nmr.mgh.harvard.edu> <freesurfer-boun...@nmr.mgh.harvard.edu<mailto:freesurfer-boun...@nmr.mgh.harvard.edu>> On Behalf Of Dong, Yilei Sent: Thursday, March 28, 2024 2:18 PM To: Freesurfer support list <freesurfer@nmr.mgh.harvard.edu<mailto:freesurfer@nmr.mgh.harvard.edu>> Subject: [Freesurfer] recon-all nu_correct error External Email - Use Caution Hi Freesurfer support, Recently, I submitted a script to run recon-all on 100 MRI images on UCSD's computing cluster. I get an error file that says "nu_correct: Command not found. FREESURFER: Undefined variable." I previously emailed this problem and was told to send my recon-all.log if running the recon-all did not work a 2nd time. I was also told previously that my cluster's Freesurfer can't locate the minc toolkit binaries on my system. How can this error be troubleshooted? Please let me know if you need any other information. Freesurfer version: 7.2.0, but already available as module Platform: Rocky Linux release 8.8 (Green Obsidian) uname -a: Linux login01 4.18.0-477.15.1.el8_8.x86_64 #1 SMP Wed Jun 28 15:04:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux Recon-all log: see attached Sincerely, Yilei <Screen Shot 2024-03-28 at 12.01.54 PM.png>_______________________________________________ Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu<mailto:Freesurfer@nmr.mgh.harvard.edu> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
_______________________________________________ Freesurfer mailing list Freesurfer@nmr.mgh.harvard.edu https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Mass General Brigham Compliance HelpLine at https://www.massgeneralbrigham.org/complianceline <https://www.massgeneralbrigham.org/complianceline> . Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.