Hello all, I'm afraid I am new to all this so bear with me...
I am looking to find the statistical significance between two large netCDF data sets. Firstly I've loaded the two files into python: swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r') swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r') I have then isolated the variables I want to perform the pearson correlation on: hs=swh.variables['hs'] hs_2050s=swh_2050s.variables['hs'] Here is the metadata for those files: print hs <type 'netCDF4.Variable'> int16 hs(time, latitude, longitude) standard_name: significant_height_of_wind_and_swell_waves long_name: significant_wave_height units: m add_offset: 0.0 scale_factor: 0.002 _FillValue: -32767 missing_value: -32767 unlimited dimensions: time current shape = (86400, 350, 227) print hs_2050s <type 'netCDF4.Variable'> int16 hs(time, latitude, longitude) standard_name: significant_height_of_wind_and_swell_waves long_name: significant_wave_height units: m add_offset: 0.0 scale_factor: 0.002 _FillValue: -32767 missing_value: -32767 unlimited dimensions: time current shape = (86400, 350, 227) Then to perform the pearsons correlation: from scipy.stats.stats import pearsonr pearsonr(hs,hs_2050s) I then get a memory error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr x = np.asarray(x) File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray return array(a, dtype, copy=False, order=order) MemoryError This also happens when I try to create numpy arrays from the data. Does anyone know how I can alleviate theses memory errors? Cheers, Jamie -- https://mail.python.org/mailman/listinfo/python-list