Hi,
Yes, working with binary formats is the way to go when you have large data.
But for further
reference, Dask[1] fits perfectly for your use case, see below how I
process a 7Gb
text file under 17 seconds (in a laptop: mbp + quad-core + ssd).
# Create roughly ~7Gb worth text data.
In [40]: impo
On Thu, Dec 1, 2016 at 3:26 AM, BartC wrote:
> On 30/11/2016 16:16, Heli wrote:
>>
>> Hi all,
>>
>> Writing my ASCII file once to either of pickle or npy or hdf data types
>> and then working afterwards on the result binary file reduced the read time
>> from 80(min) to 2 seconds.
>
>
> 240,000% f
On 30/11/2016 16:16, Heli wrote:
Hi all,
Writing my ASCII file once to either of pickle or npy or hdf data types and
then working afterwards on the result binary file reduced the read time from
80(min) to 2 seconds.
240,000% faster? Something doesn't sound quite right! How big is the
file
Hi all,
Writing my ASCII file once to either of pickle or npy or hdf data types and
then working afterwards on the result binary file reduced the read time from
80(min) to 2 seconds.
Thanks everyone for your help.
--
https://mail.python.org/mailman/listinfo/python-list
On Wed, 30 Nov 2016 01:17 am, Heli wrote:
> The following line which reads the entire 7.4 GB file increments the
> memory usage by 3206.898 MiB (3.36 GB). First question is Why it does not
> increment the memory usage by 7.4 GB?
>
> f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
On 29/11/2016 14:17, Heli wrote:
Hi all,
Let me update my question, I have an ascii file(7G) which has around 100M
lines. I read this file using :
f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
x=f[:,1]
y=f[:,2]
z=f[:,3]
id=f[:,0]
I will need the x,y,z and id arrays later
On Tuesday, November 29, 2016 at 3:18:29 PM UTC+1, Heli wrote:
> Hi all,
>
> Let me update my question, I have an ascii file(7G) which has around 100M
> lines. I read this file using :
>
> f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
>
> x=f[:,1]
> y=f[:,2]
> z=f[:,3]
Heli writes:
> Hi all,
>
> Let me update my question, I have an ascii file(7G) which has around
> 100M lines. I read this file using :
>
> f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
>
> x=f[:,1]
> y=f[:,2]
> z=f[:,3]
> id=f[:,0]
>
> I will need the x,y,z and id arrays
Hi all,
Let me update my question, I have an ascii file(7G) which has around 100M
lines. I read this file using :
f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
x=f[:,1]
y=f[:,2]
z=f[:,3]
id=f[:,0]
I will need the x,y,z and id arrays later for interpolations. The prob
On Sat, 26 Nov 2016 02:17 am, Heli wrote:
> Hi,
>
> I have a huge ascii file(40G) and I have around 100M lines. I read this
> file using :
>
> f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
[...]
> I will need the x,y,z and id arrays later for interpolations. The problem
> is
On 25/11/2016 15:17, Heli wrote:
I have a huge ascii file(40G) and I have around 100M lines. I read this file
using :
f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
x=f1[:,1]
y=f1[:,2]
z=f1[:,3]
id=f1[:,0]
I will need the x,y,z and id arrays later for interpolations. The p
Heli :
> I have a huge ascii file(40G) and I have around 100M lines. I read this
> file using :
>
> [...]
>
> The problem is reading the file takes around 80 min while the
> interpolation only takes 15 mins.
>
> I was wondering if there is a more optimized way to read the file that
> would reduce
Hi,
I have a huge ascii file(40G) and I have around 100M lines. I read this file
using :
f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)
x=f1[:,1]
y=f1[:,2]
z=f1[:,3]
id=f1[:,0]
I will need the x,y,z and id arrays later for interpolations. The problem is
reading the file t
13 matches
Mail list logo