On Tue, Jun 28, 2016 at 10:08 AM Hedieh Ebrahimi <heml...@gmail.com> wrote:
> File 1 has : > x1,y1,z1 > x2,y2,z2 > .... > > and file2 has : > x1,y1,z1,value1 > x2,y2,z2,value2 > x3,y3,z3,value3 > ... > > I need to read the coordinates from file 1 and then interpolate a value > for these coordinates on file 2 to the closest coordinate possible. The > problem is file 2 is has around 5M lines. So I was wondering what would be > the fastest approach? > Is this a one-time task, or something you'll need to repeat frequently? How many points need to be interpolated? How do you define distance? Euclidean 3d distance? K-nearest? 5 million can probably fit into memory, so it's not so bad. NumPy is a good option for broadcasting the distance function across all 5 million labeled points for each unlabeled point. Given that file format, NumPy can probably read from file directly into an array. http://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy -- https://mail.python.org/mailman/listinfo/python-list