submit jobs on multi-core
Dear all, I have a python script in which I have a list of files to input one by one and for each file I get a number as an output. I used for loop to submit the file to script. My script uses one file at a time and returns the output. My computers has 8 cores. Is there any way that I could submit 8 jobs at a time and get all the output faster ? In other words, how can I modify my script so that I could submit 8 jobs together on 8 different processors ? I am bit new to this stuff, please suggest me some directions. Thank you. -- Joshi -- http://mail.python.org/mailman/listinfo/python-list
How to queue functions
Dear all, I am trying to use multiprocessing module. I have 5 functions and 2000 input files. First, I want to make sure that these 5 functions execute one after the other. Is there any way that I could queue these 5 functions within the same script ? Next, as there are 2000 input files. I could queue them by queue.put() and get back to run one by one using queue.get() as follows: for file in files: if '.dat.gz' in file: q.put(file) while True: item = q.get() x1 = f1(item) x2 = f2(x1) x3 = f3(x2) x4 = f4(x3) final_output = f5(x4) However, how can I input them on my 8 core machine, so that at a time 8 files will be processed (to the set of 5 functions; each function one after the other) ? I am bit confused with the multiprocessing, please suggest me some directions Thank you -- Joshi -- http://mail.python.org/mailman/listinfo/python-list
Distance between point and a line passing through other two points
Hello all, I have 3 points with coordinates (x0,y0,z0), (x1,y1,z1) and (x2,y2,z2). I also have a line joining points (x1,y1,z1) and (x2,y2,z2). For example, p0=[5.0, 5.0, 5.0] p1=[3.0, 3.0, 3.0] p2=[4.0, 4.0, 4.0] a = np.array(p0) b = np.array(p1) c = np.array(p2) I want to write a script that can calculate shortest distance d between point (x0,y0,z0) and the line((x1,y1,z1), (x2,y2,z2)). In other words, d = distance(a, line(b,c)) Since I have information of the coordinates of these points only, I am not sure how to put it into python script to get distance d. On searching Internet, some solutions are discussed for 2D coordinates (i.e. for (x0,y0), (x1,y1) and (x2,y2) ). However, I need solution for 3D coordinates. Any direction or suggestion would be great help. Thanking you in advance, -- Joshi -- https://mail.python.org/mailman/listinfo/python-list
how to write list in a file
Hello everyone, I am trying hard to write a list to a file as follows: def average_ELECT(pwd): os.chdir(pwd) files = filter(os.path.isfile, os.listdir('./')) folders = filter(os.path.isdir, os.listdir('./')) eelec = 0.0; evdw = 0.0; EELEC = []; elecutoff = []; g = Gnuplot.Gnuplot() for f1 in files: # if f1[21:23]=='12': if f1[27:29]==sys.argv[1]: # vdw cutoff remains constant; see 2nd column of output fl1 = open(f1,'r').readlines() # print len(fl1) for i in range(1, len(fl1)): fl1[i]=fl1[i].split() eelec = eelec + float(fl1[i][1]) evdw = evdw + float(fl1[i][2]) #print fl1[i][1], fl1[i][2] avg_eelec = eelec/40 avg_evdw = evdw/40 # print eelec, evdw # print f1[21:23], f1[27:29], avg_eelec, avg_evdw print f1[21:23], f1[27:29], avg_eelec # EELEC.append(avg_eelec); elecutoff.append(float(f1[21:23])) eelec=0.0; evde=0.0; a = f1[21:23]+' '+f1[27:29]+' '+str(avg_eelec) EELEC.append(a) print sorted(EELEC) with open('EElect_elec12-40_vdwxxx.dat','w') as wr: for i in EELEC: print i wr.write("%s\n" % i) wr.close() The script is printing "print sorted(EELEC)" as well as "print f1[21:23], f1[27:29], avg_eelec" very well. However, for some reason, I neither see any file (expected to see EElect_elec12-40_vdwxxx.dat as per the script) generated nor any error message. Could anyone suggest me correction here. Thanking you in advance. -- DJ -- https://mail.python.org/mailman/listinfo/python-list
2d color-bar map plot
Dear all, I am bit new to the python/pyplot. This might be simple, but I guess I am missing something here. I have data file as follows: 2.1576318858 -1.8651195165 4.2333428278 2.1681875208 -1.9229968780 4.1989176884 2.3387636157 -2.0376253255 2.4460899122 2.1696565965 -2.6186941271 4.4172007912 2.0848862071 -2.1708981985 3.3404520962 2.0824347942 -1.9142798955 3.3629290206 2.0281685821 -1.8103363482 2.5446721669 2.3309993378 -1.8721153619 2.7006893016 2.0957461483 -1.5379071451 4.5228264441 2.2761376261 -2.5935979811 3.9231744717 . . . (total of 200 lines) Columns 1,2,3 corresponds to x,y,z axis data points. This is not a continuous data. I wish to make a plot as a 2D with 3rd dimension (i.e z-axis data) as a color map with color bar on right hand side. As a beginner, I tried to follow tutorial with some modification as follows: http://matplotlib.org/examples/pylab_examples/tricontour_vs_griddata.html # Read data from file: fl1 = open('flooding-psiphi.dat','r').readlines() xs = ys = zs = [] for line in fl1: line = line.split() xs.append(float(line[0])) ys.append(float(line[1])) zs.append(float(line[2])) print xs[0], ys[0], zs[0] xi = np.mgrid[-5.0:5.0:200j] yi = np.mgrid[-5.0:5.0:200j] zi = griddata((x, y), z, (xi, yi), method='cubic') plt.subplot(221) plt.contour(xi, yi, zi, 15, linewidths=0.5, colors='k') plt.contourf(xi, yi, zi, 15, cmap=plt.cm.rainbow, norm=plt.Normalize(vmax=abs(zi).max(), vmin=-abs(zi).max())) plt.colorbar() # draw colorbar plt.plot(x, y, 'ko', ms=3) plt.xlim(-5, 5) plt.ylim(-5, 5) plt.title('griddata and contour (%d points, %d grid points)' % (npts, ngridx*ngridy)) #print ('griddata and contour seconds: %f' % (time.clock() - start)) plt.gcf().set_size_inches(6, 6) plt.show() However, I failed and getting long error as follows: QH6154 qhull precision error: initial facet 1 is coplanar with the interior point ERRONEOUS FACET: - f1 - flags: bottom simplicial upperDelaunay flipped - normal:0.7071 -0.70710 - offset: -0 - vertices: p600(v2) p452(v1) p304(v0) - neighboring facets: f2 f3 f4 While executing: | qhull d Qz Qbb Qt Options selected for Qhull 2010.1 2010/01/14: run-id 1531309415 delaunay Qz-infinity-point Qbbound-last Qtriangulate _pre-merge _zero-centrum Pgood _max-width 8.8 Error-roundoff 1.2e-14 _one-merge 8.6e-14 _near-inside 4.3e-13 Visible-distance 2.5e-14 U-coplanar-distance 2.5e-14 Width-outside 4.9e-14 _wide-facet 1.5e-13 precision problems (corrected unless 'Q0' or an error) 2 flipped facets The input to qhull appears to be less than 3 dimensional, or a computation has overflowed. Qhull could not construct a clearly convex simplex from points: - p228(v3): 2.4 2.4 1.4 - p600(v2): 1.4 1.4 8.8 - p452(v1): 5.7 5.7 8 - p304(v0): -3.1 -3.1 2.4 The center point is coplanar with a facet, or a vertex is coplanar with a neighboring facet. The maximum round off error for computing distances is 1.2e-14. The center point, facets and distances to the center point are as follows: center point1.5951.5955.173 facet p600 p452 p304 distance=0 facet p228 p452 p304 distance=0 facet p228 p600 p304 distance=0 facet p228 p600 p452 distance=0 These points either have a maximum or minimum x-coordinate, or they maximize the determinant for k coordinates. Trial points are first selected from points that maximize a coordinate. The min and max coordinates for each dimension are: 0:-3.134 5.701 difference= 8.835 1:-3.134 5.701 difference= 8.835 2: -2.118e-22 8.835 difference= 8.835 If the input should be full dimensional, you have several options that may determine an initial simplex: - use 'QJ' to joggle the input and make it full dimensional - use 'QbB' to scale the points to the unit cube - use 'QR0' to randomly rotate the input for different maximum points - use 'Qs' to search all points for the initial simplex - use 'En' to specify a maximum roundoff error less than 1.2e-14. - trace execution with 'T3' to see the determinant for each point. If the input is lower dimensional: - use 'QJ' to joggle the input and make it full dimensional - use 'Qbk:0Bk:0' to delete coordinate k from the input. You should pick the coordinate with the least range. The hull will have the correct topology. - determine the flat containing the points, rotate the points into a coordinate plane, and delete the other coordinates. - add one or more points to make the input full dimensional. Traceback (most recent call last): File "./scatter.py", line 43, in zi = griddata((x, y), z, (xi, yi), method='linear') File "/usr/lib/python2.7/dist-packages/scipy/interpolate/ndgriddata.py", line 183, in griddata ip = LinearNDInterpolator(points, values, fill_value=fill_value) File "interpnd.pyx", line 192, in scipy.interpolate.interpnd.LinearNDInterpolator.__init__ (scipy/interpolate/inte
[no subject]
Hello, This might be simple, but I guess I am missing something here. I have data file as follows: 2.1576318858 -1.8651195165 4.2333428278 2.1681875208 -1.9229968780 4.1989176884 2.3387636157 -2.0376253255 2.4460899122 2.1696565965 -2.6186941271 4.4172007912 2.0848862071 -2.1708981985 3.3404520962 2.0824347942 -1.9142798955 3.3629290206 2.0281685821 -1.8103363482 2.5446721669 2.3309993378 -1.8721153619 2.7006893016 2.0957461483 -1.5379071451 4.5228264441 2.2761376261 -2.5935979811 3.9231744717 . . . (total of 200 lines) Columns 1,2,3 corresponds to x,y,z axis data points. This is not a continuous data. I wish to make a plot as a 2D with 3rd dimension (i.e z-axis data) as a color map with color bar on right hand side. As a beginner, I tried to follow tutorial with some modification as follows: http://matplotlib.org/examples/pylab_examples/tricontour_vs_griddata.html # Read data from file: fl1 = open('flooding-psiphi.dat','r').readlines() xs = ys = zs = [] for line in fl1: line = line.split() xs.append(float(line[0])) ys.append(float(line[1])) zs.append(float(line[2])) print xs[0], ys[0], zs[0] xi = np.mgrid[-5.0:5.0:200j] yi = np.mgrid[-5.0:5.0:200j] zi = griddata((x, y), z, (xi, yi), method='cubic') plt.subplot(221) plt.contour(xi, yi, zi, 15, linewidths=0.5, colors='k') plt.contourf(xi, yi, zi, 15, cmap=plt.cm.rainbow, norm=plt.Normalize(vmax=abs(zi).max(), vmin=-abs(zi).max())) plt.colorbar() # draw colorbar plt.plot(x, y, 'ko', ms=3) plt.xlim(-5, 5) plt.ylim(-5, 5) plt.title('griddata and contour (%d points, %d grid points)' % (npts, ngridx*ngridy)) #print ('griddata and contour seconds: %f' % (time.clock() - start)) plt.gcf().set_size_inches(6, 6) plt.show() However, I failed and getting long error as follows: QH6154 qhull precision error: initial facet 1 is coplanar with the interior point ERRONEOUS FACET: - f1 - flags: bottom simplicial upperDelaunay flipped - normal:0.7071 -0.70710 - offset: -0 - vertices: p600(v2) p452(v1) p304(v0) - neighboring facets: f2 f3 f4 While executing: | qhull d Qz Qbb Qt Options selected for Qhull 2010.1 2010/01/14: run-id 1531309415 delaunay Qz-infinity-point Qbbound-last Qtriangulate _pre-merge _zero-centrum Pgood _max-width 8.8 Error-roundoff 1.2e-14 _one-merge 8.6e-14 _near-inside 4.3e-13 Visible-distance 2.5e-14 U-coplanar-distance 2.5e-14 Width-outside 4.9e-14 _wide-facet 1.5e-13 precision problems (corrected unless 'Q0' or an error) 2 flipped facets The input to qhull appears to be less than 3 dimensional, or a computation has overflowed. Qhull could not construct a clearly convex simplex from points: - p228(v3): 2.4 2.4 1.4 - p600(v2): 1.4 1.4 8.8 - p452(v1): 5.7 5.7 8 - p304(v0): -3.1 -3.1 2.4 The center point is coplanar with a facet, or a vertex is coplanar with a neighboring facet. The maximum round off error for computing distances is 1.2e-14. The center point, facets and distances to the center point are as follows: center point1.5951.5955.173 facet p600 p452 p304 distance=0 facet p228 p452 p304 distance=0 facet p228 p600 p304 distance=0 facet p228 p600 p452 distance=0 These points either have a maximum or minimum x-coordinate, or they maximize the determinant for k coordinates. Trial points are first selected from points that maximize a coordinate. The min and max coordinates for each dimension are: 0:-3.134 5.701 difference= 8.835 1:-3.134 5.701 difference= 8.835 2: -2.118e-22 8.835 difference= 8.835 If the input should be full dimensional, you have several options that may determine an initial simplex: - use 'QJ' to joggle the input and make it full dimensional - use 'QbB' to scale the points to the unit cube - use 'QR0' to randomly rotate the input for different maximum points - use 'Qs' to search all points for the initial simplex - use 'En' to specify a maximum roundoff error less than 1.2e-14. - trace execution with 'T3' to see the determinant for each point. If the input is lower dimensional: - use 'QJ' to joggle the input and make it full dimensional - use 'Qbk:0Bk:0' to delete coordinate k from the input. You should pick the coordinate with the least range. The hull will have the correct topology. - determine the flat containing the points, rotate the points into a coordinate plane, and delete the other coordinates. - add one or more points to make the input full dimensional. Traceback (most recent call last): File "./scatter.py", line 43, in zi = griddata((x, y), z, (xi, yi), method='linear') File "/usr/lib/python2.7/dist-packages/scipy/interpolate/ndgriddata.py", line 183, in griddata ip = LinearNDInterpolator(points, values, fill_value=fill_value) File "interpnd.pyx", line 192, in scipy.interpolate.interpnd.LinearNDInterpolator.__init__ (scipy/interpolate/interpnd.c:2598) File "qhull.pyx", line
How to read columns in python
I am bit new to python and programming and this might be a basic question: I have a file containing 3 columns. first two columns are x and y axes and third column is their corresponding values in the graph. I want to read this in a matrix as well as plot in 2D. Could anyone tell me how to do so the things. Thanking you in advance -- Dhananjay -- http://mail.python.org/mailman/listinfo/python-list
Re: How to read columns in python
Well, The three columns are tab separated and there are 200 such rows having these 3 columns in the file. First two columns are x and y coordinates and third column is the corresponding value. I want to read this file as a matrix in which column1 correspond to row, column2 corresponds to columns(in matrix) and column3 corresponds to value in the matrix. -- Dhananjay On Tue, Feb 24, 2009 at 12:18 PM, Chris Rebert wrote: > On Mon, Feb 23, 2009 at 10:41 PM, Dhananjay > wrote: > > I am bit new to python and programming and this might be a basic > question: > > > > I have a file containing 3 columns. > > Your question is much too vague to answer. What defines a "column" for > you? Tab-separated, comma-separated, or something else altogether? > > - Chris > > -- > Follow the path of the Iguana... > http://rebertia.com > -- -- -- Dhananjay C Joshi Project Assistant Lab of Structural Biology, C D F D, ECIL Road, Nacharam Hyderabad-500 076, INDIA Tel : +91-40-27151344 Fax : +91-40-27155610 -- -- http://mail.python.org/mailman/listinfo/python-list
Regarding sort()
Hello All, I have data set as follows: 24 GLU3 47 LYS 6 3.9092331 42 PRO5 785 VAL 74 4.145114 1 54 LYS6 785 VAL 74 4.305017 1 55 LYS6 785 VAL 74 4.291098 1 56 LYS7 785 VAL 74 3.968647 1 58 LYS7 772 MET 73 4.385121 1 58 LYS7 778 MET 73 4.422980 1 58 LYS7 779 MET 73 3.954990 1 58 LYS7 785 VAL 74 3.420554 1 59 LYS7 763 GLN 72 4.431955 1 59 LYS7 767 GLN 72 3.844037 1 59 LYS7 785 VAL 74 3.725048 1 I want to sort the data on the basis of 3rd column first and latter want to sort the sorted data (in first step) on the basis of 6th column. I tried sort() function but could not get the way how to use it. I am new to programming, please tell me how can I sort. Thanking you in advance regards -- -- Dhananjay C Joshi Project Assistant Lab of Structural Biology, CDFD, Bldg.7, Gruhakalpa 5-4-399/B, Nampally Hyderabad- 51, India Tel: +91-40-24749404 Fax: +91-40-24749448 -- -- http://mail.python.org/mailman/listinfo/python-list
Re: benchmark
On Aug 7, 6:12 pm, alex23 <[EMAIL PROTECTED]> wrote: > On Aug 7, 8:08 pm, [EMAIL PROTECTED] wrote: > > > Really how silly can it be when you suggest someone is taking a > > position and tweaking the benchmarks to prove a point [...] > > I certainly didn't intend to suggest that you had tweaked -anything- > to prove your point. While that was not how I read it first, I assume that was a misjudged reading. > I do, however, think there is little value in slavishly implementing > the same algorithm in different languages. To constrain a dynamic > language by what can be achieved in a static language seemed like such > an -amazingly- artificial constraint to me. That you're a fan of > Python makes such a decision even more confusing. It is a sufficiently well understood maxim, that any comparison between two factors should attempt to keep other factors as equal as possible (Ceteris Paribus - Everything else being equal), slavishly if you will. It is my perception that had I changed the algorithms, I would've been a much higher level of criticism a lot more for comparing apples and oranges. I simply could not understand your point with regards to dynamic vs. static languages. If you are by any chance referring to make the code a little less OO, I believe the entire exercise could be redone using a procedural algorithm, and all the languages will run much much faster than they currently do. But that would be essentially moving from an OO based design to a procedural design. Is that what you are referring to (I suspect not .. I suspect it is something else) ? If not, would certainly appreciate you spending 5 mins describing that. I am a fan of Python on its own merits. There is little relationship between that and this exercise. > It's great that you saw value in Python enough to choose it for actual > project work. It's a shame you didn't endeavour to understand it well > enough before including it in your benchmark. I have endeavoured hard, and maybe there's a shortcoming in the results of that endeavour. But I haven't quite understood what it is I haven't understood (hope that makes sense :) ) > As for it being "disappointing", the real question is: has it been > disappointing for you in actual real-world code? I am extremely happy with it. But there definitely are some projects I worked on earlier I would simply not choose any dynamic language for (not ruby / not python / not ruby / not groovy). These languages simply cannot be upto the performance demands required of some projects. > Honestly, performance benchmarks seem to be the dick size comparison > of programming languages. Not sure if there is a real life equivalent use case if I was to use this analogy further. But there are some days (mind you not most days) one needs a really big dick. Always helpful to know the size. -- http://mail.python.org/mailman/listinfo/python-list
Re: benchmark
On Aug 7, 11:58 pm, Terry Reedy <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > Are there any implications of using psyco ? > > It compiles statements to machine code for each set of types used in the > statement or code block over the history of the run. So code used > polymorphically with several combinations of types can end up with > several compiled versions (same as with C++ templates). (But a few > extra megabytes in the running image is less of an issue than it was > even 5 or so years ago.) And time spent compiling for a combination > used just once gains little. So it works best with numeric code used > just for ints or floats. > > Terry J. Reedy Sounds to me very much like polymorphic inline caching / site caching, which is something I have seen been worked upon and getting introduced in recent versions of groovy / jruby and ruby 1.9 (and I read its being looked at in Microsoft CLR as well .. but I could be wrong there). I am no expert in this so please correct me if I deserve to be. But if site caching is indeed being adopted by so many dynamic language runtime environments, I kind of wonder what makes python hold back from bringing it in to its core. Is it that a question of time and effort, or is there something that doesn't make it appropriate to python ? Cheers, Dhananjay -- http://mail.python.org/mailman/listinfo/python-list
count
Dear all, I have file as follows,however, tab seperated (not shown in following file): 6 3 4.309726 7 65 93.377388 8 47 50.111952 9 270 253.045923 10 184182.684670 11 76 121.853455 12 85 136.283470 13 114145.910662 14 45 80.703013 15 44 47.154646 16 41 66.461339 17 16 33.819488 18 127 136.105455 19 70 88.798681 20 29 61.297823 I wanted to sort column 2 in assending order and I read whole file in array "data" and did the following: data.sort(key = lambda fields:(fields[2])) I have sorted column 2, however I want to count the numbers in the column 2. i.e. I want to know, for example, how many repeates of say '3' (first row, 2nd column in above data) are there in column 2. I could write seperate programme to get the result.s. However, is there any way to count the numbers there itself while sorting in column 2 ? Thanking you in advance, -- Dhananjay -- http://mail.python.org/mailman/listinfo/python-list
Re: [BangPypers] extracting unicode text from pdfs
You may want to try out pdfminer. Its very similar to xpdf in structure and should give you the parsed data into unicode directly. On Mon, May 24, 2010 at 7:13 PM, Eknath Venkataramani wrote: > I have around 45 pdfs to convert into raw text containing text in _HINDI_ . > When I use the xpdf package, the generated text is very weird, so I'd like > to write a program which would convert the pdf text into Unicode text as it > is. > > The fonts used in the pdfs: > name type emb sub uni object > ID > - --- --- --- > - > APKAPP+Usha-Bold Type 1C yes yes yes 72 > 0 > APKBBB+Agenda-Light Type 1C yes yes yes 77 > 0 > APKBGF+Usha Type 1C yes yes yes 41 > 0 > APKBKJ+Agenda-Medium Type 1C yes yes yes 46 > 0 > APKBON+Agenda-Bold Type 1C yes yes yes 49 > 0 > > For eg. in the pdf: आदमी मुसाफिर है > when I use pdftotext, I get some very weird symbols: '... > ...' > while i'd like 'आदमी मुसाफिर है' to be the output > > > -- > Eknath Venkataramani > ___ > BangPypers mailing list > bangpyp...@python.org > http://mail.python.org/mailman/listinfo/bangpypers > -- blog: http://blog.dhananjaynene.com twitter: http://twitter.com/dnene -- http://mail.python.org/mailman/listinfo/python-list