Salaam Mahmood,
On 08/01/2022 12.07, Mahmood Naderan via Python-list wrote: > I have a csv file like this > V0,V1,V2,V3 > 4,1,1,1 > 6,4,5,2 > 2,3,6,7 > > And I want to search two rows for a match and find the column. For > example, I want to search row[0] for 1 and row[1] for 5. The corresponding > column is V2 (which is the third column). Then I want to return the value > at row[2] and the found column. The result should be 6 then. Not quite: isn't the "found column" also required? > I can manually extract the specified rows (with index 0 and 1 which are > fixed) and manually iterate over them like arrays to find a match. Then I Perhaps this idea has been influenced by a similar solution in another programming language. May I suggest that the better-answer you seek lies in using Python idioms (as well as Python's tools)... > key1 = 1 > key2 = 5 Fine, so far - excepting that this 'problem' is likely to be a small part of some larger system. Accordingly, consider writing it as a function. In which case, these two "keys" will become function-parameters (and the two 'results' become return-values). > row1 = df.iloc[0] # row=[4,1,1,1] > row2 = df.iloc[1] # row=[6,4,5,2] This is likely not native-Python. Let's create lists for 'everything', just-because: >>> headings = [ "V0","V1","V2","V3" ] >>> row1 = [4,1,1,1] >>> row2 = [6,4,5,2] >>> results = [ 2,3,6,7 ] Note how I'm using the Python REPL (in a "terminal", type "python" (as appropriate to your OpSys) at the command-line). IMHO the REPL is a grossly under-rated tool, and is a very good means towards trial-and-error, and learning by example. Highly recommended! > for i in range(len(row1)): This construction is very much a "code smell" for thinking that it is not "pythonic". (and perhaps the motivation for this post) In Python (compared with many other languages) the "for" loop should actually be pronounced "for-each". In other words when we pair the code-construct with a list (for example): for each item in the list the computer should perform some suite of commands. (the "suite" is everything 'inside' the for-each-loop - NB my 'Python-betters' will quickly point-out that this feature is not limited to Python-lists, but will work with any :iterable" - ref: https://docs.python.org/3/tutorial/controlflow.html#for-statements) Thus: > for item in headings: print( item ) ... V0 V1 V2 V3 The problem is that when working with matrices/matrixes, a math background equips one with the idea of indices/indexes, eg the ubiquitous subscript-i. Accordingly, when reading 'math' where a formula uses the upper-case Greek "sigma" character, remember that it means "for all" or "for each"! So, if Python doesn't use indexing or "pointers", how do we deal with the problem? Unfortunately, at first glance, the pythonic approach may seem more-complicated or even somewhat convoluted, but once the concepts (and/or the Python idioms) are learned, it is quite manageable (and applicable to many more applications than matrices/matrixes!)... > if row1[i] == key1: > for j in range(len(row2)): > if row2[j] == key2: > res = df.iloc[:,j] > print(res) # 6 > > Is there any way to use built-in function for a more efficient code? This is where your idea bears fruit! There is a Python "built-in function": zip(), which will 'join' lists. NB do not become confused between zip() and zip archive/compressed files! Most of the time reference book and web-page examples show zip() being used to zip-together two lists into a single data-construct (which is an iterable(!)). However, zip() will actually zip-together multiple (more than two) "iterables". As the manual says: «zip() returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument iterables.» Ah, so that's where the math-idea of subscript-i went! It has become 'hidden' in Python's workings - or putting that another way: Python looks after the subscripting for us (and given that 'out by one' errors in pointers is a major source of coding-error in other languages, thank-you very much Python!) First re-state the source-data as Python lists, (per above) - except that I recommend the names be better-chosen to be more meaningful (to your application)! Now, (in the REPL) try using zip(): >>> zip( headings, row1, row2, results ) <zip object at 0x7f655cca6bc0> Does that seem a very good illustration? Not really, but re-read the quotation from the manual (above) where it says that zip returns an iterator. If we want to see the values an iterator will produce, then turn it into an iterable data-structure, eg: >>> list( zip( headings, row1, row2, results ) ) [('V0', 4, 6, 2), ('V1', 1, 4, 3), ('V2', 1, 5, 6), ('V3', 1, 2, 7)] or, to see things more clearly, let me re-type it as: [ ('V0', 4, 6, 2), ('V1', 1, 4, 3), ('V2', 1, 5, 6), ('V3', 1, 2, 7) ] What we now see is actually a "transpose" of the original 'matrix' presented in the post/question! (NB Python will perform this layout for us - read about the pprint library) Another method which can also be employed (and which will illustrate the loop required to code the eventual-solution(!)) is that Python's next() will extract the first row of the transpose: >>> row = next( zip( headings, row1, row2, results ) ) >>> row ('V0', 4, 6, 2) This is all-well-and-good, but that result is a tuple of four items (corresponding to one column in the way the source-data was explained). If we need to consider the four individual data-items, that can be improved using a Python feature called "tuple unpacking". Instead of the above delivering a tuple which is then assigned to "row", the tuple can be assigned to four "identifiers", eg >>> heading, row1_item, row2_item, result= next( zip( headings, row1, row2, results ) ) (apologies about email word-wrapping - this is a single line of Python-code) Which, to prove the case, could be printed: >>> heading, row1_item, row2_item, result ('V0', 4, 6, 2) (ref: https://docs.python.org/3/tutorial/datastructures.html?highlight=tuple%20unpacking#tuples-and-sequences) Thus, if we repeatedly ask for the next() row from the zip-ped transpose, eventually it will respond with the row starting 'V2' - which is the desired-result, ie the row containing the 1, the 5, and the 6 - and if you follow-through using the REPL, will be clearly visible. Finally, 'all' that is required, is a for-each-loop which will iterate across/down the zip object, one tuple (row of the transpose) at a time, AND perform the "tuple-unpacking" all in one command, with an if-statement to detect the correct row/column: >>> for *tuple-unpacking* in *zip() etc*: ... if row1_item == *what?* and row2_item == *what?* ... print( *which* and *which identifier* ) ... V2 6 Yes, three lines. It's as easy as that! (when you know how) Worse: when you become more expert, you'll be able to compress all of that down into a single-line solution - but it won't be as "readable" as is this! NB this question has a 'question-smell' of 'homework', so I'll not complete the code for you - this is something *you* asked to learn and the best way to learn is by 'doing' (not by 'reading'). However, please respond with your solution, or any further question (with the next version of the code so-far, per this first-post - which we appreciate!) Regardless, you asked 'the right question' (curiosity is the key to learning) and in the right way/manner. Well done! NBB the above code-outline does not consider the situation where the search fails/the keys are not found! For further information, please review: https://docs.python.org/3/library/functions.html?highlight=zip#zip Also, further to the above discussion of combining lists and loops: https://docs.python.org/3/tutorial/datastructures.html?highlight=zip#looping-techniques and with a similar application (to this post): https://docs.python.org/3/faq/programming.html?highlight=zip#how-can-i-sort-one-list-by-values-from-another-list -- Regards, =dn -- https://mail.python.org/mailman/listinfo/python-list