yadin wrote:
On May 20, 6:53 pm, norseman <norse...@hughes.net> wrote:
bearophileh...@lycos.com wrote:
yadin:
How can I build up a program that tells me that this sequence
1000028706
1000028707
1000028708
is repeated somewhere in the column, and how can i know where?
Can such patterns nest? That is, can you have a repeated pattern made
of an already seen pattern plus something else?
If you don't want a complex program, then you may need to specify the
problem better.
You may want something like LZ77 or releated (LZ78, etc):
http://en.wikipedia.org/wiki/LZ77
This may have a bug:
http://code.activestate.com/recipes/117226/
Bye,
bearophile
============================================
index on column
Ndx1 is set to index #1
Ndx2 is set to index #2
test Ndx1 against Ndx2
if equal write line number and column content to a file
(that's two things on one line: 15 1000028706
283 1000028706 )
Ndx1 is set to Ndx2
Ndx2 is set to index #next
loop to test writing out each duplicate set
Then use the outfile and index on line number
In similar manor, check if line current and next line line numbers are
sequential. If so scan forward to match column content of lower line
number and check first matched column's line number and next for
sequential. Print them out if so
everything in outfile has 1 or more duplicates
4 aa |--
5 bb |-- | thus 4/5 match 100/101
6 cc | |
. | |
100 aa | |--
101 bb |--
102 ddd
103 cc there is a duplicate but not a sequence
200 ff
mark duplicate sequences as tested and proceed on through
seq1 may have more than one other seq in file.
the progress is from start to finish without looking back
thus each step forward has fewer lines to test.
marking already knowns eliminates redundant sequence testing.
By subseting on pass1 the expensive testing is greatly reduced.
If you know your subset data won't exceed memory then the "outfile"
can be held in memory to speed things up considerably.
Today is: 20090520
no code
Steve- Hide quoted text -
- Show quoted text -
this is the program...I wrote but is not working
I have a list of valves, and another of pressures;
If I am ask to find out which ones are the valves that are using all
this set of pressures, wanted best pressures
this is the program i wrote but is not working properly, it suppossed
to return in the case
find all the valves that are using pressures 1 "and" 2 "and" 3.
It returns me A, A2, A35....
looking at the data that seems correct.
there are 3 '1's in the list, 1-A, 1-A2, 1-A35
there are 2 '2's in the list, 2-A, 2-A2
there are 2 '3's in the list, 3-A, 3-A2
and so on
after the the two sets are paired
indexing on the right yields 1-A,2-A,3-A,1-A2,2-A2,3-A2,7-A4...
indexing on the left yiels1 1-A,1-A2,1-A35,2-A,2-A2,3-A,3-A2,7-A4...
and the two 78s would pair with a G and with a G2 (78-G, 78-G2)
beyond that I'm a bit lost.
20090521 Steve
The correct answer supposed to be A and A2...
if I were asked for pressures 56 and 78 the correct answer supossed to
be valves G and G2...
Valves = ['A','A','A','G', 'G', 'G',
'C','A2','A2','A2','F','G2','G2','G2','A35','A345','A4'] ##valve names
pressures = [1,2,3,4235,56,78,12, 1, 2, 3, 445, 45,56,78,1, 23,7] ##
valve pressures
result = []
bestpress = [1,2,3] ##wanted base pressures
print bestpress,'len bestpress is' , len(bestpress)
print len(Valves)
print len(Valves)
for j in range(len(Valves)):
#for i in range(len(bestpress)):
#for j in range(len(Valves)):
for i in range(len(bestpress)-2):
if pressures [j]== bestpress[i] and bestpress [i+1]
==pressures [j+1] and bestpress [i+2]==pressures [j+2]:
result.append(Valves[j])
#i = i+1
#j = j+1
# print i, j, bestpress[i]
print "common PSVs are", result
--
http://mail.python.org/mailman/listinfo/python-list