Sequence and/or pattern matching

2005-10-19 Thread Séb
Hi everyone,

I'm relatively new to python and I want to write a piece of code who do
the following work for data mining purpose :

1) I have a list of connexion between some computers. This list has
this format :

Ip A   Date Ip B
......  ...
192.168.0.119.10.2005   192.168.0.2
192.168.0.319.10.2005   192.168.0.1
192.168.0.419.10.2005   192.168.0.6
......  ...

2) I want to find if there are unknown sequences of connexions in my
data and if these sequences are repeated along the file :

For example :

Computer A connects to Computer B then
Computer B connects to Computer C then
Computer C connects to Computer A

3) Then, the software gives the sequences it has found and how many
times they appear...

I hope this is clear, point 2) is where I have my main problem. Has
someone an idea where to start and if there's already something coded ?

Thanks

Séb

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sequence and/or pattern matching

2005-10-19 Thread Séb

> Essentially, if I understand correctly, you want to detect LOOPS given a
> sequence of directed connections A->B.  "loop detection" and "graph"
> would then be the keywords to search for, in this case.

Exactly, but the sequence has to be discovered by the piece of code !

> Does this "then" imply you're only interested in loops occurring in this
> *sequence*, i.e., is order of connections important?  If the sequence of
> directed connections was, say, in the different order:
>
> B->C
> A->B
> C->A
>
> would you want this detected as a loop, or not?

Yes, it would be nice to detect it as a loop, with for example a
threshold. Btw, it would be nice to ignore additional connections in
such a way :

B->C # Normal connection
D->E # Additional connection to ignore
A->B # Normal connection
C->A # Normal connection

Would it be possible ?

Thank you very much

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sequence and/or pattern matching

2005-10-20 Thread Séb
Hi everybody,

Thanks for the time taken to answer my question. Unfortunatly, it seems
that there's a little confusion about what I want to do.

In fact, I don't want to search for a particular path between
computers. What I really want is to detect sequences of connection that
are repeated along the log. Is it clearer, if not, I will put another
exmample ;-)

Thank you !

Ps Python community is very nice, I'm glad I learn this language !

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Sequence and/or pattern matching

2005-10-20 Thread Séb
Sorry for the confusion, I think my example was unclear. Thank you Mike
for this piece of code who solves a part of my problem. In fact, the
sequences are unknown at the beginning, so the first part of the code
has to find possible sequences and if those sequences are repeated,
counts how many time they appear (as your code does).

I have found this morning that there's a software produced by i2
software who does this kind of job, but for telephone call analysis.
Maybe the description could help to better understand my goal :

http://www.i2.co.uk/products/Pattern_Tracer/default.asp 

Séb

-- 
http://mail.python.org/mailman/listinfo/python-list