subject:"Catogorising strings into random versus non\-random"

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Rick Johnson

On Sunday, December 20, 2015 at 10:22:57 PM UTC-6, Chris Angelico wrote: > DuckDuckGo doesn't give a result count, so I skipped it. Yahoo search yielded: So why bother to mention it then? Is this another one of your "pikeish" propaganda campaigns? -- https://mail.python.org/mailman/listinfo/pyth

Re: Catogorising strings into random versus non-random

2015-12-21 Thread duncan smith

On 21/12/15 16:49, Ian Kelly wrote: > On Mon, Dec 21, 2015 at 9:40 AM, duncan smith wrote: >> Finite state machine / transition matrix. Learn from some English text >> source. Then process your strings by lower casing, replacing underscores >> with spaces, removing trailing numeric characters etc.

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Paul Rubin

Steven D'Aprano writes: > Does anyone have any suggestions for how to do this? Preferably something > already existing. I have some thoughts and/or questions: I think I'd just look at the set of digraphs or trigraphs in each name and see if there are a lot that aren't found in English. > - I thi

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Mark Lawrence

On 21/12/2015 16:49, Ian Kelly wrote: On Mon, Dec 21, 2015 at 9:40 AM, duncan smith wrote: Finite state machine / transition matrix. Learn from some English text source. Then process your strings by lower casing, replacing underscores with spaces, removing trailing numeric characters etc. Base

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Ian Kelly

On Mon, Dec 21, 2015 at 9:40 AM, duncan smith wrote: > Finite state machine / transition matrix. Learn from some English text > source. Then process your strings by lower casing, replacing underscores > with spaces, removing trailing numeric characters etc. Base your score > on something like the

Re: Catogorising strings into random versus non-random

2015-12-21 Thread duncan smith

On 21/12/15 03:01, Steven D'Aprano wrote: > I have a large number of strings (originally file names) which tend to fall > into two groups. Some are human-meaningful, but not necessarily dictionary > words e.g.: > > > baby lions at play > saturday_morning12 > Fukushima > ImpossibleFork > > > (no

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Vincent Davis

On Mon, Dec 21, 2015 at 7:25 AM, Vlastimil Brom wrote: > > baby lions at play > > saturday_morning12 > > Fukushima > > ImpossibleFork > > > > > > (note that some use underscores, others spaces, and some CamelCase) while > > others are completely meaningless (or mostly so): > > > > > > xy39mGWbosj

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Vlastimil Brom

2015-12-21 4:01 GMT+01:00 Steven D'Aprano : > I have a large number of strings (originally file names) which tend to fall > into two groups. Some are human-meaningful, but not necessarily dictionary > words e.g.: > > > baby lions at play > saturday_morning12 > Fukushima > ImpossibleFork > > > (note

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Christian Gollwitzer

Am 21.12.15 um 11:53 schrieb Christian Gollwitzer: So for the spaces, either use a proper trainig material (some long corpus from Wikipedia or such), with punctuation removed. Then it will catch the correct probabilities at word boundaries. Or preprocess by removing the spaces. Christian

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Christian Gollwitzer

Am 21.12.15 um 11:36 schrieb Steven D'Aprano: On Mon, 21 Dec 2015 08:56 pm, Christian Gollwitzer wrote: Apfelkiste:Tests chris$ python score_my.py -8.74 baby lions at play -7.63 saturday_morning12 -6.38 Fukushima -5.72 ImpossibleFork -10.6 xy39mGWbosjY -12.9 9sjz7s8198ghwt -12.1 rz4sdko-

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Steven D'Aprano

On Mon, 21 Dec 2015 08:56 pm, Christian Gollwitzer wrote: > Apfelkiste:Tests chris$ python score_my.py > -8.74 baby lions at play > -7.63 saturday_morning12 > -6.38 Fukushima > -5.72 ImpossibleFork > -10.6 xy39mGWbosjY > -12.9 9sjz7s8198ghwt > -12.1 rz4sdko-28dbRW00u > Apfelkiste:Tests chri

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Christian Gollwitzer

Am 21.12.15 um 09:24 schrieb Peter Otten: Steven D'Aprano wrote: I have a large number of strings (originally file names) which tend to fall into two groups. Some are human-meaningful, but not necessarily dictionary words e.g.: baby lions at play saturday_morning12 Fukushima ImpossibleFork

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Steven D'Aprano

On Monday 21 December 2015 15:22, Chris Angelico wrote: > On Mon, Dec 21, 2015 at 2:01 PM, Steven D'Aprano > wrote: >> I have a large number of strings (originally file names) which tend to >> fall into two groups. Some are human-meaningful, but not necessarily >> dictionary words e.g.: [...] >

Re: Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random)

2015-12-21 Thread Steven D'Aprano

On Monday 21 December 2015 14:45, Ben Finney wrote: > Steven D'Aprano writes: > >> Let's call the second group "random" and the first "non-random", >> without getting bogged down into arguments about whether they are >> really random or not. > > I think we should discuss it, even at risk of get

Re: Catogorising strings into random versus non-random

2015-12-21 Thread Peter Otten

Steven D'Aprano wrote: > I have a large number of strings (originally file names) which tend to > fall into two groups. Some are human-meaningful, but not necessarily > dictionary words e.g.: > > > baby lions at play > saturday_morning12 > Fukushima > ImpossibleFork > > > (note that some use u

Re: Catogorising strings into random versus non-random

2015-12-20 Thread Chris Angelico

On Mon, Dec 21, 2015 at 2:01 PM, Steven D'Aprano wrote: > I have a large number of strings (originally file names) which tend to fall > into two groups. Some are human-meaningful, but not necessarily dictionary > words e.g.: > > > baby lions at play > saturday_morning12 > Fukushima > ImpossibleFor

Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random)

2015-12-20 Thread Ben Finney

Steven D'Aprano writes: > Let's call the second group "random" and the first "non-random", > without getting bogged down into arguments about whether they are > really random or not. I think we should discuss it, even at risk of getting bogged down. As you know better than I, “random” is not an

Catogorising strings into random versus non-random

2015-12-20 Thread Steven D'Aprano

I have a large number of strings (originally file names) which tend to fall into two groups. Some are human-meaningful, but not necessarily dictionary words e.g.: baby lions at play saturday_morning12 Fukushima ImpossibleFork (note that some use underscores, others spaces, and some CamelCase) w

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Re: Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random)

Re: Catogorising strings into random versus non-random

Re: Catogorising strings into random versus non-random

Categorising strings on meaningful–meaningless spectrum (was: Catogorising strings into random versus non-random)

Catogorising strings into random versus non-random

18 matches

Site Navigation

Mail list logo

Footer information