On Sunday, December 20, 2015 at 10:22:57 PM UTC-6, Chris Angelico wrote:
> DuckDuckGo doesn't give a result count, so I skipped it. Yahoo search yielded:
So why bother to mention it then? Is this another one of your "pikeish"
propaganda campaigns?
--
https://mail.python.org/mailman/listinfo/pyth
On 21/12/15 16:49, Ian Kelly wrote:
> On Mon, Dec 21, 2015 at 9:40 AM, duncan smith wrote:
>> Finite state machine / transition matrix. Learn from some English text
>> source. Then process your strings by lower casing, replacing underscores
>> with spaces, removing trailing numeric characters etc.
Steven D'Aprano writes:
> Does anyone have any suggestions for how to do this? Preferably something
> already existing. I have some thoughts and/or questions:
I think I'd just look at the set of digraphs or trigraphs in each name
and see if there are a lot that aren't found in English.
> - I thi
On 21/12/2015 16:49, Ian Kelly wrote:
On Mon, Dec 21, 2015 at 9:40 AM, duncan smith wrote:
Finite state machine / transition matrix. Learn from some English text
source. Then process your strings by lower casing, replacing underscores
with spaces, removing trailing numeric characters etc. Base
On Mon, Dec 21, 2015 at 9:40 AM, duncan smith wrote:
> Finite state machine / transition matrix. Learn from some English text
> source. Then process your strings by lower casing, replacing underscores
> with spaces, removing trailing numeric characters etc. Base your score
> on something like the
On 21/12/15 03:01, Steven D'Aprano wrote:
> I have a large number of strings (originally file names) which tend to fall
> into two groups. Some are human-meaningful, but not necessarily dictionary
> words e.g.:
>
>
> baby lions at play
> saturday_morning12
> Fukushima
> ImpossibleFork
>
>
> (no
On Mon, Dec 21, 2015 at 7:25 AM, Vlastimil Brom
wrote:
> > baby lions at play
> > saturday_morning12
> > Fukushima
> > ImpossibleFork
> >
> >
> > (note that some use underscores, others spaces, and some CamelCase) while
> > others are completely meaningless (or mostly so):
> >
> >
> > xy39mGWbosj
2015-12-21 4:01 GMT+01:00 Steven D'Aprano :
> I have a large number of strings (originally file names) which tend to fall
> into two groups. Some are human-meaningful, but not necessarily dictionary
> words e.g.:
>
>
> baby lions at play
> saturday_morning12
> Fukushima
> ImpossibleFork
>
>
> (note
Am 21.12.15 um 11:53 schrieb Christian Gollwitzer:
So for the spaces, either use a proper trainig material (some long
corpus from Wikipedia or such), with punctuation removed. Then it will
catch the correct probabilities at word boundaries. Or preprocess by
removing the spaces.
Christian
Am 21.12.15 um 11:36 schrieb Steven D'Aprano:
On Mon, 21 Dec 2015 08:56 pm, Christian Gollwitzer wrote:
Apfelkiste:Tests chris$ python score_my.py
-8.74 baby lions at play
-7.63 saturday_morning12
-6.38 Fukushima
-5.72 ImpossibleFork
-10.6 xy39mGWbosjY
-12.9 9sjz7s8198ghwt
-12.1 rz4sdko-
On Mon, 21 Dec 2015 08:56 pm, Christian Gollwitzer wrote:
> Apfelkiste:Tests chris$ python score_my.py
> -8.74 baby lions at play
> -7.63 saturday_morning12
> -6.38 Fukushima
> -5.72 ImpossibleFork
> -10.6 xy39mGWbosjY
> -12.9 9sjz7s8198ghwt
> -12.1 rz4sdko-28dbRW00u
> Apfelkiste:Tests chri
Am 21.12.15 um 09:24 schrieb Peter Otten:
Steven D'Aprano wrote:
I have a large number of strings (originally file names) which tend to
fall into two groups. Some are human-meaningful, but not necessarily
dictionary words e.g.:
baby lions at play
saturday_morning12
Fukushima
ImpossibleFork
On Monday 21 December 2015 15:22, Chris Angelico wrote:
> On Mon, Dec 21, 2015 at 2:01 PM, Steven D'Aprano
> wrote:
>> I have a large number of strings (originally file names) which tend to
>> fall into two groups. Some are human-meaningful, but not necessarily
>> dictionary words e.g.:
[...]
>
On Monday 21 December 2015 14:45, Ben Finney wrote:
> Steven D'Aprano writes:
>
>> Let's call the second group "random" and the first "non-random",
>> without getting bogged down into arguments about whether they are
>> really random or not.
>
> I think we should discuss it, even at risk of get
Steven D'Aprano wrote:
> I have a large number of strings (originally file names) which tend to
> fall into two groups. Some are human-meaningful, but not necessarily
> dictionary words e.g.:
>
>
> baby lions at play
> saturday_morning12
> Fukushima
> ImpossibleFork
>
>
> (note that some use u
On Mon, Dec 21, 2015 at 2:01 PM, Steven D'Aprano wrote:
> I have a large number of strings (originally file names) which tend to fall
> into two groups. Some are human-meaningful, but not necessarily dictionary
> words e.g.:
>
>
> baby lions at play
> saturday_morning12
> Fukushima
> ImpossibleFor
Steven D'Aprano writes:
> Let's call the second group "random" and the first "non-random",
> without getting bogged down into arguments about whether they are
> really random or not.
I think we should discuss it, even at risk of getting bogged down. As
you know better than I, “random” is not an
I have a large number of strings (originally file names) which tend to fall
into two groups. Some are human-meaningful, but not necessarily dictionary
words e.g.:
baby lions at play
saturday_morning12
Fukushima
ImpossibleFork
(note that some use underscores, others spaces, and some CamelCase) w
18 matches
Mail list logo