Re: Crack-proofing the Scratch Lottery Code [long]

labwork07 Wed, 09 Feb 2011 06:02:34 -0800

But that doesn't help me win the lottery :-)

On Feb 9, 2011 6:21am, Ken Wesson <kwess...@gmail.com> wrote:

It's likely that many of you have read, or at least encountered links

to, this Wired article recently:

http://www.wired.com/magazine/2011/01/ff_lottery/all/1

In it, a particular tic-tac-toe themed scratch lottery is examined in

detail. A mathematician, Mohan Srivastava, discovered weaknesses in it

and many similar lotteries. The article claims:

> Fundamentally, he believes that creating impregnable tickets is

> extremely difficult, if not impossible. “There is nothing random about

> the lottery,” he says. “In reality, everything about the game has been

> carefully designed to control payouts and entice the consumer.” Of

> course, these elaborate design elements mean that the ticket can be

> undesigned, that the algorithm can be reverse-engineered. The veneer of

> chance can be peeled away.

I had my doubts whether it was impossible to make it secure while

still controlling payouts so I took a stab at writing my own algorithm

to generate tic-tac-toe cards that follow the same rules as the ones

depicted in the Wired article.

Particularly, the player gets 24 hidden numbers, selected from the

numbers between 1 and 39 inclusive, and eight tic-tac-toe boards with

exposed numbers. No board has two of the same number; no board has

more than one winning row, column, or diagonal; and all of the rows,

columns, and diagonals on the board have the same prize values.

My thought was to use a two stage process.

First, generate the part of the ticket visible without scratchinganything.

Then, generate the 24 hidden numbers, in such a way that the same

visible-part can be made a winner or a loser.

Of course it's a bit more complex than just that. The 8 visible

tic-tac-toe grids cannot be truly random without it being impossible

to get a desired statistical distribution of prizes. So the grid

generation has to be constrained somewhat. Eventually I came up with a

program that produces visible-parts that the user can, with some

difficulty, assort into five distinguishable species with three

subspecies each, but I tuned the hidden-number algorithm so that the

average payout for all combinations is roughly a dollar (with the

ticket price being three, and also the minimum non-zero prize, so the

house should make about two bucks a ticket on average, coming out in

the black). Therefore the user cannot game the distinguishable species

and subspecies to cherry-pick any subset of tickets with a

statistically significantly higher expected payout. In particular they

can expect to lose a couple of bucks on each ticket no matter what

numbers they see in the uncovered part of that ticket.

Here is a sample ticket generated with this program by calling(create-ticket):

[(3 4 6 11 13 14 15 16 17 18 19 20 ; hidden numbers

21 24 27 28 29 30 32 33 34 35 37 39)

(((37 33 10)

(35 25 17) ; upper left tic-tac-toe grid

(26 15 3))

((10 3 37)

(17 25 33) ; upper right grid

(34 15 26))

((23 18 39)

(14 24 5) ; second-to-top left

(22 31 27))

((37 3 10)

(33 25 17)

(26 15 27))

((6 13 7)

(20 1 2)

(8 11 30))

((37 33 10)

(29 25 17)

(26 15 35))

((28 34 38)

(36 19 30) ; lower left

(13 9 12))

((6 18 7)

(4 1 2) ; lower right

(8 11 34)))]

There are a few patterns that might leap out at more mathematically

inclined readers; certain of the number positions, such as the center

and upper corners, have only four possible values with two of them

more common than the other two. This is unavoidable but it is also the

case that these patterns cannot be used to guess the prize amount

without knowing the hidden numbers. In fact the most that can be

determined from them is whether it has about a 1 in 3 chance of being

a $3 winner and a 2 in 3 chance of losing, an almost certain loser

with a 1% chance of winning $100, or able to generate a variety of

prizes other than $3 and averaging $1 per ticket. The most serious

weakness here is that a thrill-seeking player can, in theory, zero in

on roughly half of the tickets as having no chance to generate the

big-money prizes ($250 and up) and avoid those. They can't, however,

game these tickets to alter substantially their expected loss of $2

per ticket bought.

The above has three discernible groups of grids: four with 25 in the

center, two with mostly smaller numbers, and two more. This indicates

that it's a species likely to produce 0, 1, or 2 winning grids and

very unlikely to produce more; the bottom left corner is one of the

odd ones and the top right isn't so it probably belongs to the

varied-prizes subspecies capable of generating the big $50,000 maximum

prize. The odds of a ticket like this one winning $50,000 are

minuscule, however.

Virtually "scratching" this ticket and matching up the numbers with

the hit-grids function reveals:

([[true true false] ; XX .

[true false true ] ; X . X

[false true true ]] ; . XX

[[false true true ] ; . XX

[true false true ] ; X . X

[true true false]] ; XX .

[[false true true ] ; . XX

[true true false] ; XX .

[false false true ]] ; . . X

[[true true false] ; XX .

[true false true ] ; X . X

[false true true ]] ; . XX

[[true true false] ; XX .

[true false false] ; X . .

[false true true ]] ; . XX

[[true true false] ; XX .

[true false true ] ; X . X

[false true true ]] ; . XX

[[true true false] ; XX .

[false true true ] ; . XX

[true false false]] ; X . .

[[true true false] ; XX .

[true false false] ; X . .

[false true true ]]); . XX

Like about 8 out of 9 of the tickets this code generates, ours is not awinner.

Below is the code that generated this, plus a small suite of simple

tests, and a detailed discussion of it all. To understand the code and

its discussion, it is recommended that you visit the link above and

familiarize yourself with the appearance of the tic-tac-toe tickets,

particularly the geometry of the potential winning lines and

corresponding prize amounts.

This section is fairly technical so you may want to skip to the

concluding remarks at the bottom of this post with control-end and

page-up now. You can also control-F for

sample test result

to get there quickly. That phrase occurs at the start of the

conclusion and nowhere between here and there. In Firefox and

Thunderbird, triple-clicking the isolated line above will select it

for quick copy/paste.

The first part of the code is a set of utility functions centered

around randomizing various things: making shuffled vectors from seqs,

uniformly picking a random element from a coll, picking randomly from

a set of alternatives with specified probability weightings, and

generating a boolean that has a 1 in N chance of being true. That last

is mainly used in ifs: (if (one-in N) (do-rare-thing)

(do-common-thing)); (one-in 2) generates a coin flip boolean and

(one-in nil) is constantly false, so nil can be used as "infinity"

here, eg

(if (one-in

(cond

(some-test) 1 ; always

(another-test) 2 ; half the time

(test-no-3) 100 ; rarely

:otherwise)) nil ; never

(do-something))

All of these use the rand-int function and none of the rest of the

code directly uses the RNG. These functions are thus the one area that

needs modifying to drop in another RNG implementation, which could be

done with a replacement for rand-int that calls a custom

implementation that, say, stores its state in an atom held in a global

var. This allows replacement of the default RNG with something like,

say, a Mersenne Twister or an entropy pool or even, hardware

permitting, a quantum source of genuine randomness.

(defn shuffle [ss]

(let [s (count ss)]

(loop [out (transient (vec ss)) n (dec s)]

(if (zero? n)

(persistent! out)

(let [i (rand-int (inc n))

t (out n)]

(recur

(assoc!

(assoc! out n (out i))

it)

(dec n)))))))

(defn rand-elt [s]

(nth s (rand-int (count s))))

(defn rand-elt-fmap [fmap]

(let [ftot (reduce + (vals fmap))

n (rand-int ftot)]

(loop [i 0 k nil fmap (seq fmap)]

(if (
k

(let [f (first fmap)]

(recur (+ i (val f)) (key f) (next fmap)))))))

(defn one-in [n]

(and n (= (rand-int n) 0)))

The next part of the code generates "templates", which are

three-element vectors whose final element is the visible eight

tic-tac-toe grids. That's a list of eight lists of three lists of

three integers. The other two parts are used by the hidden number

generator. Importantly, the latter can make a winner or a loser out of

any ticket. The only thing different across the different ticket

"species" is the number of possibly-winning grids: one species will

fairly often have one possibly-winning grid and rarely have 7, another

sometimes 2 and rarely 6, another occasionally 3 and rarely 5, another

infrequently 4, and another rarely 8. These are successively rarer, as

well as having increasing probabilities of losing. Each has three

subspecies which are subtler to spot but produce more distinct prize

distribution differences -- one produces rare $100 wins, one common $3

wins, and one wins of various amounts except $3 at various

frequencies. All species and subspecies have about the same expected

total prize amount, though -- roughly $1.

(def t1

[[:za :ca :xc]

[:cb :xx :ys]

[:yc :zb :cc]])

(def t2

[[:xc :ca :za]

[:ys :xx :cb]

[:cc :zb :yc]])

(def rarities [t1 t2 t1 t1 t1 t1 t2 t1])

(defn create-template []

(let [nums (shuffle (range 1 40))

cs (subvec nums 0 15)

xs (subvec nums 15 23)

ys (subvec nums 23 31)

zs (subvec nums 31)

xx1s (subvec xs 0 2)

xx2s (subvec xs 2 4)

xc1s (subvec xs 4 6)

xc2s (subvec xs 6)

yc1s (subvec ys 0 2)

yc2s (subvec ys 2 4)

ys1s (subvec ys 4 6)

ys2s (subvec ys 6)

za1s (subvec zs 0 2)

zb1s (subvec zs 2 4)

za2s (subvec zs 4 6)

zb2s (subvec zs 6)

grid-nums (map #(+ (* 2 %1) %2)

(shuffle

(rand-elt-fmap

{[0 0 0 0 0 0 0 0] 1

[0 0 0 0 0 0 0 1] 100

[0 0 0 0 0 0 1 1] 50

[0 0 0 0 0 1 1 1] 20

[0 0 0 0 1 1 1 1] 5

[0 0 0 1 1 1 1 1] 20

[0 0 1 1 1 1 1 1] 50

[0 1 1 1 1 1 1 1] 100

[1 1 1 1 1 1 1 1] 1}))

(repeatedly 8 #(rand-int 2)))

card (map

(fn [grid num]

(let [twos (> num 1)

m (mod num 2)

xxs (if twos xx2s xx1s)

xcs (if twos xc2s xc1s)

ycs (if twos yc2s yc1s)

yss (if twos ys2s ys1s)

zas (if twos za2s za1s)

zbs (if twos zb2s zb1s)

cs (shuffle cs)

ca (cs 0)

cb (cs 1)

cc (cs 2)]

(map

(fn [row]

(map

(fn [cell]

(condp = cell

:ca ca

:cb cb

:cc cc

:za (zas m)

:zb (zbs m)

:xx (xxs m)

:xc (xcs m)

:yc (ycs m)

:ys (yss m)))

row))

grid)))

rarities grid-nums)]

[nums grid-nums card]))

The template generator takes a random permutation of 1 to 39 and

partitions it into 15 common numbers, which will be among the hidden

numbers on all tickets generated from that template, and 24 controlled

numbers, only 9 of which will appear on each ticket and which are

subdivided into three sets of 8, called xs, ys, and zs, which are each

further categorized in three ways: "ones" versus "twos" and "parity",

about which see below, and two possible positions within grids: for

the xs, center or corner, for the ys, side or corner, and for the zs,

a or b.

The t1 and t2 templates are grid templates with the numbers replaced

by categories: za and zb for the zs, xx (center) and xc (corner) for

the xs, and yc (corner) and ys (side) for the ys, as well as three

cells marked ca, cb, and cc which can get arbitrary common numbers.

These templates both have the property that every winning line must

cross either an x and ay, an x and az, or ay and a z. Further, one

of the diagonals (which generate the most valuable prizes) requires

both xs and a y. This feature is used to control the prize

distribution at the coarsest level and to make it possible to make

both losing and winning tickets based on any template this function

can emit.

The rarities template is mostly t1s with two t2s. The t2 template is

simply a mirror image of the t1 template and the rarities template

arranges things so the four biggest prizes, the two $50,000s and the

two $1000s, are xxy diagonals. These diagonals will be made very

rare.

The template generator first shuffles 1 to 39 and partitions the

result; these numbers, in this sequence, will be passed to the

hidden-number generator as the first element of the template vector.

Then it generates the second part: a seq of 8 numbers between 0 and 3.

Each grid is assigned one of these numbers. It's actually a bitfield

of sorts: the high bit divides the grids into "ones" and "twos" and

the low bit gives some of them "even" parity and some "odd". Note that

the various types of xs, ys, and zs come in pairs; the parity is used

sometimes to prevent certain winning lines from occurring in

hidden-number generation. Depending on a grid's parity, only the

zeroth or only the first elements of those vectors make it into that

grid. Similarly, there are xs that only occur in "ones" and xs that

only occur in "twos". So, the 0 1 2 3 number of a grid determines

which of four sets of controlled numbers occur on that grid.

Further, the distribution of "ones" and "twos" is skewed so most cards

will have one of one type and seven of the other, fewer will have two

of the rarer type, fewer still will have three, fewer still will have

four of each, and only rarely will a card have all of the same type.

(These produce the five distinguishable species). As you might have

guessed, the hidden-number generator, by choosing from among the xs

and ys and zs appropriately, will make only the "ones" or only the

"twos" eligible to win, and will almost always choose the rarer of the

two. This means most winning cards have a single win, fewer have two,

and so on. The user might try to game that by picking a card with a

less skewed number distribution (or, if he's lucky enough, one with a

lot of non-occurring numbers) hoping for four or even eight winning

grids, but the hidden-number generator compensates by making more of

those cards pure losers than of cards that have the more skewed

distributions.

The card itself just uses random common numbers along with the

grid-number (0 to 3) determined xs, ys, and zs to fill in each of the

t1 or t2 templates.

The output vector is the shuffling of the range 1 to 39, the

grid-nums, and the exposed part of the card.

(defn ifilt [coll idx]

(if idx [(nth coll idx)] coll))

This is a utility function used when generating the hidden numbers. It

takes a coll and an index and if index is nil, returns the coll,

otherwise returns a one-element vector of the coll's element at that

index. So, it can be used to selectively narrow a coll down to one

element or not. In practice, it gets used on two-element collections

(specific pairs of xs, ys, or zs) with either 0, 1 or nil (so, a

parity or nil) as idx to force parity 0, force parity 1, or allow both

parities. It plays a role in dealing with a complex set of possible

winning cards.

(defn pick-player-nums [[nums grid-nums _]]

(let [cs (subvec nums 0 15)

xs (subvec nums 15 23)

ys (subvec nums 23 31)

zs (subvec nums 31)

xx1s (subvec xs 0 2)

xx2s (subvec x


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Crack-proofing the Scratch Lottery Code [long]

Reply via email to