optimization of rule-based model on discrete variables

2021-06-14 Thread Elena via Python-list
Hi, I have, say 10 variables (x1 ... x10) which can assume discrete finite 
values, for instance [0,1 or 2].
I need to build a set of rules, such as:

1) if x1==0 and x2==1 and x10==2 then y = 1
2) if x2==1 and x3==1 and x4==2 and x6==0 then y = 0
3) if x2==0 and x3==1 then y = 2
4) if x6==0 and x7==2 then y = 0
...
...
(actually it can be seen as a decision tree classifier).

y can assume the same discrete value [0,1 or 2]
I don't know a-priori anything about the number of rules and the 
combinations of the tested inputs.

Given a dataset of X={(x1... x10)} I can calculate Y=f(X) where f is this 
rule-based function.

I know an operator g that can calculate a real value from Y: e = g(Y)
g is too complex to be written analytically.

I would like to find a set of rules f able to minimize e on X.

I know the problem can become NP-hard, but I would be fine also with a 
suboptimal solution.

What's the best way to approach the problem?
In case, does something already exist in python?


thank you
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: optimization of rule-based model on discrete variables

2021-06-14 Thread Elena via Python-list
Il Mon, 14 Jun 2021 19:39:17 +1200, Greg Ewing ha scritto:

> On 14/06/21 4:15 am, Elena wrote:
>> Given a dataset of X={(x1... x10)} I can calculate Y=f(X) where f is
>> this rule-based function.
>> 
>> I know an operator g that can calculate a real value from Y: e = g(Y)
>> g is too complex to be written analytically.
>> 
>> I would like to find a set of rules f able to minimize e on X.
> 
> There must be something missing from the problem description.
>  From what you've said here, it seems like you could simply find
> a value k for Y that minimises g, regardless of X, and then f would
> consist of a single rule: y = k.
> 
> Can you tell us in more concrete terms what X and g represent?

I see what you mean, so I try to explain it better: Y is a vector say [y1, 
y2, ... yn], with large (n>>10), where yi = f(Xi) with Xi = [x1i, x2i, ... 
x10i] 1<=i<=n. All yi and xji assume discrete values.

I already have a dataset of X={Xi} and would like to find the rules f able 
to minimize a complicated-undifferenciable Real function g(f(X)).
Hope this makes more sense.

x1...x10 are 10 chemical components that can be absent (0), present (1), 
modified (2). yi represent a quality index of the mixtures and g is a 
global quality of the whole process.

Thank you in advance

ele  
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: optimization of rule-based model on discrete variables

2021-06-15 Thread Elena via Python-list
Il Tue, 15 Jun 2021 10:40:05 +1200, Greg Ewing ha scritto:

> On 15/06/21 12:51 am, Elena wrote:
> Hmmm, so the problem breaks down into two parts:
> (1) find a vector Y that minimises g (2) find a set of rules that will
> allow you to predict each component of Y from its corresponding X values
> 
> Is that right?

Correct, the split can be an interesting approach.

> 
> I ztill don't really understand. What are you going to do with this
> function f once you have it?
> 
> I would have thought the idea was that if someone gives you a new
> mixture X[n+1] you can use f to predict how well it will work.
> But that will just give you a y[n+1], and it's not clear what to do with
> that. Do you append it to Y and feed an n+1 component vector into g?
> 
> I think I still need more information about the underlying problem
> before I can help you much.

After the optimization, I will use f just to predict new Xi.

Thanks

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: optimization of rule-based model on discrete variables

2021-06-15 Thread Elena via Python-list
Il Tue, 15 Jun 2021 01:53:09 +, Martin Di Paola ha scritto:

> From what I'm understanding it is an "optimization problem" like the
> ones that you find in "linear programming".
> 
> But in your case the variables are not Real (they are Integers) and the
> function to minimize g() is not linear.
> 
> You could try/explore CVXPY (https://www.cvxpy.org/) which it's a solver
> for different kinds of "convex programming". I don't have experience
> with it however.
> 
> The other weapon in my arsenal would be Z3
> (https://theory.stanford.edu/~nikolaj/programmingz3.html) which it's a
> SMT/SAT solver with a built-in extension for optimization problems.
> 
> I've more experience with this so here is a "draft" of what you may be
> looking for.
> 
> 
> from z3 import Integers, Optimize, And, If
> 
> # create a Python array X with 3 Z3 Integer variables named x0, x1, x2 X
> = Integers('x0 x1 x2')
> Y = Integers('y0 y1')
> 
> # create the solver solver = Optimize()
> 
> # add some restrictions like lower and upper bounds for x in X:
>solver.add(And(0 <= x, x <= 2)) # each x is between 0 and 2
> for y in Y:
>solver.add(And(0 <= y, y <= 2))
> 
> def f(X):
># Conditional expression can be modeled too with "If"
># These are *not* evaluated like a normal Python "if" but # modeled
>as a whole. It'll be the solver which will "run it"
>return If(
>  And(x[0] == 0, x[1] == 0),  # the condition Y[0] == 0,  # Y[0] will
>  *must* be 0 *if* the condition holds Y[0] == 2   # Y[0] will *must*
>  be 2 *if* the condition doesn't hold )
> 
> solver.add(f(X))
> 
> # let's define the function to optimize g = Y[0]**2 solver.maximize(g)
> 
> # check if we have a solution solver.check() # this should return 'sat'
> 
> # get one of the many optimum solutions solver.model()
> 
> 
> I would recommend you to write a very tiny problem with 2 or 3 variables
> and a very simple f() and g() functions, make it work (manually and with
> Z3) and only then build a more complex program.
> 
> You may find useful (or not) these two posts that I wrote a month ago
> about Z3. These are not tutorials, just personal experience with a
> concrete example.
> 
> Combine Real, Integer and Bool variables:
> https://book-of-gehn.github.io/articles/2021/05/02/Planning-Space-
Missions.html
> 
> Lookup Tables (this may be useful for programming a f() "variable"
> function where the code of f() (the decision tree) is set by Z3 and not
> by you such f() leads to the optimum of g())
> https://book-of-gehn.github.io/articles/2021/05/26/Casting-Broadcasting-
LUT-and-Bitwise-Ops.html
> 
> 
> Happy hacking.
> Martin.
> 
> 

Interesting, I completely didn't know about this Z3 tool, I'll try to go  
into that.
Thank you for hint.
BTW the first two links I think are broken.

Ele
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: optimization of rule-based model on discrete variables

2021-06-16 Thread Elena via Python-list
Il Wed, 16 Jun 2021 11:37:42 +1200, Greg Ewing ha scritto:

> On 15/06/21 10:07 pm, Elena wrote:
>> After the optimization, I will use f just to predict new Xi.
> 
> So you're going to use f backwards?
> 
> I don't see how that will work. Where are you going to find a new yi to
> feed into the inverse of f?
> 
> I think I don't understand what role g plays in all of this. If the
> ultimate goal is to find a better mixture,
> you need some kind of figure of merit for an individual mixture. But you
> don't have that, you only have this thing g that somehow depends on all
> of your mixtures at once.
> 
> I'm still not seeing the big picture.

sorry I wrote it wrongly, my bad, I will use f just to predict yi from new 
coming Xi.

-- 
https://mail.python.org/mailman/listinfo/python-list