optimization of rule-based model on discrete variables
Hi, I have, say 10 variables (x1 ... x10) which can assume discrete finite values, for instance [0,1 or 2]. I need to build a set of rules, such as: 1) if x1==0 and x2==1 and x10==2 then y = 1 2) if x2==1 and x3==1 and x4==2 and x6==0 then y = 0 3) if x2==0 and x3==1 then y = 2 4) if x6==0 and x7==2 then y = 0 ... ... (actually it can be seen as a decision tree classifier). y can assume the same discrete value [0,1 or 2] I don't know a-priori anything about the number of rules and the combinations of the tested inputs. Given a dataset of X={(x1... x10)} I can calculate Y=f(X) where f is this rule-based function. I know an operator g that can calculate a real value from Y: e = g(Y) g is too complex to be written analytically. I would like to find a set of rules f able to minimize e on X. I know the problem can become NP-hard, but I would be fine also with a suboptimal solution. What's the best way to approach the problem? In case, does something already exist in python? thank you -- https://mail.python.org/mailman/listinfo/python-list
Re: optimization of rule-based model on discrete variables
Il Mon, 14 Jun 2021 19:39:17 +1200, Greg Ewing ha scritto: > On 14/06/21 4:15 am, Elena wrote: >> Given a dataset of X={(x1... x10)} I can calculate Y=f(X) where f is >> this rule-based function. >> >> I know an operator g that can calculate a real value from Y: e = g(Y) >> g is too complex to be written analytically. >> >> I would like to find a set of rules f able to minimize e on X. > > There must be something missing from the problem description. > From what you've said here, it seems like you could simply find > a value k for Y that minimises g, regardless of X, and then f would > consist of a single rule: y = k. > > Can you tell us in more concrete terms what X and g represent? I see what you mean, so I try to explain it better: Y is a vector say [y1, y2, ... yn], with large (n>>10), where yi = f(Xi) with Xi = [x1i, x2i, ... x10i] 1<=i<=n. All yi and xji assume discrete values. I already have a dataset of X={Xi} and would like to find the rules f able to minimize a complicated-undifferenciable Real function g(f(X)). Hope this makes more sense. x1...x10 are 10 chemical components that can be absent (0), present (1), modified (2). yi represent a quality index of the mixtures and g is a global quality of the whole process. Thank you in advance ele -- https://mail.python.org/mailman/listinfo/python-list
Re: optimization of rule-based model on discrete variables
Il Tue, 15 Jun 2021 10:40:05 +1200, Greg Ewing ha scritto: > On 15/06/21 12:51 am, Elena wrote: > Hmmm, so the problem breaks down into two parts: > (1) find a vector Y that minimises g (2) find a set of rules that will > allow you to predict each component of Y from its corresponding X values > > Is that right? Correct, the split can be an interesting approach. > > I ztill don't really understand. What are you going to do with this > function f once you have it? > > I would have thought the idea was that if someone gives you a new > mixture X[n+1] you can use f to predict how well it will work. > But that will just give you a y[n+1], and it's not clear what to do with > that. Do you append it to Y and feed an n+1 component vector into g? > > I think I still need more information about the underlying problem > before I can help you much. After the optimization, I will use f just to predict new Xi. Thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: optimization of rule-based model on discrete variables
Il Tue, 15 Jun 2021 01:53:09 +, Martin Di Paola ha scritto: > From what I'm understanding it is an "optimization problem" like the > ones that you find in "linear programming". > > But in your case the variables are not Real (they are Integers) and the > function to minimize g() is not linear. > > You could try/explore CVXPY (https://www.cvxpy.org/) which it's a solver > for different kinds of "convex programming". I don't have experience > with it however. > > The other weapon in my arsenal would be Z3 > (https://theory.stanford.edu/~nikolaj/programmingz3.html) which it's a > SMT/SAT solver with a built-in extension for optimization problems. > > I've more experience with this so here is a "draft" of what you may be > looking for. > > > from z3 import Integers, Optimize, And, If > > # create a Python array X with 3 Z3 Integer variables named x0, x1, x2 X > = Integers('x0 x1 x2') > Y = Integers('y0 y1') > > # create the solver solver = Optimize() > > # add some restrictions like lower and upper bounds for x in X: >solver.add(And(0 <= x, x <= 2)) # each x is between 0 and 2 > for y in Y: >solver.add(And(0 <= y, y <= 2)) > > def f(X): ># Conditional expression can be modeled too with "If" ># These are *not* evaluated like a normal Python "if" but # modeled >as a whole. It'll be the solver which will "run it" >return If( > And(x[0] == 0, x[1] == 0), # the condition Y[0] == 0, # Y[0] will > *must* be 0 *if* the condition holds Y[0] == 2 # Y[0] will *must* > be 2 *if* the condition doesn't hold ) > > solver.add(f(X)) > > # let's define the function to optimize g = Y[0]**2 solver.maximize(g) > > # check if we have a solution solver.check() # this should return 'sat' > > # get one of the many optimum solutions solver.model() > > > I would recommend you to write a very tiny problem with 2 or 3 variables > and a very simple f() and g() functions, make it work (manually and with > Z3) and only then build a more complex program. > > You may find useful (or not) these two posts that I wrote a month ago > about Z3. These are not tutorials, just personal experience with a > concrete example. > > Combine Real, Integer and Bool variables: > https://book-of-gehn.github.io/articles/2021/05/02/Planning-Space- Missions.html > > Lookup Tables (this may be useful for programming a f() "variable" > function where the code of f() (the decision tree) is set by Z3 and not > by you such f() leads to the optimum of g()) > https://book-of-gehn.github.io/articles/2021/05/26/Casting-Broadcasting- LUT-and-Bitwise-Ops.html > > > Happy hacking. > Martin. > > Interesting, I completely didn't know about this Z3 tool, I'll try to go into that. Thank you for hint. BTW the first two links I think are broken. Ele -- https://mail.python.org/mailman/listinfo/python-list
Re: optimization of rule-based model on discrete variables
Il Wed, 16 Jun 2021 11:37:42 +1200, Greg Ewing ha scritto: > On 15/06/21 10:07 pm, Elena wrote: >> After the optimization, I will use f just to predict new Xi. > > So you're going to use f backwards? > > I don't see how that will work. Where are you going to find a new yi to > feed into the inverse of f? > > I think I don't understand what role g plays in all of this. If the > ultimate goal is to find a better mixture, > you need some kind of figure of merit for an individual mixture. But you > don't have that, you only have this thing g that somehow depends on all > of your mixtures at once. > > I'm still not seeing the big picture. sorry I wrote it wrongly, my bad, I will use f just to predict yi from new coming Xi. -- https://mail.python.org/mailman/listinfo/python-list