On Sat, Nov 21, 2020 at 10:41:31PM +0800, Qian Yun wrote:
> OK, here is some of my preliminary findings of this deep
> learning system (I'll refer as DL):
> 
> (The test are done with a beam size of 10, which means this DL
> will give 10 answers that it deems most likely.)
> (And I use the official FWD + BWD + IBP trained model.)
> 
> 1. It doesn't handle large numbers very well.

The paper said that for training they used numbers up to 5
in random expressions.  Differentiation and arthemtic
simplification may produce larger numbers, but clearly
large number go beyond training set.  Also, in the
past there were works suggesting that arithmetic is
hard for ANN-s.  OTOH we do not need DL for arithmetic
and IMO use of DL for integration is mostly independent
of arithmetic.

> 
> For example, to integrate "x**1678", its answers are
> 
> -0.05505  NO  x**169/169
> -0.08730  NO  x**1685/1685
> -0.10008  NO  x**1681/1681
> -0.21394  NO  x**1689/1689
> -0.25264  NO  x**1687/1687
> -0.25288  NO  x**1688/1681
> -0.28164  NO  x**1678/1678
> -0.28320  NO  x**1678/1679
> -0.29745  NO  x**1684/1684
> -0.31267  NO  x**1678/1685
> 
> This example is testing DL's understanding of pattern
> "integration of x^n is x^(n+1)/(n+1)".
> 
> This result seems to show that DL understands the pattern but fails to
> do "n+1" for some not so large n.
> 
> 2. DL may give correct result that contains strange constant.
> 
> For example, to integrate "x**2", its answers are
> 
> -0.25162  OK  x**3*(1/cos(2) + 1)/(6*(1/(2*cos(2)) + 1/2))
> -0.25220  OK  x**3*(1 + 1/cos(1))/(6*(1/2 + 1/(2*cos(1))))
> -0.25304  OK  x**3*(1 + 1/sin(2))/(6*(1/2 + 1/(2*sin(2))))
> -0.25324  OK  x**3*(1 + 1/sin(1))/(6*(1/2 + 1/(2*sin(1))))
> -0.25458  OK  x**3*(1/tan(1) + 1)/(15*(1/(5*tan(1)) + 1/5))
> -0.25508  OK  x**3*(1 + log(1024))/(15*(1/5 + log(1024)/5))
> -0.25525  OK  x**3*(1/tan(2) + 1)/(15*(1/(5*tan(2)) + 1/5))
> -0.25647  OK  x**3*(1 + 1/cos(1))/(15*(1/5 + 1/(5*cos(1))))
> -0.25774  OK  x**3*(1 + 1/sin(1))/(15*(1/5 + 1/(5*sin(1))))
> -0.28240  OK  x**3*(log(2) + 1)/(15*(log(2)/5 + 1/5))
> 
> 3. DL doesn't understand multiplication very well.
> 
> For example, to integrate "19*sin(x/17)", its answers are
> 
> -0.12595  NO  -365*cos(x/17)
> -0.12882  NO  -373*cos(x/17)
> -0.14267  NO  -361*cos(x/17)
> -0.14314  NO  -357*cos(x/17)
> -0.18328  NO  -353*cos(x/17)
> -0.20499  NO  -377*cos(x/17)
> -0.21484  NO  -352*cos(x/17)
> -0.25740  NO  -369*cos(x/17)
> -0.26029  NO  -359*cos(x/17)
> -0.26188  NO  -333*cos(x/17)
> 
> 4. DL doesn't handle long expression very well.
> 
> For example to integrate
> 'sin(x)+cos(x)+exp(x)+log(x)+tan(x)+atan(x)+acos(x)+asin(x)'
> its answers are
> 
> -0.00262  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20
> 1)/2 + sin(x) - cos(x)
> -0.07420  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20
> 1)/2 + log(cos(x)) + sin(x) - cos(x)
> -0.10192  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20
> 1)/2 + 2*sin(x) - cos(x)
> -0.10513  NO  x*log(x) + x*acos(x) + x*asin(x) + exp(x) - log(x**2 +=20
> 1)/2 - log(cos(x)) + sin(x) - cos(x)
> -0.10885  NO  x*log(x) + x*sin(x) + x*acos(x) + x*asin(x) - x + exp(x) -=20
> log(x**2 + 1)/2 + sin(x) - cos(x)
> -0.10947  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20
> 1)/2 + sin(x) - cos(x)
> -0.13657  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20
> 1)/2 + log(exp(x) + 1) + sin(x) - cos(x)
> -0.16144  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) + log(x + 1)=20
> - log(x**2 + 1)/2 + sin(x) - cos(x)
> -0.16806  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20
> 1)/2 + log(cos(x)) + sin(x) - cos(x)
> -0.19019  NO  x*log(x) + x*acos(x) + x*asin(x) - x + exp(x) - log(x**2 +=20
> 1)/2 + log(exp(asinh(x)) + 1) + sin(x) - cos(x)
> 
> 
> 5. For the FWD test set with 9986 integrals, (which is generate random
> expression first, then try to solve with sympy and discard failures)
> FriCAS can solve 9980 out of 9986 in 71 seconds, of the remaining 6
> integrals, FriCAS can solve another 2 under 100 seconds, and gives
> "implementation incomplete" for 2 integrals, and the remaining 2
> integrals contain complex constant like "acos(acos(tan(3)))", which
> FriCAS can solve using another function.
> 
> The DL system can solve 95.6%, by comparison FriCAS is over 99.94%.
> 
> 6. The DL system is slow.  To solve the FWD test set, the DL system
> may use around 100 hours of CPU time.

You mean 10000 examples?  That would be average 36 seconds
per example...  IIUC you run on CPU, they probably got
much shorter runtime on GPU.

> 7. For the BWD test set, (which is generate random expression first,
> then take derivatives as integrand), FriCAS can roughly solve 95%.
> Compared with DL's claimed 99.5%.  The paper says Mathematica can
> solve 84.0%, I'll a little skeptical about that.

I posted here generator that attemped to match parameters
to the DL paper and got 78% success rate.  That discoverd
few bugs and percentage should be higher now, but much
lower than 95%.  So apparently they used easier examples
(several details in the paper were rather unclear and
I had to use my guesses).  I wonder how well DL would
do on examples from my generator?  In particular, the
paper does not mention simplification of examples.
Unsimplified derivatives tend to contain visible traces
of primitive, after simplification problem gets harder.

> 8. DL doesn't handle rational function integration very well.
> 
> It can handle '(x+1)^2/((x+1)^6+1)' but not its expanded form.
> 
> So DL can recognize patterns, but it really doesn't have insight.
> 
> Rational function integration can be well handled by
> Lazard-Rioboo-Trager algorithm, while DL falis at many
> rational function integrals.
> 
> So some of my comments a year ago are correct:
> 
> "
> In fact, I doubt that this program can solve some rational function
> integration that requires Lazard-Rioboo-Trager algorithm to get
> simplified result.
> "
> 
> 9. DL doesn't handle algebraic function integration very well.
> 
> I have a list of algebraic functions that FriCAS can solve while
> other CASs can't, DL can't solve them as well.
> 
> 10. For the harder mixed-cased integration, I have a list of
> integrations that FriCAS can't handle, DL can't solve them as well.
> 
> - Best,
> - Qian
> 
> On 11/16/20 8:34 PM, Qian Yun wrote:
> > Hi guys,
> > =20
> > I assume you all know the paper "DEEP LEARNING FOR SYMBOLIC MATHEMATICS"
> > by facebook AI researchers, almost one year ago, posted on
> > https://arxiv.org/abs/1912.01412
> > =20
> > And the code was posted 8 months ago:
> > https://github.com/facebookresearch/SymbolicMathematics
> > =20
> > Have you played with it?
> > =20
> > Finally I have some time recently and played with it for a while,
> > and I believe I found some flaws. I will post my findings with
> > more details later. And it's a really interesting experience.
> > =20
> > If you have some spare time and want to have fun, I strongly
> > advise you to play with it and try to break it :-)
> > =20
> > Tips: to run the jupyter notebook example, apply the following
> > patch to run it on CPU instead of CUDA:
> > =20
> > - Best,
> > - Qian
> > =20
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > =20
> > diff --git a/beam_integration.ipynb b/beam_integration.ipynb
> > index f9ef329..00754e3 100644
> > --- a/beam_integration.ipynb
> > +++ b/beam_integration.ipynb
> > @@ -64,6 +64,6 @@
> >  =C2=A0=C2=A0=C2=A0=C2=A0 "\n",
> >  =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 # model parameters\n",
> > -=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'cpu': False,\n",
> > +=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'cpu': True,\n",
> >  =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'emb_dim': 1024,\n",
> >  =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'n_enc_layers': 6,\n",
> >  =C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0 'n_dec_layers': 6,\n",
> > diff --git a/src/model/__init__.py b/src/model/__init__.py
> > index 2b0a044..73ec446 100644
> > --- a/src/model/__init__.py
> > +++ b/src/model/__init__.py
> > @@ -38,7 +38,7 @@
> >  =C2=A0=C2=A0=C2=A0=C2=A0 # reload pretrained modules
> >  =C2=A0=C2=A0=C2=A0=C2=A0 if params.reload_model !=3D '':
> >  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 logger.info(f"Reloading=
> modules from {params.reload_model} ...")
> > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 reloaded =3D torch.load(param=
> s.reload_model)
> > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 reloaded =3D torch.load(param=
> s.reload_model,=20
> > map_location=3Dtorch.device('cpu'))
> >  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 for k, v in modules.ite=
> ms():
> >  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
> assert k in reloaded
> >  =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
> if all([k2.startswith('module.') for k2 in=20
> > reloaded[k].keys()]):
> > diff --git a/src/utils.py b/src/utils.py
> > index bd90608..ef87582 100644
> > --- a/src/utils.py
> > +++ b/src/utils.py
> > @@ -25,7 +25,7 @@
> >  =C2=A0FALSY_STRINGS =3D {'off', 'false', '0'}
> >  =C2=A0TRUTHY_STRINGS =3D {'on', 'true', '1'}
> > =20
> > -CUDA =3D True
> > +CUDA =3D False
> > =20
> > =20
> >  =C2=A0class AttrDict(dict):
> 
> --=20
> You received this message because you are subscribed to the Google Groups "=
> FriCAS - computer algebra system" group.
> To unsubscribe from this group and stop receiving emails from it, send an e=
> mail to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/=
> fricas-devel/a6e0c843-fdcc-6b1c-3159-21499d7c05b9%40gmail.com.

-- 
                              Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/20201125150419.GA29555%40math.uni.wroc.pl.

Reply via email to