date:20240117

[sympy] Contribution to SymPy Projects + Polynomials representation ideas

2024-01-17 Thread Spiros Maggioros

Hi everyone! My name is Spiros Maggioros and i'm a 3rd year undegraduate 
electrical & computer engineering student at National Technical University 
of Athens.I've worked as a machine learning engineering intern at OTE(HTO), 
i'm the lead of IEEEXtreme for the Greek section and a Computer Lab 
Assistant for my university.I have my own computer science research team 
working on prediction optimizations and algorithms.I would love to start 
contributing to SymPy and start solving some issues.

One year ago i wrote a paper for polynomials representation(in the sparse 
representation) using AVL trees, and i would love to share my idea.Also, i 
would love to help with the implementation of the Polynomial GCD.Every help 
and details on what's the best way to contribute will be helpful!

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/bc5a3a35-d3ce-4dc0-bbd6-2bfe12a51a88n%40googlegroups.com.

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

2024-01-17 Thread Oscar Benjamin

On Wed, 17 Jan 2024 at 15:16, Spiros Maggioros  wrote:
>
> Hi everyone!

Hi Spiros,

> My name is Spiros Maggioros and i'm a 3rd year undegraduate electrical & 
> computer engineering student at National Technical University of Athens.I've 
> worked as a machine learning engineering intern at OTE(HTO), i'm the lead of 
> IEEEXtreme for the Greek section and a Computer Lab Assistant for my 
> university.I have my own computer science research team working on prediction 
> optimizations and algorithms.I would love to start contributing to SymPy and 
> start solving some issues.
>
> One year ago i wrote a paper for polynomials representation(in the sparse 
> representation) using AVL trees, and i would love to share my idea.

That sounds interesting. Do you have a link to it or can you explain
the idea briefly here?

> Also, i would love to help with the implementation of the Polynomial 
> GCD.Every help and details on what's the best way to contribute will be 
> helpful!

That's great. There was a GSOC project last Summer looking at
polynomial GCD which made some good progress but there is still plenty
more to do.

Polynomials in particular are a high priority item for SymPy. Right
now the two top priority items for polys are (any progress on either
would be good):

1. Improve SymPy's existing implementation and algorithms for polynomial gcd.
2. Expose Flint's sparse polynomials in python-flint and add the
wrapper code in SymPy so that SymPy can use them when python-flint is
available.

There is a start on the python-flint part here but it seems to be stalled:
https://github.com/flintlib/python-flint/pull/59

Working on python-flint requires working with C and Cython as well as
Python which might not be suitable.

As for SymPy's existing polynomial GCD algorithms there is a pull
request from GSOC 23 that never got finished:
https://github.com/sympy/sympy/pull/25442

I think that PR needs to be broken down. Some parts were extracted to
other PRs that got merged but the final piece didn't get finished.
Right now the problem with it is that it mixes up some things that
should be part of the general GCD preprocessing steps (like removing
unneeded variables) in with the PRS algorithm which should just be
implemented in a more direct way. The basic code there seems
reasonable but I don't think that it is organised in the right way.

Another thing that needs looking at is the code in
sympy/polys/modulargcd. It looks like good code but isn't used
anywhere and is not well tested.

For initial contribution you might want to look at something a bit
easier than working on the polys GCD code though. There are some 300
open issues with the polys tag if you are interested specifically in
polynomials:
https://github.com/sympy/sympy/issues?q=is%3Aissue+is%3Aopen+label%3Apolys+

--
Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxSaRCwNLj0_EJSFFcqMExj%2BnSFmZ4RwMiauGGNBTQ7Nxg%40mail.gmail.com.

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

2024-01-17 Thread Spiros Maggioros

I accidentally made a mistake while explaining, in the photo with the AVL 
tree, the polynomial represented is P(x) = Cx^3 + Bx^2 + Ax, the {+0, +1, 
+2} tags are the heights for each node for the rotation.Sorry about that.

Στις Τετάρτη 17 Ιανουαρίου 2024 στις 5:54:55 μ.μ. UTC+2, ο χρήστης Spiros 
Maggioros έγραψε:

> Sure, as for using AVL trees to represent polynomials, the idea follows 
> the same approach as the AVL trees in case of insertion and deletion, so we 
> have O(logn) insertions and deletion, the tree stays balanced and we can 
> have the polynomial ordered every time.Here's an image for better 
> understanding:
> Here is a simple Right-Right rotation for the insertion process.
>
> [image: Screenshot 2024-01-17 at 17-48-49 polynomial_trees.png]
> Where A, B, C are the coefficients, i.e. the polynomial that this tree 
> represents is P(x) = Ax^2 + Bx + C.
> [image: Screenshot 2024-01-17 at 17-51-59 polynomial_trees.png]
> So we showed that, using AVL trees instead of arrays is much better(note 
> that even linked lists is slower cause the insertion time complexity is 
> O(n)).
> I have not seen the data structure that is used in SymPy, but i'm planning 
> to check what i need to see cause right now i'm in exam period and i have 
> no time at all.
> I will see the issues in the poly section as well and see if i can help 
> with something easy for now, cause i'm planning to apply for the GSoC this 
> year as well.
>
> Thanks for the information Oscar, i really appreciate that.
>
> Στις Τετάρτη 17 Ιανουαρίου 2024 στις 5:39:59 μ.μ. UTC+2, ο χρήστης Oscar 
> έγραψε:
>
>> On Wed, 17 Jan 2024 at 15:16, Spiros Maggioros  
>> wrote: 
>> > 
>> > Hi everyone! 
>>
>> Hi Spiros, 
>>
>> > My name is Spiros Maggioros and i'm a 3rd year undegraduate electrical 
>> & computer engineering student at National Technical University of 
>> Athens.I've worked as a machine learning engineering intern at OTE(HTO), 
>> i'm the lead of IEEEXtreme for the Greek section and a Computer Lab 
>> Assistant for my university.I have my own computer science research team 
>> working on prediction optimizations and algorithms.I would love to start 
>> contributing to SymPy and start solving some issues. 
>> > 
>> > One year ago i wrote a paper for polynomials representation(in the 
>> sparse representation) using AVL trees, and i would love to share my idea. 
>>
>> That sounds interesting. Do you have a link to it or can you explain 
>> the idea briefly here? 
>>
>> > Also, i would love to help with the implementation of the Polynomial 
>> GCD.Every help and details on what's the best way to contribute will be 
>> helpful! 
>>
>> That's great. There was a GSOC project last Summer looking at 
>> polynomial GCD which made some good progress but there is still plenty 
>> more to do. 
>>
>> Polynomials in particular are a high priority item for SymPy. Right 
>> now the two top priority items for polys are (any progress on either 
>> would be good): 
>>
>> 1. Improve SymPy's existing implementation and algorithms for polynomial 
>> gcd. 
>> 2. Expose Flint's sparse polynomials in python-flint and add the 
>> wrapper code in SymPy so that SymPy can use them when python-flint is 
>> available. 
>>
>> There is a start on the python-flint part here but it seems to be 
>> stalled: 
>> https://github.com/flintlib/python-flint/pull/59 
>>
>> Working on python-flint requires working with C and Cython as well as 
>> Python which might not be suitable. 
>>
>> As for SymPy's existing polynomial GCD algorithms there is a pull 
>> request from GSOC 23 that never got finished: 
>> https://github.com/sympy/sympy/pull/25442 
>>
>> I think that PR needs to be broken down. Some parts were extracted to 
>> other PRs that got merged but the final piece didn't get finished. 
>> Right now the problem with it is that it mixes up some things that 
>> should be part of the general GCD preprocessing steps (like removing 
>> unneeded variables) in with the PRS algorithm which should just be 
>> implemented in a more direct way. The basic code there seems 
>> reasonable but I don't think that it is organised in the right way. 
>>
>> Another thing that needs looking at is the code in 
>> sympy/polys/modulargcd. It looks like good code but isn't used 
>> anywhere and is not well tested. 
>>
>> For initial contribution you might want to look at something a bit 
>> easier than working on the polys GCD code though. There are some 300 
>> open issues with the polys tag if you are interested specifically in 
>> polynomials: 
>>
>> https://github.com/sympy/sympy/issues?q=is%3Aissue+is%3Aopen+label%3Apolys+ 
>>
>> -- 
>> Oscar 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/da3d91aa-54ef-4458-afb1-cf3b6bd4dfe5n%40google

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

2024-01-17 Thread Oscar Benjamin

On Wed, 17 Jan 2024 at 15:54, Spiros Maggioros 
wrote:

> So we showed that, using AVL trees instead of arrays is much better(note
> that even linked lists is slower cause the insertion time complexity is
> O(n)).
>

Interesting. Did you compare the AVL tree with other sparse data structures?

> I have not seen the data structure that is used in SymPy, but i'm planning
> to check what i need to see cause right now i'm in exam period and i have
> no time at all.
>

No problem. If you want to learn more about how these things are
implemented in SymPy then I recommend starting by learning how to use the
lower-level data structures. This doc page is a little out of date since
(as of current master) SymPy can make use of python-flint in some places
but it shows how to access things:

https://docs.sympy.org/latest/modules/polys/domainsintro.html

The DUP representation is what you describe as an "array" (a "list" in
Python terminology). The DMP representation uses this recursively for
multivariate polynomials. Sparse polynomials are implemented using
hash-tables (dicts). The doc page I just linked explains how to create and
introspect these data structures and how they are used within SymPy.

The situation in Python is a bit different from C or other languages with
lower interpreter overhead because the downsides of using say a hash-table
vs an array are much lower in a relative sense. This is a quick and dirty
measurement of the time to lookup an item in a dict vs a list using ipython:

In [28]: hashtable = dict(zip(range(10), range(1, 10+1)))

In [29]: array = list(range(10))

In [30]: %timeit hashtable[1000]
56.2 ns ± 1.03 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops
each)

In [31]: %timeit array[1000]
22.2 ns ± 0.12 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops
each)

In C the difference in lookup time between a hash-table and an array would
be much more than 2.5x (an array lookup would be more like 1ns). The reason
they are not so different in Python is because there is so much interpreter
overhead in both cases that the real underlying operation does not
really take a majority of the runtime. I think that probably tends to shift
what data structures seem fastest in the context of SymPy when compared to
implementations of the same operations in other languages.

--
Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxSgf8ya%3D_auVb-%2BL1w4zkqo6hKLnysKYSRvfiG-vU%3D3gg%40mail.gmail.com.

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

2024-01-17 Thread Spiros Maggioros

I understand, hash-table(unordered_map in c++) is the only data structures 
that beats the tree representation in c++, there's drawbacks though, as you 
mentioned, and one more drawback is that you can't really sort the 
polynomial using this data structure, cause it's a "1-1" function, the only 
way to do that is by sorting the pairs from the beginning, which require 
O(nlogn) computational time.Note that in the tree representation i can 
traverse the tree in inorder fashion and return it sorted so there's no 
need for further functions.

Yes, i compared it to other data structures like arrays(list in python) and 
linked lists which is the most common way to represent polynomials.
If the user gives the coefficients and exponents ordered, let's say [ 
{1,3}, {3,2}, {1,1}  ] which is P(x) = X^3 + 3X^2 + X, then the list wins 
as we just have to push_back(O(1) time complexity) the pairs.But, if i want 
to add more and more pairs as i continue, lets say i want to add {2, 5} 
which is 2X^5 to my polynomial, then i can't just push back the pair, cause 
i will lose the order. So in this case, the AVL tree wins.In order to have 
a sorted polynomial with a linked list i must have O(n) insertion time 
complexity.

Now if you use lists in python, let's say i want to represent a polynomial 
P(x) = X^1000 + X, then i'll need max(exponent(P)) slots in my list.But 
with an AVL tree i'll just need 2 nodes.

I understand what is happening in python, that's why intense testing is 
needed.Because something in theory seems faster does not mean that's always 
the case.

Spiros.

Στις Τετάρτη 17 Ιανουαρίου 2024 στις 9:29:47 μ.μ. UTC+2, ο χρήστης Oscar 
έγραψε:

> On Wed, 17 Jan 2024 at 15:54, Spiros Maggioros  
> wrote:
>
>> So we showed that, using AVL trees instead of arrays is much better(note 
>> that even linked lists is slower cause the insertion time complexity is 
>> O(n)).
>>
>
> Interesting. Did you compare the AVL tree with other sparse data 
> structures?
>  
>
>> I have not seen the data structure that is used in SymPy, but i'm 
>> planning to check what i need to see cause right now i'm in exam period and 
>> i have no time at all.
>>
>
> No problem. If you want to learn more about how these things are 
> implemented in SymPy then I recommend starting by learning how to use the 
> lower-level data structures. This doc page is a little out of date since 
> (as of current master) SymPy can make use of python-flint in some places 
> but it shows how to access things:
>
> https://docs.sympy.org/latest/modules/polys/domainsintro.html
>
> The DUP representation is what you describe as an "array" (a "list" in 
> Python terminology). The DMP representation uses this recursively for 
> multivariate polynomials. Sparse polynomials are implemented using 
> hash-tables (dicts). The doc page I just linked explains how to create and 
> introspect these data structures and how they are used within SymPy.
>
> The situation in Python is a bit different from C or other languages with 
> lower interpreter overhead because the downsides of using say a hash-table 
> vs an array are much lower in a relative sense. This is a quick and dirty 
> measurement of the time to lookup an item in a dict vs a list using ipython:
>
> In [28]: hashtable = dict(zip(range(10), range(1, 10+1)))
>
> In [29]: array = list(range(10))
>
> In [30]: %timeit hashtable[1000]
> 56.2 ns ± 1.03 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops 
> each)
>
> In [31]: %timeit array[1000]
> 22.2 ns ± 0.12 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops 
> each)
>
> In C the difference in lookup time between a hash-table and an array would 
> be much more than 2.5x (an array lookup would be more like 1ns). The reason 
> they are not so different in Python is because there is so much interpreter 
> overhead in both cases that the real underlying operation does not 
> really take a majority of the runtime. I think that probably tends to shift 
> what data structures seem fastest in the context of SymPy when compared to 
> implementations of the same operations in other languages.
>
> --
> Oscar
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/9241cfb0-0a25-4d30-8bbf-08c7656e2199n%40googlegroups.com.

[sympy] Contribution to SymPy Projects + Polynomials representation ideas

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

Re: [sympy] Contribution to SymPy Projects + Polynomials representation ideas

5 matches

Site Navigation

Mail list logo

Footer information