On Wed, 17 Jan 2024 at 15:54, Spiros Maggioros <maggioroslo...@gmail.com>
wrote:

> So we showed that, using AVL trees instead of arrays is much better(note
> that even linked lists is slower cause the insertion time complexity is
> O(n)).
>

Interesting. Did you compare the AVL tree with other sparse data structures?


> I have not seen the data structure that is used in SymPy, but i'm planning
> to check what i need to see cause right now i'm in exam period and i have
> no time at all.
>

No problem. If you want to learn more about how these things are
implemented in SymPy then I recommend starting by learning how to use the
lower-level data structures. This doc page is a little out of date since
(as of current master) SymPy can make use of python-flint in some places
but it shows how to access things:

https://docs.sympy.org/latest/modules/polys/domainsintro.html

The DUP representation is what you describe as an "array" (a "list" in
Python terminology). The DMP representation uses this recursively for
multivariate polynomials. Sparse polynomials are implemented using
hash-tables (dicts). The doc page I just linked explains how to create and
introspect these data structures and how they are used within SymPy.

The situation in Python is a bit different from C or other languages with
lower interpreter overhead because the downsides of using say a hash-table
vs an array are much lower in a relative sense. This is a quick and dirty
measurement of the time to lookup an item in a dict vs a list using ipython:

In [28]: hashtable = dict(zip(range(100000), range(1, 100000+1)))

In [29]: array = list(range(100000))

In [30]: %timeit hashtable[1000]
56.2 ns ± 1.03 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops
each)

In [31]: %timeit array[1000]
22.2 ns ± 0.12 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops
each)

In C the difference in lookup time between a hash-table and an array would
be much more than 2.5x (an array lookup would be more like 1ns). The reason
they are not so different in Python is because there is so much interpreter
overhead in both cases that the real underlying operation does not
really take a majority of the runtime. I think that probably tends to shift
what data structures seem fastest in the context of SymPy when compared to
implementations of the same operations in other languages.

--
Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sympy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxSgf8ya%3D_auVb-%2BL1w4zkqo6hKLnysKYSRvfiG-vU%3D3gg%40mail.gmail.com.

Reply via email to