I have a question for the MySQL/MariaDB experts.

*Short and sweet*

How is that Query 1 <https://quarry.wmflabs.org/query/38237> runs in
seconds and so does Query 2 <https://quarry.wmflabs.org/query/38243>, but Query
3 <https://quarry.wmflabs.org/query/38244> -- which is essentially the same
thing except it tries to bring columns from both sides of the join -- takes
forever to run?

*Details*

I have a bot that uses this script
<https://github.com/PersianWikipedia/fawikibot/blob/master/HujiBot/findproxy.py>
to identify and block IPs associated with open proxies. To be parsimonious
with the blocks, it only blocks the said IP and not its associated range.
However, many proxy IPs belong to a web hosting range and it would be
better to block the entire range. The goal of the query is to find all
active blocks made by my bot, sort them in order of IP address, and emulate
the LEAD() function -- which we still don't have because we have not
upgraded to MariaDB 10.2 on Labs servers -- to make it easy to find cases
where two consecutive IPs start with the same two octets (like 100.24.X.Y
and 100.24.C.D) so that I can manually investigate those in more detail.

The nested SELECT that is repeated twice simply generates a list of all
active blocks by my bot. Query 1 shows all of them (88 rows) and Query 2
shows only the rows in which the LEAD subquery row has a rownumber that is
equal to 1 + that of a row in the original 88-row data. Of note, this also
has 88 rows, though one of the rows is all NULLs because for the last row
of the data we should not find a match in the LEAD subqeury.

Anyhow, Query 3 simply aims to put the actual data and the LEAD subquery
data side by side and that is where things fall apart somehow. I cannot run
an EXPLAIN on this through Quarry, because Quarry does not like SQL
variables :/ and I have no idea how else to diagnose this problem.

Thanks!

Huji
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly lab...@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud

Reply via email to