I've finally written a script that can be used to download attachments from bugzilla and check if they are spam. The detection is quite crude - it checks for a file that 'file' thinks is a HTML file. Then it checks if the file contains '<script'. That results in a possible spam file that I have to look at manually.

It was quite a bit of work as I found more than 110 spam attachments btw
:-(

As for the spam attachments, I don't know how to delete them. So what I've done is instead to change the MIME type to 'application/spam'. This seems to at least prevent the browser from automatically showing the HTML.


There's a list of the spam attachments on this page:
        http://wiki.lyx.org/Devel/BugzillaSpam

I'll try to run my script now and then, but some of you are likely to detect them before me. If you do, then please change the MIME type to
        application/spam
and add the number of the attachment to the wiki page:
        http://wiki.lyx.org/Devel/BugzillaSpam
(That way I know you've already done it, so I don't have to do it).


I would like to be able to run this script from aussie.lyx.org, but 'wget' is not able to resolve e.g.

        wget 'http://bugzilla.lyx.org/attachment.cgi?id=1615&action=view'

The result is as follows:

         wget 'http://bugzilla.lyx.org/attachment.cgi?id=1615&action=view'
--09:42:07--  http://bugzilla.lyx.org/attachment.cgi?id=1615&action=view
           => `attachment.cgi?id=1615&action=view'
Resolving bugzilla.lyx.org... failed: No such file or directory.


Lars', do you think you could help here? A manual entry in /etc/hosts perhaps?

Or if someone else can tell me how to access the attachments somehow from aussie? (It should be possible as aussie is the machine that runs bugzilla...)

/Christian

PS. Here's a list of the attachments that I've tagged as spam (some were already tagged of course).

1600-1649
1616
1617
1618
1619
1632
1633
1635
1636
1637
1638
1642

1650-1699
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1672
1698

1700-1750
1723
1724
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749

1750-1799
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799

1800-1849
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832

--
Christian Ridderström, +46-8-768 39 44               http://www.md.kth.se/~chr

Reply via email to