The country information is interesting. I’ve found that bots also skew the 
counts.

In my bin dir on the CW server, I have a perl program, moduleScrape.pl, 
(~/bin/moduleScrape.pl) that slogs through the logs to figure out module 
downloads counting each download once rather than by all the parts. It first 
goes through the conf files to find the module in the repository and then picks 
a single file for each module. Then it goes through the log files (ftp and 
http) looking for downloads (including zip files) of modules. It tosses hits by 
bots. The output format is normalized to:
Date    Module  Format  Transport               IP      Country Simplified agent
Note IP is obscured here.
20150628        Easton  prt     FTP     xxx.xxx.xxx.xxx  United States   
w4....@xiphos.org
20150628        PolGdanska      zip     HTTP    xxx.xxx.xxx.xxx    Poland  
Apache-HttpClient/UNAVAILABLE (java 1.4)

The program needs tweaking for each server as it “knows” CrossWire’s 
repositories and it’s logs.

There are a bunch of flags that allow to specify a date range and is geared to 
find that last full month.

The program started out by J Ansorg and improved by N Carter.

I’ve also a program moduleStats, that runs this program and analyzes the output 
to produce statistics about the modules.

Troy and I’ve been talking about tossing the data into a database.

DM


> On Sep 10, 2017, at 5:38 PM, Karl Kleinpaste <k...@kleinpaste.org> wrote:
> 
> Now and then I get curious about where all the accesses to ftp.xiphos.org 
> <ftp://ftp.xiphos.org/> come from.  This is a crude summary from my 
> /var/log/xferlog since early August.  Counts of accesses can be gotten by 
> substituting the last "uniq" stage of the pipeline with "uniq -c | sort -nr" 
> but such counts are registering individual files accessed, which is not very 
> informative, especially for modules that include dozens of image files.
> 
> cat xferlog* | cut -f7 -d' ' | sed -e s/::ffff:// | sort | uniq -c | sort -nr 
> | awk '{ print $2 }' | fgrep . | while read ip ; do geoiplookup $ip ; done | 
> grep 'GeoIP Country Edition' | sed -e 's/GeoIP Country Edition: //' | sort | 
> uniq

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to