Can I ask that you provide comments for each attribute and method? See: https://whimsy.apache.org/docs/api/Cache.html
- Sam Ruby On Sun, Jun 18, 2017 at 8:27 PM, <s...@apache.org> wrote: > This is an automated email from the ASF dual-hosted git repository. > > sebb pushed a commit to branch master > in repository https://gitbox.apache.org/repos/asf/whimsy.git > > > The following commit(s) were added to refs/heads/master by this push: > new 2d696ea Simple HTTP(S) cache for text files > 2d696ea is described below > > commit 2d696ea14961415862aded0e80525aaef23cbe96 > Author: Sebb <s...@apache.org> > AuthorDate: Mon Jun 19 01:27:57 2017 +0100 > > Simple HTTP(S) cache for text files > > Initial implementation > --- > lib/whimsy/cache.rb | 148 > ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 148 insertions(+) > > diff --git a/lib/whimsy/cache.rb b/lib/whimsy/cache.rb > new file mode 100644 > index 0000000..40a400f > --- /dev/null > +++ b/lib/whimsy/cache.rb > @@ -0,0 +1,148 @@ > +require 'fileutils' > +require 'digest' > +require 'net/http' > +require 'wunderbar' > + > +# Simple cache for HTTP(S) text files > +class Cache > + # Don't bother checking cache entries that are younger (seconds) > + attr_accessor :minage > + attr_reader :enabled > + > + def initialize(dir: '/tmp/cache', > + minage: 600, # 10 mins > + enabled: true) > + @dir = dir > + @enabled = enabled > + @minage = minage > + init_cache(dir) if enabled > + end > + > + def enabled=(enabled) > + @enabled = enabled > + init_cache(dir) if enabled > + end > + > + # gets the URL content > + # Caches the response and returns that if unchanged or recent > + # Returns: > + # - uri (after redirects) > + # - content > + # - status: nocache, recent, updated, missing or no last mod/etag > + def get(url) > + if not @enabled > + uri, res = fetch(url) > + return uri, res.body, 'nocache' > + end > + > + # Check the cache > + age, lastmod, uri, etag, data = read_cache(url) > + Wunderbar.debug "#{uri} #{age} LM=#{lastmod} ET=#{etag}" > + if age < minage > + return uri, data, 'recent' # we have a recent cache entry > + end > + > + # Try to do a conditional get > + if data and (lastmod or etag) > + cond = {} > + cond['If-Modified-Since'] = lastmod if lastmod > + # Allow for Apache Bug 45023 > + cond['If-None-Match'] = etag.gsub(/-gzip"$/,'"') if etag > + uri, res = fetch(url, cond) > + if res.is_a?(Net::HTTPSuccess) > + write_cache(url, res) > + return uri, res.body, 'updated' > + elsif res.is_a?(Net::HTTPNotModified) > + path = makepath(url) > + mtime = Time.now > + File.utime(mtime, mtime, path) # show we checked the page > + return uri, data, 'unchanged' > + else > + return nil, res, 'error' > + end > + else > + uri, res = fetch(url) > + write_cache(url, res) > + return uri, res.body, data ? 'no last mod/etag' : 'missing' > + end > + end > + > + private > + > + def init_cache(path) > + return if File.directory?(path) and File.writable?(path) > + begin > + FileUtils.mkdir_p path > + Wunderbar.info "Created the cache #{path}" > + raise Exception.new("Not writable") unless File.writable?(path) > + rescue Exception => e > + Wunderbar.warn "Could not create the cache #{path} - #{e}" > + @enabled = false > + end > + end > + > + # fetch uri, following redirects > + def fetch(uri, options={}, depth=1) > + if depth > 5 > + raise IOError.new("Too many redirects (#{depth}) detected at #{uri}") > + end > + uri = URI.parse(uri) > + Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do > |http| > + request = Net::HTTP::Get.new(uri.request_uri) > + options.each do |k,v| > + request[k] = v > + end > + response = http.request(request) > + Wunderbar.debug "Headers: #{response.to_hash.inspect}" > + Wunderbar.debug response.code > + if response.code == '304' # Not modified > + return uri, response > + elsif response.code =~ /^3\d\d/ # assume redirect > + fetch response['location'], options, depth+1 > + else > + return uri, response > + end > + end > + end > + > + # File cache contains last modified followed by the data > + # The file mod time can be used to skip any checks for recently updated > files > + def write_cache(uri, res) > + path = makepath(uri) > + open path, 'wb' do |io| > + io.puts res['Last-Modified'] > + io.puts uri > + io.puts res['Etag'] > + io.write res.body > + end > + end > + > + # return age, last-modified, uri, data > + def read_cache(uri) > + path = makepath(uri) > + mtime = File.stat(path).mtime rescue nil > + last = nil > + data = nil > + uri = nil > + etag = nil > + if mtime > + open path, 'rb' do |io| > + last = io.gets.chomp > + uri = URI.parse(io.gets.chomp) > + etag = io.gets.chomp > + data = io.read > +# Fri, 12 May 2017 14:10:23 GMT > +# 123456789012345678901234567890 > + last = nil unless last.length > 25 > + end > + end > + > + return Time.now - (mtime ? mtime : Time.new(0)), last, uri, etag, data > + end > + > + def makepath(uri) > + name = Digest::MD5.hexdigest uri.to_s > + File.join @dir, "#{name}" > + end > + > +end > > -- > To stop receiving notification emails like this one, please contact > ['"comm...@whimsical.apache.org" <comm...@whimsical.apache.org>'].