Eis-D-Z commented on code in PR #1715: URL: https://github.com/apache/libcloud/pull/1715#discussion_r912810371
########## contrib/scrape-ec2-prices.py: ########## @@ -161,82 +48,155 @@ "extra-large", ] -RE_NUMERIC_OTHER = re.compile(r"(?:([0-9]+)|([-A-Z_a-z]+)|([^-0-9A-Z_a-z]+))") - -BASE_PATH = os.path.dirname(os.path.abspath(__file__)) -PRICING_FILE_PATH = os.path.join(BASE_PATH, "../libcloud/data/pricing.json") -PRICING_FILE_PATH = os.path.abspath(PRICING_FILE_PATH) - +def download_json(): + response = requests.get(URL, stream=True) + try: + return open(TEMPFILE, "r") + except IOError: + with open(TEMPFILE, "wb") as fo: + for chunk in response.iter_content(chunk_size=2**20): + if chunk: + fo.write(chunk) + return open(TEMPFILE, "r") + + +def get_json(): + try: + return open(TEMPFILE, "r") + except IOError: + return download_json() + + +# Prices and sizes are in different dicts and categorized by sku +def get_all_prices(): + # return variable + # prices = {sku : {price: int, unit: string}} + prices = {} + current_sku = "" + current_rate_code = "" + amazonEC2_offer_code = "JRTCKXETXF" Review Comment: The file has two big keys, "products" and "terms". Under "products" are the technical details of each sku. Under "terms" there are pricing data. A sample entry is like this: { "terms": { "onDemand": { sku: { sku: {"JRTCKXETXF": { "priceDimensions": { rateCode: {"unit": "$", "value": "0.17" } } } } } } } It is fortunate that this ijson parser has been used since it makes it easy to find the "rateCode" number (with ijson you parse every line and it tells you if what you encounter is a key or a value). Otherwise it would be more difficult. I guess that amazon has a way of determining the rateCode based on the sku or some other data but I haven't found their relation. The JRTCKXETXF is the same for all EC2 items and I named it amazonEC2_offer_code. Thus what the code does is parse the "terms" dict line by like, when it encounters a dict key under terms["OnDemand"] set it as the current_sku. Then search the next lines for the rateCode of the item and after that finally get the price and the unit of the sku. Save those two in the dict to be returned under the sku. ({sku1: {"price": "xx", "unit": "$"}, sku2:{... } -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@libcloud.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org