(This isn't operational, but it relates to the DNS-OARC workshop held last week. Which raised a side question: Ought there be a dns-resea...@lists.dns-oarc.net?)
As a follow up comments that JSON is necessary .... I've added a JSON version for each CSV file on the DNS Core Census website. Even a JSON version of the catalog. The catalog.csv lists only CSV files. The catalog.json lists only JSON files. I figure that is appropriate. For those who didn't attend, the slides for the talk are here: https://indico.dns-oarc.net/event/42/contributions/903/attachments/872/1594/Beta%20Availability%20of%20two%20TLD%20Data%20Products.pdf On slide 8, where it mentions CSV, there is now JSON (including csv.gz -> json.gz) Uncompressed, JSON is 2-3 times the size of CSV, compressed JSON is about 1/2 the size of CSV. I'd never expected that. In addition, in the code directory there are now these two scripts demonstrating how to download the census: get_dns_core_census_from_web_via_csv.py get_dns_core_census_from_web_via_json.py The diff between the two are, showing how "easy" pandas makes this in python (;)) and why I was wondering why JSON was preferred. 47c47 --> change the catalog < catalog_url = 'https://observatory.research.icann.org/dns-core-census/v010/table/catalog.csv' --- > catalog_url = > 'https://observatory.research.icann.org/dns-core-census/v010/table/catalog.json' 51c51 --> read the catalog in the right format < catalog = pd.read_csv (catalog_url,dtype=str,na_filter=False) --- > catalog = pd.read_json (catalog_url,dtype=str)#,na_filter=False) 83c83 --> read each table in the right format < dataframes[row['TABLE_TOPIC']] = pd.read_csv (read_file,dtype=str,na_filter=False) --- > dataframes[row['TABLE_TOPIC']] = pd.read_json > (read_file,dtype=str)#,na_filter=False) _______________________________________________ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations