[dns-operations] Follow up to the talk - Beta availability of Two Data Sets

Edward Lewis Tue, 22 Feb 2022 10:00:11 -0800

(This isn't operational, but it relates to the DNS-OARC workshop held last 
week.  Which raised a side question: Ought there be a 
dns-resea...@lists.dns-oarc.net?)


As a follow up comments that JSON is necessary .... I've added a JSON version 
for each CSV file on the DNS Core Census website.  Even a JSON version of the 
catalog.  The catalog.csv lists only CSV files.  The catalog.json lists only 
JSON files.  I figure that is appropriate.

For those who didn't attend, the slides for the talk are here: 
https://indico.dns-oarc.net/event/42/contributions/903/attachments/872/1594/Beta%20Availability%20of%20two%20TLD%20Data%20Products.pdf

On slide 8, where it mentions CSV, there is now JSON (including csv.gz -> 
json.gz)

Uncompressed, JSON is 2-3 times the size of CSV, compressed JSON is about 1/2 
the size of CSV.  I'd never expected that.

In addition, in the code directory there are now these two scripts 
demonstrating how to download the census:

get_dns_core_census_from_web_via_csv.py
get_dns_core_census_from_web_via_json.py

The diff between the two are, showing how "easy" pandas makes this in python 
(;)) and why I was wondering why JSON was preferred.

47c47 --> change the catalog
<       catalog_url = 
'https://observatory.research.icann.org/dns-core-census/v010/table/catalog.csv'
---
>       catalog_url = 
> 'https://observatory.research.icann.org/dns-core-census/v010/table/catalog.json'
51c51 --> read the catalog in the right format
<       catalog = pd.read_csv (catalog_url,dtype=str,na_filter=False)
---
>       catalog = pd.read_json (catalog_url,dtype=str)#,na_filter=False)
83c83 --> read each table in the right format
<                       dataframes[row['TABLE_TOPIC']] = pd.read_csv 
(read_file,dtype=str,na_filter=False)
---
>                       dataframes[row['TABLE_TOPIC']] = pd.read_json 
> (read_file,dtype=str)#,na_filter=False)



_______________________________________________
dns-operations mailing list
dns-operations@lists.dns-oarc.net
https://lists.dns-oarc.net/mailman/listinfo/dns-operations

[dns-operations] Follow up to the talk - Beta availability of Two Data Sets

Reply via email to