On Dec 10, 2013, at 6:25 AM, Dan Stromberg <drsali...@gmail.com> wrote:

> The IMDB flat text file probably came the closest, but it appears to have 
> encoding issues; it's apparently nearly windows-1255, but not quite.

It's ISO-8859-1.

Both certificates.list.gz and mpaa-ratings-reasons.list.gz are rather 
straightforward to parse.

For the US, you will get something along these lines out of 
certificates.list.gz:

USA:(Banned)
USA:12
USA:AO
USA:Approved
USA:C
USA:E
USA:E10+
USA:G
USA:GP
USA:K-A
USA:M
USA:M/PG
USA:NC-17
USA:Not Rated
USA:Open
USA:PG
USA:PG-13
USA:Passed
USA:R
USA:T
USA:TV-14
USA:TV-G
USA:TV-MA
USA:TV-PG
USA:TV-Y
USA:TV-Y7
USA:Unrated
USA:X

And as mentioned, imdbpy handles all this out-of-the-box if you don’t feel like 
doing it yourself.




-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to