"Richard O'Keefe" <rao...@gmail.com> writes:

> My difficulty is that  from a statistics/data science perspective,
> it doesn't seem terribly *useful*.

There are two common use cases in my experience:

1) Error checking, most frequently right after reading in a dataset.
   A quick look at the data types of all columns shows if it is coherent
   with your expectations. If you have a column called "data" of data
   type "Object", then most probably something went wrong with parsing
   some date format.

2) Type checking for specific operations. For example, you might want to
   compute an average over all rows for each numerical column in your
   dataset.  That's easiest to do by selecting columns of the right data
   type.

You are completely right that data type information is not sufficient
for checking for all possible problems, such as unit mismatch. But it
remains a useful tool.

Cheers,
  Konrad.

Reply via email to