Someone who works in hadoop asked me: If our data is in terabytes can we do statistical (ie numpy pandas etc) analysis on it?
I said: No (I dont think so at least!) ie I expect numpy (pandas etc) to not work if the data does not fit in memory Well sure *python* can handle (streams of) terabyte data I guess *numpy* cannot Is there a more sophisticated answer? ["Terabyte" is a just a figure of speech for "too large for main memory"] -- https://mail.python.org/mailman/listinfo/python-list