A TAQ dataset can be 50GB. Can Python do data manipulation, analysis, and modeling without making the computer crash? My laptop is Intel 4th, Core i5, RAM 8GB, 500GB HD
Can Python handle a common TAQ dataset?
-
You are going to need an on-disk store with a front-end that provides an expressive DSL into that store.
For example, in R, dplyr is a unified interface into multiple database backends, data.frame and data.table.
pandas for Python also has multiple backends, but blaze is superior since it plugs into PyTables, MongoDB and a bunch of others.
All of this is naturally going to be slower than accessing the data store directly with its native query language.