On Thu, Jan 8, 2015 at 11:23 AM, John Ladasky
<john_lada...@sbcglobal.net> wrote:
>> P.S. don't use pickle, it is a security vulnerability equivalent in
>> severity to using exec in your code, and an unversioned opaque
>> schemaless blob that is very difficult to work with when circumstances
>> change.
>
> For all of its shortcomings, I can't live without pickle.  In this case, I am 
> doing data mining.  My TrainingSession class commandeers seven CPU cores via 
> Multiprocessing.Pool.  Still, even my "toy" TrainingSessions take several 
> minutes to run.  I can't afford to re-run TrainingSession every time I need 
> my models.  I need a persistent object.
>
> Besides, the opportunity for mischief is low.  My code is for my own personal 
> use.  And I trust the third-party libraries that I am using.  My SVRModel 
> object wraps the NuSVR object from scikit-learn, which in turn wraps the 
> libsvm binary.

There are several issues, not all of which are easily dodged. Devin cited two:

* Security: it's fundamentally equivalent to using 'exec'
* Unversioned: it's hard to make updates to your code and then load old data

"For your own personal use" dodges the first one, but makes the second
one even more of a concern. You can get much better persistence using
a textual format like JSON, and adding in a simple 'version' member
can make it even easier. Then, when you make changes, you can cope
with old data fairly readily.

Pickle is still there if you want it, but you do have to be aware of
its limitations. If you edit the TrainingSession class, you may well
have to rerun the training... but maybe that's not a bad thing.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to