我想保存一个dict或数组。
dict
我尝试与np.save和一起使用pickle,发现前者总是花费更少的时间。
np.save
pickle
我的实际数据要大得多,但在这里我仅展示一小段用于演示目的:
import numpy as np #import numpy.array as array import time import pickle b = {0: [np.array([0, 0, 0, 0])], 1: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1, 0, 0, 0]), np.array([ 0, -1, 0, 0]), np.array([ 0, 0, -1, 0]), np.array([ 0, 0, 0, -1])], 2: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1, 0, 0]), np.array([ 1, 0, -1, 0]), np.array([ 1, 0, 0, -1])], 3: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1, 0, 0, 0]), np.array([ 0, -1, 0, 0]), np.array([ 0, 0, -1, 0]), np.array([ 0, 0, 0, -1])], 4: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1, 0, 0]), np.array([ 1, 0, -1, 0]), np.array([ 1, 0, 0, -1])], 5: [np.array([0, 0, 0, 0])], 6: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1, 0, 0, 0]), np.array([ 0, -1, 0, 0]), np.array([ 0, 0, -1, 0]), np.array([ 0, 0, 0, -1])], 2: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1, 0, 0]), np.array([ 1, 0, -1, 0]), np.array([ 1, 0, 0, -1])], 7: [np.array([1, 0, 0, 0]), np.array([0, 1, 0, 0]), np.array([0, 0, 1, 0]), np.array([0, 0, 0, 1]), np.array([-1, 0, 0, 0]), np.array([ 0, -1, 0, 0]), np.array([ 0, 0, -1, 0]), np.array([ 0, 0, 0, -1])], 8: [np.array([2, 0, 0, 0]), np.array([1, 1, 0, 0]), np.array([1, 0, 1, 0]), np.array([1, 0, 0, 1]), np.array([ 1, -1, 0, 0]), np.array([ 1, 0, -1, 0]), np.array([ 1, 0, 0, -1])]} start_time = time.time() with open('testpickle', 'wb') as myfile: pickle.dump(b, myfile) print("--- Time to save with pickle: %s milliseconds ---" % (1000*time.time() - 1000*start_time)) start_time = time.time() np.save('numpy', b) print("--- Time to save with numpy: %s milliseconds ---" % (1000*time.time() - 1000*start_time)) start_time = time.time() with open('testpickle', 'rb') as myfile: g1 = pickle.load(myfile) print("--- Time to load with pickle: %s milliseconds ---" % (1000*time.time() - 1000*start_time)) start_time = time.time() g2 = np.load('numpy.npy') print("--- Time to load with numpy: %s milliseconds ---" % (1000*time.time() - 1000*start_time))
输出:
--- Time to save with pickle: 4.0 milliseconds --- --- Time to save with numpy: 1.0 milliseconds --- --- Time to load with pickle: 2.0 milliseconds --- --- Time to load with numpy: 1.0 milliseconds ---
我的实际大小(字典中约有100,000个键)时差更加明显。
为什么在保存和加载时,泡菜比np.save花费的时间更长?
我pickle什么时候应该使用?
因为只要书面对象不包含Python数据,
与此同时
注意,如果一个numpy数组确实包含Python对象,那么numpy只会腌制该数组,所有的胜利都将出局。