我之前曾在这里问过以下几行代码:
parameters = [{'weights': ['uniform'], 'n_neighbors': [5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]}] clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=4) clf.fit(features, rewards)
但是当我运行此命令时,出现了另一个与先前提出的问题无关的问题。Python最终显示以下操作系统错误消息:
Process: Python [1327] Path: /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: 2.7.2.5 (2.7.2.5.r64662-trunk) Code Type: X86-64 (Native) Parent Process: Python [1316] Responsible: Sublime Text 2 [308] User ID: 501 Date/Time: 2014-08-12 10:27:24.640 +0200 OS Version: Mac OS X 10.9.4 (13E28) Report Version: 11 Anonymous UUID: D10CD8B7-221F-B121-98D4-4574A1F2189F Sleep/Wake UUID: 0B9C4AE0-26E6-4DE8-B751-665791968115 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000110 VM Regions Near 0x110: --> __TEXT 0000000100000000-0000000100001000 [ 4K] r-x/rwx SM=COW /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python Application Specific Information: *** multi-threaded process forked *** crashed on child side of fork pre-exec Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libdispatch.dylib 0x00007fff91534c90 dispatch_group_async_f + 141 1 libBLAS.dylib 0x00007fff9413f791 APL_sgemm + 1061 2 libBLAS.dylib 0x00007fff9413cb3f cblas_sgemm + 1267 3 _dotblas.so 0x0000000102b0236e dotblas_matrixproduct + 5934 4 org.activestate.ActivePython27 0x00000001000c552d PyEval_EvalFrameEx + 23949 5 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 6 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968 7 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 8 org.activestate.ActivePython27 0x000000010003d390 function_call + 176 9 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 10 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586 11 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 12 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968 13 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 14 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968 15 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127 16 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127 17 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 18 org.activestate.ActivePython27 0x000000010003d390 function_call + 176 19 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 20 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586 21 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 22 org.activestate.ActivePython27 0x000000010003d390 function_call + 176 23 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 24 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365 25 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 26 org.activestate.ActivePython27 0x0000000100077dfa slot_tp_call + 74 27 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 28 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586 29 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 30 org.activestate.ActivePython27 0x000000010003d390 function_call + 176 31 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 32 org.activestate.ActivePython27 0x00000001000c098a PyEval_EvalFrameEx + 4586 33 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127 34 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127 35 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 36 org.activestate.ActivePython27 0x000000010003d390 function_call + 176 37 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 38 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365 39 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 40 org.activestate.ActivePython27 0x0000000100077a28 slot_tp_init + 88 41 org.activestate.ActivePython27 0x0000000100074e25 type_call + 245 42 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 43 org.activestate.ActivePython27 0x00000001000c267d PyEval_EvalFrameEx + 11997 44 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127 45 org.activestate.ActivePython27 0x00000001000c7137 PyEval_EvalFrameEx + 31127 46 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 47 org.activestate.ActivePython27 0x000000010003d390 function_call + 176 48 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 49 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365 50 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 51 org.activestate.ActivePython27 0x0000000100077a28 slot_tp_init + 88 52 org.activestate.ActivePython27 0x0000000100074e25 type_call + 245 53 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 54 org.activestate.ActivePython27 0x00000001000c267d PyEval_EvalFrameEx + 11997 55 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 56 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968 57 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 58 org.activestate.ActivePython27 0x000000010003d390 function_call + 176 59 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 60 org.activestate.ActivePython27 0x000000010001d36d instancemethod_call + 365 61 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 62 org.activestate.ActivePython27 0x0000000100077dfa slot_tp_call + 74 63 org.activestate.ActivePython27 0x000000010000be12 PyObject_Call + 98 64 org.activestate.ActivePython27 0x00000001000c267d PyEval_EvalFrameEx + 11997 65 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 66 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968 67 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 68 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968 69 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 70 org.activestate.ActivePython27 0x00000001000c5d10 PyEval_EvalFrameEx + 25968 71 org.activestate.ActivePython27 0x00000001000c7ad6 PyEval_EvalCodeEx + 2118 72 org.activestate.ActivePython27 0x00000001000c7bf6 PyEval_EvalCode + 54 73 org.activestate.ActivePython27 0x00000001000ed31e PyRun_FileExFlags + 174 74 org.activestate.ActivePython27 0x00000001000ed5d9 PyRun_SimpleFileExFlags + 489 75 org.activestate.ActivePython27 0x00000001001041dc Py_Main + 2940 76 org.activestate.ActivePython27.app 0x0000000100000ed4 0x100000000 + 3796 Thread 0 crashed with X86 Thread State (64-bit): rax: 0x0000000000000100 rbx: 0x00007fff7cd43640 rcx: 0x0000000000000000 rdx: 0x0000000105e00000 rdi: 0x0000000000000008 rsi: 0x0000000105e01000 rbp: 0x00007fff5fbfa370 rsp: 0x00007fff5fbfa350 r8: 0x0000000000000001 r9: 0x0000000105e00000 r10: 0x0000000105e01000 r11: 0x0000000000000000 r12: 0x000000010ba10530 r13: 0x000000010b000000 r14: 0x00000001066d1970 r15: 0x00007fff915311af rip: 0x00007fff91534c90 rfl: 0x0000000000010206 cr2: 0x0000000000000110 Logical CPU: 2 Error Code: 0x00000006 Trap Number: 14 ......... VM Region Summary: ReadOnly portion of Libraries: Total=183.7M resident=97.0M(53%) swapped_out_or_unallocated=86.7M(47%) Writable regions: Total=1.3G written=142.8M(11%) resident=503.6M(39%) swapped_out=0K(0%) unallocated=791.7M(61%)
当我将代码中的第二行替换为:
clf = GridSearchCV(neighbors.KNeighborsRegressor(), parameters, n_jobs=1)
然后一切正常,除非我不使用多个线程。
我的操作系统是OSX 10.9.4
我的python版本是2.7.8 | Anaconda 2.0.1(x86_64)| (默认值,2014年7月2日,15:36:00)[GCC 4.2.1(Apple Inc.内部版本5577)]
我的scikit-lern版本是0.14.1
我的numpy版本是1.8.1
我的scipy版本是0.14.0
我的问题是,是否有人知道如何使GridSearchCV在多个线程上运行?
编辑:
我已经意识到,实际上此错误仅发生在我的某些输入数据集中。不幸的是,有问题的数据集(其X)太大,因此无法在此处复制它们。输入要素数据基本上是tf- idf向量,y向量是float> 0,特别是:
[60.0, 7.0, 12.0, 21.0, 5.5, 3.0, 0.0, 2.5, 11.0, 3.0, 16.0, 2.0, 0.0, 4.5, 2.5, 6.0, 9.5, 2.5, 15.0, 7.0, 8.0, 13.0, 14.0, 8.0, 3.5, 6.0, 22.5, 7.0, 4.0, 3.5, 4.5, 6.0, 5.5, 7.0, 2.0, 0.0, 0.0, 0.0, 14.5, 8.0, 7.5, 2.5, 11.5, 1.0, 3.0, 14.5, 10.0, 14.5, 8.0, 8.0, 7.0, 2.5, 3.5, 3.0, 13.5, 7.0, 6.5, 2.5, 9.0, 8.0, 11.0, 17.5, 12.5, 4.5, 5.5, 8.0, 2.0, 7.0, 4.0, 1.5, 3.0, 21.5, 4.5, 4.0, 7.0, 9.0, 13.5, 8.0, 10.5, 4.5, 1.5, 11.5, 7.5, 11.5, 4.5, 5.0, 7.0, 9.5, 4.0, 4.0, 6.0, 3.5, 4.5, 7.5, 3.5, 3.5, 3.5, 6.0, 5.0, 5.5, 25.0, 6.5, 5.0, 2.0, 2.0, 10.5, 0.0, 6.5, 19.0, 9.0, 1.0, 1.5, 1.0, 0.0, 1.0, 4.5, 2.5, 17.5, 39.5, 7.5, 5.5, 8.0, 1.0, 6.0, 12.0, 10.0, 5.5, 19.0, 4.5, 1.5, 25.5, 4.0, 10.0, 18.5, 9.5, 10.5, 2.5, 6.0, 1.0, 10.0, 8.5, 12.5, 13.5, 5.0, 6.5, 11.0, 4.5, 8.0, 7.5, 11.5, 14.5, 9.0, 3.0, 1.5, 3.5, 5.5, 2.5, 12.5, 6.5, 5.5, 5.0, 0.0, 8.0, 3.0, 14.5, 5.0, 14.0, 7.0, 13.5, 12.5, 4.0, 1.5, 6.5, 10.5, 9.0, 16.5, 4.0, 4.0, 15.0, 11.5, 2.5, 8.5, 3.0, 5.0, 4.0, 8.5, 6.0, 5.0, 5.0, 5.0, 5.5, 8.0, 11.0, 4.0, 0.0, 5.5, 0.0, 4.5, 1.5, 0.0, 6.5, 11.0, 2.5, 8.0, 15.5, 5.5, 4.5, 5.0, 4.0, 5.5, 10.5, 7.5, 6.5, 8.5, 2.5, 1.5, 1.5, 18.0, 15.0, 14.0, 9.5, 5.5, 7.5, 14.5, 2.5, 5.0, 60.0, 6.5, 14.5, 6.5, 4.0, 1.5, 2.0, 4.0, 27.0, 3.0, 5.0, 4.0, 2.5, 1.0, 1.5, 1.5, 9.0, 4.0, 8.5, 4.0, 4.0, 0.0, 1.5, 7.5, 1.5, 7.5, 1.0, 28.5, 15.5, 7.5, 1.0, 2.5, 2.5, 2.5, 16.0, 5.5, 8.5, 4.0, 2.5, 5.0, 2.5, 6.0, 11.0, 10.0, 4.5, 6.5, 8.0, 6.0, 4.5, 15.5, 4.0, 5.0]
具有1个作业的版本适用于我的所有输入数据集,即使是对此也适用。
libdispatch.dylib当您进行numpy.dot呼叫时,OSX的内置BLAS实现(称为Accelerate)在内部使用了Grand Central Dispatch中的SAS 。当程序随后fork不使用execsyscall调用POSIX syscall时,GCD运行时将无法运行,因此,使使用该multiprocessing模块的所有Python程序都容易崩溃。sklearnGridsearchCV使用Pythonmultiprocessing模块进行并行化。
libdispatch.dylib
numpy.dot
fork
exec
multiprocessing
GridsearchCV
在Python 3.4和更高版本中,您可以强制Python多重处理使用forkserver start方法而不是默认fork模式来解决此问题,例如在程序主文件的开头:
if __name__ == "__main__": import multiprocessing as mp; mp.set_start_method('forkserver')
或者,您可以从源代码重建numpy,并使它链接到ATLAS或OpenBLAS,而不是OSX Accelerate。numpy开发人员正在处理默认情况下包括ATLAS或OpenBLAS的二进制发行版。