docker运行pytorch报错:RuntimeError: DataLoader worker (pid 493) is killed by signal: Bus error. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.

使用docker运行Pytorch的时候如果报如下错误:

1
RuntimeError: DataLoader worker (pid 493) is killed by signal: Bus error. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.


应该是docker容器的共享内存空间不够导致的。先把docker当前该保存的状态保存好后,使用exit退出docker,重新docker run运行相应的镜像并加上--shm-size参数,例如--shm-size 10G代表使用10G的共享内存。

参考

https://github.com/pytorch/pytorch/issues/2244