Error: gradient computation has been modified by an inplace operation

모델을 변경하면서 실험을 할 때 "gradient computation has been modified by an inplace operation"같은 error만큼 답이 안나오는 상황이 흔치 않다.

어떤 경우는 detach()로 해경하는 경우도 있었지만 이번 경우는 "nn.parallel.DistributedDataParallel"을 사용하면서 생겼다.

model = nn.parallel.DistributedDataParallel$($model, device_ids=[local_rank], broadcast_buffers=True, find_unused_parameters=False$)$

같은 상황에서 broadcast_buffers option을 True에서 False로 변경했을 때 해결됐다.

https://github.com/pytorch/pytorch/issues/62474 https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html

Distributed: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torc

🐛 Bug when using distributed with pretrained model, backprop seems to error out due to inplace modification. To Reproduce I have converted a repo: https://github.com/talreiss/Mean-Shifted-Anomaly-D...

github.com

'AI' 카테고리의 다른 글

Decoupled Knowledge Distillation - CVPR2022 (0)	2023.08.12
Debiased Self-Training for Semi-Supervised Learning (0)	2023.01.07
SaR: Self-Adaptive Refinement on Pseudo Labels for Multiclass-Imbalanced Semi-Supervised Learning (0)	2023.01.06
Numpy Image File with torchvision.datasets (0)	2022.04.22
대학원 면접 준비 (0)	2022.01.18

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

MisoYuri's Deck