Pytorch has many similar but different operations in its tensor. Here is a more detailed explanation of these operations.
tensor.clone(), tensor.detach(), tensor.data
The three operations of tensor.clone(), tensor.detach(), and tensor.data all have the meaning of copying tensor, but there are certain differences in the actual copy!
# clone()Real copy, newly opened storage x = torch.tensor(([ 1.0 ]),requires_grad= True ) y = x.clone() print( "Id_x:{} Id_y:{}" .format(id (x),id(y))) y += 1 print( "x:{} y:{}" .format(x,y)) print( '-----------------------------------' ) # detach() shares storage with the original tensor , After the operation requires_grad becomes false x = torch.tensor(([ 1.0 ]),requires_grad= True ) y = x.detach() print( "Id_x:{} Id_y:{}" .format(id(x), id(y))) y += 1 print( "x:{} y:{}" .format(x,y)) print( '-----------------------------------' ) # .data is the same as detach(), official The answer is that there is not enough time to change the code, so this thing is still x = torch.tensor(([ 1.0 ]),requires_grad= True ) y = x.data print( "Id_x:{} Id_y:{}" .format( id(x),id(y))) y += 1 print( "x:{} y:{}" .format(x,y))
Id_x:140684285215008 Id_y:140684285217384 x:tensor([1.], requires_grad=True) y:tensor([2.], grad_fn=<AddBackward0>) ---------------- ------------------- Id_x:140684285216808 Id_y:140684285215008 x:tensor([2.], requires_grad=True) y:tensor([2.]) --- -------------------------------- Id_x:140684285216088 Id_y:140684285216808 x:tensor([2.], requires_grad=True ) y:tensor([2.])
In the official document, the explanation for detach is: Returns a new Tensor, detached from the current graph.
How to understand this sentence, see the code below.
It can be seen that the p obtained by z detach has nothing to do with the previous calculation graph, so that the gradient of pq backpropagation will not reach x.
x = torch.tensor(([ 1.0 ]),requires_grad = True ) y = x** 2 z = 2 *y w = z** 3 # detach it, so the gradient wrt `p` does not effect `z`! p = z.detach() print(p) q = torch.tensor(([ 2.0 ]), requires_grad = True ) pq = p*q pq.backward(retain_graph = True ) w.backward() print(x.grad) x = torch.tensor(([ 1.0 ]),requires_grad = True ) y = x** 2 z = 2 *y w = z** 3 # create a subpath for z p = z.clone() print(p) q = torch.tensor(([ 2.0 ]), requires_grad = True ) pq = p*q pq.backward(retain_graph = True ) w.backward() print(x.grad)
tensor([2.]) tensor([48.]) tensor([2.], grad_fn=<CloneBackward>) tensor([56.])
tensor.reshape(),tensor.view()
I think it’s the same for the time being, I haven’t figured out how it’s different