Dive into Deep Learning/D2L Preliminaries

D2L - 2.1. Data Manipulation

2023. 10. 9. 11:12 | Posted by 솔웅

https://d2l.ai/chapter_preliminaries/ndarray.html

2.1. Data Manipulation — Dive into Deep Learning 1.0.3 documentation

d2l.ai

2.1. Data Manipulation

In order to get anything done, we need some way to store and manipulate data. Generally, there are two important things we need to do with data: (i) acquire them; and (ii) process them once they are inside the computer. There is no point in acquiring data without some way to store it, so to start, let’s get our hands dirty with n-dimensional arrays, which we also call tensors. If you already know the NumPy scientific computing package, this will be a breeze. For all modern deep learning frameworks, the tensor class (ndarray in MXNet, Tensor in PyTorch and TensorFlow) resembles NumPy’s ndarray, with a few killer features added. First, the tensor class supports automatic differentiation. Second, it leverages GPUs to accelerate numerical computation, whereas NumPy only runs on CPUs. These properties make neural networks both easy to code and fast to run.

어떤 작업을 수행하려면 데이터를 저장하고 조작할 수 있는 방법이 필요합니다. 일반적으로 데이터와 관련하여 중요한 두 가지 작업이 있습니다. (i) 데이터를 획득합니다. (ii) 컴퓨터 내부에 있는 경우 이를 처리합니다. 데이터를 저장할 방법 없이 데이터를 얻는 것은 의미가 없습니다. 먼저 텐서라고도 불리는 n차원 배열을 사용해 보겠습니다. NumPy 과학 컴퓨팅 패키지를 이미 알고 있다면 매우 쉬울 것입니다. 모든 최신 딥 러닝 프레임워크의 경우 텐서 클래스(MXNet의 ndarray, PyTorch의 Tensor 및 TensorFlow)는 몇 가지 킬러 기능이 추가된 NumPy의 ndarray와 유사합니다. 첫째, 텐서 클래스는 자동 미분을 지원합니다. 둘째, NumPy는 CPU에서만 실행되는 반면 GPU를 활용하여 수치 계산을 가속화합니다. 이러한 속성은 신경망을 코딩하기 쉽고 빠르게 실행할 수 있게 해줍니다.

2.1.1. Getting Started

To start, we import the PyTorch library. Note that the package name is torch.

시작하려면 PyTorch 라이브러리를 가져옵니다. 패키지 이름은 torch입니다.

A tensor represents a (possibly multidimensional) array of numerical values. In the one-dimensional case, i.e., when only one axis is needed for the data, a tensor is called a vector. With two axes, a tensor is called a matrix. With k >2 axes, we drop the specialized names and just refer to the object as a k th-order tensor.

텐서는 숫자 값의 (아마도 다차원) 배열을 나타냅니다. 1차원의 경우, 즉 데이터에 하나의 축만 필요한 경우 텐서를 벡터라고 합니다. 두 개의 축이 있는 텐서를 행렬이라고 합니다. k >2 축을 사용하면 특수한 이름을 삭제하고 객체를 k차 텐서로 참조합니다.

PyTorch provides a variety of functions for creating new tensors prepopulated with values. For example, by invoking arange(n), we can create a vector of evenly spaced values, starting at 0 (included) and ending at n (not included). By default, the interval size is 1. Unless otherwise specified, new tensors are stored in main memory and designated for CPU-based computation.

PyTorch는 값이 미리 채워진 새로운 텐서를 생성하기 위한 다양한 기능을 제공합니다. 예를 들어 arange(n)을 호출하면 0(포함)에서 시작하여 n(포함되지 않음)으로 끝나는 균일한 간격의 값으로 구성된 벡터를 만들 수 있습니다. 기본적으로 간격 크기는 1입니다. 달리 지정하지 않는 한 새 텐서는 주 메모리에 저장되고 CPU 기반 계산을 위해 지정됩니다.

x = torch.arange(12, dtype=torch.float32)
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

Each of these values is called an element of the tensor. The tensor x contains 12 elements. We can inspect the total number of elements in a tensor via its numel method.

이러한 각 값을 텐서의 요소라고 합니다. 텐서 x는 12개의 요소를 포함합니다. numel 메소드를 통해 텐서의 총 요소 수를 검사할 수 있습니다.

x.numel()

We can access a tensor’s shape (the length along each axis) by inspecting its shape attribute. Because we are dealing with a vector here, the shape contains just a single element and is identical to the size.

모양 속성을 검사하여 텐서의 모양(각 축의 길이)에 접근할 수 있습니다. 여기서는 벡터를 다루기 때문에 모양에는 단일 요소만 포함되고 크기도 동일합니다.

x.shape

torch.Size([12])

We can change the shape of a tensor without altering its size or values, by invoking reshape. For example, we can transform our vector x whose shape is (12,) to a matrix X with shape (3, 4). This new tensor retains all elements but reconfigures them into a matrix. Notice that the elements of our vector are laid out one row at a time and thus x[3] == X[0, 3].

reshape를 호출하면 크기나 값을 변경하지 않고도 텐서의 모양을 변경할 수 있습니다. 예를 들어 모양이 (12,)인 벡터 x를 모양이 (3, 4)인 행렬 X로 변환할 수 있습니다. 이 새로운 텐서는 모든 요소를 유지하지만 이를 행렬로 재구성합니다. 벡터의 요소는 한 번에 한 행씩 배치되므로 x[3] == X[0, 3]입니다.

X = x.reshape(3, 4)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

Note that specifying every shape component to reshape is redundant. Because we already know our tensor’s size, we can work out one component of the shape given the rest. For example, given a tensor of size n and target shape (ℎ, w), we know that w=n/ℎ. To automatically infer one component of the shape, we can place a -1 for the shape component that should be inferred automatically. In our case, instead of calling x.reshape(3, 4), we could have equivalently called x.reshape(-1, 4) or x.reshape(3, -1).

모든 shape component 특정해서 reshape 하는 것은 redundant하다는 것을 주목하세요.. 우리는 이미 텐서의 크기를 알고 있기 때문에 나머지가 주어지면 shape 의 한 구성 요소를 계산할 수 있습니다. 예를 들어 크기가 n이고 대상 shape (ℎ, w) 있는 텐서가 있으면 w=n/ℎ임을 알 수 있습니다. 모양의 한 구성 요소를 자동으로 추론하려면 자동으로 추론해야 하는 모양 구성 요소에 -1을 배치할 수 있습니다. 우리의 경우 x.reshape(3, 4)를 호출하는 대신 x.reshape(-1, 4) 또는 x.reshape(3, -1)을 호출할 수도 있습니다.

Practitioners often need to work with tensors initialized to contain all 0s or 1s. We can construct a tensor with all elements set to 0 and a shape of (2, 3, 4) via the zeros function.

Practitioners 는 모두 0 또는 1을 포함하도록 초기화된 텐서를 사용하여 작업해야 하는 경우가 많습니다. zeros 함수를 통해 모든 요소가 0으로 설정되고 모양이 (2, 3, 4)인 텐서를 구성할 수 있습니다.

torch.zeros((2, 3, 4))

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

Similarly, we can create a tensor with all 1s by invoking ones.

마찬가지로, 1을 호출하여 모두 1로 구성된 텐서를 생성할 수 있습니다.

torch.ones((2, 3, 4))

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

We often wish to sample each element randomly (and independently) from a given probability distribution. For example, the parameters of neural networks are often initialized randomly. The following snippet creates a tensor with elements drawn from a standard Gaussian (normal) distribution with mean 0 and standard deviation 1.

우리는 주어진 확률 분포에서 각 요소를 무작위로(그리고 독립적으로) 샘플링하려는 경우가 많습니다. 예를 들어 신경망의 매개변수는 무작위로 초기화되는 경우가 많습니다. 다음 스니펫은 평균이 0이고 표준편차가 1인 표준 가우스(정규) 분포에서 추출된 요소로 텐서를 생성합니다.

torch.randn(3, 4)

tensor([[-0.6921, -1.7850, -0.0397,  0.3334],
        [-0.6288, -0.7518, -0.4018, -0.9821],
        [-1.3914,  1.5492, -0.3178, -0.9031]])

Finally, we can construct tensors by supplying the exact values for each element by supplying (possibly nested) Python list(s) containing numerical literals. Here, we construct a matrix with a list of lists, where the outermost list corresponds to axis 0, and the inner list corresponds to axis 1.

마지막으로, 숫자 리터럴이 포함된 (중첩된) Python 목록을 제공하여 각 요소에 대한 정확한 값을 제공함으로써 텐서를 구성할 수 있습니다. 여기서는 목록 목록으로 행렬을 구성합니다. 여기서 가장 바깥쪽 목록은 축 0에 해당하고 내부 목록은 축 1에 해당합니다.

torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

tensor([[2, 1, 4, 3],
        [1, 2, 3, 4],
        [4, 3, 2, 1]])

2.1.2. Indexing and Slicing

As with Python lists, we can access tensor elements by indexing (starting with 0). To access an element based on its position relative to the end of the list, we can use negative indexing. Finally, we can access whole ranges of indices via slicing (e.g., X[start:stop]), where the returned value includes the first index (start) but not the last (stop). Finally, when only one index (or slice) is specified for a k th-order tensor, it is applied along axis 0. Thus, in the following code, [-1] selects the last row and [1:3] selects the second and third rows.

Python 목록과 마찬가지로 인덱싱(0부터 시작)을 통해 텐서 요소에 액세스할 수 있습니다. 목록 끝을 기준으로 요소의 위치를 기준으로 요소에 액세스하려면 음수 인덱싱을 사용할 수 있습니다. 마지막으로, 슬라이싱(예: X[start:stop])을 통해 전체 인덱스 범위에 액세스할 수 있습니다. 여기서 반환된 값에는 첫 번째 인덱스(start)가 포함되지만 마지막(stop)은 포함되지 않습니다. 마지막으로, k차 텐서에 대해 하나의 인덱스(또는 슬라이스)만 지정된 경우 축 0을 따라 적용됩니다. 따라서 다음 코드에서 [-1]은 마지막 행을 선택하고 [1:3]은 두 번째와 세 번째 행을 선택합니다.

X, X[0],X[1],X[2],X[-1],X[0:3],X[0:2],X[0:1], X[1:3],X[1:2],X[1:1],X[2:3],X[2:2],X[-1:3],X[-1:0],X[-1:1],X[-1:2],X[-2],X[-2:3]

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 tensor([0., 1., 2., 3.]),
 tensor([4., 5., 6., 7.]),
 tensor([ 8.,  9., 10., 11.]),
 tensor([ 8.,  9., 10., 11.]),
 tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 tensor([[0., 1., 2., 3.],
         [4., 5., 6., 7.]]),
 tensor([[0., 1., 2., 3.]]),
 tensor([[ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 tensor([[4., 5., 6., 7.]]),
 tensor([], size=(0, 4)),
 tensor([[ 8.,  9., 10., 11.]]),
 tensor([], size=(0, 4)),
 tensor([[ 8.,  9., 10., 11.]]),
 tensor([], size=(0, 4)),
 tensor([], size=(0, 4)),
 tensor([], size=(0, 4)),
 tensor([4., 5., 6., 7.]),
 tensor([[ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]))

Beyond reading them, we can also write elements of a matrix by specifying indices.

이를 읽는 것 외에도 인덱스를 지정하여 행렬의 요소를 작성할 수도 있습니다.

X[1, 2] = 17
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5., 17.,  7.],
        [ 8.,  9., 10., 11.]])

If we want to assign multiple elements the same value, we apply the indexing on the left-hand side of the assignment operation. For instance, [:2, :] accesses the first and second rows, where : takes all the elements along axis 1 (column). While we discussed indexing for matrices, this also works for vectors and for tensors of more than two dimensions.

여러 요소에 동일한 값을 할당하려면 할당 작업의 왼쪽에 인덱싱을 적용합니다. 예를 들어, [:2, :]는 첫 번째와 두 번째 행에 액세스합니다. 여기서 :는 축 1(열)을 따라 모든 요소를 가져옵니다. 행렬에 대한 인덱싱에 대해 논의했지만 이는 벡터와 2차원 이상의 텐서에도 적용됩니다.

X[:2, :] = 12
X

tensor([[12., 12., 12., 12.],
        [12., 12., 12., 12.],
        [ 8.,  9., 10., 11.]])

2.1.3. Operations

Now that we know how to construct tensors and how to read from and write to their elements, we can begin to manipulate them with various mathematical operations. Among the most useful of these are the elementwise operations. These apply a standard scalar operation to each element of a tensor. For functions that take two tensors as inputs, elementwise operations apply some standard binary operator on each pair of corresponding elements. We can create an elementwise function from any function that maps from a scalar to a scalar.

이제 텐서를 구성하는 방법과 해당 요소를 읽고 쓰는 방법을 알았으므로 다양한 수학적 연산을 사용하여 텐서를 조작할 수 있습니다. 이들 중 가장 유용한 것 중에는 요소별 연산이 있습니다. 이는 텐서의 각 요소에 표준 스칼라 연산을 적용합니다. 두 개의 텐서를 입력으로 사용하는 함수의 경우 요소별 연산은 해당 요소의 각 쌍에 일부 표준 이진 연산자를 적용합니다. 스칼라에서 스칼라로 매핑되는 모든 함수에서 요소별 함수를 만들 수 있습니다.

In mathematical notation, we denote such unary scalar operators (taking one input) by the signature ƒ : ℝ → ℝ . This just means that the function maps from any real number onto some other real number. Most standard operators, including unary ones like e**x, can be applied elementwise.

수학적 표기법에서는 이러한 단항 스칼라 연산자(하나의 입력을 취함)를 f: ℝ → ℝ 기호로 나타냅니다. 이는 단지 함수가 실수를 다른 실수로 매핑한다는 것을 의미합니다. e**x와 같은 단항 연산자를 포함한 대부분의 표준 연산자는 요소별로 적용할 수 있습니다.

torch.exp(x)

tensor([162754.7969, 162754.7969, 162754.7969, 162754.7969, 162754.7969,
        162754.7969, 162754.7969, 162754.7969,   2980.9580,   8103.0840,
         22026.4648,  59874.1406])

Likewise, we denote binary scalar operators, which map pairs of real numbers to a (single) real number via the signature ƒ : ℝ , ℝ → ℝ . Given any two vectors u and v of the same shape, and a binary operator ƒ , we can produce a vector c=F(u,v) by setting ci← ƒ (ui,vi) for all i, where ci,ui, and vi are the i th elements of vectors c,u, and v. Here, we produced the vector-valued F: ℝ**d, ℝ**d → ℝ**d by lifting the scalar function to an elementwise vector operation. The common standard arithmetic operators for addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (**) have all been lifted to elementwise operations for identically-shaped tensors of arbitrary shape.

마찬가지로, 서명 ƒ : ℝ , ℝ → ℝ을 통해 실수 쌍을 (단일) 실수로 매핑하는 이진 스칼라 연산자를 나타냅니다. 동일한 모양의 두 벡터 u 및 v와 이항 연산자 ƒ 가 주어지면 모든 i에 대해 ci← ƒ (ui,vi)를 설정하여 벡터 c=F(u,v)를 생성할 수 있습니다. 여기서 ci,ui, vi는 벡터 c,u, v의 i 번째 요소입니다. 여기서는 스칼라 함수를 요소별 벡터 연산으로 끌어올려 벡터 값 F: ℝ**d, ℝ**d → ℝ**d를 생성했습니다. . 덧셈(+), 뺄셈(-), 곱셈(*), 나눗셈(/), 지수화(**)에 대한 일반적인 표준 산술 연산자가 모두 동일한 모양의 임의 모양의 텐서에 대한 요소별 연산으로 향상되었습니다.

x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y

(tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))

In addition to elementwise computations, we can also perform linear algebraic operations, such as dot products and matrix multiplications. We will elaborate on these in Section 2.3.

요소별 계산 외에도 내적 및 행렬 곱셈과 같은 선형 대수 연산을 수행할 수도 있습니다. 이에 대해서는 섹션 2.3에서 자세히 설명하겠습니다.

We can also concatenate multiple tensors, stacking them end-to-end to form a larger one. We just need to provide a list of tensors and tell the system along which axis to concatenate. The example below shows what happens when we concatenate two matrices along rows (axis 0) instead of columns (axis 1). We can see that the first output’s axis-0 length (6) is the sum of the two input tensors’ axis-0 lengths (3+3); while the second output’s axis-1 length (8) is the sum of the two input tensors’ axis-1 lengths (4+4).

또한 여러 개의 텐서를 연결하여 끝에서 끝까지 쌓아서 더 큰 텐서를 형성할 수도 있습니다. 텐서 목록을 제공하고 연결할 축을 시스템에 알려주기만 하면 됩니다. 아래 예는 열(축 1) 대신 행(축 0)을 따라 두 행렬을 연결할 때 어떤 일이 발생하는지 보여줍니다. 첫 번째 출력의 0축 길이(6)는 두 입력 텐서의 0축 길이(3+3)의 합이라는 것을 알 수 있습니다. 두 번째 출력의 축 1 길이(8)는 두 입력 텐서의 축 1 길이(4+4)의 합입니다.

X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
torch.cat((X, Y), dim=0), torch.cat((X, Y), dim=1)

설명

import torch

# 12개의 원소를 가지는 1차원 텐서 생성하고, 이를 (3, 4) 크기의 2차원 텐서로 변환합니다.
X = torch.arange(12, dtype=torch.float32).reshape((3, 4))

# 주어진 값으로 3x4 크기의 텐서 Y를 생성합니다.
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

# torch.cat을 사용하여 두 개의 텐서 X와 Y를 합칩니다. dim=0을 사용하면 행 방향으로 합치고, dim=1을 사용하면 열 방향으로 합칩니다.
result1 = torch.cat((X, Y), dim=0)  # 행 방향으로 합침
result2 = torch.cat((X, Y), dim=1)  # 열 방향으로 합침

# 결과 출력
print(result1)
print(result2)

이 코드는 PyTorch를 사용하여 두 개의 텐서 X와 Y를 합치고, 그 결과를 출력하는 예제입니다. torch.cat 함수를 사용하여 두 텐서를 합치는데, dim 매개변수를 사용하여 어느 방향(행 또는 열)으로 합칠지를 지정할 수 있습니다. 결과는 result1과 result2에 저장되며, 각각 행 방향과 열 방향으로 합쳐진 텐서를 나타냅니다.

결과

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.],
         [ 2.,  1.,  4.,  3.],
         [ 1.,  2.,  3.,  4.],
         [ 4.,  3.,  2.,  1.]]),
 tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
         [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
         [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]]))

첫 번째 결과인 result1은 행 방향으로 텐서 X와 Y가 합쳐진 것을 나타내며, 두 번째 결과인 result2는 열 방향으로 합쳐진 것을 나타냅니다.

Sometimes, we want to construct a binary tensor via logical statements. Take X == Y as an example. For each position i, j, if X[i, j] and Y[i, j] are equal, then the corresponding entry in the result takes value 1, otherwise it takes value 0.

때로는 논리문을 통해 이진 텐서를 구성하고 싶을 때도 있습니다. X == Y를 예로 들어 보겠습니다. 각 위치 i, j에 대해 X[i, j]와 Y[i, j]가 동일하면 결과의 해당 항목은 값 1을 취하고, 그렇지 않으면 값 0을 갖습니다.

X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

Summing all the elements in the tensor yields a tensor with only one element.

텐서의 모든 요소를 합산하면 요소가 하나만 있는 텐서가 생성됩니다.

X.sum()

tensor(66.)

2.1.4. Broadcasting

By now, you know how to perform elementwise binary operations on two tensors of the same shape. Under certain conditions, even when shapes differ, we can still perform elementwise binary operations by invoking the broadcasting mechanism. Broadcasting works according to the following two-step procedure: (i) expand one or both arrays by copying elements along axes with length 1 so that after this transformation, the two tensors have the same shape; (ii) perform an elementwise operation on the resulting arrays.

지금까지 동일한 모양의 두 텐서에 대해 요소별 이진 연산을 수행하는 방법을 알았습니다. 특정 조건에서는 모양이 다르더라도 브로드캐스팅 메커니즘을 호출하여 요소별 이진 연산을 계속 수행할 수 있습니다. 브로드캐스트는 다음 2단계 절차에 따라 작동합니다. (i) 길이가 1인 축을 따라 요소를 복사하여 하나 또는 두 배열을 확장하여 이 변환 후 두 텐서가 동일한 모양을 갖게 합니다. (ii) 결과 배열에 대해 요소별 연산을 수행합니다.

a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 2))
a, b

설명

import torch

# torch.arange를 사용하여 0부터 2까지의 값을 가지는 1차원 텐서를 생성하고, 이를 (3, 1) 크기의 2차원 텐서로 변환합니다.
a = torch.arange(3).reshape((3, 1))

# torch.arange를 사용하여 0부터 1까지의 값을 가지는 1차원 텐서를 생성하고, 이를 (1, 2) 크기의 2차원 텐서로 변환합니다.
b = torch.arange(2).reshape((1, 2))

# 결과 출력
print(a)
print(b)

이 코드는 PyTorch를 사용하여 두 개의 텐서 a와 b를 생성하고 그 결과를 출력하는 예제입니다.

첫 번째 부분에서 a는 0부터 2까지의 값을 가지는 1차원 텐서를 생성하고, .reshape((3, 1))을 사용하여 이를 (3, 1) 크기의 2차원 텐서로 변환합니다. 이렇게 하면 3개의 행과 1개의 열을 가진 행렬이 생성됩니다.
두 번째 부분에서 b는 0부터 1까지의 값을 가지는 1차원 텐서를 생성하고, .reshape((1, 2))를 사용하여 이를 (1, 2) 크기의 2차원 텐서로 변환합니다. 이로써 1개의 행과 2개의 열을 가진 행렬이 생성됩니다.

결과는 a와 b의 값이 출력되며, 각각 2차원 텐서의 형태를 가지고 있음을 확인할 수 있습니다.

결과

tensor([[0],
        [1],
        [2]])
tensor([[0, 1]])

첫 번째 결과는 텐서 a이며, (3, 1) 크기의 2차원 텐서입니다. 이는 3개의 행과 1개의 열을 가지며, 각 행에는 0, 1, 2라는 값을 가지고 있습니다.

두 번째 결과는 텐서 b이며, (1, 2) 크기의 2차원 텐서입니다. 이는 1개의 행과 2개의 열을 가지며, 각 열에는 0, 1이라는 값을 가지고 있습니다.

Since a and b are 3×1 and 1×2 matrices, respectively, their shapes do not match up. Broadcasting produces a larger 3×2 matrix by replicating matrix a along the columns and matrix b along the rows before adding them elementwise.

a와 b는 각각 3×1과 1×2 행렬이므로 모양이 일치하지 않습니다. 브로드캐스팅은 요소별로 추가하기 전에 열을 따라 행렬 a를, 행을 따라 행렬 b를 복제하여 더 큰 3×2 행렬을 생성합니다.

a + b

tensor([[0, 1],
        [1, 2],
        [2, 3]])

2.1.5. Saving Memory

Running operations can cause new memory to be allocated to host results. For example, if we write Y = X + Y, we dereference the tensor that Y used to point to and instead point Y at the newly allocated memory. We can demonstrate this issue with Python’s id() function, which gives us the exact address of the referenced object in memory. Note that after we run Y = Y + X, id(Y) points to a different location. That is because Python first evaluates Y + X, allocating new memory for the result and then points Y to this new location in memory.

작업을 실행하면 호스트 결과에 새 메모리가 할당될 수 있습니다. 예를 들어, Y = X + Y라고 쓰면 Y가 가리키는 데 사용된 텐서를 역참조하고 대신 새로 할당된 메모리에서 Y를 가리킵니다. 메모리에서 참조된 객체의 정확한 주소를 제공하는 Python의 id() 함수를 사용하여 이 문제를 입증할 수 있습니다. Y = Y + X를 실행한 후 id(Y)는 다른 위치를 가리킵니다. 그 이유는 Python이 먼저 Y + X를 평가하여 결과에 새 메모리를 할당한 다음 Y를 메모리의 새 위치를 가리키기 때문입니다.

before = id(Y)
Y = Y + X
id(Y) == before

False

This might be undesirable for two reasons. First, we do not want to run around allocating memory unnecessarily all the time. In machine learning, we often have hundreds of megabytes of parameters and update all of them multiple times per second. Whenever possible, we want to perform these updates in place. Second, we might point at the same parameters from multiple variables. If we do not update in place, we must be careful to update all of these references, lest we spring a memory leak or inadvertently refer to stale parameters.

이는 두 가지 이유로 바람직하지 않을 수 있습니다. 첫째, 우리는 항상 불필요하게 메모리를 할당하는 것을 원하지 않습니다. 기계 학습에서는 종종 수백 메가바이트의 매개변수가 있고 모든 매개변수를 초당 여러 번 업데이트합니다. 가능할 때마다 이러한 업데이트를 제자리에서 수행하려고 합니다. 둘째, 여러 변수에서 동일한 매개변수를 가리킬 수 있습니다. 제자리에서 업데이트하지 않으면 메모리 누수가 발생하거나 부주의하게 오래된 매개변수를 참조하지 않도록 이러한 참조를 모두 업데이트하도록 주의해야 합니다.

Fortunately, performing in-place operations is easy. We can assign the result of an operation to a previously allocated array Y by using slice notation: Y[:] = <expression>. To illustrate this concept, we overwrite the values of tensor Z, after initializing it, using zeros_like, to have the same shape as Y.

다행히도 내부 작업을 수행하는 것은 쉽습니다. 슬라이스 표기법(Y[:] = <expression>)을 사용하여 이전에 할당된 배열 Y에 연산 결과를 할당할 수 있습니다. 이 개념을 설명하기 위해 zeros_like를 사용하여 초기화한 후 텐서 Z의 값을 Y와 동일한 모양으로 덮어씁니다.

Z = torch.zeros_like(Y)
print('id(Z):', id(Z))
Z[:] = X + Y
print('id(Z):', id(Z))

설명

import torch

# Y와 동일한 크기와 데이터 타입을 가지는 모든 요소가 0인 텐서 Z를 생성합니다.
Z = torch.zeros_like(Y)

# 현재 Z의 메모리 주소를 출력합니다.
print('id(Z):', id(Z))

# Z의 모든 요소에 X와 Y의 합을 할당합니다.
Z[:] = X + Y

# 다시 Z의 메모리 주소를 출력합니다.
print('id(Z):', id(Z))

이 코드는 PyTorch를 사용하여 텐서 Z를 생성하고, 이후 X와 Y의 합을 Z에 할당하는 예제입니다. 코드를 단계별로 설명하겠습니다:

Z = torch.zeros_like(Y) : Y와 동일한 크기와 데이터 타입을 가지며, 모든 요소가 0으로 초기화된 텐서 Z를 생성합니다.
print('id(Z):', id(Z)) : id(Z)를 사용하여 현재 Z의 메모리 주소를 출력합니다.
Z[:] = X + Y : Z의 모든 요소에 X와 Y의 합을 할당합니다. 이는 요소별 덧셈을 수행하며, Z의 값이 갱신됩니다.
print('id(Z):', id(Z)) : 다시 id(Z)를 사용하여 Z의 메모리 주소를 출력합니다. 이 주소는 이전 출력과 동일해야 합니다.

결과적으로, 코드는 Z를 초기화하고 값을 할당한 후에도 Z의 메모리 주소가 변경되지 않는 것을 보여줍니다. 이는 PyTorch의 메모리 관리 방식 중 하나로, 새로운 값을 할당하더라도 기존 텐서의 메모리를 재사용하여 효율적으로 관리하는 방식을 나타냅니다.

결과

id(Z): 140381179266448
id(Z): 140381179266448

If the value of X is not reused in subsequent computations, we can also use X[:] = X + Y or X += Y to reduce the memory overhead of the operation.

X 값이 후속 계산에서 재사용되지 않는 경우 X[:] = X + Y 또는 X += Y를 사용하여 작업의 메모리 오버헤드를 줄일 수도 있습니다.

before = id(X)
X += Y
id(X) == before

설명

# 현재 X의 메모리 주소를 저장합니다.
before = id(X)

# X에 Y를 더하고 X의 메모리 주소를 다시 확인합니다.
X += Y

# X의 메모리 주소가 이전과 동일한지를 확인합니다.
id(X) == before

True

2.1.6. Conversion to Other Python Objects

Converting to a NumPy tensor (ndarray), or vice versa, is easy. The torch tensor and NumPy array will share their underlying memory, and changing one through an in-place operation will also change the other.

NumPy 텐서(ndarray)로 변환하거나 그 반대로 변환하는 것은 쉽습니다. 토치 텐서와 NumPy 배열은 기본 메모리를 공유하며 내부 작업을 통해 하나를 변경하면 다른 것도 변경됩니다.

A = X.numpy()
B = torch.from_numpy(A)
type(A), type(B)

(numpy.ndarray, torch.Tensor)

To convert a size-1 tensor to a Python scalar, we can invoke the item function or Python’s built-in functions.

크기가 1인 텐서를 Python 스칼라로 변환하려면 항목 함수나 Python의 내장 함수를 호출하면 됩니다.

a = torch.tensor([3.5])
a, a.item(), float(a), int(a)

(tensor([3.5000]), 3.5, 3.5, 3)

2.1.7. Summary

The tensor class is the main interface for storing and manipulating data in deep learning libraries. Tensors provide a variety of functionalities including construction routines; indexing and slicing; basic mathematics operations; broadcasting; memory-efficient assignment; and conversion to and from other Python objects.

텐서 클래스는 딥러닝 라이브러리에서 데이터를 저장하고 조작하기 위한 기본 인터페이스입니다. 텐서는 구성 루틴을 포함한 다음과 같은 다양한 기능을 제공합니다. 인덱싱 및 슬라이싱; 기본 수학 연산; broadcasting ; 메모리 효율적인 할당; 그리고 다른 Python 객체와의 변환.

2.1.8. Exercises

Run the code in this section. Change the conditional statement X == Y to X < Y or X > Y, and then see what kind of tensor you can get.

이 섹션의 코드를 실행하세요. 조건문 X == Y를 X < Y 또는 X > Y로 변경한 다음 어떤 종류의 텐서를 얻을 수 있는지 확인하세요.
Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?

방송 메커니즘의 요소별로 작동하는 두 개의 텐서를 다른 모양(예: 3차원 텐서)으로 교체합니다. 결과가 예상한 것과 같나요?

저작자표시

'Dive into Deep Learning > D2L Preliminaries' 카테고리의 다른 글

D2L - 2.7. Documentation (2)	2023.10.14
D2L - 2.6. Probability and Statistics (0)	2023.10.14
D2L - 2.5. Automatic Differentiation (0)	2023.10.12
D2L - 2.4. Calculus : 미적분 (1)	2023.10.12
D2L - 2.3. Linear Algebra - 선형 대수학 (1)	2023.10.11
D2L - 2.2. Data Preprocessing (0)	2023.10.09
D2L - 2. Preliminaries (0)	2023.10.09

IT 기술 따라잡기

공지사항

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

카테고리