= 178 weight
1.1 Scalars, Vectors, Matrices
This subsection defines the mathematical concepts behind vectors, matrices, and tensors and their codding counterparts.
Scalars
A scalar is a single (real) value, denoted by ordinary lower-cased letters (e.g., x, y, and z) and the space of all (continuous) real-valued scalars by \mathbb{R}. Consequently, if x is a real-valued scalar, we write x\in\mathbb{R}.
Scalars are represented in Python by an int
or float
. For example, let x represent my weight in pounds.
Python Scalars
Python offers four basic data types: int
, float
, bool
, and str
.
isinstance(3.141, int) # Checks if an integer
False
Numpy Scalars
As vectors and matrices are formed of scalars, numpy
arrays also comprise of array scalars
—they are the elements of an array. These building blocks for numpy
arrays
import numpy as np
Vectors
In real-world applications, it oftentimes make more sense to group scalars together as a vector than letting them float around as seemingly unrelated variables. For example, in a clinical study of BMI, scalar measurements—like weight, height, and age—of an individual can be put into to a feature vector for easer data manipulation.
A vector is a fixed-length array of scalars, called the elements of the vector. We denote vectors by bold lowercase letters, e.g., \bold{x}, \bold{y}, \bold{z}, etc.
The standard way to visualize vectors is by vertically stacking their elements:
\bold{x} =\begin{bmatrix}x_{1} \\ \vdots \\x_{n}\end{bmatrix}, \tag{1.1}
We can refer to an element of a vector by using a subscript. For example, x_2 denotes the second element of \mathbf{x}. Since x_2 is a scalar, we do not bold it. The vector contains n elements, we write \bold{x}\in\mathbb{R}^n. Here, n is called the dimension of the vector.
In linear algebra, we sometimes distinguish between such column vectors and row vectors whose elements are stacked horizontally. A row vector is written as follows: \bold{x} =\begin{bmatrix}x_{1}, \ldots, x_{n}\end{bmatrix}.
Vectors in Python
In Python, the code counterpart of a vector is simply a list
.
# For example, x represents here a feature vector containing weight (lb), height (in), age (yrs).
= [123, 69, 46] x
In Python, as in most programming languages, vector indices start at 0, also known as zero-based indexing, whereas in linear algebra subscripts begin at 1 (one-based indexing).
Matrices
A matrix is a collection of rows and columns, much like a spreadsheet. Typically, rows correspond to individual records and columns correspond to distinct attributes or features.
Dragging our example BMI study a little further, let us now imagine sampling the feature vector for each of the 25 students in our class. We can in this case spare 25 variables each defining a vector containing the measurements from a student. To save variables and facilitate easy data manipulation, a better data organization will stack these vectors side-by-side to form a rectangular data object: a matrix. A matrix can be illustrate as a table:
student | weight | height | age |
---|---|---|---|
student 1 | 12 | 12 | 12 |
student 2 | 123 | 123 | 123 |
student 3 | 1 | 1 | 1 |
We denote matrices by bold capital letters (e.g., \bold{X}, \bold{Y}, and \bold{Z}). The expression \bold{A} \in \mathbb{R}^{m \times n} indicates that a matrix \bold{A} contains m \times n real-valued scalars, arranged as m rows and n columns. When m = n, we say that a matrix is square. To refer to an individual element, we subscript both the row and column indices, e.g., a_{ij} is the value that belongs to \bold{A}’s i^{\textrm{th}} row and j^{\textrm{th}} column:
\bold{A}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix}. \tag{1.2}
= np.arange(30).reshape(10, 3); x x
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29]])