1.3 Matrices
: 15 minutes
A matrix is a collection of rows and columns, much like a spreadsheet. Typically, rows correspond to individual records and columns correspond to distinct attributes or features.
Dragging our hypothetical BMI study a little further, let us now imagine sampling the feature vector (weight, height, age) over and over again for each of the 25 students in our class. We could in this case spare 25 variables (student1, student2, …) each defining a vector containing the measurements from a student. However, to save variables and facilitate easy data manipulation, a better data organization will stack the features side-by-side to form a rectangular data object: a matrix. A matrix can be illustrate as a table:
student | weight | height | age |
---|---|---|---|
student 1 | 153 | 68 | 46 |
student 2 | 196 | 55 | 30 |
… | … | … | … |
student 3 | 163 | 58 | 26 |
We denote matrices by bold capital letters (e.g., \bold{X}, \bold{Y}, and \bold{Z}). The expression \bold{A} \in \mathbb{R}^{m \times n} indicates that a matrix \bold{A} contains m \times n real-valued scalars, arranged as m rows and n columns. Alternatively, we say the size of \bold{A} is (m,n).
When m = n, we say that a matrix is square. To refer to an individual element, we subscript both the row and column indices, e.g., a_{ij} is the value that belongs to \bold{A}’s i^{\textrm{th}} row and j^{\textrm{th}} column:
\bold{A}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix}. \tag{3.1}
Matrices in Python
Although not very convenient, one can use a list
of lists to represent a data matrix in Python.