Data Representation
Introduction:
In the realms of signal processing and machine learning, the representation of data in a numerical format is crucial for analysis, processing, and modeling. Depending on the dimensionality and complexity of the data, it can be represented in different numerical forms. The most common forms of data representation include scalar values, vectors, matrices, and tensors.
Definitions:
Scalar: A scalar is a single numerical value. It has magnitude but no direction. For instance, in mathematical terms, the number 5 or -3.2 are scalar values.
Vector: A vector is an ordered list of numbers. It can be visualized as a line segment in space that has both magnitude and direction. Mathematically, a vector is typically represented as a column (or sometimes row) of numbers (aka 1-D data)
Matrix: A matrix is a two-dimensional array of numbers. It can be visualized as a rectangular grid of numbers. A matrix has rows and columns, and its shape is often described by the number of rows by the number of columns, e.g., a 3x2 matrix has 3 rows and 2 columns.
Tensor: A tensor is a multi-dimensional array of numbers. While a scalar is 0-dimensional, a vector is 1-dimensional, and a matrix is 2-dimensional, a tensor can be 3-dimensional or more. For instance, a 3-dimensional tensor can be visualized as a cube of numbers.
Examples:
Scalar: An example of a scalar is the current temperature. If it's 22°C right now, that's a single numerical value.
Vector: An example of a vector could be the average high temperatures forecasted for the next week. For instance, if the forecasted high temperatures for the next seven days are 22°C, 23°C, 24°C, 25°C, 26°C, 27°C, and 28°C, then the vector representation might be: [22, 23, 24, 25, 26, 27, 28]
Matrix: An example of a matrix is a grayscale image. The pixels of the image are represented as values between 0 (black) and 255 (white). The image's resolution, say 100x100 pixels, will determine the size of the matrix. Each entry in the matrix corresponds to the grayscale value of a pixel.
Tensor: A tensor example is a colored image. In the most common format, an image has three color channels: Red, Green, and Blue (RGB). Each channel can be thought of as a matrix (like the grayscale image), and the three matrices combined form a 3-dimensional tensor. If the image is of resolution 100x100 pixels, the tensor's shape would be 100x100x3, with each slice of size 100x100 representing one of the RGB channels.
Summary & Conclusion
Scalar: A single numerical value.
Vector: A 1D array of values, often representing a series or sequence of numbers.
Matrix: A 2D array of values, typically visualized as a grid or table of numbers.
Tensor: An array of values with 3 or more dimensions, commonly used in applications requiring multi-channel or multi-modal data.
Data representation is crucial in domains like signal processing and machine learning. Gaining a practical understanding of how these data structures manifest in real-world engineering scenarios enables one to look beyond the terminology. Rather than being daunted by the jargon, professionals can leverage these data formats as powerful tools for both analysis and synthesis in their respective fields.