Tensor vs Matrix: an example with computer vision

Today's post will try to answer the question: What is the difference between a tensor and a matrix?

The Matrix

If you have a basic understanding of algebra, you should already know that a Matrix is an object with a rectangular shape of values, such as rows and columns, to describe something. For example, I can create a matrix of 5 rows and 3 columns to represent movie preferences for five different people. In the following example, we will use a table representing a matrix M for movies. One means the person likes the movie, and zero otherwise.

NameKung-Fu PandaTerminator 2White Chicks
Juan110
Peter101
Stacy010
Ann111
Greg110

Mathematically speaking, we can represent this matrix like:

$$M = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ a_{41} & a_{42} & a_{43} \\ a_{51} & a_{52} & a_{53} \end{pmatrix}$$

Where ( a_11 ) is the first cell where Juan likes Kung-fu panda. If we replace each cell with the table values, our matrix will look like this:

$$A = \begin{pmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 0 \\ 1 & 1 & 1 \\ 1 & 1 & 0 \end{pmatrix}$$

With this, we get into a more concrete definition of a matrix: a rectangular array of numbers, symbols, or expressions arranged in rows and columns. It's an object of 1 or 2 dimensions that describes something.

The Tensor

It is like a matrix but with high dimensionality. Yes, we can now build a 3D or 4D matrix! The best way to explain a matrix is by describing an image in its three channels: Red, Green, and Blue.

Let's assume we have an image that is full color and is just 5 x 5 pixels in size. An image is explained as three different matrixes altogether. The first one describes the image as a set of pixels in Red color, the second one as a set of pixels in Green, and the third one as a set of pixels in Blue. The combination of all these matrixes is the final image.

A tensor is then an object that can hold three matrixes of size 5 x 5 pixels. We usually describe this tensor a T with the shape: T(5,5,3), where the 3 represents each color channel.

The mathematical description of this tensor looks like this:

$$\mathcal{T} = \begin{pmatrix} \text{Red Channel} & \begin{pmatrix} r_{11} & r_{12} & r_{13} & r_{14} & r_{15} \\ r_{21} & r_{22} & r_{23} & r_{24} & r_{25} \\ r_{31} & r_{32} & r_{33} & r_{34} & r_{35} \\ r_{41} & r_{42} & r_{43} & r_{44} & r_{45} \\ r_{51} & r_{52} & r_{53} & r_{54} & r_{55} \end{pmatrix}, \\ \text{Green Channel} & \begin{pmatrix} g_{11} & g_{12} & g_{13} & g_{14} & g_{15} \\ g_{21} & g_{22} & g_{23} & g_{24} & g_{25} \\ g_{31} & g_{32} & g_{33} & g_{34} & g_{35} \\ g_{41} & g_{42} & g_{43} & g_{44} & g_{45} \\ g_{51} & g_{52} & g_{53} & g_{54} & g_{55} \end{pmatrix}, \\ \text{Blue Channel} & \begin{pmatrix} b_{11} & b_{12} & b_{13} & b_{14} & b_{15} \\ b_{21} & b_{22} & b_{23} & b_{24} & b_{25} \\ b_{31} & b_{32} & b_{33} & b_{34} & b_{35} \\ b_{41} & b_{42} & b_{43} & b_{44} & b_{45} \\ b_{51} & b_{52} & b_{53} & b_{54} & b_{55} \end{pmatrix} \end{pmatrix}$$

The T tensor is the 5x5x3 object. We can add another dimension called Alpha for opacity if we like. You can keep adding as many dimensions as your heart desires.

Sample Tensor in Python with No Library

The following code creates a custom-made Tensor to display an image for the three color matrixes. (No Pytorch or Tensorflow)

import numpy as np
import matplotlib.pyplot as plt

# Create a 5x5 matrix for the red channel
red_channel = np.array([
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
    [0, 0, 255, 0, 0],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0]
])

# Create a 5x5 matrix for the green channel
green_channel = np.array([
    [0, 0, 0, 0, 0],
    [0, 255, 255, 255, 0],
    [0, 255, 0, 255, 0],
    [0, 255, 255, 255, 0],
    [0, 0, 0, 0, 0]
])

# Create a 5x5 matrix for the blue channel
blue_channel = np.array([
    [255, 255, 255, 255, 255],
    [255, 0, 0, 0, 255],
    [255, 0, 0, 0, 255],
    [255, 0, 0, 0, 255],
    [255, 255, 255, 255, 255]
])

# Stack the color channels to create a 5x5x3 tensor
image_tensor = np.stack([red_channel, green_channel, blue_channel], axis=-1)

# Plot the image
plt.imshow(image_tensor)
plt.title('5x5 Image Pattern')
plt.axis('off')  # Hide axes
plt.show()

The code above generated the following image:

This shows how tensor T with pixels on and off for each channel was able to represent something in the image. The matplotlib uses all channels stacked together (as a sequence) to display the image.

New Tensor for Alpha Channel

We will slightly change the code to make the Blue tensor all blue, not just the pixels from the boundaries. This adds a new tensor, Alpha, who is responsible for showing opacity.

import numpy as np
import matplotlib.pyplot as plt

# Create the 5x5 matrices for each color channel
red_channel = np.array([
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
    [0, 0, 255, 0, 0],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0]
])

green_channel = np.array([
    [0, 0, 0, 0, 0],
    [0, 255, 255, 255, 0],
    [0, 255, 0, 255, 0],
    [0, 255, 255, 255, 0],
    [0, 0, 0, 0, 0]
])

# Create a 5x5 matrix for the blue channel with all pixels set to blue (255)
blue_channel = np.full((5, 5), 255, dtype=np.uint8)

# Create a 5x5 matrix for the alpha channel
alpha_channel = np.array([
    [255, 255, 255, 255, 255],
    [255, 128, 128, 128, 255],
    [255, 128, 128, 128, 255],
    [255, 128, 128, 128, 255],
    [255, 255, 255, 255, 255]
])

# Normalize alpha values to range [0, 1]
alpha_normalized = alpha_channel / 255.0

# Blend the red channel with 50% opacity
red_blended = red_channel * alpha_normalized

# Stack the color channels to create a 5x5x3 tensor
image_tensor = np.stack([red_blended, green_channel, blue_channel], axis=-1)

# Plot the image
plt.imshow(image_tensor)
plt.title('5x5 Image with Blue Channel Fully Set to 255')
plt.axis('off')  # Hide axes
plt.show()

This code will display the color red at 50% opacity. Please note that the opacity tensor is used as a filter; this means it is multiplied by one or more channels to provide the desired effect.

Remember that there is a blue and green layer, so the colors with 50% opacity will be shown as follows:

Summary

Tensors are multi-dimensional arrays that can hold much more information than a simple 1D or 2D matrix. Tensors are used today in computer vision, deep learning, image and video representations, natural language processing, robotics, control systems, reinforcement learning, and recommender systems, among many others!

We usually need to do many calculations using tensors. Some libraries can help speed up the process and provide many functionalities, such as Tensorflow and PyTorch.

If you want to see tensors used with TensorFlow to build a movie recommender system, please review my post on Movie Recommendations.