For our maths assignment we’ve been asked to research an application of linear algebra in the field of computing.

The maths this semester has been tough, as I’ve said before. We’ve done a ton of linear algebra, particularly matrices, and over the weeks I’ve been waiting for it to all start to come together, to help me understand what all this stuff has got to do with data science and big data in particular.

So when the assignment came out I was pleased to have the chance to try and understand a bit more about the context, about why we’re learning linear algebra and about how it all fits into a larger picture. When I started doing a bit of research I quickly realised the sheer number of applications linear algebra has. I had no idea.

The particular aspect of linear algebra I decided to focus on was Singular Value Decomposition (SVD), a technique for reducing the rank of a matrix through factorisation. SVD can be applied in a number of areas including:

- Computer vision,
- data mining (e.g. to remove redundant data that isn’t required for analysis, from a dataset),
- data compression,
- machine learning (e.g. in face recognition), and
- data science (e.g. recommender systems).

The application I decided to focus on was using SVD to compress a digital image. To help me understand it better I wrote a short Python program.

The program generates a series of 30 compressed images using the svd function in numpy, stamps each with the singular value from the SVD that’s been used to generate it and finally combines all of these images are combined into an animated GIF, so that the viewer can see the efect that the choice of singular values have on the quality of the compressed image.

Here’s the resultant animated GIF it produces.

Here’s the main part of the code.

Let’s break it down into steps. The program takes the name of an image and converts the image into a numpy array. It then decomposes it into three matrices.

It then takes these matrices, and uses the product of parts of them to compress the image. The image is stamped with the singular value used to generate it and written to the disk.

Finally, it takes the series of images and combines them into a single animated GIF.