Monday, April 22, 2019

Sparse Matrices

Hello everyone,

This week I am bringing you another discussion of a recent lab I was doing for CS class. This week I will be discussing about Sparse Matrices. As always, feel free to leave a comment on any questions you may have and and conceptual mistakes I may have made while writing this post.

What is a Sparse Matrix?

A sparse matrix is essentially a matrix with most of its elements being zero. An example would be that I have a 500 by 500 matrix to store integer values, but I only use about 100 spots out of the possible 250000. A sparse matrix in a computer program is looked down upon because we are wasting a lot of memory on a few values. There are some ways to lower the amount of memory used on sparse matrices, with one involving the use of arrays and linked lists.

An Ideal Implementation for Matrices

Since the lab I was doing was for a C++ course, I will be referring to the vector data type in the C++ Standard Template Library (STL). I will also be referring to the list data type in C++ STL as it is a linked list implementation.

Starting off, a typical matrix would be stored in a two-dimensional array. What is beneficial about this implementation is that it allows for quick, random access to elements within the matrix. The downside is that if the matrix is sparse, we are wasting memory.

In the implementation I will be discussing about, we will be utilizing an array and a linked list. The array will be initialized to be an array of linked lists, meaning each element within the array represents a row in the matrix. The linked lists will represent the columns, whose nodes will contain the column number and the data value being stored.

The idea behind this implementation is that we are only spending memory on a one-dimensional array. From there, our linked list will contain nodes for data values that are nonzero (or if you are doing a template approach, not a default value). If the linked list for a given row does not contain a node with a given column number, it is the equivalent of saying there is a zero at the given (row, column) coordinates.

The downside of this implementation is that access speed to elements is a bit slow, because even if we are given coordinates, we need to traverse the linked list at a given row searching for the given column value. But this implementation does save memory in the end.

Questions Given in the Lab

Just like the last post, this lab includes a few questions regarding to the kinds of things we can do in software to save memory:
Q: What is the trade-off of using software to give save resources instead of using better hardware?
A: If we utilize algorithms and software to save on the amount of resources used computers, I would say that we will be trading off speed for resource efficiency. I believe this is being done with zip files, where the software needs time to compress and package given files and uncompressed them when unzipped.

Q: vector::at() returns a reference that you can use as both an rvalue (RHS of assignment) as well as an lvalue. Is it possible to implement such an accessor for SparseMatrices (the class I was implementing)?
A: I would say it is possible to implement such an accessor for SparseMatrices. If vector::at() is returning the reference of an object and it is being called on the right-hand side of an assignment, then we are just assigning that object to another one. If we want our SparseMatrices accessor be used on the left-hand side of an assignment, the objects SparseMatrices contains need to have an overloaded assignment operator.

Q: In C++ two different functions cannot differ in their return types only. Discuss why you think this restriction exists in the language standard.
A: I believe this restriction exists in the language standard because of ambiguity. If we have two functions with the same name, amount of parameters, but different return type, how can it tell which return type to use.

Q: Does it make sense to keep your lists sorted?
A: If the nodes within a linked list were sorted such that columns on the left-most side appear first, then it would provide faster access time in the ideal case, since it will not have to search to far for a certain column number. However, in the worst case it may still require a traversal through the entire list.

Conclusion

This lab is a pretty interesting lab since we get to see how we can use simple techniques to effectively save on that sweet precious memory. Although there are probably more methods of implementing an efficient matrix, the implementation is considered to be a pretty simple one. I hope you enjoy this post. As mentioned before, feel free to leave a comment on anything relevant to the material presented here, questions, and mistakes that I have made when writing this post so I can fix them.

Thank you so much for reading!

1 comment:

  1. Awesome. Can you post these discussion points in the Canvas forum also?

    &

    ReplyDelete