Sunday, April 14, 2019

The Subset Sum Problem






Hello visitors,

This blog was started as a way to discuss some interesting things about programmings done in my CS class. Meaning that I will probably make mistakes in my discussion, so consider this as an early warning. Nevertheless, we will start this blog off with the Subset Sum Problem.

After spending some time reading about Subset Sum, this problem is seem to be a common dynamic programming problem. However for my class, we will be tackling it using mainly logic and other optimizations.

Our Task

Given a list of numbers (which we will call the "master list"), we are tasked to find a subset whose sum is as large as possible, but not larger than a given target value.

Brute-Force Solution

The brute-force solution would simply be to create the power set of the given list of numbers and then go through each subset in the power set until we find the best possible subset. However, this would be pretty slow since we first have to make all the possible subsets (which is equal to 2^N, where N is the number of terms in the list), some of which may have a sum greater than the target.

Optimizing our Brute-Force Solution

The Subset Sum Problem already is inherently slow, but there are a few simple things we can do to optimize the brute force solution to achieve slightly better performance:
  1. Use loops to create the power set instead of recursion
  2. When creating the subsets, prune out subsets whose sums are bigger than the target value (since adding more terms to them will result in a bigger sum) and check for subsets whose sum is actually equal to the target value (since that is essentially the best solution available).
  3. Skip individual items in the master list whose value is greater than the target value (same as the region mentioned in point 2).
  4. Check to see if the master list is empty (because an empty set would then be the solution)
  5. Check if the sum of all the elements in the master list is less than the target (if it is, then the master list itself is the solution and we are done). 

Interesting Optimization Questions

I already mentioned that even with the kinds of optimizations above, the algorithm in the worst-case would still be slow. Along with the lab, we were given three questions pertaining to what other factors affect the performance of the algorithm:
  1. Does a particular order of processing the master set elements work better than others (e.g. Does sorting the list of items help?)
    • Typically sorting the list would not necessarily do much for performance.  Depending on how the list is sorted, it can knock out viable sets earlier in the creation process, but will still take around the same amount of time for an unsorted list.
  2. Does the value of the target matter (i.e. are certain targets likely to be reached faster?)
    • Low target values would generally cause the search for the best subset to be faster, since it does not require subsets with really big sums. High target values causes the search to become longer because it requires subsets with big sums.
  3. Does the nature of the probability distribution from which the master set elements are drawn matter? (e.g. Can the algorithm take advantage of the fact if you knew that the
    master set elements were drawn from a particular probability distribution (e.g. a
    Gaussian, uniform or exponential distribution, etc.)
    • If the algorithm knows what kinds of values would appear most often, then it could use them to make different combinations of subsets until it finds the best solution. Though, it might still take some significant amount of time when doing so. 

Conclusion

Overall, the Subset Sum Problem is inherently slow. The optimizations mentioned would provide a modest performance boost, but not a significant one.  Going through the optimization questions, the value of the target will be what causes the algorithm to take more or less time in searching for the best subset. And yeah, this is my take on the Subset Sum problem. I hope this was an interesting read, and please leave comments down below on related topics and/or mistakes that need correction.

Thank you for reading and have a good day!

1 comment:

  1. This is a wonderful post and great discussion. Thanks Eric.

    &

    ReplyDelete