Can you do third-grade math? [part 2]

Tue 25 August 2015 by Jeff Fischer

In part 1, we looked at a math problem found in my 3rd grade son's homework. We can formally state the stamps problem as:

Given a goal postage value V and a set of available stamp values S, find a multiset of stamps C = {c:sub:1, c₂, ... c_n} such that (1) all members of C are in S, (2) the sum of their values c₁ + c₂ ... + c_n = V, and (3) no smaller cardinality multiset C′ also satisfies properties 1 and 2.

A first solution

We saw in part one that a simple greedy algorithm does not always give the correct answer. We now consider an algorithm that always finds the right answer, without worrying too much about efficiency. To do this, we first notice that, given a stamp value s from our set of available values, there can be no more than V/s instances of s in the solution multiset C. If there were more, the sum of all members of C would be greater than V.

To find all the possible solutions, we look at all combinations of counts within the range [0, V/s:sub:`i`] for all s:sub:`i` in S. For example, given V = 42 and S = {50, 22, 10}, we get: [0] × [0, 1] × [0, 1, 2, 3, 4] = [0, 0, 0], [0, 1, 0], [0, 1, 1], [0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 0, 1], ...

If we take the product of each count and its associated stamp value and sum these up, and the total is V, then we have a solution. We keep track of the shortest solution so far. Once we have tried all possible combinations, we return the best solution we found. To easily iterate through all combinations of stamp counts, we use the Python function itertools.product(), which iterates through the cartesian product of a list of vectors. Here is the Python code:

from itertools import product
STAMP_SIZES = (50, 22, 10)

def product_solve(value, stamp_sizes=STAMP_SIZES):
    iterators = [range((value/s)+1) for s in stamp_sizes]
    best_result = None
    for p in product(*iterators):
        # Each p is an array with the length equal to len(stamp_sizes),
        # where p[i] is the count to be used for stamp_sizes[i]
        total = 0
        result_len = 0
        for (i, cnt) in enumerate(p):
            stamp_value = cnt * stamp_sizes[i]
            total += stamp_value
            result_len += cnt
            if total>value:
                break
        if total==value:
            # we have a solution
            if best_result is None or result_len<len(best_result):
                # new best result - build up the array
                best_result = []
                for (i, cnt) in enumerate(p):
                    best_result.extend([stamp_sizes[i],]*cnt)
    return best_result

This gives us the expected results for the cases we've looked at so far:

>>> product_solve(42)
product_solve(42)
[22, 10, 10]
>>> product_solve(76)
product_solve(76)
[22, 22, 22, 10]
>>> product_solve(55, stamp_sizes=[50, 11, 1])
product_solve(55, stamp_sizes=[50, 11, 1])
[11, 11, 11, 11, 11]

Let's try a harder test case. We'll use the first eleven primes as our stamp sizes and a bigger value:

>>> primes = [1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
primes = [1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
>>> product_solve(392, stamp_sizes=primes)
product_solve(392, stamp_sizes=primes)

When running this, the program seems to hang. Think about how many iterations must be processed for this case: the product of (1+392/1) × (1+392/2) × ... (1+392/29). If each iteration takes a tenth of a millisecond to process, it will take over 20 million years to get the answer!

The complexity of the Stamps Problem

Our stamps problem is a actually a variation of the Subset Sum Problem:

Given a set (or multiset) of integers and a target value V, is there a non-empty subset whose sum is V?

Usually the problem is stated, without loss of generality, with V = 0. This problem is NP Complete, a complexity class for which there are no known polynomial time algorithms. Thus, the amount of computation required to solve the problem increases very quickly with the problem's size, as we saw with the above example.

A faster solution

We will use a Dynamic Programming approach to solve this problem more efficiently. In dynamic programming, the problem is broken down into smaller subproblems and the result of each subproblem is saved ("memoized") rather than recomputing it each time it is needed. In our solution, we will memoize the best solution found for each subproblem encountered, where a subproblem is the stamps problem for a goal value less than V.

There will be three nested loops: first over the length of the solution, second over the stamp sizes, and the third over the memoized solutions. In the innermost loop, we add each of the stamp values to all the optimal subproblem solutions found so far and see if any new optimal solutions are found. Note that this increases the length of the solution by one and also the maximum subproblem value.

We will make use of two properties of the problem's structure:

Due to the associative property of addition, we only need to memoize one optimal solution to each subproblem. For example, given a subproblem goal value of 12, the solution multisets [6, 6], [8, 4], and [4, 8] are all equivalent.
Assume we have found all optimal solutions of length at most N for any subproblems with goal value at most V. If extending the solution length to N + 1 does not yield any more optimal solutions, then no optimal solutions for subproblems with goal value at most V exist for any length > N. This lets us know when to stop our search. To see this is true, consider that, if an iteration does not add any new solutions, further iterations will run with the exact same inputs.

Here is the Python code for our dynamic programming solution:

STAMP_SIZES = (50, 22, 10)

def dynamic_solve(value, stamp_sizes=STAMP_SIZES):
    """Dynamic programming solution to stamps problem:
    """
    # map from value to best solution so far for that value
    best_so_far = {s:[s,] for s in stamp_sizes}
    # base case
    if value in best_so_far:
        return [value,]
    new_cases_added = True
    # We start with the base cases and keep increasing the length of the
    # solutions by one. If we increase the length and find no new optimal
    # solutions below the value, then no solution exists.
    while new_cases_added:
        new_cases_added = False
        for s in stamp_sizes:
            for v in best_so_far.keys():
                subtotal = v+s
                if subtotal==value:
                    return best_so_far[v] + [s,]
                elif subtotal<value and (subtotal not in best_so_far):
                    best_so_far[subtotal] = best_so_far[v] + [s,]
                    new_cases_added = True
    return None # no solution found

Here are the results we get when we run this solver on our test cases:

>>> dynamic_solve(42)
[10, 22, 10]
>>> dynamic_solve(76)
[22, 22, 10, 22]
>>> dynamic_solve(55, stamp_sizes=[50, 11, 1])
[11, 11, 11, 11, 11]
>>> primes = [1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
>>> dynamic_solve(392, stamp_sizes=primes)
[29, 1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 1, 2, 3, 5, 7, 11, 13, 17, 19,
 23, 29, 2, 5, 7, 11, 13, 17, 19, 29]

The dynamic_solve() function found the correct answer for our test cases, including the pathological primes test case. In fact, it solved the problem in only 1.37 milliseconds, instead of taking millions of years! Thus, the design of the algorithm makes a huge difference for these types of problems.

Conclusion

This clearly is not a third-grade-level math problem. At UCLA, I think they might cover this type of problem in CS180, the upper-division algorithms class. Of course, I did not go into all these details with my son — I just helped him try to find solutions intuitively. Personally, I'm not sure what value this type of problem has for a third grader. I would rather have him focus on solving problems based on the theory he has actually learned to this point, instead of guessing his way through problems he does not fully understand. I could see discrete math, algorithms, and complexity theory taught at the High School level — I think it should be part of the standard AP/honors curriculum, it really is no harder than Calculus.

You can find the example code from this series of blog posts on Github at https://github.com/jfischer/blog-examples/tree/master/2015-08-stamps-problem.

If you enjoyed my explaination of this problem, I strongly recommend that you look at Peter Norvig's exposition of the Traveling Salesman Problem.