Cyclic Tower of Hanoi – analysis and implementation

The Tower of Hanoi problem consists of moving a size-ordered stack of n discs from one tower to another tower, out of three towers {A, B, C}, one disc at a time without putting a larger disc on top of a smaller one. The cyclic version of this problem adds the further constraint that a disc can only move through the towers in cycles, eg B -> C -> A.

Regardless of the details of the different versions, the problem has one useful invariant: the nth disc can only ever move to an empty tower because there are no discs larger than it. So wherever the nth disc moves, the top n-1 discs can move to any tower regardless of where the nth disc is without breaking the size-order rule. So this implies a recursive process in which we get the nth disc to it’s final position, but also apply that process to the top n-1 discs BEFORE applying it to the nth disc. But due to the cyclic constraint, we’ll need two versions of each process – one that moves m discs to an adjacent tower, and one that moves m discs to the post-adjacent tower. For moving one disc, there’s no difference than with applying the same function twice, but for n > 1 discs, moving to the post-adjacent tower can be done more efficiently than moving all the discs to the adjacent tower twice.

So first, we’ll represent our three towers using:

constexpr int A = 0, B = 1, C = 2, TOWERS = 3;

You could use enum classes, but ints are easier for now. Since we are doing this in cyclic order, we may as well define a function to get the next tower:

constexpr auto next_tower(auto _t)
{
  return (_t + 1) % TOWERS;
}

constexpr ensures compilation time evaluation to speed things up a bit but also is a requirement if we want to use template metaprogramming. The modulo wraps the tower addition around back to A(0). Next is to represent our towers and discs. The discs are represented by the natural numbers from 1 to n, thereby also representing their relative sizes and thus ordering constraint. An empty position is represented with a 0-disc. While we’re at it, we’ll also print out the towers for demonstration purposes.

int tower[3][12] = {{1,2,3,4,5,6,7,8,9,10,11,12}, {0}, {0}};
constexpr int DISCS = std::extent<decltype(tower), 1>::value;

const auto disc_width = std::to_string(DISCS).length();
std::string padding(disc_width + 1, '-');

/*
The towers are printed horizontally to avoid having to align columns.
--------------------------------2--7-11-14-15-16|A
--------------------------------1--3--5--8-10-13|B
--------------------------------------4--6--9-12|C
*/
void print_tower()
{
  for(auto t = 0; t < TOWERS; ++t)
  {
    for(auto d = 0; d < DISCS; ++d)
    {
      if(tower[t][d]) std::cout << "-" << std::setw(disc_width) << std::setfill('-') << tower[t][d];
      else std::cout << padding;
    }
    std::cout << "|" << static_cast<char>('A' + t);
    std::cout << std::endl;
  }
  std::cout << std::endl;
}

For verification purposes, you can use the algorithm std::is_sorted(…) on each tower to assert that the towers are do not violate size-ordering. For the disc transfer functions, I use template metaprogramming to illustrate compile-time generation of the move sequence. We must define the template definition before we define the specializations. The definition of the general function is to move n discs from one tower to another tower (doesn’t matter which). We’ll also include a conditional parameter for choosing moving to either an adjacent or post-adjacent tower.

auto rfind_zero(auto _first, auto _last)
{
  typedef std::reverse_iterator<decltype(_first)> riter;
  return std::find(riter(_last), riter(_first), 0).base();
}

template<int Discs, int Src, int Dst, bool direct = next_tower(Src) == Dst>
struct transfer;

template<int Src, int Dst>
struct transfer<1, Src, Dst, true>
{
  static inline void disc()
  {
    auto top_src = rfind_zero(tower[Src], tower[Src] + DISCS);
    auto top_dst = rfind_zero(tower[Dst], tower[Dst] + DISCS) - 1;
    std::iter_swap(top_src, top_dst);
    print_tower();
  }
};

The templated structure definition does not actually define a complete structure. This is okay as we should expect that we will cover all cases. Leaving it undefined like this means that if we made a mistake in our implementation, the compiler will tell us with a compiler error saying that there is no definition. The direct boolean is a default parameter calculated from the source and destination tower arguments. This means that in the specialization, the compiler will calculate it for us and we would not need to specify whether it is direct or not. This shows how the power of the strong static typing in C++ allows us to check algorithm validity without having to run it.

To move a disc from one tower to another, we find the top disc of the source tower, remove it, and place it on top of the destination tower. The top disc of the first tower is the first non-zero element in our source tower array. The top of the destination tower which we place our disc is the element before the first non-zero element in the destination tower array. I use a reverse linear search (std::find with reverse iterators) because the array is small and the first non-zero disc will regularly be closer to the bottom than the top (beginning of the array), but using binary search (std::upper_bound) will work equally well. I use std::iter_swap to do the actual transfer of the disc from one tower to the next since, by definition, it will always swap with a 0-disc, so it will look like we actually moved something.

Next we’ll need to specialize for moving one disc to a post-adjacent tower – the indirect specialization:

template<int Src, int Dst>
struct transfer<1, Src, Dst, false>
{
  static inline void disc()
  {
    transfer<1, Src, next_tower(Src)>::disc();
    transfer<1, next_tower(Src), Dst>::disc();
  }
};

This is the only case where moving to a post-adjacent tower is equal to moving to the adjacent tower twice. We can cheat a bit here where we can simulate two transfers by calling the direct specialization with the Src and Dst towers, as long as any metadata function (such as move counts, or move listing) is also simulated in the output.

These two specializations are the base cases and as you can see they will never fall down further to the undefined structure definition above. There is also another opportunity for optimization by keeping track of the top disc for each tower so that we can avoid the search. Then we’ll need to generalize these two functions for cases with more than one disc.

template<int Discs, int Src, int Dst>
struct transfer<Discs, Src, Dst, true>
{
  static inline void disc()
  {
    transfer<Discs-1, Src, next_tower(Dst)>::disc();
    transfer<1, Src, Dst>::disc();
    transfer<Discs-1, next_tower(Dst), Dst>::disc();
  }
};

To move an n-tower of discs to an adjacent tower, if we remember our invariant, we first have to make room so that we can move the nth disc into the final position. So first, we’ll need to move n-1 discs from the source tower to the tower after next. Then we move the nth disc to the adjacent tower. Now the destination tower is the post-adjacent tower to where our n-1 discs are temporarily stacked, so we’ll do another indirect transfer as before. Now you may wonder why we can call the indirect transfer for n > 1 discs before we specialize it. The answer is that we’ve only defined the templates, but we haven’t used them yet – we haven’t yet referred to a class where we’ve nailed down all of the template parameters, so they still only exist “in the ether”. But we have to specialize the remaining function before we make the actual call:

template<int Discs, int Src, int Dst>
struct transfer<Discs, Src, Dst, false>
{
  static inline void disc()
  {
    transfer<Discs-1, Src, Dst>::disc();
    transfer<1, Src, next_tower(Src)>::disc();
    transfer<Discs-1, Dst, Src>::disc();
    transfer<1, next_tower(Src), Dst>::disc();
    transfer<Discs-1, Src, Dst>::disc();
  }
};

To move an n-tower of discs to the post-adjacent tower, we must do a little bit more. In essence, we’ll have to make space twice to allow the nth disc to move to its final position. So we make space the first time by transferring the top n-1 discs to the post-adjacent tower. Then the nth disc moves to the adjacent tower. Then we move n-1 discs to its adjacent tower. The nth disc is moved again to its adjacent tower and has reached its destination tower. So all that’s left is the move the n-1 discs to the destination tower, which is an indirect transfer. Finally, we define our main function:

int main()
{
  print_tower();
  transfer<DISCS, A, B>()();

  return 0;
}

You can add a counter and check that the number of single disc transfers equals the numbers given by the following table:

http://www.wolframalpha.com/input/?i=%28%281+%2B+sqrt%283%29%29%5E%28n%2B1%29+-+%281+-+sqrt%283%29%29%5E%28n%2B1%29%29%2F%282+*+sqrt%283%29%29+-+1%2C+n+%3D+1%2C+16

Advertisement

Object Orientation is unsound

http://www.stlport.org/resources/StepanovUSA.html

I find OOP technically unsound. It attempts to decompose the world in terms of interfaces that vary on a single type. To deal with the real problems you need multisorted algebras – families of interfaces that span multiple types. I find OOP philosophically unsound. It claims that everything is an object. Even if it is true it is not very interesting – saying that everything is an object is saying nothing at all. I find OOP methodologically wrong. It starts with classes. It is as if mathematicians would start with axioms. You do not start with axioms – you start with proofs. Only when you have found a bunch of related proofs, can you come up with axioms. You end with axioms. The same thing is true in programming: you have to start with interesting algorithms. Only when you understand them well, can you come up with an interface that will let them work.

It took a while for me to understand generic programming aka templates in C++ and this is the insight that got me there.

Literally nothing fits into clean hierarchies and not everything is an object.