Cyclic Tower of Hanoi – analysis and implementation

The Tower of Hanoi problem consists of moving a size-ordered stack of n discs from one tower to another tower, out of three towers {A, B, C}, one disc at a time without putting a larger disc on top of a smaller one. The cyclic version of this problem adds the further constraint that a disc can only move through the towers in cycles, eg B -> C -> A.

Regardless of the details of the different versions, the problem has one useful invariant: the nth disc can only ever move to an empty tower because there are no discs larger than it. So wherever the nth disc moves, the top n-1 discs can move to any tower regardless of where the nth disc is without breaking the size-order rule. So this implies a recursive process in which we get the nth disc to it’s final position, but also apply that process to the top n-1 discs BEFORE applying it to the nth disc. But due to the cyclic constraint, we’ll need two versions of each process – one that moves m discs to an adjacent tower, and one that moves m discs to the post-adjacent tower. For moving one disc, there’s no difference than with applying the same function twice, but for n > 1 discs, moving to the post-adjacent tower can be done more efficiently than moving all the discs to the adjacent tower twice.

So first, we’ll represent our three towers using:

constexpr int A = 0, B = 1, C = 2, TOWERS = 3;

You could use enum classes, but ints are easier for now. Since we are doing this in cyclic order, we may as well define a function to get the next tower:

constexpr auto next_tower(auto _t)
{
  return (_t + 1) % TOWERS;
}

constexpr ensures compilation time evaluation to speed things up a bit but also is a requirement if we want to use template metaprogramming. The modulo wraps the tower addition around back to A(0). Next is to represent our towers and discs. The discs are represented by the natural numbers from 1 to n, thereby also representing their relative sizes and thus ordering constraint. An empty position is represented with a 0-disc. While we’re at it, we’ll also print out the towers for demonstration purposes.

int tower[3][12] = {{1,2,3,4,5,6,7,8,9,10,11,12}, {0}, {0}};
constexpr int DISCS = std::extent<decltype(tower), 1>::value;

const auto disc_width = std::to_string(DISCS).length();
std::string padding(disc_width + 1, '-');

/*
The towers are printed horizontally to avoid having to align columns.
--------------------------------2--7-11-14-15-16|A
--------------------------------1--3--5--8-10-13|B
--------------------------------------4--6--9-12|C
*/
void print_tower()
{
  for(auto t = 0; t < TOWERS; ++t)
  {
    for(auto d = 0; d < DISCS; ++d)
    {
      if(tower[t][d]) std::cout << "-" << std::setw(disc_width) << std::setfill('-') << tower[t][d];
      else std::cout << padding;
    }
    std::cout << "|" << static_cast<char>('A' + t);
    std::cout << std::endl;
  }
  std::cout << std::endl;
}

For verification purposes, you can use the algorithm std::is_sorted(…) on each tower to assert that the towers are do not violate size-ordering. For the disc transfer functions, I use template metaprogramming to illustrate compile-time generation of the move sequence. We must define the template definition before we define the specializations. The definition of the general function is to move n discs from one tower to another tower (doesn’t matter which). We’ll also include a conditional parameter for choosing moving to either an adjacent or post-adjacent tower.

auto rfind_zero(auto _first, auto _last)
{
  typedef std::reverse_iterator<decltype(_first)> riter;
  return std::find(riter(_last), riter(_first), 0).base();
}

template<int Discs, int Src, int Dst, bool direct = next_tower(Src) == Dst>
struct transfer;

template<int Src, int Dst>
struct transfer<1, Src, Dst, true>
{
  static inline void disc()
  {
    auto top_src = rfind_zero(tower[Src], tower[Src] + DISCS);
    auto top_dst = rfind_zero(tower[Dst], tower[Dst] + DISCS) - 1;
    std::iter_swap(top_src, top_dst);
    print_tower();
  }
};

The templated structure definition does not actually define a complete structure. This is okay as we should expect that we will cover all cases. Leaving it undefined like this means that if we made a mistake in our implementation, the compiler will tell us with a compiler error saying that there is no definition. The direct boolean is a default parameter calculated from the source and destination tower arguments. This means that in the specialization, the compiler will calculate it for us and we would not need to specify whether it is direct or not. This shows how the power of the strong static typing in C++ allows us to check algorithm validity without having to run it.

To move a disc from one tower to another, we find the top disc of the source tower, remove it, and place it on top of the destination tower. The top disc of the first tower is the first non-zero element in our source tower array. The top of the destination tower which we place our disc is the element before the first non-zero element in the destination tower array. I use a reverse linear search (std::find with reverse iterators) because the array is small and the first non-zero disc will regularly be closer to the bottom than the top (beginning of the array), but using binary search (std::upper_bound) will work equally well. I use std::iter_swap to do the actual transfer of the disc from one tower to the next since, by definition, it will always swap with a 0-disc, so it will look like we actually moved something.

Next we’ll need to specialize for moving one disc to a post-adjacent tower – the indirect specialization:

template<int Src, int Dst>
struct transfer<1, Src, Dst, false>
{
  static inline void disc()
  {
    transfer<1, Src, next_tower(Src)>::disc();
    transfer<1, next_tower(Src), Dst>::disc();
  }
};

This is the only case where moving to a post-adjacent tower is equal to moving to the adjacent tower twice. We can cheat a bit here where we can simulate two transfers by calling the direct specialization with the Src and Dst towers, as long as any metadata function (such as move counts, or move listing) is also simulated in the output.

These two specializations are the base cases and as you can see they will never fall down further to the undefined structure definition above. There is also another opportunity for optimization by keeping track of the top disc for each tower so that we can avoid the search. Then we’ll need to generalize these two functions for cases with more than one disc.

template<int Discs, int Src, int Dst>
struct transfer<Discs, Src, Dst, true>
{
  static inline void disc()
  {
    transfer<Discs-1, Src, next_tower(Dst)>::disc();
    transfer<1, Src, Dst>::disc();
    transfer<Discs-1, next_tower(Dst), Dst>::disc();
  }
};

To move an n-tower of discs to an adjacent tower, if we remember our invariant, we first have to make room so that we can move the nth disc into the final position. So first, we’ll need to move n-1 discs from the source tower to the tower after next. Then we move the nth disc to the adjacent tower. Now the destination tower is the post-adjacent tower to where our n-1 discs are temporarily stacked, so we’ll do another indirect transfer as before. Now you may wonder why we can call the indirect transfer for n > 1 discs before we specialize it. The answer is that we’ve only defined the templates, but we haven’t used them yet – we haven’t yet referred to a class where we’ve nailed down all of the template parameters, so they still only exist “in the ether”. But we have to specialize the remaining function before we make the actual call:

template<int Discs, int Src, int Dst>
struct transfer<Discs, Src, Dst, false>
{
  static inline void disc()
  {
    transfer<Discs-1, Src, Dst>::disc();
    transfer<1, Src, next_tower(Src)>::disc();
    transfer<Discs-1, Dst, Src>::disc();
    transfer<1, next_tower(Src), Dst>::disc();
    transfer<Discs-1, Src, Dst>::disc();
  }
};

To move an n-tower of discs to the post-adjacent tower, we must do a little bit more. In essence, we’ll have to make space twice to allow the nth disc to move to its final position. So we make space the first time by transferring the top n-1 discs to the post-adjacent tower. Then the nth disc moves to the adjacent tower. Then we move n-1 discs to its adjacent tower. The nth disc is moved again to its adjacent tower and has reached its destination tower. So all that’s left is the move the n-1 discs to the destination tower, which is an indirect transfer. Finally, we define our main function:

int main()
{
  print_tower();
  transfer<DISCS, A, B>()();

  return 0;
}

You can add a counter and check that the number of single disc transfers equals the numbers given by the following table:

http://www.wolframalpha.com/input/?i=%28%281+%2B+sqrt%283%29%29%5E%28n%2B1%29+-+%281+-+sqrt%283%29%29%5E%28n%2B1%29%29%2F%282+*+sqrt%283%29%29+-+1%2C+n+%3D+1%2C+16

Object Orientation is unsound

http://www.stlport.org/resources/StepanovUSA.html

I find OOP technically unsound. It attempts to decompose the world in terms of interfaces that vary on a single type. To deal with the real problems you need multisorted algebras – families of interfaces that span multiple types. I find OOP philosophically unsound. It claims that everything is an object. Even if it is true it is not very interesting – saying that everything is an object is saying nothing at all. I find OOP methodologically wrong. It starts with classes. It is as if mathematicians would start with axioms. You do not start with axioms – you start with proofs. Only when you have found a bunch of related proofs, can you come up with axioms. You end with axioms. The same thing is true in programming: you have to start with interesting algorithms. Only when you understand them well, can you come up with an interface that will let them work.

It took a while for me to understand generic programming aka templates in C++ and this is the insight that got me there.

Literally nothing fits into clean hierarchies and not everything is an object.

Easy adapter pattern with object map

Say you want to have an a class that allows dynamic adaptation. You can’t create it as a template class to provide “policies” in policy based design because different policies will create distinct classes. This will prevent different adapter classes from attaching to the same object.

You also wouldn’t want to require adapter classes to have to inherit from some interface and introduce an unneeded dependency through virtual inheritance.

Ideally, you’d want to have the flexibility of templates but in a dynamic context. This is the kind of container you’d need to achieve this:

#include <unordered_map>
#include <typeindex>
#include <memory>
.
.
.
std::unordered_map<std::type_index, std::shared_ptr<void>> object_map;

type_index is a C++11 feature that allows types to have hash and weak ordering semantics. To get rid of the boilerplate code to use this container, you’d have interface functions like these:

template<typename T, typename... Args>
void add_object(Args... args)
{
  object_map[std::type_index{typeid(T)}] = std::static_pointer_cast<void>(std::make_shared<T>(args...));
}

template<typename T>
T& get_object()
{
  return *static_cast<T*>(object_map[std::type_index{typeid(T)}].get());
}

shared_ptr is used because it is the only smart pointer that supports a void type as well as having nice helper functions to cast to shared_ptrs of other types. It also uses variadic templates to allow construction via make_shared, which is the exception safe way to create a shared_ptr. What’s also nice about it is that the casting to a void shared_ptr does not throw away custom deleters, as proven by this code:

struct A
{
  static int gen;

  A()
  {
    ++gen;
  }

  void operator()()
  {
    std::clog <<  "Run A: " << gen <<  std::endl;
  }

  ~A()
  {
    std::cerr <<  "Destructing A" <<  std::endl;
  }
};

int A::gen = 0;

int main()
{
  auto aptr = std::shared_ptr<A>(new A(), [](A* _p)
    {
      std::cerr << "Custom delete A" <<  std::endl;
      delete _p;
    });
  auto voidptr = std::static_pointer_cast(aptr);
  aptr.reset();
  (*std::static_pointer_cast<A>(voidptr).get())();
  std::clog << "Number of owners: " <<  voidptr.use_count() << " " << aptr.use_count() << std::endl;

  return 0;
}

A has destructor to signal that it is in fact being deleted. aptr has a custom deleter. After being cast to a shared_ptr<void>, aptr is reset to show that it isn't the custom deleter in aptr that is destroying the A object but the one that is casted into voidptr.

The output of that test is:

Run A: 1
Number of owners: 1 0
Custom delete A
Destructing A

So back to the object map. This simple test demonstrates it working:

std::clog << "Object map test" << std::endl;
add_object<A>();
add_object<int>(42);
add_object<double>(2.71828);
add_object<std::string>("Hello world!");

std::cout << get_object<std::string>() << " " << get_object<int>() << " " << get_object<double>() << std::endl;
get_object<A>()();

The output of this test is:

Object map test
Hello world! 42 2.71828
Run A: 2
Destructing A

Radix sort – another language comparison exercise

I was reading the Wikipedia page for Radix sort[1] and I noticed a pretty terrible example for C++, so I set about writing my own, which I’ve included in the wiki as a modernized C++14 version.

#include <iostream>
#include <algorithm>
#include <array>
#include <vector>
#include <random>

int main()
{
  using INT = std::mt19937_64::result_type;
  std::array<INT, 200000> randoms;
  std::generate(randoms.begin(), randoms.end(), [engine = std::mt19937_64{}]() mutable {return engine();});
  std::array<std::vector<INT>, 256> buckets;

  for (auto r : randoms) std::cout << r << std::endl;
  std::cout << std::endl;

  for (int i = 0; i < sizeof(INT); ++i)
  {
    for (auto r : randoms) buckets[(r >> (i << 3)) & 0xFF].push_back(r);

    auto j = randoms.begin();
    for (auto& bucket : buckets)
    {
      for (auto b : bucket) *(j++) = b;
      bucket.clear();
    }
  }

  for (auto r : randoms) std::cout << r << std::endl;

  return 0;
}

1: http://en.wikipedia.org/wiki/Radix_sort

More on brevity and clarity

Continuing on from the previous post, there was another side challenge to implement the POSIX utility wc. Someone claimed C++ makes things unnecessarily hard and the challenge was supposed to prove it. Well, it was simple and I threw in a simple (incomplete) SLOC counter as well. The challenger couldn’t argue that C++ made it hard to implement wc, and so decided to nitpick on small things that do not even relate to the challenge at hand, mostly around coding style preferences that have nothing to do with the ease of implementing the core functionality of wc.

I’m by no means the best C++ coder in terms of complexity or style. Judge for yourself whether or not this was impossible to do cleanly in C++:

#include <iostream>
#include <fstream>
#include <sstream>
#include <algorithm>

enum class char_opts
{
  BYTES, 
  CHARS, 
  NUM_OPTS
};

void count(std::istream& _in, unsigned& sloc_count, unsigned &line_count, unsigned &word_count, unsigned &char_count, unsigned &byte_count, unsigned &max_line_length, unsigned &find_count, const std::string &str)
{
  std::string line;
  std::getline(_in,  line);
  bool in_block_comment = false;
  for (unsigned lc = 0; _in; std::getline(_in, line), ++lc)
  {
    byte_count += line.length();
    char_count += line.length();
    if (!_in.eof())
    {
      ++line_count;
      ++byte_count;
      ++char_count;
    }
    max_line_length = std::max<unsigned>(max_line_length, line.length());
    if (!str.empty()) for (auto s = line.find(str); s != std::string::npos; s = line.find(str, s+1), ++find_count);

    std::istringstream line_str{line};
    std::skipws(line_str);
    std::string word;
    line_str >> word;
    for (; line_str; line_str >> word) ++word_count;

    auto trimmed = line;
    trimmed.erase(0, trimmed.find_first_not_of(" t"));
    auto trailing = trimmed.find_last_not_of(" t");
    if (trailing != std::string::npos) trimmed.erase(trailing);
    if (!trimmed.empty() && trimmed != "{" && trimmed != "}" && trimmed.find("//") != 0) ++sloc_count;
  }
}

int main(int _c, char** _v)
{
  char_opts copts = char_opts::NUM_OPTS;
  bool sloc = false;
  bool lines = false;
  bool words = false;
  bool line_length = false;
  std::string str;

  bool opts_supplied = false;

  auto args = _v + 1;
  const auto end = _v + _c;
  for (; args < end; ++args)
  {
    std::string arg{*args};
    if (arg == "-" || arg[0] != '-') break;

    if (arg == "-c" || arg == "-bytes") copts = char_opts::BYTES;
    else if (arg == "-m" || arg == "-chars") copts = char_opts::CHARS;
    else if (arg == "-L" || arg == "-max-line-length") line_length = true;
    else if (arg == "-sloc") sloc = true;
    else if (arg == "-l" || arg == "-lines") lines = true;
    else if (arg == "-w" || arg == "-words") words = true;
    else if (arg == "-o") str = *++args;
    else
    {
      std::cerr << "Invalid argument '" <<  arg << ''' << std::endl;
      return -1;
    }

    opts_supplied = true;
  }

  if (!opts_supplied)
  {
    copts = char_opts::BYTES;
    lines = true;
    words = true;
    line_length = true;
  }

  unsigned file_count = 0;
  unsigned total_sloc_count = 0;
  unsigned total_line_count = 0;
  unsigned total_word_count = 0;
  unsigned total_char_count = 0;
  unsigned total_byte_count = 0;
  unsigned total_max_line_length = 0;
  unsigned total_find_count = 0;
  for (bool no_file = args == end; no_file || args < end; ++args, ++file_count, no_file = false)
  {
    std::string filename{no_file ? "" : *args};
    unsigned sloc_count = 0;
    unsigned line_count = 0;
    unsigned word_count = 0;
    unsigned char_count = 0;
    unsigned byte_count = 0;
    unsigned max_line_length = 0;
    unsigned find_count = 0;
    if (no_file || filename == "-") std::cin.clear();
    count(no_file || filename == "-" ? std::cin : std::move(std::ifstream{filename}), sloc_count, line_count, word_count, char_count, byte_count, max_line_length, find_count, str);
    std::cout << (sloc ? std::to_string(sloc_count) + " " : "")
          << (lines ? std::to_string(line_count) + " " : "")
          << (words ? std::to_string(word_count) + " " : "")
          << (copts != char_opts::NUM_OPTS ? std::to_string(copts == char_opts::BYTES ? byte_count : char_count) + " " : "")
          << (line_length ? std::to_string(max_line_length) + " " : "")
          << (!str.empty() ? std::to_string(find_count) + " " : "")
          << filename <<  std::endl;

    total_sloc_count += sloc_count;
    total_line_count += line_count;
    total_word_count += word_count;
    total_char_count += char_count;
    total_byte_count += byte_count;
    total_max_line_length = std::max(total_max_line_length, max_line_length);
  }

  if (file_count > 1) std::cout << (sloc ? std::to_string(total_sloc_count) + " " : "")
        << (lines ? std::to_string(total_line_count) + " " : "")
        << (words ? std::to_string(total_word_count) + " " : "")
        << (copts != char_opts::NUM_OPTS ? std::to_string(copts == char_opts::BYTES ? total_byte_count : total_char_count) + " " : "")
        << (line_length ? std::to_string(total_max_line_length) + " " : "")
        << (!str.empty() ? std::to_string(total_find_count) + " " : "")
        << "total" <<  std::endl;

  return 0;
}

Interesting exercise in brevity and clarity

Recently had a discussion and challenge in comparing two languages, C++ and Python. I think modern C++ is holds up really well to so-called scripting languages to do quick and dirty utility programs. This is a reasonably short implementation of a prime number finder:

#include <cstdio>
inline bool prime(const auto _candidate, const auto *_first, const auto *_last) {
  for (auto p = _first; p != _last && *p * *p <= _candidate; ++p)
    if (_candidate % *p == 0)
      return false;
  return true;
}
int main(int _c, char** _v) {
  const unsigned num_primes = 10000;
  static unsigned primes[num_primes] = {2, 3};
  for (unsigned i = 2; i < num_primes; ++i)
    for (primes[i] = primes[i-1] + 2; !prime(primes[i], primes + 1, primes + i); primes[i] += 2);
  printf("The %uth prime is: %u.n", num_primes, primes[num_primes - 1]);
  return 0;
}

Several things. Mostly I’ve learned to re-embrace the spirit of C/C++ for brevity, such as single statement if and for blocks. But you really have to think about readability when you code in that style. The brief C style is only bad if it’s done without consideration about code aesthetics. The brief style shouldn’t be about reducing line count, but about increasing readability. It is a bit counter-intuitive coming from a university education that told you to put every if block in braces over multiple lines.

When coded in such a manner, modern C++ can approach the ease of writing that languages like Python enjoy.

Template meta-programming rule of thumb

“Use template meta-programming to express design, not to express computation.”

Various explanations of template meta-programming uses examples like a compile-time Fibonnacci sequence. What those tutorials should be focusing on is how to use template meta-programming to hide incidental requirements of interfaces.

The ultimate goal of template meta-programming is to enable code like this:

int main()
{
    do_what_i_expect(/* args */);
    return 0;
}

Making templates easier with named-arguments

Say you have a template that requires a large number of arguments. ie, more than three. This is not a very clear or concise interface and not self-documenting. Any thing with a large number of arguments suffers from the same problem. You can give default arguments but if you just want to provide one argument that happens to be after one or more other default arguments, you have to provide those as well and you have to know the defaults if you want to keep the default behaviour bar the one you want to modify. My solution to provide named template arguments is this:

template<typename NumType,
         NumType TLow = std::numeric_limits<NumType>::lowest(),
         NumType TMin = std::numeric_limits<NumType>::min(),
         NumType TMax = std::numeric_limits<NumType>::max(),
         NumType TDef = 0,
         NumType TInv = -1,
         typename Specials = TestList<NumType>,
         typename Excludes = TestList<NumType> >
struct NumWrapper
{
    static constexpr NumType low = TLow;
    static constexpr NumType min = TMin;
    static constexpr NumType max = TMax;
    static constexpr NumType def = TDef;
    static constexpr NumType inv = TInv;
    typedef Specials inc_type;
    typedef Excludes exc_type;

    template<NumType Special>
    struct Min
    {
        typedef NumWrapper<NumType, low, Special, max, def, inv, Specials, Excludes> type;
    };

    template<NumType Special>
    struct Max
    {
        typedef NumWrapper<NumType, low, min, Special, def, inv, Specials, Excludes> type;
    };

    template<NumType Special>
    struct Low
    {
        typedef NumWrapper<NumType, Special, min, max, def, inv, Specials, Excludes> type;
    };

    template<NumType Special>
    struct Def
    {
        typedef NumWrapper<NumType, low, min, max, Special, inv, Specials, Excludes> type;
    };

    template<NumType Special>
    struct Inv
    {
        typedef NumWrapper<NumType, low, min, max, def, Special, Specials, Excludes> type;
    };

    template<NumType... List>
    struct Inc
    {
        typedef NumWrapper<NumType, low, min, max, def, inv, TestList<NumType, List...>, exc_type> type;
    };

    template<NumType... List>
    struct Exc
    {
        typedef NumWrapper<NumType, low, min, max, def, inv, inc_type, TestList<NumType, List...> > type;
    };

    NumType val = def;
    NumWrapper() = default;
    NumWrapper(NumType _v) : val(_v) {}

    operator NumType ()
    {
        return val;
    }
};

This class is something I needed to quickly create data ranges really easily in order to generate values for testing. I may want to provide a different minimum that is different from the underlying type but use the std::numeric_limits for the other values, or I may want to provide extra values that have a special meaning within the context of its use.

The named argument effect is achieved by declaring nested classes in the NumWrapper classes that have an internal typedef that creates a new NumWrapper type from the enclosing template instantiation. The internal typedef only instantiates on the template argument they “name”, and use the rest of the values from the enclosing template instantiation. The use of default template arguments in the main definition, and the inheritance of those arguments as you continue the typedef chain means the user does not then have to provide those values if they don’t want to.

Take special note that, as the library developer, you will of course need to know the order of the template arguments. You just have to make it so that the user of your library does not have to know the order.

Declaring a new integer type becomes as simple as this:

typedef NumWrapper<short>::Max<9999>::type::Def<1>::type::Min<-10>::type::Inc<2,3,5,7,11,13,17,19>::type MyIntegralType;

It also makes it easy to figure out what the expected range and values of this integral type should be. You can even automate its specialization for std::numeric_limits:

namespace std
{
    template<typename NumType, NumType TLow, NumType TMin, NumType TMax, NumType TDef, NumType TInv, typename Specials, typename Excludes>
    struct numeric_limits<NumWrapper<NumType, TLow, TMin, TMax, TDef, TInv, Specials, Excludes> > : numeric_limits<NumType>
    {
        static constexpr bool is_specialized = true;
        static constexpr NumType min()
        {
            return NumWrapper<NumType, TLow, TMin, TMax, TDef, TInv, Specials, Excludes>::min;
        }

        static constexpr NumType max()
        {
            return NumWrapper<NumType, TLow, TMin, TMax, TDef, TInv, Specials, Excludes>::max;
        }

        static constexpr NumType lowest()
        {
            return NumWrapper<NumType, TLow, TMin, TMax, TDef, TInv, Specials, Excludes>::low;
        }
    };

    template<typename NumType, NumType TLow, NumType TMin, NumType TMax, NumType TDef, NumType TInv, typename Specials, typename Excludes>
    struct is_arithmetic<NumWrapper<NumType, TLow, TMin, TMax, TDef, TInv, Specials, Excludes>> : is_arithmetic<NumType> {};
}

This way, the user will never have to specialize std::numeric_limits ever again.

One final note, you can use preprocessor macros to make this even more easier to write:

#define NUMTYPE(type) NumWrapper<type>
#define TLOW(low) ::Low<low>::type
#define TMIN(min) ::Min<min>::type
#define TMAX(max) ::Max<max>::type
#define TDEF(def) ::Def<def>::type
#define TINV(inv) ::Inv<inv>::type
#define TINC(...) ::Inc<__VA_ARGS__>::type
#define TEXC(...) ::Exc<__VA_ARGS__>::type

typedef NUMTYPE(short)TMAX(9999)TDEF(1)TMIN(-10)TINC(2,3,5,7,11,13,17,19) MyIntegralType;

Almost lambdas in C99 – faking local variable capture

Function objects, and therefore lambdas, are not possible in C. However, if we think about one of the main benefits of lambdas – local variable capture – we can write both functions that can “capture” local variables, and functions that use those functions. Say for example you need to read from a socket. You can write the same socket reading routine each time, and handle all the errors and signals each time, handle the buffer management each time, or you can write a “generic” function that hides all the details:

int socket_read(int sock, bool (*func)(char*, ssize_t, va_list), ...)
{
    ssize_t rc = 1024;
    char buffer[rc];
    va_list args;
    va_start(args, func);
    while(!func(buffer, rc, args))
    {
        va_end(args);
        va_start(args, func);
        errno = 0;
        rc = recv(sock, buffer, sizeof(buffer), MSG_PEEK);
        if(rc > 0)
        {
            rc = recv(sock, buffer, rc, 0);
        }
        else if(rc == 0)
        {
            break;
        }
        else if(rc == -1 && errno != EAGAIN && errno != EWOULDBLOCK)
        {
            break;
        }
    }
    va_end(args);
    return rc;
}

Note that this function also takes care of the va_args initialization and cleanup, making client code safer.

An example usage would be to have separate functions that reads, for example, HTTP headers, and one that reads HTTP payload:

bool read_header(char* buf, ssize_t len, va_list args)
{
    ssize_t* hblen = va_arg(args, ssize_t*);
    ssize_t* hlen = va_arg(args, ssize_t*);
    char** headers = va_arg(args, char**);
    if(*hlen + len > *hblen)
    {
        if(!(*headers = (char*) realloc(*headers, *hlen + len)))  // Replace naive memory management algorithm with more efficient one
        {
            return true;
        }
        *hblen = *hlen + len;
    }
    memcpy(*headers + *hlen, buf, len);
    *hlen += len;
    return strstr(*headers, "rnrn") != NULL;
}

bool read_payload(char* buf, ssize_t len, va_list args)
{
    ssize_t* bytes_left = va_arg(args, ssize_t*);
    char** data = va_arg(args, char**);
    memcpy(*data, buf, len);
    *data += len;
    *bytes_left -= len;
    return bytes_left;
}

Note that the read_header function has its own buffer management due to the variable nature of HTTP headers, but the read_payload function doesn’t need to because it would be used to read fixed size data which would be provided by outside code.

Note also how each of the functions can virtually communicate with itself through the pointers passed into it via the va_list, even though it has no direct control of the socket_read loop.

Note that you should also be mindful of the order of the varargs. The rule of thumb is the put the most important and/or independent variables first. eg, sizes should always come before the buffer they limit.

Now you can read from a socket much more elegantly:

int main()
{
    int sock = socket(AF_INET, SOCK_STREAM, 0);

    // Set up socket
    // ...

    ssize_t hblen = 0;
    ssize_t hlen = 0;
    char* headers = NULL;  // For the purposes of this example, rely on realloc(NULL,size) behaviour. In real use, preallocate, and set hblen to size
    socket_read(sock, read_header, &hblen, &hlen, &headers);

    // Note that read_header may read more than just the header
    // so find rnrn, and copy the rest of the valid contents
    // into the following payload buffer
    // and adjust begin accordingly

    ssize_t clen /* = Find and interpret Content-Length header */;
    char payload[clen];
    char* begin = payload;
    socket_read(sock, read_payload, &clen, &begin);
    return 0;
}

Note how you’ve effectively locally captured variables from the calling function to be passed on to the delegate functions. Note that I also don’t clutter the calling function with memory management.

  • DISCLAIMER: I can vouch that the technique works, but since I rewrote code from scratch for this example and have not tested it due to time constraints, I cannot guarantee that the code works as is. It will need tweaking to get rid of bugs.

Responses to the Invalid value concept, and responses to those

Link 1
Link 2

You could argue that an end iterator is the archetypal example of an “invalid value”. And end iterator does not represent any valid part of a vector. It’s one-past-the-end, which serves as a marker. And not even that, in set and map, where the one-past-the-end end iterator doesn’t even point to the physical end of the container’s range.

Or what if you’re writing a test generation library?1 You need to test invalid values, and you need a way to signal “I want you to generate an invalid value of a class to test this function that uses it”. For example, a function may only take a certain integer range. Any value outside of that range is invalid, and you should test for it.

If you should never create an invalid value, you may as well say you should never test how invalid values are handled.

I did consider calling it a null value, but a null value can be a valid representation of a concept. For a contrived example, what if you’re writing a simple wrapper around BSD sockets? If a socket cannot be created, you get a -1. That is an invalid value. If the divine decree is that a constructor should never create an invalid value, then such a socket wrapper class can never be a legal construct.

You could argue that you should throw an exception if an invalid value is about to be created, but exceptions aren’t always the desired behaviour.


  1. Which is how I found a potential need for it. 

The Invalid Value constructor – a proposal

C++ has the concept of the default constructor. One is provided for you by the compiler under most common circumstances. What isn’t provided is the concept of a default constructor for an invalid value of the class. I propose the following convention for an invalid value constructor:

struct Value
{
    Value(std::nullptr_t) {std::cout << "Null constructor" << std::endl;}
    Value(void* v) {std::cout << "Null pointer constructor" << std::endl;}
};

int main()
{
    Value v{nullptr};
    return 0;
}

The output of this program is:

Null constructor

This works because nullptr on its own has the type of std::nullptr, and so overload resolution will choose the narrowest possible match, thus the constructor that takes the single argument of nullptr.

The Internals Pattern, or emulating Ada’s discriminated variant records

In C, unions are not typesafe and requires that the programmer be completely responsible with how the unions are used. Ada’s discriminated variant records are a typesafe way of doing unions and using the Internals pattern1 we can get the same functionality in C++. This is all the code that’s needed:

template<typename...>
struct Internals
{
    // Empty struct for the default case
};

Due to C++’s template instantiation rules, you can have different members of a class for any specialization of the template. In essence:

template<>
struct Internals<int>
{
    int int_data;  // Note that there is no requirement that the members are the same
};

template<>
struct Internals<std::string>
{
    std::string string_data;
};

template<>
struct Internals<int, int>
{
    std::tuple<int, int> int_tuple_data;
};

All of these structs are completely different types, even though they all come from the Internals templated type. They bear no relation to each other in the way inheritance would have done, and this is the main benefit. You can have compile-time polymorphism that is checked before you even run the code. This is now possible:

template<typename T, typename U, typename... Vs>
struct Scaffold
{
    Internals<Scaffold> internals;
    void print()
    {
        std::cout << "Default dummy data" << std::endl;
    }
};

template<>
struct Internals<Scaffold<int, float, char, double>>
{
    std::string data{"Specialized dummy data"};
};

template<>
void Scaffold<int, float, char, double>::print()
{
    std::cout << this->internals.data << std::endl;
}

int main()
{
    Scaffold<long, long>{}.print();
    Scaffold<int, float, char, double>{}.print();

    return 0;
}

The output of this program is:

Default dummy data
Specialized dummy data

This is a useful behaviour when you want something akin to aspects, or crosscutting concerns. Clients of your framework can specialize the internals without having to derive from any framework class, decoupling the framework scaffolding from the client.


  1. I have not found this pattern discussed anywhere and so do not know the correct name even if there is one 

DRAKON

DRAKON is a really nifty language and this tool is really easy to learn and use. It is a very expressive graphical notation, and much like C++, it doesn’t actually force you into a paradigm. The constraint rules of DRAKON make for really easy to read diagrams, but it also makes it quite enjoyable to use because you’re not focused on layout out things to avoid line collisions. They also force you to make your designs more consistent with good principles and each other.