Mike's Programming Blog

Monday, October 14, 2013

Moving Blog

I'm moving my blog to coder-mike.com, so head across there to check out my latest post where I take a deeper look at a what a variable actually is. Don't forget to update your bookmarks and feed viewers! It's brand new, so it's still a work in progress. If there are any problems or suggestions feel free to let me know.

Monday, October 7, 2013

Subscribing Member Functions to C Callbacks

Today I'm going to talk about another complaint I have with C/C++, regarding the incompatibility between member functions and standard functions.

The synopsis goes like this. For compatibility with C, I have some C++ callback functions that look like this:

void foo(void* state, int eventCode);

That is, they have a state argument which represents any persistent state that the subscriber wishes to have passed back to it later when the event is fired. For illustration, the subscription function might look something like this:

typedef void (*CallbackFunc)(void* state, int eventCode);

void subscribe(CallbackFunc callback, void* state)
{
    // add the callback to a list of functions to be called
}

And we can subscribe to the event like this:

void* myState = ...; // Some state we want passed to the callback
subscribe(&foo, myState);

I won't even get into the problem of state "ownership" in this post - that's a whole different problem that drives me up the wall in C and C++. When it comes to function pointers, there is an incompatibility between the above function and this member function:

class State
{
    void foo(int eventCode);
};

In other words we can't subscribe the member function to the same event:

// Lets say we have some state here
State* myState = new State();

// We can't subscribe like this:
subscribe(State::foo, myState);

// Or like this
subscribe(myState.foo);

If you're a C++ programmer, which you probably are (at least a little I'm sure), you're probably thinking I'm crazy right now - "Of course you can't - why would you even try?"

The reason it could work is simply because of the underlying mechanism of member functions. Everyone knows that there is a "hidden" argument passed to a (non-static) member function invocation - the value for the this parameter. Exactly how this invisible argument gets passed is not part of the standard, so it doesn't have to use the same mechanism as a normal parameter, but for the sake of allowing this kind of compatibility I would have suggested that it be standardized to be passed the same way as a normal parameter.

Another way of approaching the problem is to look at C# for inspiration - a "function pointer" (a delegate in C#) can have any state attached to it. So a member function is the same as a static function and the same as a closure or multi-cast delegate etc, provided that the parameter types fit. To me this is the opposite way of looking at it: instead of saying that a member function has an additional invisible this parameter, we now say that actually every function has a this parameter which provides the attached state, but in the case of a non-member function it is just ignored by the callee.

This is probably not strictly what happens in C#, but it is a reasonable way of looking at it. If we were to apply this to C++, then the compiler could compile a hidden this pointer to all functions (including non-member functions). The optimizer would remove it again for all functions where it isn't needed for whatever reason (such as for those functions whose addresses aren't taken in the program). This would make it possible to have the expression myState.foo actually be a valid expression - it would evaluate to function pointer (along with its implicit state myState).

Of course this second option adds overhead to every function pointer, because it now has to point to both the function itself, and to the attached state. I think that if I were to redesign C++ from scratch, that I would probably accept this overhead and do it anyway. Perhaps I would include a back-door option to create a "no-state" function and it's corresponding pointer, which would be compatible with C functions. And I would specify that the calling convention used to pass the this parameter is the same as that used for the first parameter of a "no-state" function. This would make C and C++ function pointers seamlessly interoperable.

Why does it bug me?

It bugs me because stateful events, callbacks, or continuations are part of any good design, and C++ makes them incredibly difficult to implement. C++ claims to be object orientated, but yet its member function pointers don't refer to the object they're a member of. So C++ function pointers are really C function pointers - except that they're type-incompatible with C function pointers. So really they aren't useful on their own in either context! It seems to me this is a massive oversight of the original language.

Monday, September 23, 2013

Simple Dependent Types

Today I'd like to talk about something called dependent types. As a disclaimer, I have no formal training in type theory. If you catch any mistakes, please let me know so I can fix it and stop embarrassing myself as soon as possible! ;-)
I'd like to tell you about an idea I have for a potential language feature that uses dependent typing, but doesn't require a degree in mathematics to use.

What are Dependent Types?

So, you might be wondering what dependent types are in the first place, so before I go any further let me try explain what I understand them to be.
Consider the following structure in C++:

struct Foo
{
    int values[10];
};

The structure Foo has a single field, which is an array of 10 integers. This is simple enough. But the value 10 is arbitrary, and so the structure is not very useful. What if we want a Foo that can hold 11 integers? C++ provides a feature called templates which allows us to do just that:

template <int NUM_VALUES>
struct Foo
{
    int values[NUM_VALUES];
};

Now we can use Foo<10> when we want a Foo that holds 10 values like it did before, but we can also write Foo<11> when we want a Foo that holds 11 values. For example:

Foo<2> test()
{
    Foo<2> myTwoValues;

    myTwoValues.values[0] = 25;
    myTwoValues.values[1] = 26;

    return myTwoValues;
}

Note that because the array of values is a fixed length, many compilers will actually warn you if you explicitly access value 3 in a Foo<2>. This is a really nice feature, but it doesn't go so far in practical applications.
What happens if we don't know at compile-time how many values we want?

Foo<???> test(int numValues)
{
    Foo<???> myFoo;

    for (int i = 0; i < numValues; i++)
        myFoo.values[i] = i + 25;

    return myFoo;
}

This is a common problem. In fact, I would say you almost never use a fixed-length array in C++, because in most cases you don't know how long it needs to be ahead of time. Of course in C++ the solution to this is simply to allocate the values dynamically:

struct Foo
{
    Foo(int n) : numValues(n), values(new int[n]) {  }
    ~Foo() { delete[] values; }
    int numValues;
    int* values;
}

This is a horrible solution in many ways. For one thing, you're performing an extra heap allocation (what if numValues is just 3, or what if the Foo is already on the heap - what a waste!). For another thing, you need to remember to free the extra heap allocation (or assign it to a smart pointer). And now you've lost all compile-time safety when it comes to indexing into the array.
This is where dependent types would be useful. What if C++ could do the following:

template <int n>
struct Foo
{
    int values[n];
};

Foo<numValues> test(int numValues)
{
    Foo<numValues> myFoo;

    for (int i = 0; i < numValues; i++)
        myFoo.values[i] = i + 25;

    return myFoo;
}

This Foo is a dependent type: the static type depends on a runtime value. We can't say if it's a Foo<2>, a Foo<3>, a Foo<27>, etc, without looking first at the value of numValues. There is also another dependent type hiding in this example: the type of Foo::values, which depends on a value (n) that is now possibly a variable.
As a side note, the term template is really not applicable any more, since the structure is no longer specialized only by the compiler, but also by the runtime (conceptually).

How would this work?

In the above case, it seems simple enough for the compiler to understand what's going on and generate the appropriate code. The binary size of the type values[n] is no longer calculated at compile-time by substituting in the constant n, it's now a function of the run-time type parameter n.
If we recursively apply this logic, then the size of stack frame itself is a function of the local variable numValues (which happens to also be a parameter).
This really doesn't seem that difficult. Functions that are executed anyway by the compiler, are now simply executed at runtime instead of compile time (at least conceptually anyway). Sizes, field offsets, and variables offsets, which were previously static constants, now become runtime functions.

The Problem

I previously mentioned that some of the compiler's work would now occur at runtime - calculating binary layouts and such. But the compiler also does something else that's really useful: static type checking. This helps remove a whole class of errors. Consider the following:

void test(int count1, int count2)
{
    Foo<count1> foo1;
    Foo<count2> foo2 = foo1; // Is this valid?
}

The above example seems pretty simple in concept. We have two instances of Foo that are dependent on two different variables. The question is, can we assign the one to the other?
At first glance, of course not! They have different counts, so they must be incompatible, right? But what if count1 and count2 are actually the same value?
So deciding whether or not the assignment is valid requires information about the value of the variables depended on. This is quite a conundrum. The compiler essentially has three choices here:

it could delay the static analysis to runtime, like we have been doing for binary layout calculations, and throw an AssignmentNotValid exception or something if the runtime type checking fails
or it can require that all type safety is proved at compile-time only, and require that we persuade it why it should work
or we can just relax the type soundness in this case, and simply the require the programmer to do a cast from one type to the other, as a signed declaration that he accepts all responsibility if the assignment turns out to be invalid

All of these choices seem reasonable:

performing the "static analysis" at runtime seems like something a .NET language would do, similar to the way that it can throw an InvalidCast exception
requiring that the programmer prove to the compiler that the assignment can never fail seems like something that other dependently typed languages do, requiring a proof-assisting language - and this, I think, is why dependent types seem to only be for the elite of theoretical computer science
ignoring the possible conflict and simply requiring the user do an explicit type-cast seems like something that C and C++ would do - you can generally cast anything to anything if you believe you know more than the compiler and don't have the patience or tools to prove it through the type system.

C already has this, kinda

To some extent C already has this (as of C99), in the form of variable length arrays (VLAs). For example:

void test(int numValues)
{
    int foo[numValues];

    for (int i = 0; i < numValues; i++)
        foo[i] = i + 25;
}

Unfortunately VLAs are extremely limited and you can't use them in very many contexts. Although foo has a dependent type, there is no static analysis that I know of that actually uses this information to warn the user if he's obviously stepping out of bounds. And the concept of VLAs doesn't generalize to user-defined types. It shows us a taste of another world, but it stops short.

Conclusion

I think dependent types get a bad reputation from anyone who encounters them, as being only of theoretical interest and only understandable to advanced mathematicians and computer science professors. The reason, I think, is because the descriptions of dependent types are very mathematical, and the implementations of dependent typed languages are quite academic. I hope I've shown you that this may be an unnecessary restriction, and we can hopefully build dependent types easily into popular languages, or at least into the next generation of popular languages. It may not be a complete solution, or a sound solution, but this is no different to most of the other features of todays popular languages.

Monday, September 9, 2013

Expressiveness vs Efficiency in C#

I do a lot of my work in C#. I love C#, especially after drowning in C for a while, because of its strong, expressive type system, managed memory, and its mixed-paradigm nature.

There are two big problems I find in C# though. One is that it's often verbose, but I won't get into that today. The other is that although the language provides nice abstractions for some things (like IEnumerable<T> which I use all the time), I find there is almost always a conflict between a succinct solution and a performant solution. A succinct solution is one that uses the high-level abstractions, while a performant solution is one that doesn't use them because of their cost. I find often that a succinct solution uses some of the functional features of C#, while a performant solution uses the imperative features.

Lets say we want to sum all the numbers between 1 and 100 which are not multiples of 3. A direct representation of this in C# using a functional notation is this:

x = Enumerable.Range(1, 100).Where(i => i % 3 != 0).Sum();

This nice line of code reads almost exactly like the English version. Unfortunately this method is slow. We're using the interface IEnumerable<int> to iterate through the numbers, and it's calling a lambda function for the predicate to filter out the multiples of 3, and then again using the interface to iterate through the numbers to sum them. I don't know exactly how much this of this the optimizer removes, but on my machine it averages about 1500ns to execute this line.

On the other hand, here is the equivalent imperative code:

int sum = 0;
for (int i = 1; i < 101; i++)
    if (i % 3 != 0)
        sum += i;
x = sum;

This essentially means exactly the same thing: for the numbers between 1 and 100, for numbers that aren't multiples of 3, sum them. I've explicitly not done anything clever here, like stepping in increments of 3 instead of 1. This code is meant to be as close to the functional code as possible. It executes in about 140ns on my computer - more than 10 times faster than the functional equivalent.

Of course there is another implementation that does the same thing:

x = 3367;

By the same timing methodology, this last implementation takes about 0.28ns on my machine, although at this level the result is pretty meaningless except to say that it is blindingly fast.

Now, I'm not an expert on optimization by any means, but I would say that both of the latter implementations are almost trivially apparent from the first implementation. The compiler or JIT optimizer could generate them just by "inlining" things that are known to hold constant for certain call.

In fact I think it's so easy that I'm going to demonstrate it (partially). First, let me rewrite the nice concise code into what the C# compiler actually sees, by writing out the extension-method calls explicitly:

x = Enumerable.Sum(
    Enumerable.Where(
        Enumerable.Range(1, 100), 
        i => i % 3 != 0));

I don't know how the Sum function looks, but lets say that this is a reasonable implementation:

static int Sum(IEnumerable<int> nums)
{
    int sum = 0;
    foreach (int n in nums)
        sum += n;
    return sum;
}

(You see where I'm going with this?) Now we "inline" the Sum function:

IEnumerable<int> nums = 
    Enumerable.Where(Enumerable.Range(1, 100), i => i % 3 != 0);
int sum = 0;
foreach (int n in nums)
    sum += n;
x = sum;

By continuing recursively substituting constant (unchanging, not const) arguments into functions, we land up with very close to our original imperative code.

To go even further, since the code doesn't access any outside state, the compiler could also just execute it at compile-time to arrive at the final answer straight away. This could even be done without any of the previous optimizations - the original expression only calls referentially transparent functions with constant arguments, and so the result must be constant and can be precomputed to just be 3367.

In my mind this is just a case constant propagation and function inlining, two things which are already done, but obviously not to the extent they could be done.

Premature Optimization?

Who cares that it takes 1500ns to execute the line of code? It's still bleedingly fast on a modern computer, so why does it matter?

Well, this isn't a hypothetical scenario. Yes, it's true that I've never had so sum up these numbers like that. But on the other hand I have had to work with streams of millions of bytes, and with arrays of thousands or millions of numbers. And in these situations I feel forced by C# to write low-level imperative code, rather than using the layers of abstractions that are supposed to make my life easier.

Why doesn't it?

So why doesn't the C# compiler do any of these optimizations? Well I don't pretend to know, but I put forward a few theories:

1. The specification is too restrictive

Perhaps the C# specification is too restrictive when it comes to letting the optimizer produce the most efficient code. For example, IEnumerable is a reference type, and reference types are always allocated on the heap. It may be a breach of C# contract to optimize out a heap allocation. This, of course, is pure speculation. It may not be about heap allocation at all, it may be about functions maintaining reference equality, or that the optimizer can't inline across assembly boundaries, etc. All of these boil down to the fact that something about the language model includes details that couple it to the inflated machine code that it currently outputs.

2. The analysis is too complicated

It might be that the performance cost analysis is too complicated. In the examples I pointed out how the second version is 10 times faster than the first, but what I didn't say was that the second version is slightly longer than the first in terms of code space. (It may not actually be longer, but lets say it is). Many optimizations have this property: they sacrifice one resource for another, rather than saving all over. And of course resources are not independent of each other. A longer program may run slower because of poorer caching.

When you start looking into the details of how to calculate whether a particular optimization is actually beneficial, you may come out with such a heavy, complicated and error-prone optimizer that it's just not worth it to have it run.

There isn't a demand

Although I've run into a few situations where these sorts of optimizations would be useful, perhaps Microsoft's market analysis came up with different answers. Perhaps the development time involved in implementing it exceeds business value and demand. I can't say anything further about this because I only know about my own demand, I can't speak for anyone else.

Conclusion

Whatever the reason, I think there's a gap in C# that needs to be filled by the next generation of popular languages, or perhaps the next generation of C# or .NET JIT optimizer. The C# compiler or JIT compiler doesn't optimize some pretty obvious cases.

Languages in general shouldn't force you chose between a nice implementation and a fast implementation. I'm sure there are already languages that address this, but none that checks all the other boxes that C# does (including that of being popular). If anyone is thinking about creating the next big language, they should seriously consider building the language from the ground up to support this kind of optimization.

Tuesday, August 20, 2013

Why Cast?

Last week I was writing some C code, and as is often the case in C, I started getting frustrated with the language and thinking about how it could be better. This time my thoughts were on casting.

The problem started when I encountered some code where a function needed to be passed a pointer to a const value, but the data available was in the form const volatile. Declaring a variable as const volatile simply means that it it can change (volatile), but you cannot change it (const). It is useful for embedded microcontrollers, where there are cases where values stored in program flash ROM (const) that can actually be changed and so shouldn't really be taken as "never changing", but instead should be reread from program flash every time it is accessed.

But that is beside the point. The point is that I have to cast something like this:

const MyAwesomeType* myAwsomeVariable = 
    (const MyAwesomeType*)myOtherVariable;

myFunction(myAwsomeVariable);

One of my coworkers would never have used const for the function, and would just write it like this:

myFunc((void*)myVar);

Certainly the latter is more readable and bit more compact, which is valuable. On the other hand, the former is more explicit and more type safe, which is also valuable. In this example I also tried to show a difference in naming convention - I prefer more explicit names, which makes lines of code harder to read, but individual variables easier to understand, while my coworker prefers shorter variable names which make the syntax of a line more readable. There is value in both approaches and the choice is purely a choice of style and philosophy, but I'm sure you can see there's a similarity in the reasoning behind the choices of variable names and more explicit types.

But why?

Initially I was debating about which approach is better - to be more explicit or more compact - but then my thoughts turned to "why do I have to cast at all?". I would say there are generally two reasons to explicitly cast:

Either I know the type beyond what the compiler knows and therefore need to persuade it that the type is correct,
Or I intend to convert the value to a type that it isn't currently

These are really very different things, and I'm only really going to talk about the first case. In the latter case of converting the type, the type-casting syntax is really just a syntactic convenience for what could have been done with intrinsic functions.

Casting, in the latter case, is used to persuade the compiler to treat some data as a type that the compiler doesn't otherwise think the data is. In the particular example of casting volatile const to const, we are persuading the compiler that for the purposes of myAwsomeVariable, we don't need to think about the data as volatile. Of course this is a complete lie! Either the data is volatile or it isn't. If it isn't volatile, then why was it declared that way? And if it is volatile, then why is it legal to say it isn't? The code could introduce a subtle bug: the developer of myFunction assumes that the data doesn't change, while clearly there is some good reason that we need to be able to change it. In short:

There is a conflict of expectations between the caller and callee, for which the compiler nicely produces a type error, but which we suppress with a type cast and thus introduce a subtle bug

When I say "introduces a subtle bug", I'm not being strictly truthful. There is no bug introduced in the code, because I've seen the functions myself and I know that the value won't change for the duration of the function call. But this does emphasize my point: I needed to look at the code on both sides of the function call to know that it's safe. This is bad. I need to know that myFunction doesn't cache the value across multiple calls (thinking it's constant and persistent), and I need to know that the caller doesn't change the value during the call (a threading race-condition), or during repeated calls (if myFunction also expects the value to be persistent across a longer time). The cast introduces one more piece of information that needs to be manually tracked throughout the program, coupling one part to another. In an ideal world, however, the function declaration should be enough to define the contract between the caller and the callee, when it comes to type consistency. Casting is an explicit breach of this contract, and so should never be allowed by the referee (the compiler).

Note that this situation isn't unique to const or volatile, of course. The same thing happens in C all the time with other type casts. It happens where we know the type of something that the compiler doesn't, or we want to keep track of the "type" of something ourselves and not give it to the compiler because it's too stupid to let us do what we want with it.

Another simple example might be a C container type, for which one implementation would be for it to contain void*-typed items because C doesn't have templates or generics. Once again we know more about the data than the compiler: we know what type of objects the container holds. And once again we have to perform casts to persuade the compiler of what we know. And once again, different parts of the program are coupled together by our special knowledge. That is, we have to keep track in our minds (or in the comments), so that the part that puts items into the container only puts in items that the other part of the program can take out. And once again this opens up a huge gaping whole for bugs to creep in.

What I'm saying is this: casting is not feature of the language, it's a code-smell that says that either your code is badly designed, or that the language is conceding defeat and needs to give you a kludge because its type system has failed you.

So, whats the solution?

I think the obvious solution is add more power to the type system. In the end, the compiler should know almost as much as we do about the code, so that if a cast is actually valid then it isn't necessary. That is, if it's valid to cast from volatile const to const, then the compiler already knew that it wasn't volatile at that point of the program so we don't need to cast it.

This is what happens with generics in C# - if we read an item from a List<int>, then the compiler already knows that it's valid to cast the item to int, so a cast becomes redundant. That is, the item already is an int, so there's no point in casting it. Indeed in C# I find that I never need casts (except for integer conversions which aren't the same thing). When I find myself needing a cast, its generally because I've done something wrong with the design - I've coded something that needs to act generically, but without writing generic code.

You might be wondering how generics could possibly help in the example of const volatile above. Well, it wouldn't be sufficient for C to have generics like C# (although it would certainly help). C++ templates are half a solution, but they have many shortfalls, which I won't go into right now.

What would such a type system look like? Well if there was an obvious answer to that then C would probably already have it. Computer science research is constantly improving the situation at a theoretical level, and I think soon we'll have the perfect low-level type system, just in time to not need it anymore.

Friday, July 26, 2013

Aren't Subtype and Parametric Polymorphism the Same?

There are apparently two common types of polymorphism in software: subtype polymorphism and parametric polymorphism. I think that these are really the same thing, and I can't any good reason why modern languages separate the two. I'm going to explain my reasoning in this post.

Firstly, what are the claimed differences between the two? Well, I'm going to demonstrate by an example. Lets say we have an abstract type, IPet, which has two implementations, Cat and Mouse, each of which implements the IPet's Stroke function (I'm using C# for the examples):

interface IPet 
{
    void Stroke();
}

class Cat : IPet
{
    void Stroke() { Console.WriteLine("Purr"); }
}

class Mouse : IPet
{
    void Stroke() { Console.WriteLine("Squeak"); }
}

Now lets say we want a function to repeatedly stroke a pet a given number of times, but abstracted from what type of pet it actually is. We can implement this using subtype polymorphism:

void StrokePet1(IPet pet, int count)
{
    for (int i = 0; i < count; i++)
        pet.stroke();
}

And here is the implementation using parametric polymorphism:

void StrokePet2<T>(T pet, int count) where T: IPet
{
    for (int i = 0; i < count; i++)
        pet.stroke();
}

So what's the difference? Well in terms of the implementation of StrokePet1 and StrokePet2 I see no difference. Both function implementations read like this:

Given a pet of any type:
    Loop count times:
        Stroke the pet

Well, in C# there is a nontrivial difference. Particularly in the way you call the function. With parametric polymorphism, you need to know the type T upfront when you call StrokePet2, otherwise it degenerates into subtype polymorphism. The same thing applies generally. For example, when you declare a List<T>, T needs to be known upfront when the declaration is made (or it needs to be a type parameter in itself).

There is another difference in C#: the parametric type T can be used in multiple locations to ensure that each declaration has the same type. Consider we have a function that evaluates two pets of the same species, and returns the "winner" (by some unknown criteria):

T ChooseWinner<T>(T left, T right) where T: IPet;

And now lets try implement this using subtype polymorphism:

IPet ChooseWinner(IPet left, IPet right);

Oh dear.. there's no way to specify that left is the same pet type as right or the return type. This is the same problem we would have with using non-generic container classes: there's no way to specify at compile time that some abstract queue should only contain elements of a certain type:

// This queue is supposed to only contain IPets, but how do we specify it?
Queue myQueue = new Queue(); 

myQueue.Enqueue(new Mouse());
myQueue.Enqueue(new Cat());
myQueue.Enqueue(new object); // Ooops. Why can't the compiler catch this? 

IPet first = myQueue.Dequeue(); // why won't it allow this?

So wait, I'm saying they're actually different? No. I'm saying that in C#, subtyping polymorphism is just parametric polymorphism but with less static information.

I don't think this is a shortcoming of subtype polymorphism - it's a shortcoming of the language to not be allow us to fully express the relationships between between types used in an class or function. There's no way to specify that a C# Queue container has to be homogeneous without using generics, but that just means that C# subtyping polymorphism isn't as expressive as its parametric polymorphism. Conversely, parametric polymorphism in C# is more strict because we need to supply more static information. Consider the following:

Queue<IPet> myQueue = new Queue<IPet>();

// Here myPet is a mouse, but we haven't told the compiler it must be
object myPet = new Mouse();

myQueue.Enqueue(myPet); // Why won't it allow this?

So really, it's all about the balance between information we keep in our heads vs information we give to the compiler for static type checking.

So What?

What's the point? Does anyone care that there may be some airy-fairy overarching concept that ties parametric and subtype polymorphism? I would say yes, and the reason is about code re-usability and performance.

Regarding performance, in C# there is a difference in the performance of using each type of polymorphism. In the case of parametric polymorphism, the underlying code can be specialized for the provided type argument, making it more efficient. For example, a List<int> will perform much better than a List<object> whose elements are ints. Lets say we have an INumber interface, (like Javas Number class). Should we really have to decide between the following two implementations:

INumber Add(INumber first, INumber second)
{
    return first + second;
}

TResult Add<TFirst,TSecond,TResult>(TFirst first, TSecond second)
    where TFirst: INumber
    where TSecond: INumber
    where TResult: INumber
{
    return first + second;
}

This seems like the sort of thing the compiler should sort out, akin to inline optimization there should be specialization optimization. Inlining of functions involves taking their parameters and substituting concrete arguments to specialize it for a specific call site - sound familiar?

Regarding reusability, note that functions with generic parameters are not interchangeable with functions that have virtual parameters, and so to some extent the designer of a function or class needs to think ahead about how much of it to make generic. But why? Why can't the language make generics seamless: the coder should specify exactly the required details for the code, and no more. Everything else should just be "generic" by default. Need a number variable...? Just use INumber, not int or short. The compiler can "inline" (specialize) specific types as they become known. Whether the compiler generates an actual abstract object with dynamic dispatching at runtime, or just specializes the code to the concrete type, should just an optimization issue and not the concern of the programmer.

As a side note, there are languages which make generics much less invasive. For example, F# is a language where I think it really is feasible to make all code generic (except at the root). But I think we need this capability in more popular OOP languages like C# and Java.

In conclusion, I think that parametric and subtype polymorphism are actually the same thing in a statically typed language. In terms of program operation, subtype polymorphism is just like a parametric polymorphism that's missing a few static features. Popular modern languages should use this make it easier to write reusable, high-performance code.