Templates generate multiple classes and multiple functions, so any template code not dependent on a template parameter (either non-type template parameters or type parameters) causes bloat: eliminate bloat due to non-type template parameters by replacing template parameters with function parameters or class data members; reduce bloat caused from type parameters by sharing implementations for instantiation types with identical binary representations.
When writing templates, since there’s only one copy of the template source code, we have to analyze it carefully to avoid the implicit replication that may take place when a template is instantiated multiple times.
Bloat due to non-type template parameters
For example, suppose we’d like to write a template for fixed-size square matrices that support matrix inversion:
This template takes a type parameter
T as well as a non-type parameter
n of type
size_t. This example is actually a classic way for template-induced code bloat:
In the statements above, two copies of
invert will be instantiated, and these two version of
invert are character-for-character identical except for the use of 5 in one version and 10 in the other. To reduce the code bloat, we could redesign and call a parameterized function instead.
Replace with function parameters
This design addressed the issue of code bloat, but it also introduces a problem:
SquareMatrixBase::invert only knows the size of the data, but it does not know where the data for a particular matrix is, because only the derived class knows that. We could add another parameter to
SquareMatrixBase::invert, such as a pointer to the beginning of a chunk of memory that stores the matrix’s data.
However, an alternative and possibly better solution will be to have
SquareMatrixBase store both a pointer to the memory for the matrix values, as well as the matrix size, so that any other functions asking for the matrix memory address or matrix size can be written in a address-independent-and-size-independent manner, and moved into
Replace with class data members
Objects of such type have no need for dynamic memory allocation, but the objects could be very large. An alternative would be to put the data for each matrix on the heap:
No matter where the matrix value data is stored, the result from a bloat point of view is that many
SquareMatrix's member functions can be simple inline calls to base class versions that are shared with all other matrices holding the same type of data, regardless of their size.
In terms of efficiency, it is possible that the version of
invert with the matrix sizes hardwired into them generates better code than the shared version whose size is passed as a function parameter or is stored in the object: in the size-specific versions, the sizes would be compile-time constants, hence eligible for optimizations such as constant propagation (they’ll be folded into the generated instructions as imeediate operands), which can’t be done in the size-independent version.
On the other side, the size-independent version decreases the size of executable by having only one version of
invert for multiple matrix sizes, and this could reduce the program’s working set size and improve locality of reference in the instruction cache, which may in term compensating for any lost optimizations in size-specific versions of
invert. The only way to tell which version is better one is to try them both and observe the behavior on the particular platform and on representative data sets.
Speaking of size of objects, we should observe that there’s an extra size of a pointer in each
SquareMatrix object, because, as the derived class,
SquareMatrix could get to the data by alternative designs such as having the base class store a
protected pointer to the matrix data. However, this new design also has some disadvantages:
- it may lead to the loss of encapsulation described in item 22
- it may also lead to resource management complications: since derived class may either dynamically alloacate the matrix data, or physically store the data inside the derived class object, if we only let the base class store a pointer to that data, it is hard for base class to determine whether the pointer should be deleted or not.
At some point, a little code replication seems like a mercy to keep away from complication.
Bloat due to type parameters
On many platforms,
long have the same binary representation, so the member functions for
vector<long> would likely be identical. Some linkers will merge identical function implementations, but some will not, and that means that some templates instantiated on both
long could cause code bloat in some environments.
Similarly, on most platforms, all pointer types have the same binary representation, so templates holding pointer types (e.g.,
list<SquareMatrix<long, 3>*>, etc.) should be able to use a single underlying implementation for each member function. Typically, this is achieved by implementing member functions that work with strongly typed pointers (i.e.,
T* pointers) by having them call functions that work with untyped pointers (i.e.,
void* pointers), which is how some of standart C++ library do for templates like