A version of block matrix-matrix multiplication in Matlab

Matlab makes it easy to state algorithms where arrays are partitioned into subblocks. The most useful feature is the ability to use index arrays like the variables I, J, and K highlighted below.


function C = matmul(A, B, blocksize)

   %------------------------------------------------------------------
   % Compute C = A*B, where A is a n x n array. Do this by cutting
   % the arrays into subblocks of size blocksize x blocksize. This
   % allows enhancing cache re-use. Warning: this function is only
   % for when mod(n, blocksize) = 0. 
   %------------------------------------------------------------------

   [m, n] = size(A);
   C = zeros(n, n);

   for ii = 1:blocksize:n
      % --------------------------------------------------
      % Indices in overall A array of current block row:
      % --------------------------------------------------
      I = ii:ii+blocksize-1;

      for jj = 1:blocksize:n
        % ----------------------------------------------------
        % Indices in overall A array of current block column:
        % ----------------------------------------------------
        J = jj:jj+blocksize-1;

        %--------------------------------------------------------------
        % Copy over A's submatrix of size (ii/blocksize, jj/blocksize) 
        % to the array T, transposing it as it is copied.
        %--------------------------------------------------------------
        T = A(I, J)';

        %----------------------------------------
        % For each block column of C and B ...
        %----------------------------------------
        for kk = 1:blocksize:n

            %-----------------------------------------------------------
            % Define K as the array of indices in C and B corresponding 
            % to the current block column being updated.
            %-----------------------------------------------------------
            K = kk:kk+blocksize-1;

            %-----------------------------------------------------------------
            % Compute product of block (I, J) of A with the block col K of B
            % Try replacing the following line with loops that access the 
            % matrices T and B with stride one in a 1-d array layout of 
            % the two arrays containing T and B, and the expressive power 
            % of block indexing will become obvious.
            %------------------------------------------------------------------
            C(I, K) = C(I, K) + T'*B(J, K);

        end % kk block

      end % jj loop

   end % ii loop


BTW, for codes that use block indexing of arrays, I usually adhere to the two taxonomic conventions shown above: If the above description of the naming convention seems opaque, ignore it. Those conventions are not part of some widely-accepted standard; they are just personal quirks.