# Ikaros Math Functions

This document documents all functions in the Ikaros math library. An introduction to the library is also available.

The functions in the Ikaros math library replace the functions in <math.h> or <cmath> with functions that are optimized for use with scalars, arrays, and matrices of floats. The library interfaces with BLAS functions and other vector optimized libraries, but also includes scalar implementations for use on systems that does not have Altivec or SSE support.

To use the matrix functions, the matrices should be allocated by the kernel or by using one of the utility functions. The matrix functions may not work if memory is allocated in some other way.

Many of the functions have names that are identical to their MATLAB equivalents.

All the math functions are listed below divided into the following categories:

## Constants

```extern const float	pi;
extern const float	sqrt2pi;
extern const float	maxfloat;
```

## Scalar Functions

```float	exp(float x);
float	pow(float x, float y);

float	log(float x);
float	log10(float x);
```

## Trigonometric Functions

The trigonometric functions are identical to the standard onces expect tah they operate on floats by default. The function atan2 also operates on arrays and matrices.

```float    sin(float x);
float    cos(float x);
float    tan(float x);
float	 asin(float x);
float	 acos(float x);
float	 atan(float x);
float    atan2(float x, float y);
float *  atan2(float * r, float * a, float * b, int size);
float ** atan2(float ** r, float ** a, float ** b, int sizex, int sizey);
```

## Gaussians

The scalar function gaussian calculates the gaussian of x with width set by sigma. The array and matrix versions do the same but for each element of the array or matrix. They all have the same sigma.

The set of functions gausian1 work in the same way but does not normalize the result. That is, the maximum element will be 1.

```float    gaussian(float x, float sigma);
float *	 gaussian(float * r, float * a, float sigma, int size);
float ** gaussian(float ** r, float ** m, float sigma,
int sizex, int sizey);

float    gaussian1(float x, float sigma);
float *	 gaussian1(float * r, float * a, float sigma, int size);
float ** gaussian1(float ** r, float ** m, float sigma,
int sizex, int sizey);
```

There are also a number of functions to fill an array or matrix with a gaussian. The lcoation of the center of the gaussian is given by the argument x or x and y for the matrix case. The gaussian is clipped at the edges of the array or matrix and is subject to dithering. It can thus not be assumed that the sum of the elements will be exactly 1. As above, the gaussian1 functions do not normalize the gaussian.

```float *  gaussian(float * r, float x, float sigma, int size);
float ** gaussian(float ** r, float x, float y, float sigma,
int sizex, int sizey);

float *  gaussian1(float * r, float x, float sigma, int size);
float ** gaussian1(float ** r, float x, float y, float sigma,
int sizex, int sizey);
```

## Random Values

The function random is used to get a pseudorandom number between low an high. The array and matrix versions set every element to a random value.

```float    random(float low, float high);
float *  random(float * r, float low, float high, int size);
float ** random(float ** r, float low, float high, int sizex, int sizey);

int random(int high);
```

## Tests

The functions zero and non_zero tests if an array or matrix contains only zero elements.

```bool zero(float * a, int size);
bool zero(float ** m, int sizex, int sizey);
bool non_zero(float * a, int size);
bool non_zero(float ** m, int sizex, int sizey);
```

## Squares and Square Roots

```int      sqr(int x);
float    sqr(float x);
float *  sqr(float * a, int size);
float ** sqr(float ** r, int sizex, int sizey);
float *  sqr(float * r, float * a, int size);
float ** sqr(float ** r, float ** a, int sizex, int sizey);

float    sqrt(int x);
float    sqrt(float x);
float *  sqrt(float * a, int size);
float ** sqrt(float ** m, int sizex, int sizey);
float *  sqrt(float * r, float * a, int size);
float ** sqrt(float ** r, float ** a, int sizex, int sizey);

float    hypot(float x, float y);
float *  hypot(float * r, float * a, float * b, int size);
float ** hypot(float ** r, float ** a, float ** b, int sizex, int sizey);
```

## Distances and Norms

The function dist calculates the euclidean distance between two arrays. In the matrix case, the whole matrix is treated as a single array. The function norm calculates the euclidean (L2) norm of its vector argument (or the Frobenius norm in the case of a matrix). The functions dist1 and norm1 operate in the same way, but use the city-block distance and the L1 norm.

```float dist(float * a, float * b, int size);
float dist(float ** a, float ** b, int sizex, int sizey);

float norm(float * a, int size);
float norm(float ** a, int sizex, int sizey);

float dist1(float * a, float * b, int size);
float dist1(float ** a, float ** b, int sizex, int sizey);

float norm1(float * a, int size);
float norm1(float ** a, int sizex, int sizey);
```

Three functions are available to normalize an array or matrix. The function normalize uses the euclidean norm and the resulting vector will have the length 1; normalize1 uses the city-block norm which results in a vector where the absolute sum of the elemnts is 1; and normalize_max uses the max-norm which will result in an array where the maximal element is 1.

```float *     normalize(float * a, int size);
float **    normalize(float ** m, int sizex, int sizey);

float *     normalize1(float * a, int size);
float **    normalize1(float ** m, int sizex, int sizey);

float *     normalize_max(float * a, int size);
float **    normalize_max(float ** m, int sizex, int sizey);
```

## Absolute Value

```int      abs(int x);
float    abs(float x);
float *  abs(float * a, int size);
float ** abs(float ** m, int sizex, int sizey);
float *  abs(float * r, float * a, int size);
float ** abs(float ** r, float ** m, int sizex, int sizey);
```

## Min and Max

```int      min(int x, int y);
float    min(float x, float y);
float    min(float * a, int size);
float    min(float ** a, int sizex, int sizey);
float *  min(float * r, float * a, int size);
float ** min(float ** r, float ** a, int sizex, int sizey);
float *  min(float * r, float * a, float * b, int size);
float ** min(float ** r, float ** a, float ** b, int sizex, int sizey);

int      arg_min(float * a, int size);
void     arg_min(int & x, int & y, float ** a, int sizex, int sizey);

int      max(int x, int y);
float    max(float x, float y);
float    max(float * a, int size);
float    max(float ** a, int sizex, int sizey);
float *	 max(float * r, float * a, int size);
float ** max(float ** r, float ** a, int sizex, int sizey);
float *	 max(float * r, float * a, float * b, int size);
float ** max(float ** r, float ** a, float ** b, int sizex, int sizey);

int      arg_max(float * a, int size);
void     arg_max(int & x, int & y, float ** a, int sizex, int sizey);

float *  minmax(float & min, float & max, float * a, int size);
float ** minmax(float & min, float & max, float ** a, int sizex, int sizey);
```

## Mean

The mean function calculated the mean value of an array of matrix.

```float mean(float * a, int size);
float mean(float ** a, int sizex, int sizey);
```

## Clipping

The clip functions limit the value of a scalar, or the elements of an array or matix to the range specified.

```float    clip(float x, float low, float high);
float *  clip(float * a, float low, float high, int size);
float ** clip(float ** a, float low, float high, int sizex, int sizey);
float *  clip(float * r, float * a, float low, float high, int size);
float ** clip(float ** r, float ** a, float low, float high, int sizex,
int sizey);
```

Calculating the sum of the elements in an array or matrix

```float    add(float * a, int size); // sum a
float    add(float ** a, int sizex, int sizey); // sum a
```

Accumulation into an array or matrix

```// r = r + alpha
float *  add(float * r, float alpha, int size);

// r = r + alpha
float ** add(float ** r, float alpha, int sizex, int sizey);

// r = r + a
float *  add(float * r, float * a, int size);

// r = r + a
float ** add(float ** r, float ** a, int sizex, int sizey);
```

Adding two values, possibly with scaling.

```// r = a + b
float *  add(float * r, float * a, float * b, int size);

// r = a + b
float ** add(float ** r, float ** a, float ** b, int sizex, int sizey);

// r = r + alpha * a
float *  add(float * r, float alpha, float * a, int size);

// r = r + alpha * a
float ** add(float ** r, float alpha, float ** a, int sizex, int sizey);

// r = alpha * a + beta * b
float *  add(float * r, float alpha, float * a, float beta,
float * b, int size);

// r = alpha * a + beta * b
float ** add(float ** r, float alpha, float ** a, float beta,
float * b, int sizex, int sizey);

// r = alpha * a + beta
float *  add(float * r, float alpha, float * a, float beta, int size);

// r = alpha * a + beta
float ** add(float ** r, float alpha, float ** a, float beta,
int sizex, int sizey);
```

## Subtraction

There are four functions to subtract a constant from an array or matrix, or to subtract each element from a constant:

```float *     subtract(float * r, float alpha, int size);
float **    subtract(float ** r, float alpha, int sizex, int sizey);
float *     subtract(float alpha, float * r, int size);
float **    subtract(float alpha, float ** r, int sizex, int sizey);
```

The other set of suntraction functions subtract two arrays or matrices:

```float *  subtract(float * r, float * a, int size);
float ** subtract(float ** r, float ** a, int sizex, int sizey);
float *  subtract(float * r, float * a, float * b, int size);
float ** subtract(float ** r, float ** a, float ** b, int sizex, int sizey);
```

## Multiplication

A number of multiplication functions perform element-wise multiplication:

```float *	 multiply(float * a, float alpha, int size);
float ** multiply(float ** a, float alpha, int size);
float *  multiply(float * r, float * a, float alpha, int size);
float ** multiply(float ** r, float ** a, float alpha, int size);

float *  multiply(float * r, float * a, int size);
float ** multiply(float ** r, float ** a, int sizex, int sizey);
float *	 multiply(float * r, float * a, float * b, int size);
float ** multiply(float ** r, float ** a, float ** b, int sizex, int sizey);
```

The function multiply can also be used to perform a multiplcation between a matrix and a vector. The matrix a has sizey rows and sizex columns. The array b has sizex elements, and the result array r has sizey elements. The following syntax is used:

```float *  multiply(float * r, float ** a, float * b, int sizex, int sizey);
```

It is also possible to do full matrix multiplication between two matrices. The arguments sizex and sizey are the size of resulting matrix. The argument n is the columns of a which must be equal to the number of rows of b.

```float ** multiply(float ** r, float ** a, float ** b, int sizex,
int sizey, int n);
```

The scalar product is calculated with the function dot. For a matrix, dot treats the matrix as one large vector (not as a set of vectors).

```float dot(float * a, float * b, int size);
float dot(float ** a, float ** b, int sizex, int sizey);
```

The math library also contains a function to calculate the outer product of two array. The outer product is a matrix where each element is the product of two elements in the the two arrays: r[j][i] = a[i]*b[j]; sizex = sizeof(a), sizey = sizeof(b).

```float ** outer(float ** r, float * a, float * b, int sizex, int sizey);
```

## Division

```float *	 divide(float * r, float * a, int size);
float ** divide(float ** r, float ** a, int sizex, int sizey);
float *	 divide(float * r, float * a, float * b, int size);
float ** divide(float ** r, float ** a, float ** b, int sizex, int sizey);
```

## Linear Algebra

The function eye sets a square matrix to the identity matrix adn diag sets the diagonal to the values of the array a. The function det calculates the determinant and trace calculates the tarce of a matrix, that is the sum of its diagonal elements. To get the rank of a matrix, the function rank can be used. The argument tol sets the tolerance when the function determines if a singular value is larger than zero. If no argument is given or tol is set to 0, the function calculates a tolerance based on the numerical resolution of the machine.

```float ** eye(float ** a, int size);
float ** diag(float ** m, float * a, int sizex, int sizey);
float    det(float ** m, int size);
float    trace(float ** m, int size);
float    rank(float ** m, int sizex, int sizey, float tol=0);
```

### Transpose

The transpose of a matrix is calculated with the function transpose. A square matrix can be transposed in place using the second version of the function below.

```float ** transpose(float ** a_T, float ** a, int sizex, int sizey);
float ** transpose(float ** a, int size);
```

### Matrix Inversion

Matrix Inversion is performed by the function inv. If the matrix is singular and cannot be inverted, the function returns false. Alternatively, the pseudoinverse of a matrix m can be computed with the function pinv. The arguments sizex and sizey are the size of the resulting matrix in r. The matrix m should thus have sizex rows and sizey columns.

```bool inv(float ** r, float ** m, int size);
void pinv(float ** r, float ** m, int sizex, int sizey);
```

### Left Division

The function mldivide performs left-division. It solves the equation mr = a. It is similar to the MATLAB function mldivide. For the case of a square matrix m and a vector a it finds the exact solution using lu-decomposition. If the system is overdetermined, the solution is obtained by minimizing ||mr-a||.

```float *  mldivide(float * r, float ** m, float * a, int size);
float *  mldivide(float * r, float ** m, float * a, int sizex, int sizey);
float ** mldivide(float ** r, float ** m, float ** a, int sizex, int sizey, int n);
```

The example below show how mldivide is used with three matrices:

```    int A_sizex, A_sizey;
int B_sizex, B_sizey;
int sizey;

float ** A = create_matrix("1 2 3; 4 5 9; 8 7 6; 5 4 3; 6 5 7", A_sizex, A_sizey);
float ** B = create_matrix("5 4; 3 2; 1 9; 8 7; 6 5", B_sizex, B_sizey);
float ** X = create_matrix(B_sizex, A_sizex);

printf("\nLeast-square solution to AX=B \n\n");

if(A_sizey == B_sizey)
sizey=A_sizey;
else
{
printf("A and B must have the same number of rows\n");
return;
}

print_matrix("A", A, A_sizex, A_sizey, 0);
print_matrix("B", B, B_sizex, B_sizey, 0);

mldivide(X, A, B, B_sizex, A_sizex, sizey);

print_matrix("X", X, B_sizex, A_sizex, 4);

float ** AX = create_matrix(B_sizex, B_sizey);
multiply(AX, A, X, B_sizex, sizey, A_sizex);

print_matrix("A*X", AX, B_sizex, B_sizey, 4);

destroy_matrix(AX);
destroy_matrix(A);
destroy_matrix(B);
destroy_matrix(X);
```

### LU-Factorization

A number of matrix factorization methods are available. The LU-decomposition of a matrix a can be calculated using the different forms of the lu function. With a single matrix argument a, it calculates the LU-decomposition in compact form in a single matrix of the same size as a. With two arguments l and u, the function returns the decomposition in two matrices, the product of which equals a, that is l*u = a. Finally, with three arguments l, u and p, the function also returns the permutation matrix. In this case l*u = p*a and l is returned in a lower triangular form without permutations.

The three different lu-functions are identical to the MATLAB functions m = lu(a), [l u] = lu(a), and [l u p] = lu(a).

```void lu(float ** m, float ** a, int sizex, int sizey);
void lu(float ** l, float ** u, float ** a, int sizex, int sizey);
void lu(float ** l, float ** u, float ** p, float ** a, int sizex, int sizey);
```

The example below show how the lu function is used;

```    int sizex, sizey;
float ** A = create_matrix("1 2 3 4 5; 9 8 7 6 5; 4 3 6 5 7", sizex, sizey);

printf("\nLU-decomposition of A \n\n");

print_matrix("A", A, sizex, sizey, 0);

printf("Compact form:\n\n");

float ** LU = create_matrix(sizex, sizey);
lu(LU, A, sizex, sizey);
print_matrix("LU", LU, sizex, sizey, 4);

printf("L*U form:\n\n");

int sizemin = min(sizex, sizey);
float ** L = create_matrix(sizemin, sizey);
float ** U = create_matrix(sizex, sizemin);

lu(L, U, A, sizex, sizey);

print_matrix("L", L, sizemin, sizey, 4);
print_matrix("U", U, sizex, sizemin, 4);

printf("\nP*L*U form\n\n");

float ** P = create_matrix(sizey, sizey);

lu(L, U, P, A, sizex, sizey);

print_matrix("L", L, sizemin, sizey, 4);
print_matrix("U", U, sizex, sizemin, 4);
print_matrix("P", P, sizey, sizey, 0);

destroy_matrix(LU);
destroy_matrix(L);
destroy_matrix(U);
destroy_matrix(P);
```

### QR-Factorization

The function qr returns the QR-decomposition of a. It currently only works for square matrices. If the matrix is singular, the factorization cannot be computed and the function returns false.

```bool qr(float ** q, float ** r, float ** a, int size);
```

The Cholesky decomposition of a square matrix m can be computed using the function chol. The result of the factorization is stored in r and a pointer to it is also returned,

### Cholesky-Factorization

```float ** chol(float ** r, float ** m, int size);
```

The used of cholesky decomposition is shown in the following code example:

```    printf("\nCholesky decomposition of A \n\n");

int size; // matrix must be square (and symmetric, positive definite)

float ** A = create_matrix("1 1 1; 1 2 3; 1 3 6", size, size);
float ** B = create_matrix(size, size);
float ** Bt = create_matrix(size, size);
float ** C = create_matrix(size, size);

bool positive_definite = chol(B, A, size);
transpose(Bt, B, size, size);

if(positive_definite)
printf("A is positive definite.\n\n");
else
printf("A is not positive definite.\nResult of chol(A) is not valid.\n\n");

print_matrix("A", A, size, size, 0);
print_matrix("chol(A)", B, size, size, 0);

multiply(C, Bt, B, size, size, size);

print_matrix("chol(A)' * chol(A)", C, size, size, 0);
```

### Singular Value Decomposition

The singular value decomposition of a matrix m is computed with the function svd. The singular values are returned in an array s rather than as a matrix and the matrices u and v contain the left and right singular vectors respectively.

If S is a matrix with the elements of s the diagonal, then u*S*v' = m.

```int svd(float ** u, float * s, float ** v, float ** m, int sizex, int sizey);
```

The use of the svd function is shown in the example below.

```    printf("\nSingular value decomposition: A = U*S*V'\n\n");

int sizex, sizey;

float ** A = create_matrix("1 2; 3 4; 5 6; 7 8", sizex, sizey);
float ** U = create_matrix(sizey, sizey);
float ** V = create_matrix(sizex, sizex);
float ** S = create_matrix(sizex, sizey);
float *  s = create_array(min(sizex, sizey));

svd(U, s, V, A, sizex, sizey);

for(int i=0; i<min(sizex, sizey); i++)
S[i][i] = s[i];

print_matrix("A", A, sizex, sizey, 0);
print_matrix("U", U, sizey, sizey);
print_matrix("S", S, sizex, sizey);
print_matrix("V", V, sizex, sizex);

// Transpose U and V

float ** Ut = transpose(create_matrix(sizey, sizey), U, sizey, sizey);
float ** Vt = transpose(create_matrix(sizex, sizex), V, sizex, sizex);

print_matrix("U'*U", multiply(Ut, Ut, U, sizey, sizey, sizey), sizey, sizey, 0);
print_matrix("V'*V", multiply(Vt, Vt, V, sizex, sizex, sizex), sizex, sizex, 0);

// Calculate and print U*S*V'

multiply(S, U, S, sizex, sizey, sizey);
multiply(S, S, transpose(V, sizex), sizex, sizey, sizex);

print_matrix("U*S*V'", S, sizex, sizey, 0);
```

### Principal Component Analysis

The function pca performs principal component analysis on the data in the matrix m. Each row is one data point and the function returns the prinicpal axes in c (sizex*sizex). The array v (sizex) contains the variance along each axis.

```float **    pca(float ** c, float * v, float ** m, int sizex, int sizey);
```

The function first centers the data in m by subtracting the mean of each column from each data point and the uses the singular value decomposition on the covariance matrix of m.

The data in m can be mapped to the new coordinate system using by calculating m*c.

## Homogenous Matrices

Ikaros math library has a number of functions for working with homogenous matrices. The functions are prefixed with an h to distinguish them from the other matrix functions.

A homogenous matrix is internally represented as a float and is not directly compatible with the other matrix functions in Ikaros. There is also a h_vector for homogenous vectors, which is compatible with the array function if the size 4 is used.

```typedef float h_matrix;
typedef float h_vector;
```

The different axes are referred to by X, Y and Z.

```enum axis { X, Y, Z };
```

The following functions initialize an h_matrix.

```float *     h_reset(h_matrix r);
float *     h_eye(h_matrix r);
float *     h_rotation_matrix(h_matrix r, axis a, float alpha);
float *     h_translatation_matrix(h_matrix r, float tx, float ty, float tz);
float *     h_reflection_matrix(h_matrix r, axis a);
float *     h_scaling_matrix(h_matrix r, float sx, float sy, float sz);
```

The following functions get data from an h_matrix.

```void        h_get_translation(const h_matrix m, float & x, float & y, float &z);
void        h_get_euler_angles(const h_matrix m, float & x, float & y, float &z);
float       h_get_euler_angle(const h_matrix m, axis a);
```

The standard mathematical operations are available and have a similar interface to the regular matrix functions in Ikaros but without the size parameters. Note that the function h_multiply_v that multiplies a matrix with a vector also normalizes the result into a homogenous vector with last element equal to 1.

```float *     h_add(h_matrix r, h_matrix a); // r = r + a
float *     h_multiply(h_matrix r, h_matrix a, h_matrix b);  // matrix x matrix
float *     h_multiply_v(h_vector r, h_matrix m, h_vector v);  // matrix x vector + scaling to make v == 1.
float *     h_multiply(h_matrix r, float c); // matrix x scalar
float *     h_transpose(h_matrix r, h_matrix a);
float *     h_inv(h_matrix r, h_matrix a);
```

The rotation part of a homogenous matrix can be orthogonalized using h_normalize_rotation(). This function is based on singular value decomposition.

```float *     h_normalize_rotation(h_vector m);
```

There are a number of function to copy and print homogenous matrices.

```float *     h_copy(h_matrix r, h_matrix m);
float *     h_copy_v(h_vector r, h_vector a);
void        h_print_matrix(const char * name, h_matrix m, int decimals=2);

float **    h_create_matrix(h_matrix m); // create an ordinary 4x4 matrix; data is copied
float **    h_set_matrix(float ** m, h_matrix h); // set top left 4x4 elements of a regular matrix from a h_matrix
```

To use the regular matrix operation on a h_matrix, it is necessary to first construct an array that contains the indices to each row. This is simplified using the function h_temp_matrix. Unlike h_create_matrix, no memory is allocated and the matrix is only valid in the scope where the argument p is declared.

```    float **    h_temp_matrix(h_matrix r, float * (&p));
```

The following example show how the function can be used. The variable m is use as a handle to the data in h and p is used for temporary pointer storage. Here, the determinant of the homogenous matrix is calculated using the function det() that expects a regular matrix.

```h_matrix h;
float * p;
float ** m;

// initialize h

h_eye(h);

// construct the temporary matrix m
// as a handle to h

m = h_temp_matrix(h, p);

// calculate the determinant of h

float d = det(m, 4);
```

## Conversion

```float trunc(float x);
int   lround(float x);

// min, max of float; byte is always 0-255

void  float_to_byte(unsigned char * r, float * a, float min, float max,
int size);

// min, max of float; byte is always 0-255

void  byte_to_float(float * r, unsigned char * a, float min, float max,
int size);

// convert to int; use d if char is NULL

int	  string_to_int(char * s, int d=0);

// convert to float; use d if char is NULL

float  string_to_float(char * s, float d=0.0);
```

## Miscellaneous Functions

The functions sort() can be used to sort the values of an array or matrix.

```    float *     sort(float * a, long size);

// Sort values in a matrix as if it were an array

float **    sort(float ** a, long sizex, long sizey);

// Sort rows in a matrix according to the value in ome of the columns

float **    sort(float ** a, int column, int sizex, int sizey);
```

The function select_boltzmann selects an element of the matrix m accroing to the Bolzmann distributed generated by the lements in m. The result is returned in the arguments x an y, and T is the temperature of the distribution.

```void  select_boltzmann(int & x, int & y, float ** m, int sizex, int sizey,
float T);
```

To ascend or descend a gradient of the values represnted in a matyrix there are two functions ascend_gradient and descend_gradient. The starting point for the search for the local optimum is set in the arguments x and y and on return these values contain the coordinates for the local maximum (for ascend_gradient) or minium (for descend_gradient).

In the current implementation, these functions will not find an optimum along the border of the matrix.

```void  ascend_gradient(int & x, int & y, float ** m, int sizex, int sizey);
void  descend_gradient(int & x, int & y, float ** m, int sizex, int sizey);
```

## Image Processing

### im2row

The function im2row rearranges the values in an image to allow convolution using matrix multiplication. (The function is similar to im2col in MATLAB but because Ikaros uses a row major representation, a different rearrangement of the elements is necessary).

```float ** im2row(
float ** result,
float ** source,
int result_size_x,
int result_size_y,
int source_size_x,
int source_size_y,
int kernel_size_x,
int kernel_size_y,
int stride_x=1,
int stride_y=1
);
```

The following example shows how this function can be used for convolution. Note that both the kernel and results are accessed as arrays in the call to multiply.

```    // Input variables

int source_size_x, source_size_y;
int kernel_size_x, kernel_size_y;

float ** source = create_matrix("1 2 3 4; 5 6 7 8; 9 10 11 12;
13 14 15 16", source_size_x, source_size_y);
float ** kernel = create_matrix("1 2; 3 4", kernel_size_x, kernel_size_y);

int stride_x = 1;
int stride_y = 1;

// Computed sizes

int result_size_x = (source_size_x - kernel_size_x)/stride_x + 1;
int result_size_y = (source_size_y - kernel_size_y)/stride_y + 1;

int buffer_size_x = kernel_size_x * kernel_size_y;
int buffer_size_y = result_size_x * result_size_y;

// Internal buffer

float ** buffer = create_matrix(buffer_size_x, buffer_size_y);

// Results

float ** result = create_matrix(result_size_x, result_size_y);

print_matrix("source", source, source_size_x, source_size_y);

im2row(buffer, source, result_size_x, result_size_y, source_size_x,
source_size_y, kernel_size_x, kernel_size_y, stride_x, stride_y);
multiply(*result, buffer, *kernel, buffer_size_x, buffer_size_y);

print_matrix("buffer", buffer, buffer_size_x, buffer_size_y);
print_matrix("kernel", kernel, kernel_size_x, kernel_size_y);
print_matrix("result", result, result_size_x, result_size_y);
```

### Convolution

The functions convolve performs two-dimensional convolution of a source matrix (or image) with a kernel. The constants rsizex and rsizey give the size of the result matrix and ksizex and ksizey give the size of the kernel. The source matrix must have size (rsizex+ksizex-1) x (rsizey+ksizey-1). If ksizex and ksizey are odd, the function will call harware-optimized convolution functions when available. The argument bias is a constant which is added to each element of the result.

```float ** convolve(
float ** result,
float ** source,
float ** kernel,
int rsizex,
int rsizey,
int ksizex,
int ksizey,
float bias = 0.0
);
```

The function box_filter is a reasonably fast implementation of the box filter that sums all the matrix elements within each box of size boxsize x boxsize. After the call, the matrix r of size sizex x sizey will contain all the sums. The matrix a contains the source data and should be of size [sizex + boxsize - 1] x [sizey + boxsize - 1]. Finally, t is an optional temporary storage m,atrix of size [sizex] x [sizey + boxsize - 1]. If t==NULL, this matrix will be allocated internally for each call.

```float ** box_filter(
float ** r,
float ** a,
int sizex,
int sizey,
int boxsize,
bool scale = false,
float ** t = 0
);
```

The function integral_image calculates the integral image.

```float **    integral_image(float ** r, float ** a, int sizex, int sizey);
```