4. Operations on symbolic expressions¶

4.1. Arithmetic operations¶

TensCalc supports most of the basic arithmetic operations in MATLAB©, in most cases using the same or a very similar syntax.

Basic arithmetic operations¶
Usage	Description	Notes
X + Y plus(X, Y)¶	Entry-by-entry addition of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `plus()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes.
X - Y minus(X, Y)¶	Entry-by-entry subtraction of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `minus()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes.
X .* Y times(X, Y)¶	Entry-by-entry multiplication of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `times()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes.
X * Y mtimes(X, Y)¶	Matrix multiplication of tensors that adapts to the size of the operands as follows: regular matrix multiplication when `X` and `Y` are both matrices (tensors with 2 dimensions) with `size(X,2)==size(Y,1)` matrix by column-vector multiplication when `X` is a matrix (tensor with 2 dimensions) and `Y` a vector (tensor with 1 dimension) with `size(X,2)==size(Y,1)` row vector by matrix multiplication when `X` is a vector (tensor with 1 dimension) and `Y` a matrix (tensors with 2 dimensions) with `size(X,1)==size(Y,1)` inner product when `X` and `Y` are both vectors (tensors with 1 dimension) with `size(X,1)==size(Y,1)` entry-by-entry multiplication with either `X` or `Y` are scalars (tensor with 0 dimensions).	Depending on the sizes of the parameters, this operation may behave quite differently from MATLAB©’s matrix multiplication `mtimes()`.
X ./ Y rdivide(X, Y)¶	Entry-by-entry right division of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `rdivide()`, expansion upon singleton dimensions is not performed automatically to match matrix sizes.
X .\\ Y ldivide(X, Y)¶	Entry-by-entry left division of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `ldivide()`, expansion upon singleton dimensions is not performed automatically to match matrix sizes.
sum(X, vecdim)¶ sum(X, 'all')	Sum of entries of the tensor `X` along the directions specified by the vector `vecdim`, or over all dimensions. Resulting in a vector with the same size as `X`, but with the dimensions in `vecdim` removed.	Similar syntax to MATLAB©
max(X)¶ max(X, []vecdim) max(X, []'all')	Maximum entry of the tensor `X` along its 1st dimension (1st form), the dimensions specified in `vecdim` (2nd form), or along all dimensions (3rd form).	Similar syntax to MATLAB©, with the exception that it does not return a second output with indices.
max(X, Y)	For two tensors X and Y with the same size, returns a tensor M also with the same size, but with entries taken from X or Y, depending on which entry is largest. If X is a scalar, then M has the same size as Y and its entries are the largest of the corresponding entry of Y or the (only) entry of X. Similarly if Y is a scalar.	Similar syntax to MATLAB©.
min(X)¶ min(X, []vecdim) min(X, []'all')	Minimum entry of the tensor `X` along its 1st dimension (1st form), the dimensions specified in `vecdim` (2nd form), or along all dimensions (3rd form).	Similar syntax to MATLAB©, with the exception that it does not return a second output with indices.
min(X, Y)	For two tensors X and Y with the same size, returns a tensor M also with the same size, but with entries taken from X or Y, depending on which entry is smallest. If X is a scalar, then M has the same size as Y and its entries are the smallest of the corresponding entry of Y or the (only) entry of X. Similarly if Y is a scalar.	Similar syntax to MATLAB©.
full(X)¶	Converts sparse tensor to full in the sense that subsequent computations will not take advantage of the information that some entries are known to be zero.	The main use of this function is to force the results of a computation to be returned in a linear memory structure with all the zeros filled in appropriately. This is generally required to return `n`-dimensional tensors to MATLAB© with `n>2`, since MATLAB© does not support sparse arrays with more than 2 dimensions.
tprod(A1, index1, A2, index2, ...)¶	Tensor product	See Tensor product

4.1.1. Tensor product¶

The function tprod() provides a very general and flexible multiplication operation between tensors, which includes the usual matrix multiplication * as a special case, the summation over rows and/or columns sum(), computing the trace of a matrix trace(), extracting a matrix main diagonal diag(), computing the Euclidean norm of a vectors norm(); as well as generalizations of all these operations to n-dimensional tensors:

tprod(A, a, B, b, C, c, ...)

Parameters

A (Tcalculus symbolic tensor) – 1st tensor
a (vector of integers) – indices for A with length ndims(A)
B (Tcalculus symbolic tensor) – 2nd tensor
b (vector of integers) – indices for B with length ndims(B)
C (Tcalculus symbolic tensor) – 3rd tensor
c (vector of integers) – indices for C with length ndims(C)

tprod() returns a tensor Y obtained using a summation-product operation of the form

\[Y(y_1, y_2, ...) = \sum_{s_1} \sum_{s_2} \cdots A(a_1,a_2,...) B(b_1,b_2,...) C(c_1,c_2,...) \cdots\]

with the matching between the result tensor indices \(y_1, y_2,...\), the summation indices \(s_1, s_2,...\), and the input tensors indices \(a_1, a_2,..., b_1,b_2,...,c_1,c_2,...\) is determines as follows:

A negative values of -1 in one or several of the input tensor indices \(a_1, a_2,..., b_1,b_2,...,c_1,c_2,...\) means those particular indices should be summed under the 1st summation operation.
A negative values of -2 in one or several of the input tensor indices \(a_1, a_2,..., b_1,b_2,...,c_1,c_2,...\) means those particular indices should be summed under the 2nd summation operation.

…
The absence of a negative index means that there are no summations.
A positive value of +1 in one or several of the input tensor indices means those indices should match the 1st index \(y_1\) of the result tensor \(Y\).
A positive value of +2 in one or several of the input tensor indices means those indices should match the 2nd index \(y_1\) of the result tensor \(Y\).

…
The absence of a positive index means that the result has an empty size (i.e., it is a scalar).

A few examples are helpful to clarify the syntax and highlight the flexibility of tprod():

Tvariable A [m,n];
Tvariable x [n]
Tvariable B [n,k]

tprod(A,[1,-1],x,[-1]);    % same as the matrix-vector product A*x
tprod(A,[1,-1],B,[-1,2]);  % same as the matrix-matrix product A*B

tprod(A,[2,1]);            % same as the transpose A'

tprod(A,[1,-1]);           % same as sum(A,2)

tprod(x,[-1],x,[-1])       % same as the square of the Euclidean norm x'*x
tprod(A,[-1,-2],A,[-1,-2]) % same as the square of the Frobenius norm

Tvariable C [n,n]
tprod(C,[1,1])             % vector with the main diagonal of A
tprod(C,[-1,-1])           % same as trace(A)

4.2. Logical-valued operations¶

TensCalc uses logical-valued operations mostly to specify optimization constraints.

Logical-valued operations¶
Usage	Description	Notes
X==Y eq(X, Y)¶	Entry-by-entry equality comparison of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `eq()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes.
X>=Y ge(X, Y)¶	Entry-by-entry greater than or equal to comparison of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `ge()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes. From the perspective of a constrained optimization numerical solver, due to finite numerical precision, X>=Y and X>Y represent the same constraint.
X>Y gt(X, Y)¶	Entry-by-entry greater than comparison of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `gt()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes.
X<=Y le(X, Y)¶	Entry-by-entry smaller than or equal to comparison of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `le()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes. From the perspective of a constrained optimization numerical solver, due to finite numerical precision, X<=Y and X<Y represent the same constraint.
X<Y lt(X, Y)¶	Entry-by-entry smaller than comparison of tensors of the same size or of a scalar with a tensor of arbitrary size.	Unlike MATLAB©’s regular `lt()`, expansion upon singleton dimensions is not performed automatically to match the tensors’ sizes.
all(X, dim)¶ all(X, 'all')	Checks if the entries of the tensor `X` are nonzero and performs the Boolean operation `and` along the dimensions specified in `vecdim` (1st form) or along every dimension (2nd form), producing the logical value `true` if all entries are nonzero. The result is a tensor with the same size as X, but with the dimensions in `vecdim` removed.	Similar syntax to MATLAB©.
any(X, dim)¶ any(X, 'all')	Checks if the entries of the tensor `X` are nonzero and performs the Boolean operation `or` along the dimensions specified in `vecdim` (1st form) or along every dimension (2nd form), producing the logical value `true` if at least one entry is nonzero. The result is a tensor with the same size as X, but with the dimensions in `vecdim` removed.	Similar syntax to MATLAB©.

4.3. Entry-wise operations¶

The following functions are applied to every entry of a tensor.

Entry-wise operations¶
Usage	Description	Notes
exp(X)¶	Exponential of tensor entries.	Similar syntax to MATLAB©.
log(X)¶	Natural logarithm of tensor entries.	Similar syntax to MATLAB©.
sin(X)¶	Sine of tensor entries in radians.	Similar syntax to MATLAB©.
cos(X)¶	Cosine of tensor entries in radians.	Similar syntax to MATLAB©.
tan(X)¶	Tangent of tensor entries in radians.	Similar syntax to MATLAB©.
atan(X)¶	Inverse tangent in radians of tensor entries.	Similar syntax to MATLAB©.
sqr(X)¶	Square of tensor entries.	Similar to `X.^2` or `X .* X`.
cube(X)¶	Cube of tensor entries.	Similar to `X.^3` or `X .* X .* X`.
X.^Y power(X, Y)¶	Element-wise `X` raised to the power `Y`.	Similar syntax to MATLAB©, but TensCalc requires the power `Y` to be a regular numeric scalar, not a matrix/vector nor Tcalculus symbolic expression.
sqrt(X)¶	Square root of tensor entries.	Similar syntax to MATLAB©.
round(X)¶	Round to nearest integer	Similar syntax to MATLAB©, except that it does not support a second argument specifying a desired number of digits for rounding to a decimal.
ceil(X)¶	Round to nearest integer towards +infinity.	Similar syntax to MATLAB©.
floor(X)¶	Round to nearest integer towards -infinity.	Similar syntax to MATLAB©.
sign(X)¶	Signum function applied to the entries `X`, equal to 1, 0, or -1, depending on whether the corresponding extry of `X` is positive, zero, or negative.	Similar syntax to MATLAB©,
heaviside(X)¶	Step or heaviside function applied to the entries `X`, equal to 1, 0.5, or 0, depending on whether the corresponding entry of `X` is positive, zero, or negative.	Similar syntax to MATLAB©.
abs(X)¶	Absolute value of tensor entries.	Similar syntax to MATLAB©.
relu(X)¶	Rectified linear unit activation function applied to the entries `X`.	Similar to `max(X,0)`.
srelu(X)¶	Smooth rectified linear unit activation function applied to the entries `X`.	Similar to `log(1+exp(X))` or `x+log(1+exp(-x))`.
normpdf(X)¶	normpdf(X) returns the pdf of the standard normal distribution evaluated at the entries of `X`.	Similar syntax to MATLAB©, except that it does not support second and third arguments specifying mean and standard deviation different than 0 and 1, respectively.

4.4. Linear algebra¶

TensCalc supports several basic linear algebra operations, but for operations that require some form of factorization to be performed efficiently, such as:

det()     logdet()     inv()     mldivide()     \     traceinv()

TensCalc requires the user to explicitly select the factorization desired:

ldl()     lu()

This is accomplished by passing to the function the factorized matrix, as in:

det(lu(A))            det(ldl(A))
logdet(lu(A))         logdet(ldl(A))
inv(lu(A))            inv(ldl(A))
traceinv(lu(A))       traceinv(ldl(A))
mldivide(lu(A),Y)     mldivide(ldl(A),Y)
lu(A)\B               ldl(A)\B

This syntax works because in TensCalc the factorization functions lu() and ldl() return the whole factorization as a single entity, which can then be passed to any function that take a factorization as input (such as the functions listed above). This behavior is distinct from regular MATLAB© for which lu() and ldl() return factorizations through multiple outputs.

TensCalc’s code generation makes sure that redundant computations are not executed, therefore the following two snippets of code result in the same computation:

Tvariable A [10,10]
Tvariable b [10]
facA=lu(A);
y=det(facA);
x=facA\b;

or:

Tvariable A [10,10]
Tvariable b [10]
y=det(lu(A));
x=lu(A)\b;

Specifically note that, even though lu(A) appears twice in the bottom snippet, the matrix A is only factored once.

4.4.1. Which factorization to use?¶

The LDL factorization is faster and requires less memory. However, it has some limitations:
- LDL should only be used for symmetric matrices. When used on a matrix that is not symmetric, all the entries above the main diagonal are ignored.
- TensCalc’s LDL factorization only works for matrices that do not have zeros in the main diagonal. Structural zeros in the main diagonal will result in an error at code generation time. Non-structural zeros (i.e., zeros that cannot be determined at code generation time) will lead to divisions by zero at run time.
The LU factorization is a little slower and requires twice as much memory to store both the L and U factors. However, it can be applied to non-symmetric matrices and matrices with zeros in the main diagonal.

Both the LU and the LDL factorizations, use “pychologically lower/upper-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu() and ldl() with sparse matrices).

Linear Algebra operations¶
Usage	Description	Notes
diag(v, k)¶ diag(v) diag(A)	When `v` is an `n`-vector, returns a square matrix with `n+abs(k)` rows/columns, with the `k`-th diagonal equal to `v`. `k=0` corresponds to the main diagonal, `k>0` above the main diagonal and `k<0` below. When `k` ommited, it is assumed equal to (main diagonal). When `A` is an `n`-by-`n` a matrix, returns an `n`-vector with the main diagonal of `A`.	Similar syntax to MATLAB©, except that it does not support a second argument specifying a diagonal other than the main diagonal, when `A` is a matrix.
trace(A)¶	Trace of a matrix, i.e., sum of the diagonal elements of `A`, which is also the sum of the eigenvalues of `A`.	Similar syntax to MATLAB©.
A.' A' transpose(A)¶ ctranspose(A)¶	Transpose of a real-valued matrix.	Similar syntax to MATLAB©, except that TensCalc does not support complex-valued variables and therefore `transpose` and `ctranspose` return the same values.
lu(A)¶	LU factorization using “pychologically lower/upper-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices).	The output of this function includes the whole factorization as a single entity, in a format that can be passed to functions that require factorizations (such as mldivide, inv, det, logdet, traceinv), but should not be used by functions that are not expecting a factorized matrix as an input.
ldl(A)¶	LDL factorization using “pychologically lower-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices). All entries of `A` above the main diagonal are ignored and assumed to be equal to the one below the main diagonal, without performing any test regarding of whether or not this is true.	The output of this function includes the whole factorization as a single entity, in a format that can be passed to functions that require factorizations (such as mldivide, inv, det, logdet, traceinv), but should not be used by functions that are not expecting a factorized matrix as an input.
lu(A) \\ B mldivide(lu(A), B)¶ ldl(A) \\ B mldivide(ldl(A), B)	Left matrix division, which is the solution to the system of equations `A*X=B` where `A` must be a nonsingular square matrix (tensor with 2 dimensions) and `B` a tensor such that `size(A,1)==size(A,2)==size(B,1)`.	MATLAB© allows for non-square and possibly singular matrices `A`, in which case the least-squares solution to `AX=B` is returned. Currently, TensCalc requires `A` to be square and nonsingular. When an the LDL factorization is used, all entries of `A` above the main diagonal are ignored and assumed to be equal to the one below the main diagonal, without performing any test regarding of whether or not this is true*.
inv(lu(A))¶ inv(ldl(A))	Inverse of a square matrix.	Similar syntax to MATLAB©. When an the LDL factorization is used, all entries of `A` above the main diagonal are ignored and assumed to be equal to the one below the main diagonal, without performing any test regarding of whether or not this is true.
det(lu(A))¶ det(ldl(A))	Determinant of a square matrix.	Similar syntax to MATLAB©. When an the LDL factorization is used, all entries of `A` above the main diagonal are ignored and assumed to be equal to the one below the main diagonal, without performing any test regarding of whether or not this is true.
logdet(lu(A))¶ logdet(ldl(A))	Natural logarithm of the determinant of a square matrix with positive determinant.	Results in the same value `log(det(A))`, but in the context of optimizations, it is more efficient to use ``logdet(A)`` because the latter simplifies the computation of the derivative. When an the LDL factorization is used, all entries of `A` above the main diagonal are ignored and assumed to be equal to the one below the main diagonal, without performing any test regarding of whether or not this is true. TensCalc assumes that `det(A)>0` and actually returns `log(abs(det(A)))` but ignores the `abs()` operation when computing derivatives of `logdet`
traceinv(lu(A))¶ traceinv(ldl(A))	Natural logarithm of the determinant of a square matrix.	It results in the same value `trace(inv(A))`, but in the context of optimizations, it is more efficient to use ``traceinv(A)`` because the latter simplifies the computation of the derivative. When an the LDL factorization is used, all entries of `A` above the main diagonal are ignored and assumed to be equal to the one below the main diagonal, without performing any test regarding of whether or not this is true.
ldl_l(ldl(A))¶ ldl_d(ldl(A))¶	Returns the lower-triangular matrix or the diagonal matrix in an LDL factorization. The LDL factorization uses “pychologically lower-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices).	This function can only be applied to a matrix that has been factorized with `ldl()`. When an the LDL factorization is used, all entries of `A` above the main diagonal are ignored and assumed to be equal to the one below the main diagonal, without performing any test regarding of whether or not this is true.
lu_l(A)¶ lu_u(A)¶ lu_d(A)¶	Returns the lower-triangular matrix in an LU factorization (which is guaranteed to be nonsingular), the upper-triangular matrix, or just the main diagonal of the upper-triangular matrix (as a vector). The LU factorization uses “pychologically upper/lower-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices).	These functions can only be applied to a matrix that has been factorized with `lu()`.
norm(x)¶ norm(x, 2)	2-norm of a scalar or a vector	Similar syntax to MATLAB©, but restricted to vectors. The 2-norm should be avoided in optimization criteria because it is not differentiable at the origin. See Avoiding lack of smoothness on how to overcome this issue.
norm(x, 1)	1-norm of a scalar, a vector, or matrix	The 1-norm should be avoided in optimization criteria because it is not differentiable at points where the optimum often lies. See Avoiding lack of smoothness on how to overcome this issue.
norm(x, inf)	infinity norm of a scalar, a vector, or matrix	The infinity norm should be avoided in optimization criteria because it is not differentiable at points where the optimum often lies. See Avoiding lack of smoothness on how to overcome this issue.
norm2(x)¶ norm2(x, S)	`norm2(x)` returns the sum of the square of all entries of the tensor `x`, which for vectors and matrices corresponds to the square of the Euclidean and Frobenius norm of `x`. `norm2(x,S)` returns the value of the quadratic form `<x,Sx>`. This form is only applicable when `x` is a vector (tensor with 1 dimension) and `S` a square matrix (tensor with 2 dimensions).	`norm2(x)` is similar to `norm(x,2)^2` and also to `sum(x.^2,'all')`, but computes derivatives more efficiently.
norm1(x)¶	returns the sum of the absolute value of all entries of the tensor `x`, which for vectors corresponds to the 1-norm of `x`.	`norm1(x)` is similar to `sum(abs(x),'all')`. `norm1()` should be avoided in optimization criteria because it is not differentiable at points where the optimum often lies. See Avoiding lack of smoothness on how to overcome this issue.
norminf(x)¶	returns the largest absolute value of all entries of the tensor `x`, which for vectors corresponds to the infinity-norm of `x`.	`norminf(x)` is similar to `max(abs(x),[],'all')`. `norminf()` should be avoided in optimization criteria because it is not differentiable at points where the optimum often lies. See Avoiding lack of smoothness on how to overcome this issue.

Warning

Calling \, mldivide(), inv(), det(), logdet(), traceinv(), ldl_d(), or lu_d() with a matrix that has not been factorize will lead to syntax errors, often not easy to directly relate to the missing factorization.

4.4.2. Avoiding lack of smoothness¶

In general norm are not smooth:

The 2-norm is not smooth at the origin because of the sqrt()
The 1-norm is not smooth at any point where one coordinate is equal to zero because of the abs()
The infinity-norm is not smooth at any point where two or more entries have the equal largest absolute values because of the max()

For optimizations, this is particular problematic if the optimum occurs precisely at points where the norm is not differentiable, which is almost always the case for the 1-norm and the infinity norm.

However, it is generally possible to avoid this problem by introducing auxiliary slack variables and constraints that make the optimization smooth, without losing convexity.

A minimization involving a 2-norm of the form:

\(\min\Big\{ \text{norm}(x,2)+f(x) : x\in\mathbb{R}^n, F(x)\ge 0, G(x)= 0 \Big\}\)

can be reformulated as

\(\min\Big\{ v+f(x) : v\in\mathbb{R}, x\in\mathbb{R}^n, v>0, v^2 \ge \text{norm2}(x), F(x)\ge 0, G(x)= 0 \Big\}\)
A minimization involving a 1-norm of the form:

\(\min\Big\{ \text{norm}(x,1)+f(x) : x\in\mathbb{R}^n, F(x)\ge 0, G(x)= 0 \Big\}\)

can be reformulated as

\(\min\Big\{ \text{sum}(v)+f(x) : x,v\in\mathbb{R}^n, -v\le x \le v, F(x)\ge 0, G(x)= 0 \Big\}\)
A minimization involving an infinity-norm of the form:

\(\small\min\Big\{ \text{norm}(x,\text{inf})+f(x) : x\in\mathbb{R}^n, F(x)\ge 0, G(x)= 0 \Big\}\)

can be reformulated as

\(\small\min\Big\{ v+f(x) : v\in\mathbb{R},\; x\in\mathbb{R}^n, -v\le x \le v, F(x)\ge 0, G(x)= 0 \Big\}\)

4.5. Calculus - differentiation¶

The following function computes the partial derivatives of a symbolic expression with respect to the entries of a tensor-valued symbolic variable:

gradient(f, x)¶

Parameters

f (Tcalculus symbolic tensor) – tensor-valued expression to be differentiated
x (Tcalculus tensor-values symbolic variable) – variable (created with Tvariable()) with respect to which the derivatives will be taken

When

f is a tensor with size [n1,n2,...,nN]
x is a tensor-valued variable (created with Tvariable()) with size [m1,m2,...,mM]

then g=gradient(f,x) results in a tensor with size [n1,n2,...,nN,m1,m2,...,mM] with

\[g(i1,i2,...,iN,j1,j2,...,jM)=\frac{d\,f(i1,i2,...,iN)}{d x(j1,j2,...,jM)}\]

For example, if f is a scalar (with size []) and x an n-vector (with size [n]), then g=gradient(f,x) results in an n-vector (with size [n]) with the usual gradient:

\[g(i)=\frac{d\,f}{d x(i)}\]

If we then compute h=gradient(g,x), we obtain an n-by-n matrix (with size [n,n]) with the Hessian matrix:

\[h(i,j)=\frac{d\,g(i)}{d x(j)} = \frac{d^2\,f}{d x(i)\,d x(j)}\]

The computation of first and second derivatives can be streamlined using the following function:

hessian(f, x[, y])¶

Parameters

f (Tcalculus symbolic tensor) – tensor-valued expression to be differentiated
x (Tcalculus tensor-values symbolic variable) – variable (created with Tvariable()) with respect to which the 1st derivatives will be taken
y (Tcalculus tensor-values symbolic variable) – variable (created with Tvariable()) with respect to which the 2nd derivatives will be taken (optinal, when omitted the 2nd derivatives will also be taken with respect to x)

When

f is a tensor with size [n1,n2,...,nN]
x a tensor-valued variable (created with Tvariable()) with size [m1,m2,...,mM]
y a tensor-valued variable (created with Tvariable()) with size [l1,l2,...,lL]

then [h,g]=hessian(f,x,y) results in

a tensor g with size [n1,n2,...,nN,m1,m2,...,mM] with

\[g(i1,i2,...,iN,j1,j2,...,jM)=\frac{d\,f(i1,i2,...,iN)}{d x(j1,j2,...,jM)}\]
a tensor h with size [n1,n2,...,nN,m1,m2,...,mM,l1,l2,...,lL] with

\[h(i1,i2,...,iN,j1,j2,...,jM,k1,k2,...,kL)=\frac{d\,f(i1,i2,...,iN)}{d x(j1,j2,...,jM)d y(j1,j2,...,jM)}\]

In practice, [h,g]=hessian(f,x,y) is equivalent to:

g=gradient(f,x);
h=gradient(g,y);

4. Operations on symbolic expressions¶

4.1. Arithmetic operations¶

4.1.1. Tensor product¶

4.2. Logical-valued operations¶

4.3. Entry-wise operations¶

4.4. Linear algebra¶

4.4.1. Which factorization to use?¶

4.4.2. Avoiding lack of smoothness¶

4.5. Calculus - differentiation¶

Table of Contents

Previous topic

Next topic

This Page