4. Operations on symbolic expressions¶
4.1. Arithmetic operations¶
TensCalc supports most of the basic arithmetic operations in MATLAB©, in most cases using the same or a very similar syntax.
Usage |
Description |
Notes |
---|---|---|
|
Entry-by-entry addition of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Entry-by-entry subtraction of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Entry-by-entry multiplication of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Matrix multiplication of tensors that adapts to the size of the operands as follows:
|
Depending on the sizes of the parameters, this operation may
behave quite differently from MATLAB©’s matrix multiplication
|
|
Entry-by-entry right division of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Entry-by-entry left division of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Sum of entries of the tensor |
Similar syntax to MATLAB© |
|
Maximum entry of the tensor |
Similar syntax to MATLAB©, with the exception that it does not return a second output with indices. |
|
For two tensors X and Y with the same size, returns a tensor M also with the same size, but with entries taken from X or Y, depending on which entry is largest. If X is a scalar, then M has the same size as Y and its entries are the largest of the corresponding entry of Y or the (only) entry of X. Similarly if Y is a scalar. |
Similar syntax to MATLAB©. |
|
Minimum entry of the tensor |
Similar syntax to MATLAB©, with the exception that it does not return a second output with indices. |
|
For two tensors X and Y with the same size, returns a tensor M also with the same size, but with entries taken from X or Y, depending on which entry is smallest. If X is a scalar, then M has the same size as Y and its entries are the smallest of the corresponding entry of Y or the (only) entry of X. Similarly if Y is a scalar. |
Similar syntax to MATLAB©. |
|
Converts sparse tensor to full in the sense that subsequent computations will not take advantage of the information that some entries are known to be zero. |
The main use of this function is to force the results of a
computation to be returned in a linear memory structure with
all the zeros filled in appropriately. This is generally
required to return |
|
Tensor product |
See Tensor product |
4.1.1. Tensor product¶
The function tprod()
provides a very general and flexible
multiplication operation between tensors, which includes the usual
matrix multiplication *
as a special case, the summation over rows
and/or columns sum()
, computing the trace of a matrix trace()
,
extracting a matrix main diagonal diag()
, computing the Euclidean norm
of a vectors norm()
; as well as generalizations of all these
operations to n
-dimensional tensors:
- tprod(A, a, B, b, C, c, ...)
- Parameters
A (
Tcalculus
symbolic tensor) – 1st tensora (
vector of integers
) – indices forA
with lengthndims(A)
B (
Tcalculus
symbolic tensor) – 2nd tensorb (
vector of integers
) – indices forB
with lengthndims(B)
C (
Tcalculus
symbolic tensor) – 3rd tensorc (
vector of integers
) – indices forC
with lengthndims(C)
tprod()
returns a tensor Y
obtained using a summation-product operation of the form
with the matching between the result tensor indices \(y_1, y_2,...\), the summation indices \(s_1, s_2,...\), and the input tensors indices \(a_1, a_2,..., b_1,b_2,...,c_1,c_2,...\) is determines as follows:
A negative values of
-1
in one or several of the input tensor indices \(a_1, a_2,..., b_1,b_2,...,c_1,c_2,...\) means those particular indices should be summed under the 1st summation operation.A negative values of
-2
in one or several of the input tensor indices \(a_1, a_2,..., b_1,b_2,...,c_1,c_2,...\) means those particular indices should be summed under the 2nd summation operation.…
The absence of a negative index means that there are no summations.
A positive value of
+1
in one or several of the input tensor indices means those indices should match the 1st index \(y_1\) of the result tensor \(Y\).A positive value of
+2
in one or several of the input tensor indices means those indices should match the 2nd index \(y_1\) of the result tensor \(Y\).…
The absence of a positive index means that the result has an empty size (i.e., it is a scalar).
A few examples are helpful to clarify the syntax and highlight the
flexibility of tprod()
:
Tvariable A [m,n];
Tvariable x [n]
Tvariable B [n,k]
tprod(A,[1,-1],x,[-1]); % same as the matrix-vector product A*x
tprod(A,[1,-1],B,[-1,2]); % same as the matrix-matrix product A*B
tprod(A,[2,1]); % same as the transpose A'
tprod(A,[1,-1]); % same as sum(A,2)
tprod(x,[-1],x,[-1]) % same as the square of the Euclidean norm x'*x
tprod(A,[-1,-2],A,[-1,-2]) % same as the square of the Frobenius norm
Tvariable C [n,n]
tprod(C,[1,1]) % vector with the main diagonal of A
tprod(C,[-1,-1]) % same as trace(A)
4.2. Logical-valued operations¶
TensCalc uses logical-valued operations mostly to specify optimization constraints.
Usage |
Description |
Notes |
---|---|---|
|
Entry-by-entry equality comparison of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Entry-by-entry greater than or equal to comparison of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular From the perspective of a constrained optimization numerical solver, due to finite numerical precision, X>=Y and X>Y represent the same constraint. |
|
Entry-by-entry greater than comparison of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Entry-by-entry smaller than or equal to comparison of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular From the perspective of a constrained optimization numerical solver, due to finite numerical precision, X<=Y and X<Y represent the same constraint. |
|
Entry-by-entry smaller than comparison of tensors of the same size or of a scalar with a tensor of arbitrary size. |
Unlike MATLAB©’s regular |
|
Checks if the entries of the tensor |
Similar syntax to MATLAB©. |
|
Checks if the entries of the tensor |
Similar syntax to MATLAB©. |
4.3. Entry-wise operations¶
The following functions are applied to every entry of a tensor.
Usage |
Description |
Notes |
---|---|---|
|
Exponential of tensor entries. |
Similar syntax to MATLAB©. |
|
Natural logarithm of tensor entries. |
Similar syntax to MATLAB©. |
|
Sine of tensor entries in radians. |
Similar syntax to MATLAB©. |
|
Cosine of tensor entries in radians. |
Similar syntax to MATLAB©. |
|
Tangent of tensor entries in radians. |
Similar syntax to MATLAB©. |
|
Inverse tangent in radians of tensor entries. |
Similar syntax to MATLAB©. |
|
Square of tensor entries. |
Similar to |
|
Cube of tensor entries. |
Similar to |
|
Element-wise |
Similar syntax to MATLAB©, but TensCalc requires the power
|
|
Square root of tensor entries. |
Similar syntax to MATLAB©. |
|
Round to nearest integer |
Similar syntax to MATLAB©, except that it does not support a second argument specifying a desired number of digits for rounding to a decimal. |
|
Round to nearest integer towards +infinity. |
Similar syntax to MATLAB©. |
|
Round to nearest integer towards -infinity. |
Similar syntax to MATLAB©. |
|
Signum function applied to the entries |
Similar syntax to MATLAB©, |
|
Step or heaviside function applied to the entries |
Similar syntax to MATLAB©. |
|
Absolute value of tensor entries. |
Similar syntax to MATLAB©. |
|
Rectified linear unit activation function applied to the entries |
Similar to |
|
Smooth rectified linear unit activation function applied to the entries |
Similar to |
|
normpdf(X) returns the pdf of the standard normal distribution
evaluated at the entries of |
Similar syntax to MATLAB©, except that it does not support second and third arguments specifying mean and standard deviation different than 0 and 1, respectively. |
4.4. Linear algebra¶
TensCalc supports several basic linear algebra operations, but for operations that require some form of factorization to be performed efficiently, such as:
det() logdet() inv() mldivide() \ traceinv()
TensCalc requires the user to explicitly select the factorization desired:
ldl() lu()
This is accomplished by passing to the function the factorized matrix, as in:
det(lu(A)) det(ldl(A))
logdet(lu(A)) logdet(ldl(A))
inv(lu(A)) inv(ldl(A))
traceinv(lu(A)) traceinv(ldl(A))
mldivide(lu(A),Y) mldivide(ldl(A),Y)
lu(A)\B ldl(A)\B
This syntax works because in TensCalc the factorization functions
lu()
and ldl()
return the whole factorization as a single
entity, which can then be passed to any function that take a
factorization as input (such as the functions listed above). This
behavior is distinct from regular MATLAB© for which lu()
and
ldl()
return factorizations through multiple outputs.
TensCalc’s code generation makes sure that redundant computations are not executed, therefore the following two snippets of code result in the same computation:
Tvariable A [10,10]
Tvariable b [10]
facA=lu(A);
y=det(facA);
x=facA\b;
or:
Tvariable A [10,10]
Tvariable b [10]
y=det(lu(A));
x=lu(A)\b;
Specifically note that, even though lu(A)
appears twice in the
bottom snippet, the matrix A
is only factored once.
4.4.1. Which factorization to use?¶
The LDL factorization is faster and requires less memory. However, it has some limitations:
LDL should only be used for symmetric matrices. When used on a matrix that is not symmetric, all the entries above the main diagonal are ignored.
TensCalc’s LDL factorization only works for matrices that do not have zeros in the main diagonal. Structural zeros in the main diagonal will result in an error at code generation time. Non-structural zeros (i.e., zeros that cannot be determined at code generation time) will lead to divisions by zero at run time.
The LU factorization is a little slower and requires twice as much memory to store both the L and U factors. However, it can be applied to non-symmetric matrices and matrices with zeros in the main diagonal.
Both the LU and the LDL factorizations, use “pychologically
lower/upper-triangular matrices”, i.e., matrices that are triangular
up to a permutation, with permutations selected to minimize the
fill-in for sparse matrices and reduce computation time (see
MATLAB©’s documentation for lu()
and ldl()
with sparse
matrices).
Usage |
Description |
Notes |
---|---|---|
|
When When When |
Similar syntax to MATLAB©, except that it does not support
a second argument specifying a diagonal other than the main
diagonal, when |
|
Trace of a matrix, i.e., sum of the diagonal elements of |
Similar syntax to MATLAB©. |
|
Transpose of a real-valued matrix. |
Similar syntax to MATLAB©, except that TensCalc does not
support complex-valued variables and therefore |
|
LU factorization using “pychologically lower/upper-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices). |
The output of this function includes the whole factorization as a single entity, in a format that can be passed to functions that require factorizations (such as mldivide, inv, det, logdet, traceinv), but should not be used by functions that are not expecting a factorized matrix as an input. |
|
LDL factorization using “pychologically lower-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices). All entries of |
The output of this function includes the whole factorization as a single entity, in a format that can be passed to functions that require factorizations (such as mldivide, inv, det, logdet, traceinv), but should not be used by functions that are not expecting a factorized matrix as an input. |
|
Left matrix division, which is the solution to the system of
equations |
MATLAB© allows for non-square and possibly singular matrices
When an the LDL factorization is used, all entries of |
|
Inverse of a square matrix. |
Similar syntax to MATLAB©. When an the LDL factorization is used, all entries of |
|
Determinant of a square matrix. |
Similar syntax to MATLAB©. When an the LDL factorization is used, all entries of |
|
Natural logarithm of the determinant of a square matrix with positive determinant. |
Results in the same value When an the LDL factorization is used, all entries of TensCalc assumes that |
|
Natural logarithm of the determinant of a square matrix. |
It results in the same value When an the LDL factorization is used, all entries of |
|
Returns the lower-triangular matrix or the diagonal matrix in an LDL factorization. The LDL factorization uses “pychologically lower-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices). |
This function can only be applied to a matrix that has been
factorized with When an the LDL factorization is used, all entries of |
|
Returns the lower-triangular matrix in an LU factorization (which is guaranteed to be nonsingular), the upper-triangular matrix, or just the main diagonal of the upper-triangular matrix (as a vector). The LU factorization uses “pychologically upper/lower-triangular matrices”, i.e., matrices that are triangular up to a permutation, with permutations selected to minimize the fill-in for sparse matrices and reduce computation time (see MATLAB©’s documentation for lu with sparse matrices). |
These functions can only be applied to a matrix that has been
factorized with |
|
2-norm of a scalar or a vector |
Similar syntax to MATLAB©, but restricted to vectors. The 2-norm should be avoided in optimization criteria because it is not differentiable at the origin. See Avoiding lack of smoothness on how to overcome this issue. |
|
1-norm of a scalar, a vector, or matrix |
The 1-norm should be avoided in optimization criteria because it is not differentiable at points where the optimum often lies. See Avoiding lack of smoothness on how to overcome this issue. |
|
infinity norm of a scalar, a vector, or matrix |
The infinity norm should be avoided in optimization criteria because it is not differentiable at points where the optimum often lies. See Avoiding lack of smoothness on how to overcome this issue. |
|
|
|
|
returns the sum of the absolute value of all entries of the
tensor |
|
|
returns the largest absolute value of all entries of the tensor
|
|
Warning
Calling \, mldivide()
, inv()
, det()
, logdet()
, traceinv()
,
ldl_d()
, or lu_d()
with a matrix that has not been factorize will
lead to syntax errors, often not easy to directly relate to the
missing factorization.
4.4.2. Avoiding lack of smoothness¶
In general norm are not smooth:
The 2-norm is not smooth at the origin because of the
sqrt()
The 1-norm is not smooth at any point where one coordinate is equal to zero because of the
abs()
The infinity-norm is not smooth at any point where two or more entries have the equal largest absolute values because of the
max()
For optimizations, this is particular problematic if the optimum occurs precisely at points where the norm is not differentiable, which is almost always the case for the 1-norm and the infinity norm.
However, it is generally possible to avoid this problem by introducing auxiliary slack variables and constraints that make the optimization smooth, without losing convexity.
A minimization involving a 2-norm of the form:
\(\min\Big\{ \text{norm}(x,2)+f(x) : x\in\mathbb{R}^n, F(x)\ge 0, G(x)= 0 \Big\}\)
can be reformulated as
\(\min\Big\{ v+f(x) : v\in\mathbb{R}, x\in\mathbb{R}^n, v>0, v^2 \ge \text{norm2}(x), F(x)\ge 0, G(x)= 0 \Big\}\)
A minimization involving a 1-norm of the form:
\(\min\Big\{ \text{norm}(x,1)+f(x) : x\in\mathbb{R}^n, F(x)\ge 0, G(x)= 0 \Big\}\)
can be reformulated as
\(\min\Big\{ \text{sum}(v)+f(x) : x,v\in\mathbb{R}^n, -v\le x \le v, F(x)\ge 0, G(x)= 0 \Big\}\)
A minimization involving an infinity-norm of the form:
\(\small\min\Big\{ \text{norm}(x,\text{inf})+f(x) : x\in\mathbb{R}^n, F(x)\ge 0, G(x)= 0 \Big\}\)
can be reformulated as
\(\small\min\Big\{ v+f(x) : v\in\mathbb{R},\; x\in\mathbb{R}^n, -v\le x \le v, F(x)\ge 0, G(x)= 0 \Big\}\)
4.5. Calculus - differentiation¶
The following function computes the partial derivatives of a symbolic expression with respect to the entries of a tensor-valued symbolic variable:
- gradient(f, x)¶
- Parameters
f (
Tcalculus
symbolic tensor) – tensor-valued expression to be differentiatedx (
Tcalculus
tensor-values symbolic variable) – variable (created withTvariable()
) with respect to which the derivatives will be taken
When
f
is a tensor with size[n1,n2,...,nN]
x
is a tensor-valued variable (created withTvariable()
) with size[m1,m2,...,mM]
then g=gradient(f,x)
results in a tensor with size
[n1,n2,...,nN,m1,m2,...,mM]
with
For example, if f
is a scalar (with size []
) and x
an
n
-vector (with size [n]
), then g=gradient(f,x)
results in
an n
-vector (with size [n]
) with the usual gradient:
If we then compute h=gradient(g,x)
, we obtain an n
-by-n
matrix (with size [n,n]
) with the Hessian matrix:
The computation of first and second derivatives can be streamlined using the following function:
- hessian(f, x[, y])¶
- Parameters
f (
Tcalculus
symbolic tensor) – tensor-valued expression to be differentiatedx (
Tcalculus
tensor-values symbolic variable) – variable (created withTvariable()
) with respect to which the 1st derivatives will be takeny (
Tcalculus
tensor-values symbolic variable) – variable (created withTvariable()
) with respect to which the 2nd derivatives will be taken (optinal, when omitted the 2nd derivatives will also be taken with respect to x)
When
f
is a tensor with size[n1,n2,...,nN]
x
a tensor-valued variable (created withTvariable()
) with size[m1,m2,...,mM]
y
a tensor-valued variable (created withTvariable()
) with size[l1,l2,...,lL]
then [h,g]=hessian(f,x,y)
results in
a tensor
g
with size[n1,n2,...,nN,m1,m2,...,mM]
with\[g(i1,i2,...,iN,j1,j2,...,jM)=\frac{d\,f(i1,i2,...,iN)}{d x(j1,j2,...,jM)}\]a tensor
h
with size[n1,n2,...,nN,m1,m2,...,mM,l1,l2,...,lL]
with\[h(i1,i2,...,iN,j1,j2,...,jM,k1,k2,...,kL)=\frac{d\,f(i1,i2,...,iN)}{d x(j1,j2,...,jM)d y(j1,j2,...,jM)}\]
In practice, [h,g]=hessian(f,x,y)
is equivalent to:
g=gradient(f,x);
h=gradient(g,y);