launhr_col_getrfnp2#

claunhr_col_getrfnp2

Functions

void claunhr_col_getrfnp2(
    const INT           m,
    const INT           n,
          c64* restrict A,
    const INT           lda,
          c64* restrict D,
          INT*          info
);

void claunhr_col_getrfnp2(const INT m, const INT n, c64 *restrict A, const INT lda, c64 *restrict D, INT *info)#

CLAUNHR_COL_GETRFNP2 computes the modified LU factorization without pivoting of a complex general M-by-N matrix A.

The factorization has the form:

A - S = L * U,

where: S is a m-by-n diagonal sign matrix with the diagonal D, so that D(i) = S(i,i), 0 <= i < min(M,N). The diagonal D is constructed as D(i)=-SIGN(A(i,i)), where A(i,i) is the value after performing i steps of Gaussian elimination. This means that the diagonal element at each step of “modified” Gaussian elimination is at least one in absolute value (so that division-by-zero not possible during the division by the diagonal element);

L is a M-by-N lower triangular matrix with unit diagonal elements (lower trapezoidal if M > N);

and U is a M-by-N upper triangular matrix (upper trapezoidal if M < N).

This routine is an auxiliary routine used in the Householder reconstruction routine CUNHR_COL. In CUNHR_COL, this routine is applied to an M-by-N matrix A with orthonormal columns, where each element is bounded by one in absolute value. With the choice of the matrix S above, one can show that the diagonal element at each step of Gaussian elimination is the largest (in absolute value) in the column on or below the diagonal, so that no pivoting is required for numerical stability [1].

For more details on the Householder reconstruction algorithm, including the modified LU factorization, see [1].

This is the recursive version of the LU factorization algorithm. Denote A - S by B. The algorithm divides the matrix B into four submatrices:

   [  B11 | B12  ]  where B11 is n1 by n1,

B = [ –—|–— ] B21 is (m-n1) by n1, [ B21 | B22 ] B12 is n1 by n2, B22 is (m-n1) by n2, with n1 = min(m,n)/2, n2 = n-n1.

The subroutine calls itself to factor B11, solves for B21, solves for B12, updates B22, then calls itself to factor B22.

For more details on the recursive LU algorithm, see [2].

CLAUNHR_COL_GETRFNP2 is called to factorize a block by the blocked routine CLAUNHR_COL_GETRFNP, which uses blocked code calling Level 3 BLAS to update the submatrix. However, CLAUNHR_COL_GETRFNP2 is self-sufficient and can be used without CLAUNHR_COL_GETRFNP.

[1] “Reconstructing Householder vectors from tall-skinny QR”, G. Ballard, J. Demmel, L. Grigori, M. Jacquelin, H.D. Nguyen, E. Solomonik, J. Parallel Distrib. Comput., vol. 85, pp. 3-31, 2015.

[2] “Recursion leads to automatic variable blocking for dense linear

algebra algorithms”, F. Gustavson, IBM J. of Res. and Dev., vol. 41, no. 6, pp. 737-755, 1997.

Parameters

in

m

The number of rows of the matrix A. m >= 0.

in

n

The number of columns of the matrix A. n >= 0.

inout

A

Complex array, dimension (lda, n). On entry, the M-by-N matrix to be factored. On exit, the factors L and U from the factorization A-S=L*U; the unit diagonal elements of L are not stored.

in

lda

The leading dimension of the array A. lda >= max(1, m).

out

D

Complex array, dimension min(m, n). The diagonal elements of the diagonal M-by-N sign matrix S, D(i) = S(i,i), where 0 <= i < min(M,N). The elements can be only ( +1.0, 0.0 ) or (-1.0, 0.0 ).

out

info

= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value.

zlaunhr_col_getrfnp2

Functions

void zlaunhr_col_getrfnp2(
    const INT            m,
    const INT            n,
          c128* restrict A,
    const INT            lda,
          c128* restrict D,
          INT*           info
);

void zlaunhr_col_getrfnp2(const INT m, const INT n, c128 *restrict A, const INT lda, c128 *restrict D, INT *info)#

ZLAUNHR_COL_GETRFNP2 computes the modified LU factorization without pivoting of a complex general M-by-N matrix A.

The factorization has the form:

A - S = L * U,

where: S is a m-by-n diagonal sign matrix with the diagonal D, so that D(i) = S(i,i), 0 <= i < min(M,N). The diagonal D is constructed as D(i)=-SIGN(A(i,i)), where A(i,i) is the value after performing i steps of Gaussian elimination. This means that the diagonal element at each step of “modified” Gaussian elimination is at least one in absolute value (so that division-by-zero not possible during the division by the diagonal element);

L is a M-by-N lower triangular matrix with unit diagonal elements (lower trapezoidal if M > N);

and U is a M-by-N upper triangular matrix (upper trapezoidal if M < N).

This routine is an auxiliary routine used in the Householder reconstruction routine ZUNHR_COL. In ZUNHR_COL, this routine is applied to an M-by-N matrix A with orthonormal columns, where each element is bounded by one in absolute value. With the choice of the matrix S above, one can show that the diagonal element at each step of Gaussian elimination is the largest (in absolute value) in the column on or below the diagonal, so that no pivoting is required for numerical stability [1].

For more details on the Householder reconstruction algorithm, including the modified LU factorization, see [1].

This is the recursive version of the LU factorization algorithm. Denote A - S by B. The algorithm divides the matrix B into four submatrices:

   [  B11 | B12  ]  where B11 is n1 by n1,

B = [ –—|–— ] B21 is (m-n1) by n1, [ B21 | B22 ] B12 is n1 by n2, B22 is (m-n1) by n2, with n1 = min(m,n)/2, n2 = n-n1.

The subroutine calls itself to factor B11, solves for B21, solves for B12, updates B22, then calls itself to factor B22.

For more details on the recursive LU algorithm, see [2].

ZLAUNHR_COL_GETRFNP2 is called to factorize a block by the blocked routine ZLAUNHR_COL_GETRFNP, which uses blocked code calling Level 3 BLAS to update the submatrix. However, ZLAUNHR_COL_GETRFNP2 is self-sufficient and can be used without ZLAUNHR_COL_GETRFNP.

[1] “Reconstructing Householder vectors from tall-skinny QR”, G. Ballard, J. Demmel, L. Grigori, M. Jacquelin, H.D. Nguyen, E. Solomonik, J. Parallel Distrib. Comput., vol. 85, pp. 3-31, 2015.

[2] “Recursion leads to automatic variable blocking for dense linear

algebra algorithms”, F. Gustavson, IBM J. of Res. and Dev., vol. 41, no. 6, pp. 737-755, 1997.

Parameters

in

m

The number of rows of the matrix A. m >= 0.

in

n

The number of columns of the matrix A. n >= 0.

inout

A

Complex array, dimension (lda, n). On entry, the M-by-N matrix to be factored. On exit, the factors L and U from the factorization A-S=L*U; the unit diagonal elements of L are not stored.

in

lda

The leading dimension of the array A. lda >= max(1, m).

out

D

Complex array, dimension min(m, n). The diagonal elements of the diagonal M-by-N sign matrix S, D(i) = S(i,i), where 0 <= i < min(M,N). The elements can be only ( +1.0, 0.0 ) or (-1.0, 0.0 ).

out

info

= 0: successful exit
< 0: if info = -i, the i-th argument had an illegal value.