Equivalent formulations of semidefinite programs using linear matrix inequality and trace
Published:
A semidefinite program written using the linear matrix inequality can be equivalently written using the trace of a matrix.
Acknowledgement: the content of this post originates from a homework question in Prof. Stark Draper’s course, ECE1505 Convex Optimization, at the University of Toronto.
Semidefinite program and linear matrix inequality
Semidefinite programs (SDPs) are a specific type of conic form programs, which is a subclass of convex optimization problems with respect to the Lowener order, that is, the cone of (symmetric) positive semidefinite matrices denoted by \(\mathbb{S}_{+}^{n}\). SDPs are typically formulated as
\[\begin{equation}\label{eq:sdp-lmi} \begin{array}{cl} \minimize{x \in \mathbb{R}^n} & \transpose{c}x \\ \st & F_0 + x_1F_1 + \cdots + x_n F_n \preceq \zero \\ & Gx = h \end{array} \end{equation}\]where \(c \inR^n, G \inR^{p\times n}, h \inR^p\), and each \(F_i\) is an \(m \times m\) real symmetric matrix, denoted by \(F_i \inS^m\). The inequality constraint is known as a linear matrix inequality (LMI), which is a generalized inequality with respect to the partial order defined by the convex cone of symmetric positive definite matrices. In addition, while the decision variable in the SDP above is the vector \(x \inR^n\), they can often appear as matrices (not even necessarily symmetric matrices) in an SDP. For example, for \(i \in \{1, \dots, N\}\), let \(A_i \inR^{n \times n}, B_i \inR^{n \times n}, C_i \inR^{n \times m}\) be constant coefficient matrices, and let \(X \inS^n_{++}\) and \(Y \inR^{m \times n}\) be unknown decision variables, then
\[F(X,Y) \define F_0 + \sum_{i = 1}^N A_i X B_i + \transpose{B}_i X \transpose{A}_i + C_i Y + \transpose{Y} \transpose{C}_i \succeq \zero\]is a perfectly valid LMI in \(X\) and \(Y\). This form of LMIs often appears in SDPs used in, e.g., systems control theory.
Equivalent SDP formulation using trace of a square matrix
The goal of this post is to show that, another SDP formulation, now written with the trace of a matrix, denoted by \(\trace(\cdot)\),
\[\begin{equation}\label{eq:sdp-trace} \begin{array}{cl} \minimize{X \in \mathbb{S}^{\bullet}_+} & \trace(CX) \\ \st & \trace(A_k X) = b_k, \quad k \in \{1, \dots, q\}\\ & X \succeq \zero \end{array} \end{equation}\]is in fact equivalent to \eqref{eq:sdp-lmi}, and we show how to establish this equivalence by appropriately constructing the decision variable \(X\), and the coefficients \(C, A_k\), and \(b_k\). We leave the dimension of the new decision variable \(X\) vague on purpose for now, as the construction below will clarify it. Now, let us recall that the trace of a square matrix is defined as the sum of its diagonal elements, and is connected to the standard Frobenius inner product on \(\mathbb{R}^{m \times n}\) via
\[\trace(\transpose{X}Y) = \trace(X\transpose{Y}) = \langle X, Y \rangle = \sum_{i=1}^{m}\sum_{j=1}^n X_{ij}Y_{ij}.\]There are many additional properties of the matrix trace and its generalization that we will not discuss here, and we include some relevant references at the end of this post.
We are now ready to convert \eqref{eq:sdp-lmi} into \eqref{eq:sdp-trace}. First, from the left-hand side of the LMI in \eqref{eq:sdp-lmi}, define a slack variable from the left-hand side of the LMI in \eqref{eq:sdp-lmi}:
\[\begin{equation}\label{eq:X-tilde} \tilde{X} \define -F_0 - \sum_{i=1}^n x_i F_i. \end{equation}\]We further decompose the decision variable \(x\) element-wise by its nonnegative part and its nonpositive part, i.e., \(x = x^+ - x^-\), in which both \(x^+\) and \(x^-\) are non-negative element-wise. (Of course, such decomposition is not unique, and we just pick one). We also denote the diagonal matrix formed by placing the elements of a vector \(v\) on its main diagonal as \(\diag(v)\). We claim that the block diagonal matrix
\[X \define \begin{bmatrix} -\tilde{X} & \\ & \diag(x^+) \\ & & \diag(x^-) \end{bmatrix} \inR^{(m+2n) \times (m+2n)}\]is exactly the decision variable in \eqref{eq:sdp-trace}. As a sanity check, since \(F(x) \preceq \zero\) in \eqref{eq:sdp-lmi}, the new decision variable \(X\) above is positive semidefinite, and should be a sparse matrix provided that \(m\) is not significantly larger than \(n\). Analogous to the decomposition of \(x\), we can decompose the cost vector \(c\) in \eqref{eq:sdp-lmi} as \(c = c^+ - c^-\), in which both \(c^+\) and \(c^-\) are non-negative element-wise, and construct the diagonal cost matrix
\[C \define \begin{bmatrix} \zero_{m\times m} & \\ & \diag(c^+) \\ & & \diag(c^-) \end{bmatrix} \inR^{(m+2n) \times (m+2n)}\]which satisfies \(\trace(CX) = \transpose{c}x\). We can now compare against \eqref{eq:sdp-lmi}, and see that we need to handle the \(q\) equality constraints in \eqref{eq:sdp-trace}, which has two sources. The first one is due to the definition of the slack variable \(\tilde{X}\) in \eqref{eq:X-tilde}, which must be satisfied element-wise. Thus, given the coefficient matrices \(F_i \inS^{m}\), we need to introduce (at most) \(\frac{m(m+1)}{2}\) equality constraints as follows: define the constant matrices
\[\begin{equation} \label{eq:F-tilde} \tilde{F}_{j\ell} \define \begin{bmatrix} \Delta_{\ell j} & \\ & \diag(F_{i,j \ell}) \\ & & -\diag(F_{i,j \ell}) \end{bmatrix} \inR^{(m+2n)\times (m+2n)}, \end{equation}\]where \(\Delta_{\ell j} \inR^{m \times m}\) is the indicator matrix with \(1\) at the \(\ell, j\)-th element and zero everywhere else, and
\[\diag(F_{i,j \ell}) \define \begin{bmatrix} (F_1)_{j\ell} & \\ & (F_2)_{j\ell} & \\ & & \ddots & \\ & & & (F_n)_{j\ell} & \\ \end{bmatrix} \inR^{n \times n}\]is a diagonal matrix with the \(j,\ell\)-th elements of matrices \(F_1, \dots, F_n\) on its main diagonal. Thus, we can verify that
\[\begin{align} \trace(\tilde{F}_{j\ell} X) &= \trace(\Delta_{\ell j} \tilde{X}) + \sum_{i=1}^n x_i^+ (F_i)_{j\ell} - \sum_{i=1}^n x_i^- (F_i)_{j\ell} \nonumber \\ &= \tilde{X}_{j\ell} + \sum_{i=1}^n(x_i^+ - x_i^-) (F_i)_{j\ell} \nonumber \\ &= -(F_0)_{j\ell} - \sum_{i=1}^n x_i (F_i)_{j\ell} + \sum_{i=1}^n x_i (F_i)_{j\ell} \nonumber \\ &= -(F_0)_{j\ell} \label{eq:eq-constraint-lmi} \end{align}\]There are a total of \(\frac{m(m+1)}{2}\) constraints like \eqref{eq:eq-constraint-lmi}, and they form the first part of the equality trace constraints in \eqref{eq:sdp-trace}. The second and the last part comes from \(Gx = h\) in \eqref{eq:sdp-lmi}, which can be transformed by constructing
\[\begin{equation} \label{eq:G-tilde} \tilde{G}_{i} \define \begin{bmatrix} \zero_{m \times m} & \\ & \diag(\transpose{\e}_i G) \\ & & -\diag(\transpose{\e}_i G) \end{bmatrix} \inR^{(m+2n)\times (m+2n)} \end{equation}\]in which \(e_i \inR^p\) is the \(i\)-th unit vector, thus \(\transpose{\e}_i G \inR^n\) selects the \(i\)-th row of \(G\). For \(i \in \{1, \dots, p\}\), \(\transpose{\e}_i G x = h_i\), so \(Gx = h\) in \eqref{eq:sdp-lmi} is equivalent to the \(p\) trace constraints
\[\tilde{G}_i X = h_i.\]Thus, from \eqref{eq:sdp-trace}, we have a total of
\[q = \frac{m(m+1)}{2} + p\]trace equality constraints of the form \(\trace(A_kX) = b_k\), in which the new coefficient matrices \(A_k\) correspond to \(\tilde{F}_{j\ell}\) and \(\tilde{G}_i\) constructed in \eqref{eq:F-tilde} and \eqref{eq:G-tilde}, respectively, while the coefficients \(b_k\) correspond to \(-(F_0)_{j\ell}\) and \(h_i\), respectively.
Further readings and references
- General introduction and basic examples of SDPs: Chapter 4.6.2 of Convex Optimization
- SIAM survey of SDPs and numerical solution methods
- LMIs in systems and control theory book
- Further trace properties and generalizations: Chapter 8.D of Linear Algebra Done Right
Last updated: 2025-07-22 15:32 EST