In statistics , the matrix variate Dirichlet distribution is a generalization of the matrix variate beta distribution and of the Dirichlet distribution .
Suppose U 1 , … , U r {\displaystyle U_{1},\ldots ,U_{r}} are p × p {\displaystyle p\times p} positive definite matrices with I p − ∑ i = 1 r U i {\displaystyle I_{p}-\sum _{i=1}^{r}U_{i}} also positive-definite, where I p {\displaystyle I_{p}} is the p × p {\displaystyle p\times p} identity matrix . Then we say that the U i {\displaystyle U_{i}} have a matrix variate Dirichlet distribution, ( U 1 , … , U r ) ∼ D p ( a 1 , … , a r ; a r + 1 ) {\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r};a_{r+1}\right)} , if their joint probability density function is
{ β p ( a 1 , … , a r , a r + 1 ) } − 1 ∏ i = 1 r det ( U i ) a i − ( p + 1 ) / 2 det ( I p − ∑ i = 1 r U i ) a r + 1 − ( p + 1 ) / 2 {\displaystyle \left\{\beta _{p}\left(a_{1},\ldots ,a_{r},a_{r+1}\right)\right\}^{-1}\prod _{i=1}^{r}\det \left(U_{i}\right)^{a_{i}-(p+1)/2}\det \left(I_{p}-\sum _{i=1}^{r}U_{i}\right)^{a_{r+1}-(p+1)/2}} where a i > ( p − 1 ) / 2 , i = 1 , … , r + 1 {\displaystyle a_{i}>(p-1)/2,i=1,\ldots ,r+1} and β p ( ⋯ ) {\displaystyle \beta _{p}\left(\cdots \right)} is the multivariate beta function.
If we write U r + 1 = I p − ∑ i = 1 r U i {\displaystyle U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{i}} then the PDF takes the simpler form
{ β p ( a 1 , … , a r + 1 ) } − 1 ∏ i = 1 r + 1 det ( U i ) a i − ( p + 1 ) / 2 , {\displaystyle \left\{\beta _{p}\left(a_{1},\ldots ,a_{r+1}\right)\right\}^{-1}\prod _{i=1}^{r+1}\det \left(U_{i}\right)^{a_{i}-(p+1)/2},} on the understanding that ∑ i = 1 r + 1 U i = I p {\displaystyle \sum _{i=1}^{r+1}U_{i}=I_{p}} .
Theorems
generalization of chi square-Dirichlet result Suppose S i ∼ W p ( n i , Σ ) , i = 1 , … , r + 1 {\displaystyle S_{i}\sim W_{p}\left(n_{i},\Sigma \right),i=1,\ldots ,r+1} are independently distributed Wishart p × p {\displaystyle p\times p} positive definite matrices . Then, defining U i = S − 1 / 2 S i ( S − 1 / 2 ) T {\displaystyle U_{i}=S^{-1/2}S_{i}\left(S^{-1/2}\right)^{T}} (where S = ∑ i = 1 r + 1 S i {\displaystyle S=\sum _{i=1}^{r+1}S_{i}} is the sum of the matrices and S 1 / 2 ( S − 1 / 2 ) T {\displaystyle S^{1/2}\left(S^{-1/2}\right)^{T}} is any reasonable factorization of S {\displaystyle S} ), we have
( U 1 , … , U r ) ∼ D p ( n 1 / 2 , . . . , n r + 1 / 2 ) . {\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(n_{1}/2,...,n_{r+1}/2\right).}
Marginal distribution If ( U 1 , … , U r ) ∼ D p ( a 1 , … , a r + 1 ) {\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)} , and if s ≤ r {\displaystyle s\leq r} , then:
( U 1 , … , U s ) ∼ D p ( a 1 , … , a s , ∑ i = s + 1 r + 1 a i ) {\displaystyle \left(U_{1},\ldots ,U_{s}\right)\sim D_{p}\left(a_{1},\ldots ,a_{s},\sum _{i=s+1}^{r+1}a_{i}\right)}
Conditional distribution Also, with the same notation as above, the density of ( U s + 1 , … , U r ) | ( U 1 , … , U s ) {\displaystyle \left(U_{s+1},\ldots ,U_{r}\right)\left|\left(U_{1},\ldots ,U_{s}\right)\right.} is given by
∏ i = s + 1 r + 1 det ( U i ) a i − ( p + 1 ) / 2 β p ( a s + 1 , … , a r + 1 ) det ( I p − ∑ i = 1 s U i ) ∑ i = s + 1 r + 1 a i − ( p + 1 ) / 2 {\displaystyle {\frac {\prod _{i=s+1}^{r+1}\det \left(U_{i}\right)^{a_{i}-(p+1)/2}}{\beta _{p}\left(a_{s+1},\ldots ,a_{r+1}\right)\det \left(I_{p}-\sum _{i=1}^{s}U_{i}\right)^{\sum _{i=s+1}^{r+1}a_{i}-(p+1)/2}}}} where we write U r + 1 = I p − ∑ i = 1 r U i {\displaystyle U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{i}} .
partitioned distribution Suppose ( U 1 , … , U r ) ∼ D p ( a 1 , … , a r + 1 ) {\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)} and suppose that S 1 , … , S t {\displaystyle S_{1},\ldots ,S_{t}} is a partition of [ r + 1 ] = { 1 , … r + 1 } {\displaystyle \left[r+1\right]=\left\{1,\ldots r+1\right\}} (that is, ∪ i = 1 t S i = [ r + 1 ] {\displaystyle \cup _{i=1}^{t}S_{i}=\left[r+1\right]} and S i ∩ S j = ∅ {\displaystyle S_{i}\cap S_{j}=\emptyset } if i ≠ j {\displaystyle i\neq j} ). Then, writing U ( j ) = ∑ i ∈ S j U i {\displaystyle U_{(j)}=\sum _{i\in S_{j}}U_{i}} and a ( j ) = ∑ i ∈ S j a i {\displaystyle a_{(j)}=\sum _{i\in S_{j}}a_{i}} (with U r + 1 = I p − ∑ i = 1 r U r {\displaystyle U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{r}} ), we have:
( U ( 1 ) , … U ( t ) ) ∼ D p ( a ( 1 ) , … , a ( t ) ) . {\displaystyle \left(U_{(1)},\ldots U_{(t)}\right)\sim D_{p}\left(a_{(1)},\ldots ,a_{(t)}\right).}
partitions Suppose ( U 1 , … , U r ) ∼ D p ( a 1 , … , a r + 1 ) {\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)} . Define
U i = ( U 11 ( i ) U 12 ( i ) U 21 ( i ) U 22 ( i ) ) i = 1 , … , r {\displaystyle U_{i}=\left({\begin{array}{rr}U_{11(i)}&U_{12(i)}\\U_{21(i)}&U_{22(i)}\end{array}}\right)\qquad i=1,\ldots ,r} where U 11 ( i ) {\displaystyle U_{11(i)}} is p 1 × p 1 {\displaystyle p_{1}\times p_{1}} and U 22 ( i ) {\displaystyle U_{22(i)}} is p 2 × p 2 {\displaystyle p_{2}\times p_{2}} . Writing the Schur complement U 22 ⋅ 1 ( i ) = U 21 ( i ) U 11 ( i ) − 1 U 12 ( i ) {\displaystyle U_{22\cdot 1(i)}=U_{21(i)}U_{11(i)}^{-1}U_{12(i)}} we have
( U 11 ( 1 ) , … , U 11 ( r ) ) ∼ D p 1 ( a 1 , … , a r + 1 ) {\displaystyle \left(U_{11(1)},\ldots ,U_{11(r)}\right)\sim D_{p_{1}}\left(a_{1},\ldots ,a_{r+1}\right)} and
( U 22.1 ( 1 ) , … , U 22.1 ( r ) ) ∼ D p 2 ( a 1 − p 1 / 2 , … , a r − p 1 / 2 , a r + 1 − p 1 / 2 + p 1 r / 2 ) . {\displaystyle \left(U_{22.1(1)},\ldots ,U_{22.1(r)}\right)\sim D_{p_{2}}\left(a_{1}-p_{1}/2,\ldots ,a_{r}-p_{1}/2,a_{r+1}-p_{1}/2+p_{1}r/2\right).}
See also
References A. K. Gupta and D. K. Nagar 1999. "Matrix variate distributions". Chapman and Hall.