Misplaced Pages

Scaled inverse chi-squared distribution

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Probability distribution
Scaled inverse chi-squared
Probability density function
Cumulative distribution function
Parameters ν > 0 {\displaystyle \nu >0\,}
τ 2 > 0 {\displaystyle \tau ^{2}>0\,}
Support x ( 0 , ) {\displaystyle x\in (0,\infty )}
PDF ( τ 2 ν / 2 ) ν / 2 Γ ( ν / 2 )   exp [ ν τ 2 2 x ] x 1 + ν / 2 {\displaystyle {\frac {(\tau ^{2}\nu /2)^{\nu /2}}{\Gamma (\nu /2)}}~{\frac {\exp \left}{x^{1+\nu /2}}}}
CDF Γ ( ν 2 , τ 2 ν 2 x ) / Γ ( ν 2 ) {\displaystyle \Gamma \left({\frac {\nu }{2}},{\frac {\tau ^{2}\nu }{2x}}\right)\left/\Gamma \left({\frac {\nu }{2}}\right)\right.}
Mean ν τ 2 ν 2 {\displaystyle {\frac {\nu \tau ^{2}}{\nu -2}}} for ν > 2 {\displaystyle \nu >2\,}
Mode ν τ 2 ν + 2 {\displaystyle {\frac {\nu \tau ^{2}}{\nu +2}}}
Variance 2 ν 2 τ 4 ( ν 2 ) 2 ( ν 4 ) {\displaystyle {\frac {2\nu ^{2}\tau ^{4}}{(\nu -2)^{2}(\nu -4)}}} for ν > 4 {\displaystyle \nu >4\,}
Skewness 4 ν 6 2 ( ν 4 ) {\displaystyle {\frac {4}{\nu -6}}{\sqrt {2(\nu -4)}}} for ν > 6 {\displaystyle \nu >6\,}
Excess kurtosis 12 ( 5 ν 22 ) ( ν 6 ) ( ν 8 ) {\displaystyle {\frac {12(5\nu -22)}{(\nu -6)(\nu -8)}}} for ν > 8 {\displaystyle \nu >8\,}
Entropy

ν 2 + ln ( τ 2 ν 2 Γ ( ν 2 ) ) {\displaystyle {\frac {\nu }{2}}\!+\!\ln \left({\frac {\tau ^{2}\nu }{2}}\Gamma \left({\frac {\nu }{2}}\right)\right)}

( 1 + ν 2 ) ψ ( ν 2 ) {\displaystyle \!-\!\left(1\!+\!{\frac {\nu }{2}}\right)\psi \left({\frac {\nu }{2}}\right)}
MGF 2 Γ ( ν 2 ) ( τ 2 ν t 2 ) ν 4 K ν 2 ( 2 τ 2 ν t ) {\displaystyle {\frac {2}{\Gamma ({\frac {\nu }{2}})}}\left({\frac {-\tau ^{2}\nu t}{2}}\right)^{\!\!{\frac {\nu }{4}}}\!\!K_{\frac {\nu }{2}}\left({\sqrt {-2\tau ^{2}\nu t}}\right)}
CF 2 Γ ( ν 2 ) ( i τ 2 ν t 2 ) ν 4 K ν 2 ( 2 i τ 2 ν t ) {\displaystyle {\frac {2}{\Gamma ({\frac {\nu }{2}})}}\left({\frac {-i\tau ^{2}\nu t}{2}}\right)^{\!\!{\frac {\nu }{4}}}\!\!K_{\frac {\nu }{2}}\left({\sqrt {-2i\tau ^{2}\nu t}}\right)}

The scaled inverse chi-squared distribution ψ inv- χ 2 ( ν ) {\displaystyle \psi \,{\mbox{inv-}}\chi ^{2}(\nu )} , where ψ {\displaystyle \psi } is the scale parameter, equals the univariate inverse Wishart distribution W 1 ( ψ , ν ) {\displaystyle {\mathcal {W}}^{-1}(\psi ,\nu )} with degrees of freedom ν {\displaystyle \nu } .

This family of scaled inverse chi-squared distributions is linked to the inverse-chi-squared distribution and to the chi-squared distribution:

If X ψ inv- χ 2 ( ν ) {\displaystyle X\sim \psi \,{\mbox{inv-}}\chi ^{2}(\nu )} then X / ψ inv- χ 2 ( ν ) {\displaystyle X/\psi \sim {\mbox{inv-}}\chi ^{2}(\nu )} as well as ψ / X χ 2 ( ν ) {\displaystyle \psi /X\sim \chi ^{2}(\nu )} and 1 / X ψ 1 χ 2 ( ν ) {\displaystyle 1/X\sim \psi ^{-1}\chi ^{2}(\nu )} .

Instead of ψ {\displaystyle \psi } , the scaled inverse chi-squared distribution is however most frequently parametrized by the scale parameter τ 2 = ψ / ν {\displaystyle \tau ^{2}=\psi /\nu } and the distribution ν τ 2 inv- χ 2 ( ν ) {\displaystyle \nu \tau ^{2}\,{\mbox{inv-}}\chi ^{2}(\nu )} is denoted by Scale-inv- χ 2 ( ν , τ 2 ) {\displaystyle {\mbox{Scale-inv-}}\chi ^{2}(\nu ,\tau ^{2})} .


In terms of τ 2 {\displaystyle \tau ^{2}} the above relations can be written as follows:

If X Scale-inv- χ 2 ( ν , τ 2 ) {\displaystyle X\sim {\mbox{Scale-inv-}}\chi ^{2}(\nu ,\tau ^{2})} then X ν τ 2 inv- χ 2 ( ν ) {\displaystyle {\frac {X}{\nu \tau ^{2}}}\sim {\mbox{inv-}}\chi ^{2}(\nu )} as well as ν τ 2 X χ 2 ( ν ) {\displaystyle {\frac {\nu \tau ^{2}}{X}}\sim \chi ^{2}(\nu )} and 1 / X 1 ν τ 2 χ 2 ( ν ) {\displaystyle 1/X\sim {\frac {1}{\nu \tau ^{2}}}\chi ^{2}(\nu )} .


This family of scaled inverse chi-squared distributions is a reparametrization of the inverse-gamma distribution.

Specifically, if

X ψ inv- χ 2 ( ν ) = Scale-inv- χ 2 ( ν , τ 2 ) {\displaystyle X\sim \psi \,{\mbox{inv-}}\chi ^{2}(\nu )={\mbox{Scale-inv-}}\chi ^{2}(\nu ,\tau ^{2})}   then   X Inv-Gamma ( ν 2 , ψ 2 ) = Inv-Gamma ( ν 2 , ν τ 2 2 ) {\displaystyle X\sim {\textrm {Inv-Gamma}}\left({\frac {\nu }{2}},{\frac {\psi }{2}}\right)={\textrm {Inv-Gamma}}\left({\frac {\nu }{2}},{\frac {\nu \tau ^{2}}{2}}\right)}


Either form may be used to represent the maximum entropy distribution for a fixed first inverse moment ( E ( 1 / X ) ) {\displaystyle (E(1/X))} and first logarithmic moment ( E ( ln ( X ) ) {\displaystyle (E(\ln(X))} .

The scaled inverse chi-squared distribution also has a particular use in Bayesian statistics. Specifically, the scaled inverse chi-squared distribution can be used as a conjugate prior for the variance parameter of a normal distribution. The same prior in alternative parametrization is given by the inverse-gamma distribution.

Characterization

The probability density function of the scaled inverse chi-squared distribution extends over the domain x > 0 {\displaystyle x>0} and is

f ( x ; ν , τ 2 ) = ( τ 2 ν / 2 ) ν / 2 Γ ( ν / 2 )   exp [ ν τ 2 2 x ] x 1 + ν / 2 {\displaystyle f(x;\nu ,\tau ^{2})={\frac {(\tau ^{2}\nu /2)^{\nu /2}}{\Gamma (\nu /2)}}~{\frac {\exp \left}{x^{1+\nu /2}}}}

where ν {\displaystyle \nu } is the degrees of freedom parameter and τ 2 {\displaystyle \tau ^{2}} is the scale parameter. The cumulative distribution function is

F ( x ; ν , τ 2 ) = Γ ( ν 2 , τ 2 ν 2 x ) / Γ ( ν 2 ) {\displaystyle F(x;\nu ,\tau ^{2})=\Gamma \left({\frac {\nu }{2}},{\frac {\tau ^{2}\nu }{2x}}\right)\left/\Gamma \left({\frac {\nu }{2}}\right)\right.}
= Q ( ν 2 , τ 2 ν 2 x ) {\displaystyle =Q\left({\frac {\nu }{2}},{\frac {\tau ^{2}\nu }{2x}}\right)}

where Γ ( a , x ) {\displaystyle \Gamma (a,x)} is the incomplete gamma function, Γ ( x ) {\displaystyle \Gamma (x)} is the gamma function and Q ( a , x ) {\displaystyle Q(a,x)} is a regularized gamma function. The characteristic function is

φ ( t ; ν , τ 2 ) = {\displaystyle \varphi (t;\nu ,\tau ^{2})=}
2 Γ ( ν 2 ) ( i τ 2 ν t 2 ) ν 4 K ν 2 ( 2 i τ 2 ν t ) , {\displaystyle {\frac {2}{\Gamma ({\frac {\nu }{2}})}}\left({\frac {-i\tau ^{2}\nu t}{2}}\right)^{\!\!{\frac {\nu }{4}}}\!\!K_{\frac {\nu }{2}}\left({\sqrt {-2i\tau ^{2}\nu t}}\right),}

where K ν 2 ( z ) {\displaystyle K_{\frac {\nu }{2}}(z)} is the modified Bessel function of the second kind.

Parameter estimation

The maximum likelihood estimate of τ 2 {\displaystyle \tau ^{2}} is

τ 2 = n / i = 1 n 1 x i . {\displaystyle \tau ^{2}=n/\sum _{i=1}^{n}{\frac {1}{x_{i}}}.}

The maximum likelihood estimate of ν 2 {\displaystyle {\frac {\nu }{2}}} can be found using Newton's method on:

ln ( ν 2 ) ψ ( ν 2 ) = 1 n i = 1 n ln ( x i ) ln ( τ 2 ) , {\displaystyle \ln \left({\frac {\nu }{2}}\right)-\psi \left({\frac {\nu }{2}}\right)={\frac {1}{n}}\sum _{i=1}^{n}\ln \left(x_{i}\right)-\ln \left(\tau ^{2}\right),}

where ψ ( x ) {\displaystyle \psi (x)} is the digamma function. An initial estimate can be found by taking the formula for mean and solving it for ν . {\displaystyle \nu .} Let x ¯ = 1 n i = 1 n x i {\displaystyle {\bar {x}}={\frac {1}{n}}\sum _{i=1}^{n}x_{i}} be the sample mean. Then an initial estimate for ν {\displaystyle \nu } is given by:

ν 2 = x ¯ x ¯ τ 2 . {\displaystyle {\frac {\nu }{2}}={\frac {\bar {x}}{{\bar {x}}-\tau ^{2}}}.}

Bayesian estimation of the variance of a normal distribution

The scaled inverse chi-squared distribution has a second important application, in the Bayesian estimation of the variance of a Normal distribution.

According to Bayes' theorem, the posterior probability distribution for quantities of interest is proportional to the product of a prior distribution for the quantities and a likelihood function:

p ( σ 2 | D , I ) p ( σ 2 | I ) p ( D | σ 2 ) {\displaystyle p(\sigma ^{2}|D,I)\propto p(\sigma ^{2}|I)\;p(D|\sigma ^{2})}

where D represents the data and I represents any initial information about σ that we may already have.

The simplest scenario arises if the mean μ is already known; or, alternatively, if it is the conditional distribution of σ that is sought, for a particular assumed value of μ.

Then the likelihood term L(σ|D) = p(D|σ) has the familiar form

L ( σ 2 | D , μ ) = 1 ( 2 π σ ) n exp [ i n ( x i μ ) 2 2 σ 2 ] {\displaystyle {\mathcal {L}}(\sigma ^{2}|D,\mu )={\frac {1}{\left({\sqrt {2\pi }}\sigma \right)^{n}}}\;\exp \left}

Combining this with the rescaling-invariant prior p(σ|I) = 1/σ, which can be argued (e.g. following Jeffreys) to be the least informative possible prior for σ in this problem, gives a combined posterior probability

p ( σ 2 | D , I , μ ) 1 σ n + 2 exp [ i n ( x i μ ) 2 2 σ 2 ] {\displaystyle p(\sigma ^{2}|D,I,\mu )\propto {\frac {1}{\sigma ^{n+2}}}\;\exp \left}

This form can be recognised as that of a scaled inverse chi-squared distribution, with parameters ν = n and τ = s = (1/n) Σ (xi-μ)

Gelman and co-authors remark that the re-appearance of this distribution, previously seen in a sampling context, may seem remarkable; but given the choice of prior "this result is not surprising."

In particular, the choice of a rescaling-invariant prior for σ has the result that the probability for the ratio of σ / s has the same form (independent of the conditioning variable) when conditioned on s as when conditioned on σ:

p ( σ 2 s 2 | s 2 ) = p ( σ 2 s 2 | σ 2 ) {\displaystyle p({\tfrac {\sigma ^{2}}{s^{2}}}|s^{2})=p({\tfrac {\sigma ^{2}}{s^{2}}}|\sigma ^{2})}

In the sampling-theory case, conditioned on σ, the probability distribution for (1/s) is a scaled inverse chi-squared distribution; and so the probability distribution for σ conditioned on s, given a scale-agnostic prior, is also a scaled inverse chi-squared distribution.

Use as an informative prior

If more is known about the possible values of σ, a distribution from the scaled inverse chi-squared family, such as Scale-inv-χ(n0, s0) can be a convenient form to represent a more informative prior for σ, as if from the result of n0 previous observations (though n0 need not necessarily be a whole number):

p ( σ 2 | I , μ ) 1 σ n 0 + 2 exp [ n 0 s 0 2 2 σ 2 ] {\displaystyle p(\sigma ^{2}|I^{\prime },\mu )\propto {\frac {1}{\sigma ^{n_{0}+2}}}\;\exp \left}

Such a prior would lead to the posterior distribution

p ( σ 2 | D , I , μ ) 1 σ n + n 0 + 2 exp [ n s 2 + n 0 s 0 2 2 σ 2 ] {\displaystyle p(\sigma ^{2}|D,I^{\prime },\mu )\propto {\frac {1}{\sigma ^{n+n_{0}+2}}}\;\exp \left}

which is itself a scaled inverse chi-squared distribution. The scaled inverse chi-squared distributions are thus a convenient conjugate prior family for σ estimation.

Estimation of variance when mean is unknown

If the mean is not known, the most uninformative prior that can be taken for it is arguably the translation-invariant prior p(μ|I) ∝ const., which gives the following joint posterior distribution for μ and σ,

p ( μ , σ 2 D , I ) 1 σ n + 2 exp [ i n ( x i μ ) 2 2 σ 2 ] = 1 σ n + 2 exp [ i n ( x i x ¯ ) 2 2 σ 2 ] exp [ n ( μ x ¯ ) 2 2 σ 2 ] {\displaystyle {\begin{aligned}p(\mu ,\sigma ^{2}\mid D,I)&\propto {\frac {1}{\sigma ^{n+2}}}\exp \left\\&={\frac {1}{\sigma ^{n+2}}}\exp \left\exp \left\end{aligned}}}

The marginal posterior distribution for σ is obtained from the joint posterior distribution by integrating out over μ,

p ( σ 2 | D , I ) 1 σ n + 2 exp [ i n ( x i x ¯ ) 2 2 σ 2 ] exp [ n ( μ x ¯ ) 2 2 σ 2 ] d μ = 1 σ n + 2 exp [ i n ( x i x ¯ ) 2 2 σ 2 ] 2 π σ 2 / n ( σ 2 ) ( n + 1 ) / 2 exp [ ( n 1 ) s 2 2 σ 2 ] {\displaystyle {\begin{aligned}p(\sigma ^{2}|D,I)\;\propto \;&{\frac {1}{\sigma ^{n+2}}}\;\exp \left\;\int _{-\infty }^{\infty }\exp \leftd\mu \\=\;&{\frac {1}{\sigma ^{n+2}}}\;\exp \left\;{\sqrt {2\pi \sigma ^{2}/n}}\\\propto \;&(\sigma ^{2})^{-(n+1)/2}\;\exp \left\end{aligned}}}

This is again a scaled inverse chi-squared distribution, with parameters n 1 {\displaystyle \scriptstyle {n-1}\;} and s 2 = ( x i x ¯ ) 2 / ( n 1 ) {\displaystyle \scriptstyle {s^{2}=\sum (x_{i}-{\bar {x}})^{2}/(n-1)}} .

Related distributions

  • If X Scale-inv- χ 2 ( ν , τ 2 ) {\displaystyle X\sim {\mbox{Scale-inv-}}\chi ^{2}(\nu ,\tau ^{2})} then k X Scale-inv- χ 2 ( ν , k τ 2 ) {\displaystyle kX\sim {\mbox{Scale-inv-}}\chi ^{2}(\nu ,k\tau ^{2})\,}
  • If X inv- χ 2 ( ν ) {\displaystyle X\sim {\mbox{inv-}}\chi ^{2}(\nu )\,} (Inverse-chi-squared distribution) then X Scale-inv- χ 2 ( ν , 1 / ν ) {\displaystyle X\sim {\mbox{Scale-inv-}}\chi ^{2}(\nu ,1/\nu )\,}
  • If X Scale-inv- χ 2 ( ν , τ 2 ) {\displaystyle X\sim {\mbox{Scale-inv-}}\chi ^{2}(\nu ,\tau ^{2})} then X τ 2 ν inv- χ 2 ( ν ) {\displaystyle {\frac {X}{\tau ^{2}\nu }}\sim {\mbox{inv-}}\chi ^{2}(\nu )\,} (Inverse-chi-squared distribution)
  • If X Scale-inv- χ 2 ( ν , τ 2 ) {\displaystyle X\sim {\mbox{Scale-inv-}}\chi ^{2}(\nu ,\tau ^{2})} then X Inv-Gamma ( ν 2 , ν τ 2 2 ) {\displaystyle X\sim {\textrm {Inv-Gamma}}\left({\frac {\nu }{2}},{\frac {\nu \tau ^{2}}{2}}\right)} (Inverse-gamma distribution)
  • Scaled inverse chi square distribution is a special case of type 5 Pearson distribution

References

  • Gelman, Andrew; et al. (2014). Bayesian Data Analysis (Third ed.). Boca Raton: CRC Press. p. 583. ISBN 978-1-4398-4095-5.
  1. Gelman, Andrew; et al. (2014). Bayesian Data Analysis (Third ed.). Boca Raton: CRC Press. p. 65. ISBN 978-1-4398-4095-5.
Probability distributions (list)
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)
Directional
Univariate (circular) directional
Circular uniform
Univariate von Mises
Wrapped normal
Wrapped Cauchy
Wrapped exponential
Wrapped asymmetric Laplace
Wrapped Lévy
Bivariate (spherical)
Kent
Bivariate (toroidal)
Bivariate von Mises
Multivariate
von Mises–Fisher
Bingham
Degenerate
and singular
Degenerate
Dirac delta function
Singular
Cantor
Families
Categories:
Scaled inverse chi-squared distribution Add topic