Notation in probability and statistics: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 21:33, 10 September 2020 editMiaumee (talk \| contribs)Extended confirmed users765 edits Improve C/E. + inline ref. + "Bibliography".Tags: Reverted Visual edit ← Previous edit		Revision as of 15:36, 21 September 2020 edit undoJayBeeEll (talk \| contribs)Extended confirmed users, New page reviewers28,266 edits Undid revision 977769409 by Miaumee (talk) Per User talk:Miaumee, this is apparently the preferred response to poor editingTag: Undo Next edit →
Line 4:		Line 4:

	==Probability theory==		==Probability theory==
	* ]s are usually written in ] roman letters: ''X'', ''Y~~'', ''Z'', ''T~~'', etc.<ref name=":0">{{Cite web\|date=2020-04-26\|title=List of Probability and Statistics Symbols\|url=https://mathvault.ca/hub/higher-math/math-symbols/probability-statistics-symbols/\|access-date=2020-09-10\|website=Math Vault\|language=en-US}}</ref>		* ]s are usually written in ] roman letters: ''X'', ''Y'', etc.
	* Particular realizations of a random variable are written in corresponding ] letters. For example, ''x''<sub>1</sub>, ''x''<sub>2</sub>, …, ''x''<sub>''n''</sub> could be a ] corresponding to the random variable ''X''. A cumulative probability is formally written <math>P(X\le x) </math> to differentiate the random variable from its realization.		* Particular realizations of a random variable are written in corresponding ] letters. For example, ''x''<sub>1</sub>, ''x''<sub>2</sub>, …, ''x''<sub>''n''</sub> could be a ] corresponding to the random variable ''X''. A cumulative probability is formally written <math>P(X\le x) </math> to differentiate the random variable from its realization.
	* The probability is sometimes written <math>\mathbb{P} </math> to distinguish it from other functions and measure ''P'' so as to avoid having to define “''P'' is a probability” and <math>\mathbb{P}(X\in A) </math> is short for <math>P(\{\omega \in\Omega: X(\omega) \in A\})</math>, where <math>\Omega</math> is the event space and <math>X(\omega)</math> is a random variable. <math>\Pr(A)</math> notation is used alternatively.~~<ref name=":0" />~~		* The probability is sometimes written <math>\mathbb{P} </math> to distinguish it from other functions and measure ''P'' so as to avoid having to define “''P'' is a probability” and <math>\mathbb{P}(X\in A) </math> is short for <math>P(\{\omega \in\Omega: X(\omega) \in A\})</math>, where <math>\Omega</math> is the event space and <math>X(\omega)</math> is a random variable. <math>\Pr(A)</math> notation is used alternatively.
	*<math>\mathbb{P}(A \cap B)</math> or <math>\mathbb{P}</math> indicates the probability that events ''A'' and ''B'' both occur. The ] of random variables ''X'' and ''Y'' is denoted as <math>P(X, Y)</math>, while joint probability mass function or probability density function as <math>f(x, y)</math> and joint cumulative distribution function as <math>F(x, y)</math>.~~<ref name=":0" />~~		*<math>\mathbb{P}(A \cap B)</math> or <math>\mathbb{P}</math> indicates the probability that events ''A'' and ''B'' both occur. The ] of random variables ''X'' and ''Y'' is denoted as <math>P(X, Y)</math>, while joint probability mass function or probability density function as <math>f(x, y)</math> and joint cumulative distribution function as <math>F(x, y)</math>.
	*<math>\mathbb{P}(A \cup B)</math> or <math>\mathbb{P}</math> indicates the probability of either event ''A'' or event ''B'' occurring (“or” in this case means ]).~~<ref name=":0" />~~		*<math>\mathbb{P}(A \cup B)</math> or <math>\mathbb{P}</math> indicates the probability of either event ''A'' or event ''B'' occurring (“or” in this case means ]).
	*] are usually written with uppercase ] (e.g. <math>\mathcal F</math> for the set of sets on which we define the probability ''P'')		*] are usually written with uppercase ] (e.g. <math>\mathcal F</math> for the set of sets on which we define the probability ''P'')
	*]s (pdfs) and ]s are denoted by lowercase letters, e.g. <math>f(x)</math>, or <math>f_X(x)</math>.~~<ref name=":0" />~~		*]s (pdfs) and ]s are denoted by lowercase letters, e.g. <math>f(x)</math>, or <math>f_X(x)</math>.
	*]s (cdfs) are denoted by uppercase letters, e.g. <math>F(x)</math>, or <math>F_X(x)</math>.~~<ref name=":0" />~~		*]s (cdfs) are denoted by uppercase letters, e.g. <math>F(x)</math>, or <math>F_X(x)</math>.
	* ]s or complementary cumulative distribution functions are often denoted by placing an ] over the symbol for the cumulative:<math>\overline{F}(x) =1-F(x)</math>, or denoted as <math>S(x)</math>,~~<ref name=":0" />~~		* ]s or complementary cumulative distribution functions are often denoted by placing an ] over the symbol for the cumulative:<math>\overline{F}(x) =1-F(x)</math>, or denoted as <math>S(x)</math>,
	*In particular, the pdf of the ] is denoted by φ(''z''), and its cdf by Φ(''z'').		*In particular, the pdf of the ] is denoted by φ(''z''), and its cdf by Φ(''z'').
	*Some common operators:~~<ref~~ ~~name=":0" />~~		*Some common operators:
	:* E : ] of ''X''		:* E : ] of ''X''
	:* var : ] of ''X''		:* var : ] of ''X''
Line 20:		Line 20:
	* X is independent of Y is often written <math>X \perp Y</math> or <math>X \perp\!\!\!\perp Y</math>, and X is independent of Y given W is often written		* X is independent of Y is often written <math>X \perp Y</math> or <math>X \perp\!\!\!\perp Y</math>, and X is independent of Y given W is often written
	:<math>X \perp\!\!\!\perp Y \,\|\, W </math> or		:<math>X \perp\!\!\!\perp Y \,\|\, W </math> or
	:<math>X \perp Y \,\|\, W</math~~><ref name=":0" /~~>		:<math>X \perp Y \,\|\, W</math>
	* <math>\textstyle P(A\mid B)</math>, the '']'', is the probability of <math>\textstyle A</math> ''given'' <math>\textstyle B</math>,~~<ref~~ ~~name=":0" /> that is~~, <math>\textstyle A</math> ''after'' <math>\textstyle B</math> is observed.{{fact\|date=May 2016}}		* <math>\textstyle P(A\mid B)</math>, the '']'', is the probability of <math>\textstyle A</math> ''given'' <math>\textstyle B</math>, i.e., <math>\textstyle A</math> ''after'' <math>\textstyle B</math> is observed.{{fact\|date=May 2016}}

	==Statistics==		==Statistics==
	*Greek letters (e.g. ''θ'', ''β'') are commonly used to denote unknown parameters (population parameters).		*Greek letters (e.g. ''θ'', ''β'') are commonly used to denote unknown parameters (population parameters).
	*A tilde (~) denotes "has the probability distribution of".		*A tilde (~) denotes "has the probability distribution of".
	*Placing a hat, or caret, over a true parameter denotes an ] of it. ~~For example~~, <math>\widehat{\theta}</math> is an estimator for <math>\theta</math>.~~<ref name=":0" />~~		*Placing a hat, or caret, over a true parameter denotes an ] of it, e.g., <math>\widehat{\theta}</math> is an estimator for <math>\theta</math>.
	*The ] of a series of values ''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub> is often denoted by placing an "]" over the symbol (e.g., <math>\bar{x}</math>, pronounced "''x'' bar").		*The ] of a series of values ''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub> is often denoted by placing an "]" over the symbol, e.g. <math>\bar{x}</math>, pronounced "''x'' bar".
	*Some commonly used symbols for ] statistics are given below:		*Some commonly used symbols for ] statistics are given below:
	**the ] <math>\bar{x}</math>,		**the ] <math>\bar{x}</math>,
Line 40:		Line 40:
	**the population ] ''ρ'',		**the population ] ''ρ'',
	**the population ]s ''κ<sub>r</sub>'',		**the population ]s ''κ<sub>r</sub>'',
	*<math>x_{(k)}</math> is used for the <math>k^\text{th}</math> ],~~<ref name=":0" />~~ where <math>x_{(1)}</math> is the sample minimum and <math>x_{(n)}</math> is the sample maximum from a total sample size ''n''.		*<math>x_{(k)}</math> is used for the <math>k^\text{th}</math> ], where <math>x_{(1)}</math> is the sample minimum and <math>x_{(n)}</math> is the sample maximum from a total sample size ''n''.

	==Critical values==		==Critical values==
	The ''α''-level upper ] of a ] is the value exceeded with probability α, that is, the value ''x''<sub>''α''</sub> such that ''F''(''x''<sub>''α''</sub>) = 1 − ''α'' where ''F'' is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:~~<ref name=":0" />~~		The ''α''-level upper ] of a ] is the value exceeded with probability α, that is, the value ''x''<sub>''α''</sub> such that ''F''(''x''<sub>''α''</sub>) = 1 − ''α'' where ''F'' is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:
	*''z''<sub>''α''</sub> or ''z''(''α'') for the ]		*''z''<sub>''α''</sub> or ''z''(''α'') for the ]
	*''t''<sub>''α'',''ν''</sub> or ''t''(''α'',''ν'') for the ] with ν ]		*''t''<sub>''α'',''ν''</sub> or ''t''(''α'',''ν'') for the ] with ν ]
Line 59:		Line 59:
	*'''a.e.''' ]		*'''a.e.''' ]
	*'''a.s.''' ]		*'''a.s.''' ]
	* '''cdf''' ]~~<ref name=":0" />~~		* '''cdf''' ]
	* '''cmf''' ]		* '''cmf''' ]
	*'''df''' ] (also <math>\nu</math>)~~<ref name=":0" />~~		*'''df''' ] (also <math>\nu</math>)
	*'''i.i.d.''' ]~~<ref name=":0" />~~		*'''i.i.d.''' ]
	*'''pdf''' ]~~<ref name=":0" />~~		*'''pdf''' ]
	*'''pmf''' ]~~<ref name=":0" />~~		*'''pmf''' ]
	* '''r.v.''' ]~~<ref name=":0" />~~		* '''r.v.''' ]
	* '''w.p.''' with probability; '''wp1''' ]		* '''w.p.''' with probability; '''wp1''' ]
	* '''i.o.''' infinitely often, i.e. <math> \{ A_n\text{ i.o.} \} = \bigcap_N\bigcup_{n\geq N} A_n </math>		* '''i.o.''' infinitely often, i.e. <math> \{ A_n\text{ i.o.} \} = \bigcap_N\bigcup_{n\geq N} A_n </math>
Line 77:		Line 77:

	==References==		==References==
	<references />

	== Bibliography ==
	*{{Citation\| title=Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation\| first1=Max\|last1=Halperin \|first2=H. O. \|last2=Hartley \|first3=P. G.\|last3=Hoel \| journal=The American Statistician\| volume=19 \|year=1965 \| pages=12–14 \| issue=3\| doi=10.2307/2681417 \| jstor=2681417}}		*{{Citation\| title=Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation\| first1=Max\|last1=Halperin \|first2=H. O. \|last2=Hartley \|first3=P. G.\|last3=Hoel \| journal=The American Statistician\| volume=19 \|year=1965 \| pages=12–14 \| issue=3\| doi=10.2307/2681417 \| jstor=2681417}}

Revision as of 15:36, 21 September 2020

Probability

Outline Catalog of articles Probabilists Glossary Notation Journals Category Mathematics portal
v t e

Statistics

Outline Statisticians Glossary Notation Journals Lists of topics Articles Category Mathematics portal
v t e

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

Probability theory

Random variables are usually written in upper case roman letters: X, Y, etc.
Particular realizations of a random variable are written in corresponding lower case letters. For example, x₁, x₂, …, x_n could be a sample corresponding to the random variable X. A cumulative probability is formally written $P(X\leq x)$ to differentiate the random variable from its realization.
The probability is sometimes written $\mathbb {P}$ to distinguish it from other functions and measure P so as to avoid having to define “P is a probability” and $\mathbb {P} (X\in A)$ is short for $P(\{\omega \in \Omega :X(\omega )\in A\})$ , where $\Omega$ is the event space and $X(\omega )$ is a random variable. $\Pr(A)$ notation is used alternatively.
$\mathbb {P} (A\cap B)$ or $\mathbb {P}$ indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as $P(X,Y)$ , while joint probability mass function or probability density function as $f(x,y)$ and joint cumulative distribution function as $F(x,y)$ .
$\mathbb {P} (A\cup B)$ or $\mathbb {P}$ indicates the probability of either event A or event B occurring (“or” in this case means one or the other or both).
σ-algebras are usually written with uppercase calligraphic (e.g. ${\mathcal {F}}$ for the set of sets on which we define the probability P)
Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. $f(x)$ , or $f_{X}(x)$ .
Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. $F(x)$ , or $F_{X}(x)$ .
Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative: ${\overline {F}}(x)=1-F(x)$ , or denoted as $S(x)$ ,
In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
Some common operators:

E : expected value of X
var : variance of X
cov : covariance of X and Y

X is independent of Y is often written $X\perp Y$ or $X\perp \!\!\!\perp Y$ , and X is independent of Y given W is often written

X\perp \!\!\!\perp Y\,|\,W

X\perp Y\,|\,W

$\textstyle P(A\mid B)$ , the conditional probability, is the probability of $\textstyle A$ given $\textstyle B$ , i.e., $\textstyle A$ after $\textstyle B$ is observed.

Statistics

Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
A tilde (~) denotes "has the probability distribution of".
Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., ${\widehat {\theta }}$ is an estimator for $\theta$ .
The arithmetic mean of a series of values x₁, x₂, ..., x_n is often denoted by placing an "overbar" over the symbol, e.g. ${\bar {x}}$ , pronounced "x bar".
Some commonly used symbols for sample statistics are given below:
- the sample mean ${\bar {x}}$ ,
- the sample variance s,
- the sample standard deviation s,
- the sample correlation coefficient r,
- the sample cumulants k_r.
Some commonly used symbols for population parameters are given below:
- the population mean μ,
- the population variance σ,
- the population standard deviation σ,
- the population correlation ρ,
- the population cumulants κ_r,
$x_{(k)}$ is used for the $k^{\text{th}}$ order statistic, where $x_{(1)}$ is the sample minimum and $x_{(n)}$ is the sample maximum from a total sample size n.

Critical values

The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value x_α such that F(x_α) = 1 − α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:

z_α or z(α) for the standard normal distribution
t_α,ν or t(α,ν) for the t-distribution with ν degrees of freedom
${\chi _{\alpha ,\nu }}^{2}$ or ${\chi }^{2}(\alpha ,\nu )$ for the chi-squared distribution with ν degrees of freedom
$F_{\alpha ,\nu _{1},\nu _{2}}$ or F(α,ν₁,ν₂) for the F-distribution with ν₁ and ν₂ degrees of freedom

Linear algebra

Matrices are usually denoted by boldface capital letters, e.g. A.
Column vectors are usually denoted by boldface lowercase letters, e.g. x.
The transpose operator is denoted by either a superscript T (e.g. A) or a prime symbol (e.g. A′).
A row vector is written as the transpose of a column vector, e.g. x or x′.

Abbreviations

Common abbreviations include:

a.e. almost everywhere
a.s. almost surely
cdf cumulative distribution function
cmf cumulative mass function
df degrees of freedom (also $\nu$ )
i.i.d. independent and identically distributed
pdf probability density function
pmf probability mass function
r.v. random variable
w.p. with probability; wp1 with probability 1
i.o. infinitely often, i.e. $\{A_{n}{\text{ i.o.}}\}=\bigcap _{N}\bigcup _{n\geq N}A_{n}$
ult. ultimately, i.e. $\{A_{n}{\text{ ult.}}\}=\bigcup _{N}\bigcap _{n\geq N}A_{n}$

References

Halperin, Max; Hartley, H. O.; Hoel, P. G. (1965), "Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation", The American Statistician, 19 (3): 12–14, doi:10.2307/2681417, JSTOR 2681417

External links

Earliest Uses of Symbols in Probability and Statistics, maintained by Jeff Miller.

Categories:

Misplaced Pages