Misplaced Pages

Notation in probability and statistics: Difference between revisions

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 21:33, 10 September 2020 editMiaumee (talk | contribs)Extended confirmed users765 edits Improve C/E. + inline ref. + "Bibliography".Tags: Reverted Visual edit← Previous edit Revision as of 15:36, 21 September 2020 edit undoJayBeeEll (talk | contribs)Extended confirmed users, New page reviewers28,266 edits Undid revision 977769409 by Miaumee (talk) Per User talk:Miaumee, this is apparently the preferred response to poor editingTag: UndoNext edit →
Line 4: Line 4:


==Probability theory== ==Probability theory==
* ]s are usually written in ] roman letters: ''X'', ''Y'', ''Z'', ''T'', etc.<ref name=":0">{{Cite web|date=2020-04-26|title=List of Probability and Statistics Symbols|url=https://mathvault.ca/hub/higher-math/math-symbols/probability-statistics-symbols/|access-date=2020-09-10|website=Math Vault|language=en-US}}</ref> * ]s are usually written in ] roman letters: ''X'', ''Y'', etc.
* Particular realizations of a random variable are written in corresponding ] letters. For example, ''x''<sub>1</sub>, ''x''<sub>2</sub>, …, ''x''<sub>''n''</sub> could be a ] corresponding to the random variable ''X''. A cumulative probability is formally written <math>P(X\le x) </math> to differentiate the random variable from its realization. * Particular realizations of a random variable are written in corresponding ] letters. For example, ''x''<sub>1</sub>, ''x''<sub>2</sub>, …, ''x''<sub>''n''</sub> could be a ] corresponding to the random variable ''X''. A cumulative probability is formally written <math>P(X\le x) </math> to differentiate the random variable from its realization.
* The probability is sometimes written <math>\mathbb{P} </math> to distinguish it from other functions and measure ''P'' so as to avoid having to define “''P'' is a probability” and <math>\mathbb{P}(X\in A) </math> is short for <math>P(\{\omega \in\Omega: X(\omega) \in A\})</math>, where <math>\Omega</math> is the event space and <math>X(\omega)</math> is a random variable. <math>\Pr(A)</math> notation is used alternatively.<ref name=":0" /> * The probability is sometimes written <math>\mathbb{P} </math> to distinguish it from other functions and measure ''P'' so as to avoid having to define “''P'' is a probability” and <math>\mathbb{P}(X\in A) </math> is short for <math>P(\{\omega \in\Omega: X(\omega) \in A\})</math>, where <math>\Omega</math> is the event space and <math>X(\omega)</math> is a random variable. <math>\Pr(A)</math> notation is used alternatively.
*<math>\mathbb{P}(A \cap B)</math> or <math>\mathbb{P}</math> indicates the probability that events ''A'' and ''B'' both occur. The ] of random variables ''X'' and ''Y'' is denoted as <math>P(X, Y)</math>, while joint probability mass function or probability density function as <math>f(x, y)</math> and joint cumulative distribution function as <math>F(x, y)</math>.<ref name=":0" /> *<math>\mathbb{P}(A \cap B)</math> or <math>\mathbb{P}</math> indicates the probability that events ''A'' and ''B'' both occur. The ] of random variables ''X'' and ''Y'' is denoted as <math>P(X, Y)</math>, while joint probability mass function or probability density function as <math>f(x, y)</math> and joint cumulative distribution function as <math>F(x, y)</math>.
*<math>\mathbb{P}(A \cup B)</math> or <math>\mathbb{P}</math> indicates the probability of either event ''A'' or event ''B'' occurring (“or” in this case means ]).<ref name=":0" /> *<math>\mathbb{P}(A \cup B)</math> or <math>\mathbb{P}</math> indicates the probability of either event ''A'' or event ''B'' occurring (“or” in this case means ]).
*] are usually written with uppercase ] (e.g. <math>\mathcal F</math> for the set of sets on which we define the probability ''P'') *] are usually written with uppercase ] (e.g. <math>\mathcal F</math> for the set of sets on which we define the probability ''P'')
*]s (pdfs) and ]s are denoted by lowercase letters, e.g. <math>f(x)</math>, or <math>f_X(x)</math>.<ref name=":0" /> *]s (pdfs) and ]s are denoted by lowercase letters, e.g. <math>f(x)</math>, or <math>f_X(x)</math>.
*]s (cdfs) are denoted by uppercase letters, e.g. <math>F(x)</math>, or <math>F_X(x)</math>.<ref name=":0" /> *]s (cdfs) are denoted by uppercase letters, e.g. <math>F(x)</math>, or <math>F_X(x)</math>.
* ]s or complementary cumulative distribution functions are often denoted by placing an ] over the symbol for the cumulative:<math>\overline{F}(x) =1-F(x)</math>, or denoted as <math>S(x)</math>,<ref name=":0" /> * ]s or complementary cumulative distribution functions are often denoted by placing an ] over the symbol for the cumulative:<math>\overline{F}(x) =1-F(x)</math>, or denoted as <math>S(x)</math>,
*In particular, the pdf of the ] is denoted by &phi;(''z''), and its cdf by &Phi;(''z''). *In particular, the pdf of the ] is denoted by &phi;(''z''), and its cdf by &Phi;(''z'').
*Some common operators:<ref name=":0" /> *Some common operators:
:* E : ] of ''X'' :* E : ] of ''X''
:* var : ] of ''X'' :* var : ] of ''X''
Line 20: Line 20:
* X is independent of Y is often written <math>X \perp Y</math> or <math>X \perp\!\!\!\perp Y</math>, and X is independent of Y given W is often written * X is independent of Y is often written <math>X \perp Y</math> or <math>X \perp\!\!\!\perp Y</math>, and X is independent of Y given W is often written
:<math>X \perp\!\!\!\perp Y \,|\, W </math> or :<math>X \perp\!\!\!\perp Y \,|\, W </math> or
:<math>X \perp Y \,|\, W</math><ref name=":0" /> :<math>X \perp Y \,|\, W</math>
* <math>\textstyle P(A\mid B)</math>, the '']'', is the probability of <math>\textstyle A</math> ''given'' <math>\textstyle B</math>,<ref name=":0" /> that is, <math>\textstyle A</math> ''after'' <math>\textstyle B</math> is observed.{{fact|date=May 2016}} * <math>\textstyle P(A\mid B)</math>, the '']'', is the probability of <math>\textstyle A</math> ''given'' <math>\textstyle B</math>, i.e., <math>\textstyle A</math> ''after'' <math>\textstyle B</math> is observed.{{fact|date=May 2016}}


==Statistics== ==Statistics==
*Greek letters (e.g. ''&theta;'', ''&beta;'') are commonly used to denote unknown parameters (population parameters). *Greek letters (e.g. ''&theta;'', ''&beta;'') are commonly used to denote unknown parameters (population parameters).
*A tilde (~) denotes "has the probability distribution of". *A tilde (~) denotes "has the probability distribution of".
*Placing a hat, or caret, over a true parameter denotes an ] of it. For example, <math>\widehat{\theta}</math> is an estimator for <math>\theta</math>.<ref name=":0" /> *Placing a hat, or caret, over a true parameter denotes an ] of it, e.g., <math>\widehat{\theta}</math> is an estimator for <math>\theta</math>.
*The ] of a series of values ''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub> is often denoted by placing an "]" over the symbol (e.g., <math>\bar{x}</math>, pronounced "''x'' bar"). *The ] of a series of values ''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''n''</sub> is often denoted by placing an "]" over the symbol, e.g. <math>\bar{x}</math>, pronounced "''x'' bar".
*Some commonly used symbols for ] statistics are given below: *Some commonly used symbols for ] statistics are given below:
**the ] <math>\bar{x}</math>, **the ] <math>\bar{x}</math>,
Line 40: Line 40:
**the population ] ''&rho;'', **the population ] ''&rho;'',
**the population ]s ''&kappa;<sub>r</sub>'', **the population ]s ''&kappa;<sub>r</sub>'',
*<math>x_{(k)}</math> is used for the <math>k^\text{th}</math> ],<ref name=":0" /> where <math>x_{(1)}</math> is the sample minimum and <math>x_{(n)}</math> is the sample maximum from a total sample size ''n''. *<math>x_{(k)}</math> is used for the <math>k^\text{th}</math> ], where <math>x_{(1)}</math> is the sample minimum and <math>x_{(n)}</math> is the sample maximum from a total sample size ''n''.


==Critical values== ==Critical values==
The ''&alpha;''-level upper ] of a ] is the value exceeded with probability &alpha;, that is, the value ''x''<sub>''&alpha;''</sub> such that ''F''(''x''<sub>''&alpha;''</sub>) =&nbsp;1&nbsp;&minus;&nbsp;''&alpha;'' where ''F'' is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:<ref name=":0" /> The ''&alpha;''-level upper ] of a ] is the value exceeded with probability &alpha;, that is, the value ''x''<sub>''&alpha;''</sub> such that ''F''(''x''<sub>''&alpha;''</sub>) =&nbsp;1&nbsp;&minus;&nbsp;''&alpha;'' where ''F'' is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:
*''z''<sub>''&alpha;''</sub> or ''z''(''&alpha;'') for the ] *''z''<sub>''&alpha;''</sub> or ''z''(''&alpha;'') for the ]
*''t''<sub>''&alpha;'',''&nu;''</sub> or ''t''(''&alpha;'',''&nu;'') for the ] with &nu; ] *''t''<sub>''&alpha;'',''&nu;''</sub> or ''t''(''&alpha;'',''&nu;'') for the ] with &nu; ]
Line 59: Line 59:
*'''a.e.''' ] *'''a.e.''' ]
*'''a.s.''' ] *'''a.s.''' ]
* '''cdf''' ]<ref name=":0" /> * '''cdf''' ]
* '''cmf''' ] * '''cmf''' ]
*'''df''' ] (also <math>\nu</math>)<ref name=":0" /> *'''df''' ] (also <math>\nu</math>)
*'''i.i.d.''' ]<ref name=":0" /> *'''i.i.d.''' ]
*'''pdf''' ]<ref name=":0" /> *'''pdf''' ]
*'''pmf''' ]<ref name=":0" /> *'''pmf''' ]
* '''r.v.''' ]<ref name=":0" /> * '''r.v.''' ]
* '''w.p.''' with probability; '''wp1''' ] * '''w.p.''' with probability; '''wp1''' ]
* '''i.o.''' infinitely often, i.e. <math> \{ A_n\text{ i.o.} \} = \bigcap_N\bigcup_{n\geq N} A_n </math> * '''i.o.''' infinitely often, i.e. <math> \{ A_n\text{ i.o.} \} = \bigcap_N\bigcup_{n\geq N} A_n </math>
Line 77: Line 77:


==References== ==References==
<references />

== Bibliography ==
*{{Citation| title=Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation| first1=Max|last1=Halperin |first2=H. O. |last2=Hartley |first3=P. G.|last3=Hoel | journal=The American Statistician| volume=19 |year=1965 | pages=12–14 | issue=3| doi=10.2307/2681417 | jstor=2681417}} *{{Citation| title=Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation| first1=Max|last1=Halperin |first2=H. O. |last2=Hartley |first3=P. G.|last3=Hoel | journal=The American Statistician| volume=19 |year=1965 | pages=12–14 | issue=3| doi=10.2307/2681417 | jstor=2681417}}



Revision as of 15:36, 21 September 2020

Probability
Statistics

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

Probability theory

  • Random variables are usually written in upper case roman letters: X, Y, etc.
  • Particular realizations of a random variable are written in corresponding lower case letters. For example, x1, x2, …, xn could be a sample corresponding to the random variable X. A cumulative probability is formally written P ( X x ) {\displaystyle P(X\leq x)} to differentiate the random variable from its realization.
  • The probability is sometimes written P {\displaystyle \mathbb {P} } to distinguish it from other functions and measure P so as to avoid having to define “P is a probability” and P ( X A ) {\displaystyle \mathbb {P} (X\in A)} is short for P ( { ω Ω : X ( ω ) A } ) {\displaystyle P(\{\omega \in \Omega :X(\omega )\in A\})} , where Ω {\displaystyle \Omega } is the event space and X ( ω ) {\displaystyle X(\omega )} is a random variable. Pr ( A ) {\displaystyle \Pr(A)} notation is used alternatively.
  • P ( A B ) {\displaystyle \mathbb {P} (A\cap B)} or P [ B A ] {\displaystyle \mathbb {P} } indicates the probability that events A and B both occur. The joint probability distribution of random variables X and Y is denoted as P ( X , Y ) {\displaystyle P(X,Y)} , while joint probability mass function or probability density function as f ( x , y ) {\displaystyle f(x,y)} and joint cumulative distribution function as F ( x , y ) {\displaystyle F(x,y)} .
  • P ( A B ) {\displaystyle \mathbb {P} (A\cup B)} or P [ B A ] {\displaystyle \mathbb {P} } indicates the probability of either event A or event B occurring (“or” in this case means one or the other or both).
  • σ-algebras are usually written with uppercase calligraphic (e.g. F {\displaystyle {\mathcal {F}}} for the set of sets on which we define the probability P)
  • Probability density functions (pdfs) and probability mass functions are denoted by lowercase letters, e.g. f ( x ) {\displaystyle f(x)} , or f X ( x ) {\displaystyle f_{X}(x)} .
  • Cumulative distribution functions (cdfs) are denoted by uppercase letters, e.g. F ( x ) {\displaystyle F(x)} , or F X ( x ) {\displaystyle F_{X}(x)} .
  • Survival functions or complementary cumulative distribution functions are often denoted by placing an overbar over the symbol for the cumulative: F ¯ ( x ) = 1 F ( x ) {\displaystyle {\overline {F}}(x)=1-F(x)} , or denoted as S ( x ) {\displaystyle S(x)} ,
  • In particular, the pdf of the standard normal distribution is denoted by φ(z), and its cdf by Φ(z).
  • Some common operators:
  • X is independent of Y is often written X Y {\displaystyle X\perp Y} or X Y {\displaystyle X\perp \!\!\!\perp Y} , and X is independent of Y given W is often written
X Y | W {\displaystyle X\perp \!\!\!\perp Y\,|\,W} or
X Y | W {\displaystyle X\perp Y\,|\,W}
  • P ( A B ) {\displaystyle \textstyle P(A\mid B)} , the conditional probability, is the probability of A {\displaystyle \textstyle A} given B {\displaystyle \textstyle B} , i.e., A {\displaystyle \textstyle A} after B {\displaystyle \textstyle B} is observed.

Statistics

  • Greek letters (e.g. θ, β) are commonly used to denote unknown parameters (population parameters).
  • A tilde (~) denotes "has the probability distribution of".
  • Placing a hat, or caret, over a true parameter denotes an estimator of it, e.g., θ ^ {\displaystyle {\widehat {\theta }}} is an estimator for θ {\displaystyle \theta } .
  • The arithmetic mean of a series of values x1, x2, ..., xn is often denoted by placing an "overbar" over the symbol, e.g. x ¯ {\displaystyle {\bar {x}}} , pronounced "x bar".
  • Some commonly used symbols for sample statistics are given below:
  • Some commonly used symbols for population parameters are given below:
    • the population mean μ,
    • the population variance σ,
    • the population standard deviation σ,
    • the population correlation ρ,
    • the population cumulants κr,
  • x ( k ) {\displaystyle x_{(k)}} is used for the k th {\displaystyle k^{\text{th}}} order statistic, where x ( 1 ) {\displaystyle x_{(1)}} is the sample minimum and x ( n ) {\displaystyle x_{(n)}} is the sample maximum from a total sample size n.

Critical values

The α-level upper critical value of a probability distribution is the value exceeded with probability α, that is, the value xα such that F(xα) = 1 − α where F is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:

Linear algebra

  • Matrices are usually denoted by boldface capital letters, e.g. A.
  • Column vectors are usually denoted by boldface lowercase letters, e.g. x.
  • The transpose operator is denoted by either a superscript T (e.g. A) or a prime symbol (e.g. A′).
  • A row vector is written as the transpose of a column vector, e.g. x or x′.

Abbreviations

Common abbreviations include:

See also

References

  • Halperin, Max; Hartley, H. O.; Hoel, P. G. (1965), "Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation", The American Statistician, 19 (3): 12–14, doi:10.2307/2681417, JSTOR 2681417

External links

Categories:
Notation in probability and statistics: Difference between revisions Add topic