如何计算累积正态分布？

小开

改编自 http://mail.python.org/pipermail/python-list/2000-June/039873.html

from math import *
def erfcc(x):
"""Complementary error function."""
z = abs(x)
t = 1. / (1. + 0.5*z)
r = t * exp(-z*z-1.26551223+t*(1.00002368+t*(.37409196+
t*(.09678418+t*(-.18628806+t*(.27886807+
t*(-1.13520398+t*(1.48851587+t*(-.82215223+
t*.17087277)))))))))
if (x >= 0.):
return r
else:
return 2. - r


def ncdf(x):
return 1. - 0.5*erfcc(x/(2**0.5))

小开

这里有一个例子:

>>> from scipy.stats import norm
>>> norm.cdf(1.96)
0.9750021048517795
>>> norm.cdf(-1.96)
0.024997895148220435

换句话说，大约95% 的标准正态区间位于两个标准偏差之内，以一个标准平均值为零为中心。

如果你需要反向 CDF:

>>> norm.ppf(norm.cdf(1.96))
array(1.9599999999999991)

小开

要构建 Unknown 的示例，许多库中实现的函数 nordist ()的 Python 等价物如下:

def normcdf(x, mu, sigma):
t = x-mu;
y = 0.5*erfcc(-t/(sigma*sqrt(2.0)));
if y>1.0:
y = 1.0;
return y


def normpdf(x, mu, sigma):
u = (x-mu)/abs(sigma)
y = (1/(sqrt(2*pi)*abs(sigma)))*exp(-u*u/2)
return y


def normdist(x, mu, sigma, f):
if f:
y = normcdf(x,mu,sigma)
else:
y = normpdf(x,mu,sigma)
return y

小开

现在回答这个问题可能为时已晚，但由于谷歌仍然在这里引导人们，我决定在这里写下我的解决方案。

也就是说，自 Python 2.7以来，math库已经集成了错误函数 math.erf(x)

erf()函数可用于计算传统的统计函数，例如累积标准正态分布:

from math import *
def phi(x):
#'Cumulative distribution function for the standard normal distribution'
return (1.0 + erf(x / sqrt(2.0))) / 2.0

参考:

Https://docs.python.org/2/library/math.html

Https://docs.python.org/3/library/math.html

误差函数和标准正态分布函数是如何关联的？

小开

Alex 的答案给出了标准正态分布(均值 = 0，标准差 = 1)的解。如果你有 mean和 std(sqr(var))的正态分布，你想计算:

from scipy.stats import norm


# cdf(x < val)
print norm.cdf(val, m, s)


# cdf(x > val)
print 1 - norm.cdf(val, m, s)


# cdf(v1 < x < v2)
print norm.cdf(v2, m, s) - norm.cdf(v1, m, s)

阅读更多关于这里是 cdf和正态分布的具体实现的公式给你。

小开

摘自上文:

from scipy.stats import norm
>>> norm.cdf(1.96)
0.9750021048517795
>>> norm.cdf(-1.96)
0.024997895148220435

对于双尾测试:

Import numpy as np
z = 1.96
p_value = 2 * norm.cdf(-np.abs(z))
0.04999579029644087

小开

从 Python 3.8开始，标准库提供 NormalDist对象作为 statistics模块的一部分。

它可以用来获得给定刻薄(mu)和 标准差(sigma)的 累积分布函数(cdf-随机样本 X 小于或等于 x 的概率) :

from statistics import NormalDist


NormalDist(mu=0, sigma=1).cdf(1.96)
# 0.9750021048517796

这可以简化为 标准正态分布标准正态分布(mu = 0和 sigma = 1) :

NormalDist().cdf(1.96)
# 0.9750021048517796


NormalDist().cdf(-1.96)
# 0.024997895148220428

小开

简单如下:

import math
def my_cdf(x):
return 0.5*(1+math.erf(x/math.sqrt(2)))

我在 https://www.danielsoper.com/statcalc/formulas.aspx?id=55页找到了公式