Python 计算标准差

在本文中，我们将学习如何在数据集上实现一个Python程序来计算标准差。

考虑一组在任何坐标轴上绘制的值。 标准差 作为这组值之间可见的变异性被定义为。如果标准差很低，则值接近于均值。但是，如果标准差很高，则值距离均值较远。

它由数据集的方差的平方根表示。有两种类型的标准差−

总体标准差 是从总体的每个数据值中计算出来的。因此，它是一个固定值。数学公式定义如下−

$\mathrm{SD:=:\sqrt{\frac{\sum(X_i:-:X_m)^2}{n}}}$

其中，

X m是数据集的均值。
X i是数据集的元素。
n 是数据集的元素数量。

然而， 样本标准差 是只针对总体的某些数据值计算的统计量，因此取决于所选择的样本。数学公式定义如下−

$\mathrm{SD:=:\sqrt{\frac{\sum(X_i:-:X_m)^2}{n:-:1}}}$

其中，

X m是数据集的均值。
X i是数据集的元素。
n 是数据集的元素数量。

输入输出场景

现在让我们看一下各种数据集的输入输出场景−

假设数据集仅包含正整数−

Input: [2, 3, 4, 1, 2, 5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

假设数据集只包含负整数-

Input: [-2, -3, -4, -1, -2, -5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

假设数据集只包含正整数和负整数 –

Input: [-2, -3, -4, 1, 2, 5]
Result: Population Standard Deviation: 3.131382371342656
Sample Standard Deviation: 2.967415635794143

使用数学公式

在上面的文章中我们已经看到了标准差的公式；现在让我们看一下使用Python程序来实现数学公式在不同数据集上的应用。

示例

在下面的示例中，我们导入了 math 库，并通过应用 sqrt() 内置方法在数据集的方差上计算标准差。

import math

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#find the mean of dataset
sm=0
for i in range(len(dataset)):
   sm+=dataset[i]
   mean = sm/len(dataset)

#calculating population standard deviation of the dataset
deviation_sum = 0
for i in range(len(dataset)):
   deviation_sum+=(dataset[i]- mean)**2
   psd = math.sqrt((deviation_sum)/len(dataset))

#calculating sample standard deviation of the dataset
ssd = math.sqrt((deviation_sum)/len(dataset) - 1)

#display output
print("Population standard deviation of the dataset is", psd)
print("Sample standard deviation of the dataset is", ssd)

输出

得到的输出标准差如下-

Population standard deviation of the dataset is 1.3437096247164249
Sample standard deviation of the dataset is 0.8975274678557505

使用numpy模块中的std()函数

在这种方法中，我们导入numpy模块，并且只计算numpy数组元素的总体标准差，使用numpy.std()函数。

示例

下面的Python程序用于计算numpy数组元素的标准差 –

import numpy as np

#declare the dataset list
dataset = np.array([2, 3, 4, 1, 2, 5])

#calculating standard deviation of the dataset
sd = np.std(dataset)

#display output
print("Population standard deviation of the dataset is", sd)

输出

标准差显示如下输出 −

Population standard deviation of the dataset is 1.3437096247164249

使用统计模块中的stdev()和pstdev()函数

Python的statistics模块提供了两个函数stdev()和pstdev()来计算样本数据集的标准差。stdev()函数仅计算样本标准差，而pstdev()函数计算总体标准差。

这两个函数的参数和返回类型相同。

示例1：使用stdev()函数

以下是使用stdev()函数计算数据集样本标准差的Python程序示例：

import statistics as st

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#calculating standard deviation of the dataset
sd = st.stdev(dataset)

#display output
print("Standard Deviation of the dataset is", sd)

输出

作为输出得到的数据集的样本标准差如下所示：

Standard Deviation of the dataset is 1.4719601443879744

示例2：使用pstdev()函数

演示使用 pstdev() 函数来计算数据集的总体标准差的Python程序如下：

import statistics as st

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#calculating standard deviation of the dataset
sd = st.pstdev(dataset)

#display output
print("Standard Deviation of the dataset is", sd)

输出

作为输出获得的数据集的样本标准差如下：

Standard Deviation of the dataset is 1.3437096247164249

Python 计算标准差

Python 计算标准差

输入输出场景

使用数学公式

示例

输出

使用numpy模块中的std()函数

示例

输出

使用统计模块中的stdev()和pstdev()函数

示例1：使用stdev()函数

输出

示例2：使用pstdev()函数

输出

Camera课程

Python教程

Java教程

Web教程

数据库教程

图形图像教程

办公软件教程

Linux教程

计算机教程

大数据教程

开发工具教程

Python 精选教程

回顶部