scipy.stats.

spearmanr#

scipy.stats.spearmanr(a, b=None, axis=0, nan_policy='propagate', alternative='two-sided')[原始碼]#

計算 Spearman 相關係數及相關的 p 值。

Spearman 等級順序相關係數是一種非參數量測，用於衡量兩個資料集之間關係的單調性。與其他相關係數一樣，它的值介於 -1 和 +1 之間，0 表示沒有相關性。-1 或 +1 的相關性表示精確的單調關係。正相關表示當 x 增加時，y 也會增加。負相關表示當 x 增加時，y 會減少。

p 值大致表示一個不相關的系統產生資料集的機率，這些資料集的 Spearman 相關性至少與從這些資料集計算出的相關性一樣極端。雖然 p 值的計算並未對樣本的底層分佈做出強烈的假設，但它僅對於非常大的樣本（>500 個觀察值）才是準確的。對於較小的樣本量，請考慮置換檢定（請參閱下面的「範例」章節）。

參數:

a, b類陣列，1D 或 2D，b 為選用

一個或兩個 1 維或 2 維陣列，包含多個變數和觀察值。當它們是 1 維時，每個都代表單一變數的觀察值向量。對於 2 維情況下的行為，請參閱下面的 axis。兩個陣列需要在 axis 維度上具有相同的長度。

axisint 或 None，選用

如果 axis=0（預設值），則每一列代表一個變數，觀察值在行中。如果 axis=1，則關係會轉置：每一行代表一個變數，而列包含觀察值。如果 axis=None，則兩個陣列都將被展平。

nan_policy{‘propagate’, ‘raise’, ‘omit’}，選用

定義輸入包含 nan 時的處理方式。以下選項可用（預設值為 ‘propagate’）

‘propagate’：傳回 nan
‘raise’：拋出錯誤
‘omit’：執行計算時忽略 nan 值

alternative{‘two-sided’, ‘less’, ‘greater’}，選用

定義對立假設。預設值為 ‘two-sided’。以下選項可用

‘two-sided’：相關性不為零
‘less’：相關性為負（小於零）
‘greater’：相關性為正（大於零）

在版本 1.7.0 中新增。

傳回:

resSignificanceResult

一個包含屬性的物件

statisticfloat 或 ndarray（2 維方形）: Spearman 相關矩陣或相關係數（如果僅給定 2 個變數作為參數）。相關矩陣是方形的，長度等於 a 和 b 中變數（行或列）總數的總和。
pvaluefloat: 假設檢定的 p 值，其虛無假設是兩個樣本沒有序數相關性。請參閱上方的 alternative 以了解對立假設。pvalue 的形狀與 statistic 相同。

引發:

ValueError: 如果 axis 不是 0、1 或 None，或者如果 a 的維度數大於 2，或者如果 b 為 None 且 a 的維度數小於 2。

警告:

ConstantInputWarning: 如果輸入是常數陣列，則會引發此警告。在這種情況下，未定義相關係數，因此會傳回 np.nan。

另請參閱

Spearman 相關係數: 擴展範例

參考文獻

[1]

Zwillinger, D. 和 Kokoska, S. (2000)。CRC Standard Probability and Statistics Tables and Formulae。Chapman & Hall：紐約。2000 年。第 14.7 節

[2]

Kendall, M. G. 和 Stuart, A. (1973)。The Advanced Theory of Statistics，第 2 卷：Inference and Relationship。Griffin。1973 年。第 31.18 節

範例

>>> import numpy as np
>>> from scipy import stats
>>> res = stats.spearmanr([1, 2, 3, 4, 5], [5, 6, 7, 8, 7])
>>> res.statistic
0.8207826816681233
>>> res.pvalue
0.08858700531354381

>>> rng = np.random.default_rng()
>>> x2n = rng.standard_normal((100, 2))
>>> y2n = rng.standard_normal((100, 2))
>>> res = stats.spearmanr(x2n)
>>> res.statistic, res.pvalue
(-0.07960396039603959, 0.4311168705769747)

>>> res = stats.spearmanr(x2n[:, 0], x2n[:, 1])
>>> res.statistic, res.pvalue
(-0.07960396039603959, 0.4311168705769747)

>>> res = stats.spearmanr(x2n, y2n)
>>> res.statistic
array([[ 1. , -0.07960396, -0.08314431, 0.09662166],
       [-0.07960396, 1. , -0.14448245, 0.16738074],
       [-0.08314431, -0.14448245, 1. , 0.03234323],
       [ 0.09662166, 0.16738074, 0.03234323, 1. ]])
>>> res.pvalue
array([[0. , 0.43111687, 0.41084066, 0.33891628],
       [0.43111687, 0. , 0.15151618, 0.09600687],
       [0.41084066, 0.15151618, 0. , 0.74938561],
       [0.33891628, 0.09600687, 0.74938561, 0. ]])

>>> res = stats.spearmanr(x2n.T, y2n.T, axis=1)
>>> res.statistic
array([[ 1. , -0.07960396, -0.08314431, 0.09662166],
       [-0.07960396, 1. , -0.14448245, 0.16738074],
       [-0.08314431, -0.14448245, 1. , 0.03234323],
       [ 0.09662166, 0.16738074, 0.03234323, 1. ]])

>>> res = stats.spearmanr(x2n, y2n, axis=None)
>>> res.statistic, res.pvalue
(0.044981624540613524, 0.5270803651336189)

>>> res = stats.spearmanr(x2n.ravel(), y2n.ravel())
>>> res.statistic, res.pvalue
(0.044981624540613524, 0.5270803651336189)

>>> rng = np.random.default_rng()
>>> xint = rng.integers(10, size=(100, 2))
>>> res = stats.spearmanr(xint)
>>> res.statistic, res.pvalue
(0.09800224850707953, 0.3320271757932076)

對於小樣本，請考慮執行置換檢定，而不是依賴漸近 p 值。請注意，為了計算統計量的虛無分佈（對於樣本 x 和 y 中觀察值之間的所有可能配對），只需要排列兩個輸入中的一個。

>>> x = [1.76405235, 0.40015721, 0.97873798,
... 2.2408932, 1.86755799, -0.97727788]
>>> y = [2.71414076, 0.2488, 0.87551913,
... 2.6514917, 2.01160156, 0.47699563]

>>> def statistic(x): # permute only `x`
...     return stats.spearmanr(x, y).statistic
>>> res_exact = stats.permutation_test((x,), statistic,
...     permutation_type='pairings')
>>> res_asymptotic = stats.spearmanr(x, y)
>>> res_exact.pvalue, res_asymptotic.pvalue # asymptotic pvalue is too low
(0.10277777777777777, 0.07239650145772594)

如需更詳細的範例，請參閱Spearman 相關係數。