scipy.stats.

anderson#

scipy.stats.anderson(x, dist='norm')[原始碼]#

Anderson-Darling 檢定，用於檢驗資料是否來自特定分佈。

Anderson-Darling 檢定會檢驗虛無假設，即樣本是否來自遵循特定分佈的母體。對於 Anderson-Darling 檢定，臨界值取決於要檢定的分佈。此函數適用於常態分佈、指數分佈、邏輯分佈、韋伯分佈最小值或耿貝爾分佈（第一型極值分佈）。

參數:

xarray_like: 樣本資料陣列。
dist{‘norm’, ‘expon’, ‘logistic’, ‘gumbel’, ‘gumbel_l’, ‘gumbel_r’, ‘extreme1’, ‘weibull_min’}, optional: 要檢定的分佈類型。預設值為 ‘norm’。“extreme1”、“gumbel_l” 和 “gumbel” 名稱是相同分佈的同義詞。

回傳值:

resultAndersonResult

具有以下屬性的物件

statisticfloat: Anderson-Darling 檢定統計量。
critical_valueslist: 此分佈的臨界值。
significance_levellist: 對應臨界值的顯著水準百分比。此函數會針對不同的顯著水準組回傳臨界值，具體取決於要檢定的分佈。
fit_resultFitResult: 包含將分佈擬合到資料結果的物件。

另請參閱

kstest: Kolmogorov-Smirnov 適合度檢定。

說明

提供的臨界值適用於以下顯著水準

常態/指數分佈: 15%, 10%, 5%, 2.5%, 1%
邏輯分佈: 25%, 10%, 5%, 2.5%, 1%, 0.5%
gumbel_l / gumbel_r: 25%, 10%, 5%, 2.5%, 1%
韋伯分佈最小值: 50%, 25%, 15%, 10%, 5%, 2.5%, 1%, 0.5%

如果回傳的統計量大於這些臨界值，則對於相應的顯著水準，可以拒絕資料來自所選分佈的虛無假設。回傳的統計量在參考文獻中稱為 ‘A2’。

對於 weibull_min，已知最大概似估計具有挑戰性。如果檢定成功回傳，則表示已驗證最大概似估計的一階條件，並且臨界值與顯著水準相對應良好，前提是樣本足夠大（>10 個觀測值 [7]）。但是，對於某些資料（尤其是沒有左尾的資料），anderson 可能會導致錯誤訊息。在這種情況下，請考慮使用 scipy.stats.monte_carlo_test 執行自訂適合度檢定。

參考文獻

[1]

https://www.itl.nist.gov/div898/handbook/prc/section2/prc213.htm

[2]

Stephens, M. A. (1974). EDF Statistics for Goodness of Fit and Some Comparisons, Journal of the American Statistical Association, Vol. 69, pp. 730-737.

[3]

Stephens, M. A. (1976). Asymptotic Results for Goodness-of-Fit Statistics with Unknown Parameters, Annals of Statistics, Vol. 4, pp. 357-369.

[4]

Stephens, M. A. (1977). Goodness of Fit for the Extreme Value Distribution, Biometrika, Vol. 64, pp. 583-588.

[5]

Stephens, M. A. (1977). Goodness of Fit with Special Reference to Tests for Exponentiality , Technical Report No. 262, Department of Statistics, Stanford University, Stanford, CA.

[6]

Stephens, M. A. (1979). Tests of Fit for the Logistic Distribution Based on the Empirical Distribution Function, Biometrika, Vol. 66, pp. 591-595.

[7]

Richard A. Lockhart and Michael A. Stephens “Estimation and Tests of Fit for the Three-Parameter Weibull Distribution” Journal of the Royal Statistical Society.Series B(Methodological) Vol. 56, No. 3 (1994), pp. 491-500, Table 0.

範例

檢定隨機樣本是否來自常態分佈（具有未指定的平均值和標準差）的虛無假設。

>>> import numpy as np
>>> from scipy.stats import anderson
>>> rng = np.random.default_rng()
>>> data = rng.random(size=35)
>>> res = anderson(data)
>>> res.statistic
0.8398018749744764
>>> res.critical_values
array([0.527, 0.6  , 0.719, 0.839, 0.998])
>>> res.significance_level
array([15. , 10. ,  5. ,  2.5,  1. ])

統計量的值（幾乎）超過與 2.5% 顯著水準相關聯的臨界值，因此可以在 2.5% 的顯著水準下拒絕虛無假設，但不能在 1% 的顯著水準下拒絕。