分享

时序数据异常检测工具/数据集大列表

 极市平台 2021-01-26

者 | rob-med

编辑 | 小极

来源 | https://github.com/rob-med/awesome-TS-anomaly-detection

原文 | https://zhuanlan.zhihu.com/p/57432180

【导读】分享一个时序数据异常检测工具/数据集大列表,包括一些异常检测软件、相关软件和基准数据集等,GitHub地址:https://github.com/rob-med/awesome-TS-anomaly-detection

Anomaly Detection Software

NameLanguagePitchLicense
Numenta's NupicC++Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM).AGPL
Etsy's SkylinePythonSkyline is a real-time anomaly detection system, built to enable passive monitoring of hundreds of thousands of metrics.MIT
Twitter's AnomalyDetectionRAnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the presence of seasonality and an underlying trend.GPL
Netflix's SurusJavaRobust Anomaly Detection (RAD) - An implementation of the Robust PCA.Apache-2.0
Lytics AnomalyzerGoAnomalyzer implements a suite of statistical tests that yield the probability that a given set of numeric input, typically a time series, contains anomalous behavior.Apache-2.0
Yahoo's EGADSJavaGADS is a library that contains a number of anomaly detection techniques applicable to many use-cases in a single package with the only dependency being Java.GPL
Linkedin's luminolPythonLuminol is a light weight python library for time series data analysis. The two major functionalities it supports are anomaly detection and correlation. It can be used to investigate possible causes of anomaly.Apache-2.0
Ele.me's bansheeGoAnomalies detection system for periodic metrics.MIT
Mentat's datastream.ioPythonAn open-source framework for real-time anomaly detection using Python, Elasticsearch and Kibana.Apache-2.0
DonutPythonDonut is an unsupervised anomaly detection algorithm for seasonal KPIs, based on Variational Autoencoders.-
NASA's TelemanomPythonA framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.custom
banpeiPythonOutlier detection (Hotelling's theory) and Change point detection (Singular spectrum transformation) for time-series.MIT
CADPythonContextual Anomaly Detection for real-time AD on streagming data (winner algorithm of the 2016 NAB competition).AGPL

Related Software

This section includes some time-series software for anomaly detection-related tasks, such as forecasting and labeling.

Forecasting

NameLanguagePitchLicense
Facebook's ProphetPython/RProphet is a procedure for forecasting time series data. It is based on an additive model where non-linear trends are fit with yearly and weekly seasonality, plus holidays.BSD
PyFluxPythonThe library has a good array of modern time series models, as well as a flexible array of inference options (frequentist and Bayesian) that can be applied to these models.BSD 3-Clause
PyramidPythonPorting of R's auto.arima with a scikit-learn-friendly interface.MIT
SaxPyPythonGeneral implementation of SAX, as well as HOTSAX for anomaly detection.GPLv2.0
tslearnPythontslearn is a Python package that provides machine learning tools for the analysis of time series. This package builds on scikit-learn, numpy and scipy libraries.BSD 2-Clause
seglearnPythonSeglearn is a python package for machine learning time series or sequences. It provides an integrated pipeline for segmentation, feature extraction, feature processing, and final estimator.BSD 3-Clause
TigramitePythonTigramite is a causal time series analysis python package. It allows to efficiently reconstruct causal graphs from high-dimensional time series datasets and model the obtained causal dependencies for causal mediation and prediction analyses.GPLv3.0

Labeling

NameLanguagePitchLicense
Microsoft's TaganomalyR (dockerized web app)Simple tool for tagging time series data. Works for univariate and multivariate data, provides a reference anomaly prediction using Twitter's AnomalyDetection package.MIT
Baidu's CurvePythonCurve is an open-source tool to help label anomalies on time-series data.Apache-2.0

Benchmark Datasets

  • Numenta's NAB

NAB is a novel benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications.

  • Yahoo's Webscope S5

The dataset consists of real and synthetic time-series with tagged anomaly points. The dataset tests the detection accuracy of various anomaly-types including outliers and change-points.

    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多