Morgoth is a framework for flexible anomaly detection algorithms packaged to be used with Kapacitor
Morgoth provides a framework via the for implementing the smaller pieces of an anomaly detection
problem.
The basic framework is that Morgoth maintains a dictionary of normal behaviors and compares new
windows of data to the normal dictionary. If the new window of data is not found in the dictionary
then it is considered anomalous.
Morgoth uses algorithms, called fingerprinters, to compare windows of data to determine if they are
similar. The Lossy Counting Algorithm(LCA) is used to maintain the dictionary of normal windows. The
LCA is a space efficient algorithm that can account for drift in the normal dictionary, more on LCA
below.
Morgoth uses a consensus model where each fingerprinter votes for whether it thinks the current window
is anomalous. If the total votes percentage is greater than a consensus threshold then the window is
considered anomalous.
Fingerprinters
A fingerprinter is a method that can determine if a window of data is similar to a previous window
of data. In effect the fingerprinters take fingerprints of the incoming data and can compare
fingerprints of new data to see if they match. These fingerprinting algorithms provide the core of
Morgoth as they are the means by which Morgoth determines if a new window of data is new or something
already observed.
An example fingerprinting algorithm is a sigma algorithm that computes the mean and standard
deviation of a window and store them as the fingerprint for the window. When a new window arrives it
compares the fingerprint (mean, stddev) of the new window to the previous window. If the windows are
too far apart then they are not considered at match.
By defining several fingerprinting algorithms Morgoth can decide whether new data is anomalous or
normal.
Lossy Counting Algorithm
The LCA counts frequent items in a stream of data. It is lossy because to conserve space it will drop
less frequent items. The result is that the algorithm will find frequent items but may loose track of
less frequent items. More on the specific mathematical properties of the algorithm can be found below.
There are two parameters to the algorithm, error tolerance (e) and minimum support (m). First e is in
the range [0, 1] and is an error bound, interpreted as a percentage value. For example given and
e = 0.01 (1%), items less the 1% frequent in the data set can be dropped. Decreasing e will require
more space but will keep track of less frequent items. Increasing e will require less space but will
loose track of less frequent items. Second m is in the range [0, 1] and is a minimum support such
that items that are considered frequent have at least m% frequency. For example if m = 0.05 (5%) then
if an item has a support less than 5% it is not considered frequent, aka normal. The minimum support
becomes the threshold for when items are considered anomalous.
Notice that m > e, this is so that we reduce the number of false positives. For example say we set
e = 5% and m = 5%. If a normal behavior X, has a true frequency of 6% than based on variations in the
true frequency, X might fall below 5% for a small interval and be dropped. This will cause X's
frequency to be underestimated, which will cause it to be flagged as an anomaly, triggering a false
positive. By setting e < m we have a buffer to help mitigate creating false positives.
Properties
The Lossy Counting algorithm has three properties:
there are no false negatives,
false positives are guaranteed to have a frequency of at least (m - e)N,
the frequency of an item can underestimated by at most eN,
where N is the number of items encountered.
The space requirements for the algorithm are at most (1 / e) log(eN). It has also been show that if
the item with low frequency are uniformly random than the space requirements are no more than 7 / e.
This means that as Morgoth continues to processes windows of data its memory usage will grow as the log
of the number of windows and can reach a stable upper bound.
|