LASTS is a method to interpret the result of classification by a blackbox function , where is a time series sample and is a classification label, for an instance .

The Variational Autoencoder

LASTS uses a variational autoencoder (VAE) comprised of a encoder and decoder . Here is effectively a “latent space” representation , and is an approximation made by the decoder for . This model minimizes the loss function in order to ensure limited difference between and while finding a good value set for .

The usability of is that in condenses into a very simple vector. For example, LASTS uses a latent space of just two dimensions with an input sequence of 128 time series values.

Counterfactual Search (CFS)

Once I have , LASTS uses a neighbourhood generation (NG) method to sample values very close to the value. Think of this as a circle / sphere / -th dimensional circular object being drawn around the point in the environment, and taking random values from that object. This neighbourhood is then put through the decoder to give a set of values which is then put through the blackbox to get a set of values . If any of these values differ from the original label , we take note of them as counterfactuals, stored in a database , and the threshold value for this object is halved. They do this again and again until NO counterfactuals are newly found. After this, all data in is assessed against via a distance metric to find the closest value to , which is considered the closest counterfactual found by this counterfactual search (CFS) algorithm.

The Saliency Map,

Once I have , I am able to then put this value through the decoder to provide me a value , which is effectively the closest similar sample to that is of a different class. This allows me to find specific features within the signal that likely caused the classification different. The scale of how much each timestep contributes to the class is known as a saliency map, , which can be computed very simply via

The larger the differences are, the larger the saliency map reading at that timestamp.

Neighbourhood Generation (NG)

Next, LASTS uses another neigbourhood generation (NG) method, but this time around . The rationale for this is that

  1. Values around are right around the decision boundary. This means that the likelihood of there being similarly classified or differently classified samples within a small radius is very high.
  2. Values around are likely to be similarly classified due to it possibly being very far away from the decision boundary.

This NG provides a new neighbourhood , which I am then able to put through the decoder in order to provide me with a “neighbourhood” . I can then use the function to classify the equivalent labels. This gives me a set of values and which are values that are similar to the signal, but with some being classified similarly and some that are not.

The Shapelet Tree

After this, LASTS finally has one last step: figuring out what sort of subsequences are present in certain time series objects that aren’t present in other ones. This can be visualised as a decision tree making decisions whether certain subsequences exist within a sample. The merit of this is that decision tree models are generally considered highly interpretable unlike most of modern machine learning.

To do this, LASTS utilizes a subsequence transform (they implement various methods including SAX and Shapelets) to do this. This transform breaks down our signals into an array of 0s and 1s that represent the presence of certain subsequences. These arrays are fixed length across all samples, and can be used alongside to train a Decision Tree Classifier, which we can call a Shapelet Tree.

Quick Reference