Learned Pattern Similarity (LPS)

This is a supporting page to our paper -  Time series similarity based on a pattern-based representation (LPS)

by Mustafa Gokce Baydogan and George Runger
* this study is presented in INFORMS 2013@Minneapolis.  The presentation is available here."
* the paper is submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence on September 23rd, 2013"and rejected with one resubmit as new / one major revision / one reject decision on June 1st, 2014 - will start working on the revisions soon
* During the review process, we came up with a more robust version of LPS that does not require tuning of any parameters. LPS random column shows these results. The descriptions of LPS versions are provided below the results table.
 

Latest blog posts on Learned Pattern Similarity

DATASETS
We test our proposed approach on 45 datasets from UCR time series database. The dataset characteristics are given below. This table provides the information about the dataset characteristics (number of classes, length of time series, number of training and test instances). Test performance is provided for nearest-neighbor classifiers with Euclidean and dynamic time warping distance (NNDTWBest and NNDTWNoWin), a nearest-neighbor classifier with sparse spatial sample kernels (NNSSSK) and SPatial Assembly DistancE (SpADe). LPS results (average over 10 replications) are also provided.
 
  number of Dataset Size   LPS^   DTW    
  classes Train Test Length    obs   (R) obs+diff (R) random (MATLAB) Euclidean Best NoWin SpADe SSSK
50Words 50 450 455 270 0.226 0.207 0.183 0.369 0.242 0.310 0.215 0.488
Adiac 37 390 391 176 0.359 0.229 0.223 0.389 0.391 0.396 0.319 0.575
Beef 5 30 30 470 0.440 0.327 0.287 0.467 0.467 0.500 0.300 0.633
CBF 3 30 900 128 0.001 0.001 0.000 0.148 0.004 0.003 0.020 0.09
Coffee 2 28 28 286 0.071 0.039 0.029 0.250 0.179 0.179 0.036 0.071
ECG 2 100 100 96 0.143 0.155 0.146 0.120 0.120 0.230 0.130 0.220
Face (all) 14 560 1690 131 0.215 0.246 0.217 0.289 0.192 0.192 0.214 0.369
Face (four) 4 24 88 350 0.047 0.050 0.045 0.216 0.114 0.170 0.034 0.102
Fish 7 175 175 463 0.095 0.054 0.059 0.217 0.160 0.167 0.017 0.177
Gun-Point 2 50 150 150 0.035 0.002 0.005 0.087 0.087 0.093 0.007 0.133
Lighting-2 2 60 61 637 0.211 0.189 0.244 0.246 0.131 0.131 0.278 0.393
Lighting-7 7 70 73 319 0.359 0.349 0.336 0.425 0.288 0.274 0.315 0.438
OliveOil 4 30 30 570 0.120 0.123 0.117 0.133 0.167 0.133 0.267 0.300
OSU Leaf 6 200 242 427 0.299 0.259 0.238 0.483 0.384 0.409 0.132 0.326
Swedish Leaf 15 500 625 128 0.111 0.079 0.075 0.213 0.157 0.210 0.125 0.339
Synt. Control 6 300 300 60 0.018 0.023 0.026 0.120 0.017 0.007 0.080 0.067
Trace 4 100 100 275 0.062 0.025 0.031 0.240 0.010 0.000 0.000 0.300
Two Patterns 4 1000 4000 128 0.026 0.012 0.018 0.090 0.002 0.000 0.005 0.087
Wafer 2 1000 6174 152 0.009 0.003 0.003 0.005 0.005 0.020 0.012 0.029
Yoga 2 300 3000 426 0.148 0.127 0.125 0.170 0.155 0.164 0.123 0.172
ChlorineConc. 3 467 3840 166 0.382 0.329 0.326 0.350 0.350 0.352 0.377 0.428
CinC\_ECG\_torso 4 40 1380 1639 0.251 0.175 0.197 0.103 0.070 0.349 0.121 0.438
Cricket\_X 12 390 390 300 0.280 0.262 0.260 0.426 0.236 0.223 0.251 0.585
Cricket\_Y 12 390 390 300 0.252 0.238 0.209 0.356 0.197 0.208 0.262 0.654
Cricket\_Z 12 390 390 300 0.254 0.234 0.242 0.380 0.180 0.208 0.295 0.574
DiatomSize 4 16 306 345 0.034 0.037 0.043 0.065 0.065 0.033 0.069 0.173
ECGFiveDays 2 23 861 136 0.173 0.104 0.122 0.203 0.203 0.232 0.167 0.360
FacesUCR 14 200 2050 131 0.048 0.070 0.069 0.231 0.088 0.095 0.098 0.356
Haptics 5 155 308 1092 0.609 0.594 0.555 0.630 0.588 0.623 0.630 0.591
InlineSkate 7 100 550 1882 0.558 0.488 0.525 0.658 0.613 0.616 0.560 0.729
ItalyPowerDemand 2 67 1029 24 0.073 0.063 0.082 0.045 0.045 0.050 0.191 0.101
MALLAT 8 55 2345 1024 0.091 0.101 0.093 0.086 0.086 0.066 0.199 0.153
MedicalImages 10 381 760 99 0.327 0.265 0.249 0.316 0.253 0.263 0.376 0.463
MoteStrain 2 20 1252 84 0.116 0.079 0.084 0.121 0.134 0.165 0.085 0.166
SonyRobot 2 20 601 70 0.241 0.228 0.105 0.305 0.305 0.275 0.351 0.376
SonyRobotII 2 27 953 65 0.114 0.113 0.093 0.141 0.141 0.169 0.201 0.339
StarLightCurves 3 1000 8236 1024 0.108 0.036 0.035 0.151 0.095 0.093 0.103 0.135
Symbols 6 25 995 398 0.087 0.039 0.031 0.100 0.062 0.050 0.034 0.184
TwoLeadECG 2 23 1139 82 0.083 0.018 0.004 0.253 0.132 0.096 0.018 0.257
uWaveGesture\_X 8 896 3582 315 0.225 0.186 0.175 0.261 0.227 0.273 0.254 0.358
uWaveGesture\_Y 8 896 3582 315 0.317 0.251 0.240 0.338 0.301 0.366 0.329 0.493
uWaveGesture\_Z 8 896 3582 315 0.305 0.239 0.227 0.350 0.322 0.342 0.340 0.439
WordsSynonyms 25 267 638 270 0.278 0.268 0.242 0.382 0.252 0.351 0.225 0.553
Thorax1 42 1800 1965 750 0.216 0.198 0.193 0.171 0.185 0.209 * 0.362
Thorax2 42 1800 1965 750 0.179 0.158 0.150 0.120 0.129 0.135 * 0.315

^ There are two implementations of LPS. R implementation is the one provided on this website. MATLAB implementation is also made available. Please read the detailed explanations below.

* indicates that algorithm in the corresponding column did not return any result in 7 days of running time.

The average over 10 replications of LPS is reported in the corresponding column. Three versions of LPS are described as follows:

obs: segments from observed values are used.

obs+diff: segments from observed values and the difference of the values between consecutive time points are used

These two columns are results reported in the original submission to TPAMI. These are from R implementation, you can replicate the results using the R code described below. These results are for the case that we search for the best depth and segment length factor based on the cross-validation on the training data (comparable to NNDTW with Best Warping Window).

random: Depth and segment length parameters are dropped by randomized selection of segment length for each tree and growing a full tree. The only parameter is the number of trees. We also introduced another parameter that provides faster computations. This parameter is the number of segments evaluated for each tree. When it is set large enough, there is no concern about this parameter. This version is implemented in MATLAB which is also described in the codes section below. This version selects random segment lengths between 0.1 * time series length and 0.9 * time series length where the number of trees is 200. For each tree, we considered only 10 segments. The parameters are kept the same for all the datasets which illustrates the robustness of the proposed approach. These numbers have already been hard-coded in the MATLAB implementation described below.

CODES

We implemented LPS as an R package. The source files are provided here in the files section (this is for the submitted version, an R package is submitted to CRAN recently). You need to install R first. Package is compiled and experimented on 64bit Linux environment (Ubuntu 13.04). Recently it was tested on 64bit Windows 8 system (Oct 14, 2013). 

Install it from the local file by  'R CMD INSTALL yourfoldergoeshere/LPS_1.0.tar.gz' . 

Recently, we also made a MATLAB implementation of LPS available. The source file is provided in the same folder mentioned above. This version is a slightly different version than the one implemented with R but the overall idea is the same. 

HOW TO RUN LPS

The details of R implementation are described in the blog entry here.
The details of MATLAB implementation are provided in the blog entry here.

Copyright © 2014 mustafa gokce baydogan

LinkedIn
Twitter
last.fm