Unsupervised QPC - Clusterization
Clustering with Data Relabelling
Indeks UQPC:
<latex> UQPC(\vec{w})=\sum_{i=1}^n\sum_{j=1}^k \alpha_{ij} G\left( \vec{w}(\vec{x}_i - \vec{t}_j) \right) </latex>
Współczynniki <latex>\alpha_{ij}</latex> zależą od odległości wektora <latex>\vec{x}_i</latex> od prototypu (po projekcji na kierunek <latex>\vec{w}</latex>).
Wektorom przypisywane są etykiety najbliższego prototypu, następnie obliczenie indeksu dokonywane jest standardową metodą QPC.
<latex> \alpha_{ij}>0 \qquad \text{if} \qquad \vec{t}_j : j=\arg \min_l{|\vec{w}(\vec{x}_i-\vec{t}_l) |} </latex>
<latex> \alpha_{ij}<0 \qquad \text{if} \qquad \vec{t}_j : j\ne\arg \min_l{|\vec{w}(\vec{x}_i-\vec{t}_l) |} </latex>
Inna wersja może uwzględniać pozycje prototypów w oryginalnej przestrzeni <latex>R^n</latex>, np.:
<latex> \alpha_{ij}>0 \qquad \text{if} \qquad \vec{t}_j : j=\arg \min_l{||\vec{x}_i-\vec{t}_l ||} </latex>
<latex> \alpha_{ij}<0 \qquad \text{if} \qquad \vec{t}_j : j\ne\arg \min_l{||\vec{x}_i-\vec{t}_l ||} </latex>
Testy na prostych danych
Config : przykładowa konfiguracja. Jednymy zmienianym parametrem w tych testach była liczba prototypów K.
>> [w q p]=uqpc_train(data,2,'dataName','iris') uqpc_parameters = K: 2 uqpc_initiations: 10 qpc_parameters = beta: 0.1000 checkPeriod: 5 dataName: 'dataname' directions: 2 display: 'none' eps: 1.0000e-03 function: 'gauss' indGmax: [] initiations: 10 initWeights: [] killPeriod: 10 killRatio: 0.5000 lambda: 0.1000 learningRate: 0.1000 log: 'off' logFileName: [] maxIterations: 1000 multistart: 'no' OptConf: [] OptMethod: 'gd' orthogonalizationMethod: 'projection' ortoWeights: [] plot: 'none' plr: 0.1000 prototypes: [] QPCMethod: 'uqpc1' save: 'none' savedir: [] stopCriterium: 2
Gauss2 - Pierwsze dwie projekcje
Sztuczne dane zawierające wektory z rozkładu normalnego
Features: 3 Instances: 400 Source: artificial data
Description: two Gaussian clusters (no overlaping). 200 vectors drawn with distribution N([-1 -1 -1];[0.4 0.4 0.4]) and another 200 using distribution N([+1 +1 +1];[0.4 0.4 0.4]).
w = -0.6774 -0.1991 -0.7081 0.1694 -0.9790 0.1133 qpc = 0.6528 0.7564 prototypes = -1.6026 -0.8391 1.0000 1.5947 0.8877 2.0000
w = -0.6594 -0.0136 -0.7517 -0.2458 -0.9410 0.2326 qpc = 0.7164 0.7636 prototypes = -1.6482 -1.4852 1.0000 -0.9964 -0.6550 2.0000 1.4212 1.0054 3.0000
w = -0.8863 -0.4130 -0.2097 0.4008 -0.4568 -0.7942 qpc = 0.7706 0.7472 prototypes = -2.0251 -0.5340 1.0000 -1.4904 -1.2452 2.0000 -0.9402 0.6215 3.0000 1.5231 1.3934 4.0000
w = -0.7445 -0.5505 -0.3776 0.6299 -0.3918 -0.6706 qpc = 0.7945 0.7679 prototypes = -2.2472 -0.0997 1.0000 -1.7423 -1.2027 2.0000 -1.2746 -0.7011 3.0000 -0.9240 0.5470 4.0000 1.6828 1.0826 5.0000
w = -0.6915 -0.4394 -0.5734 0.4090 -0.8924 0.1905 qpc = 0.8374 0.4182 prototypes = -2.6780 -1.1170 1.0000 -2.2099 -1.9602 2.0000 -1.7416 -1.9625 3.0000 -1.3012 -1.9600 4.0000 -0.9648 -0.6108 5.0000 -0.6844 0.0602 6.0000 1.7147 0.7292 7.0000
Gauss3a - Pierwsze dwie projekcje
Sztuczne dane zawierające wektory z rozkładu normalnego
Features: 3 Instances: 600 Source: artificial data
Description: three Gaussian clusters (no overlaping and week overlaping). 400 vectors identical as in Gauss2 data.
Additional 200 vectors drawn with distribution N([0 3 3];[1 1 1]).
w = -0.7692 -0.1555 -0.6198 0.5675 -0.6122 -0.5506 qpc = 0.8484 0.8040 prototypes = -0.0668 -0.6088 1.0000 0.9765 0.4789 2.0000
w = -0.1569 -0.6702 -0.7254 -0.9876 0.1003 0.1210 qpc = 0.8547 0.7066 prototypes = -0.5836 0.8915 1.0000 0.2209 -0.2944 2.0000 1.1085 0.4466 3.0000
w = -0.2446 -0.7543 -0.6092 -0.9695 0.1826 0.1632 qpc = 0.8474 0.7416 prototypes = -0.8046 -0.3162 1.0000 -0.3981 0.4073 2.0000 0.2201 -0.7951 3.0000 1.1300 0.9064 4.0000
w = -0.2985 -0.7124 -0.6351 0.9475 -0.1412 -0.2870 qpc = 0.8501 0.7457 prototypes = -1.0304 -0.7514 1.0000 -0.7630 0.7096 2.0000 -0.3741 -1.0696 3.0000 0.2116 -0.3223 4.0000 1.1433 0.3275 5.0000
Iris - Pierwsze dwie projekcje
w = 0.1368 0.1681 -0.9666 0.1369 -0.3418 0.5856 -0.0504 -0.7333 q = 0.9010 0.8841 p = -0.3124 -0.4709 1.0000 0.6586 1.0459 2.0000
w = 0.1986 0.1252 -0.8541 -0.4640 -0.6044 0.2767 0.2770 -0.6939 q = 0.8516 0.7963 p = -0.7688 -0.6718 1.0000 -0.0887 -0.0270 2.0000 1.0320 0.8203 3.0000
w = -0.0979 -0.0137 -0.8268 -0.5538 -0.7768 -0.6135 0.1341 -0.0477 q = 0.8172 0.7590 p = -0.9725 -0.9724 1.0000 -0.3340 0.4432 2.0000 0.1701 0.9497 3.0000 1.2426 -0.2903 4.0000
Gauss2n2 - Pierwsze dwie projekcje
Sztuczne dane zawierające wektory z roskładu normalnego oraz jednostajny szum.
Features: 4 Instances: 600 Source: artificial data
Description: two Gaussian clusters (weak overlapping) and uniform noise.
Feature 1 and 2 was drawn with distribution N(-1.3,1) and N(+1.3,1).
Feature 2 and 4 was drawn from uniform distribution with range [-4,+4].
w = -0.5523 -0.0285 -0.8326 -0.0303 0.1305 -0.0569 -0.0487 -0.9886 q = 0.7557 0.7203 p = -0.5532 -0.6354 1.0000 0.4429 0.6202 2.0000
w = -0.5920 -0.5247 -0.6050 -0.0910 -0.1031 0.0999 0.1612 -0.9764 q = 0.7224 0.7346 p = -0.7178 -0.7449 1.0000 0.0330 -0.0118 2.0000 0.7115 0.7362 3.0000
w = -0.1100 -0.9853 -0.0730 -0.1085 -0.3473 0.1587 -0.3430 -0.8582
q = 0.7451 0.7415 p = -0.8747 -0.9069 1.0000 -0.3419 -0.2859 2.0000 0.2004 0.3974 3.0000 0.7912 0.9605 4.0000
w = -0.4693 -0.8418 -0.2383 -0.1195 -0.3406 0.3571 -0.1619 -0.8545 q = 0.7598 0.7474 p = -1.1052 -0.5754 1.0000 -0.7851 -1.0211 2.0000 -0.2884 -0.0100 3.0000 0.3211 0.5271 4.0000 0.8709 1.0255 5.0000
Notatki
* Problem z pozycjami prototypów przy składaniu projekcji. Metoda klasteryzacji za pomocą prototypów wymaga dopracowania. Czy taki sposób wyznacania kastrów ma sens? Marek Grochowski 2011/02/04 11:28