TBFY
TENDERS, CLUSTERING

Deviations
Gain chart

Unsupervised analysis is looking for previously undetected patterns in a data, usually those, we are not aware of. Our method is grouping the data into a clusters with k-Means method. This approach helps us to identify commonalities in the data, and finally helps us detect anomalous data points that do not fit into previous identified clusters.
The optimal number of clusters is calculated as the intersection of two linear curves: the first line is a linear regression of initial gain logarithmic values, the second line is a linear regression of last logarithmic gain values.






Dataset:
y dimension:
 
    x dimension:



CONTACT

Jožef Stefan Institute
Jamova cesta 39
1000 Ljubljana


DATA REPOSITORIES

Slovenian procurement data
 
Slovenian transaction data
CREDITS

TheyBuyForYou has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 780247.

Icons made by Becris from www.flaticon.com