How to Find Optimal Epsilon Value for DBSCAN Clustering?




  • DBSCAN has three hyperparameters:
    1. Epsilon: two points are considered neighbors if they are closer than Epsilon.
    2. min_samples: Min neighbors for a point to be classified as a core point.
    3. The distance metric. We can use the Elbow Curve to find an optimal value of Epsilon:

    Set k as the min_samples hyperparameter. For every data point, plot the distance to its kth nearest neighbor (in increasing order). The optimal value of Epsilon is found near the elbow point. (View Highlight)