Computes all possible silhouette indices from available functions in the package and returns a summary data frame comparing crisp, fuzzy, and median silhouette values across different methods.
Usage
calSilhouette(
prox_matrix = NULL,
proximity_type = c("dissimilarity", "similarity"),
prob_matrix = NULL,
a = 2,
print.summary = FALSE,
clust_fun = NULL,
...
)Arguments
- prox_matrix
A numeric matrix where rows represent observations and columns represent proximity measures (e.g., distances or similarities) to clusters. Typically, this is a membership or dissimilarity matrix from clustering results. If
clust_funis provided,prox_matrixshould be the name of the matrix component as a string (e.g., ifclust_fun = fcmfrom ppclust package theprox_matrix = "d").- proximity_type
Character string specifying the type of proximity measure in
prox_matrix. Options are"similarity"(higher values indicate closer proximity) or"dissimilarity"(lower values indicate closer proximity). Defaults to"dissimilarity".- prob_matrix
A numeric matrix of cluster membership probabilities, where rows represent observations and columns represent clusters (depending on
prob_type). Ifclust_funis provided,prob_matrixcan be given as the name of the matrix component (e.g.,"u"for thefcmfunction). Defaults toNULL.- a
Numeric value controlling the fuzzifier or weight scaling in fuzzy silhouette averaging. Higher values increase the emphasis on strong membership differences. Must be positive. Defaults to
2.- print.summary
Logical; if
TRUE, prints a summary table of average silhouette widths and sizes for each cluster. Defaults toFALSE.- clust_fun
Optional S3 or S4 function object or function as character string specifying a clustering function that produces the proximity measure matrix. For example,
fcmor"fcm". If provided,prox_matrixmust be the name of the matrix component in the clustering output (e.g.,"d"forfcmwhenproximity_type = "dissimilarity"). Defaults toNULL.- ...
Additional arguments passed to
clust_fun, such asx,centersforfcm.
Value
A data frame with the following columns:
- Method
Character vector of method names
- Crisp_Silhouette
Numeric vector of crisp (unweighted) average silhouette values
- Fuzzy_Silhouette
Numeric vector of fuzzy (weighted) average silhouette values (NA if
prob_matrixis not available for the method)- Median_Silhouette
Numeric vector of median silhouette values
Details
This function computes all available silhouette methods from the package and returns a comparative summary. The methods included depend on the available input matrices:
If prox_matrix is available:
medoid- Medoid-based silhouette usingSilhouettepac- PAC-based silhouette usingSilhouette
If prob_matrix is available:
pp_pac- Posterior probabilities silhouette with PAC method usingsoftSilhouettepp_medoid- Posterior probabilities silhouette with Medoid method usingsoftSilhouettenlpp_pac- Negative log posterior probabilities silhouette with PAC method usingsoftSilhouettenlpp_medoid- Negative log posterior probabilities silhouette with Medoid method usingsoftSilhouettepd_pac- Probability distribution silhouette with PAC method usingsoftSilhouettepd_medoid- Probability distribution silhouette with Medoid method usingsoftSilhouettecer- Certainty-based silhouette usingcerSilhouettedb- Density-based silhouette usingdbSilhouette
At least one of prox_matrix or prob_matrix must be provided.
Examples
if (requireNamespace("ppclust", quietly = TRUE)) {
# Example with FCM clustering
library(ppclust)
data(iris)
fcm_result <- fcm(iris[, -5], centers = 3)
# Using matrices directly
summary_result <- calSilhouette(
prox_matrix = fcm_result$d,
prob_matrix = fcm_result$u,
proximity_type = "dissimilarity",
print.summary = TRUE
)
}
#>
#> Summary of All Silhouette Methods
#> ==========================================
#> Method Crisp Fuzzy Median
#> medoid 0.8288288 0.9297577 0.9179945
#> pac 0.7541271 0.8799106 0.8484197
#> pp_pac 0.7541271 0.8799106 0.8484197
#> pp_medoid 0.8288288 0.9297577 0.9179945
#> nlpp_pac 0.8185224 0.9304926 0.9261387
#> nlpp_medoid 0.8749545 0.9604788 0.9616532
#> pd_pac 0.7469894 0.8786105 0.8562152
#> pd_medoid 0.8196600 0.9280044 0.9225377
#> cer 0.8572481 0.9248327 0.9041529
#> db 0.3303585 0.4206051 0.3097745
if (requireNamespace("ppclust", quietly = TRUE)) {
# Using clustering function
summary_result2 <- calSilhouette(
prox_matrix = "d",
prob_matrix = "u",
proximity_type = "dissimilarity",
clust_fun = ppclust::fcm,
x = iris[, -5],
centers = 3,
print.summary = TRUE
)
}
#>
#> Summary of All Silhouette Methods
#> ==========================================
#> Method Crisp Fuzzy Median
#> medoid 0.8288288 0.9297577 0.9179945
#> pac 0.7541271 0.8799106 0.8484197
#> pp_pac 0.7541271 0.8799106 0.8484197
#> pp_medoid 0.8288288 0.9297577 0.9179945
#> nlpp_pac 0.8185224 0.9304926 0.9261387
#> nlpp_medoid 0.8749545 0.9604788 0.9616532
#> pd_pac 0.7469894 0.8786105 0.8562152
#> pd_medoid 0.8196600 0.9280044 0.9225377
#> cer 0.8572481 0.9248327 0.9041529
#> db 0.3303585 0.4206051 0.3097745
