Skip to contents

Computes an extended silhouette width for multi-way clustering (e.g., biclustering, triclustering, or n-mode tensor clustering) by combining silhouette widths from a list of Silhouette objects, each representing one mode of clustering. The extended silhouette width is the weighted average of the average silhouette widths from each mode, weighted by the number of observations in each mode's silhouette analysis. The output is an object of class extSilhouette.

Usage

extSilhouette(sil_list, dim_names = NULL, print.summary = FALSE)

Arguments

sil_list

A list of objects of class "Silhouette", typically the output of Silhouette or softSilhouette, where each object represents the silhouette analysis for one mode of multi-way clustering (e.g., rows, columns, or other dimensions in biclustering or tensor clustering).

dim_names

An optional character vector of dimension names (e.g., c("Rows", "Columns")). If NULL, defaults to "Mode 1", "Mode 2", etc.

print.summary

Logical; if TRUE, prints a summary of the extended silhouette width and dimension table. Default is FALSE.

Value

A list of class "extSilhouette" with the following components:

ext_sil_width

A numeric scalar representing the extended silhouette width.

dim_table

A data frame with columns dimension (e.g., "Mode 1", "Mode 2"), n_obs (number of observations), and avg_sil_width (average silhouette width for each mode).

Details

The extended silhouette width is computed as: $$ ExS = \frac{ \sum (n_i \cdot w_i) }{ \sum n_i } $$ where \(n_i\) is the number of observations in mode \(i\) (derived from nrow(x$widths)), and \(w_i\) is the average silhouette width for that mode (from x$avg.width). Each Silhouette object in sil_list must contain a non-empty widths data frame and a numeric avg.width value. Modes with zero observations (\(n_i = 0\)) are not allowed, as they would result in an undefined weighted average. For consistency make sure all Silhouette objects derived from same method and arguments.

References

Schepers, J., Ceulemans, E., & Van Mechelen, I. (2008). Selecting among multi-mode partitioning models of different complexities: A comparison of four model selection criteria. Journal of Classification, 25(1), 67–85. doi:10.1007/s00357-008-9005-9

Bhat Kapu, S., & Kiruthika, C. (2025). Block Probabilistic Distance Clustering: A Unified Framework and Evaluation. PREPRINT (Version 1) available at Research Square. doi:10.21203/rs.3.rs-6973596/v1

Examples

# Example using iris dataset with two modes
data(iris)
# \donttest{
if (requireNamespace("blockcluster", quietly = TRUE)) {
  library(blockcluster)
  result <- coclusterContinuous(
    as.matrix(iris[, -5]),
    nbcocluster = c(3, 2)
  )
} else {
  message("Install 'blockcluster': install.packages('blockcluster')")
}
#> Loading required package: rtkore
#> Loading required package: Rcpp
#> 
#> Attaching package: 'rtkore'
#> The following object is masked from 'package:Rcpp':
#> 
#>     LdFlags
#> blockcluster version 4.5.5 loaded
#> 
#> ----------------
#> Copyright (C)  <MODAL team @INRIA,Lille & U.M.R. C.N.R.S. 6599 Heudiasyc, UTC>
#> Please post questions and bugs at: <https://gforge.inria.fr/forum/forum.php?forum_id=11190&group_id=3679>
#> Co-Clustering successfully terminated! 

if (requireNamespace("blockcluster", quietly = TRUE)) {
  sil_mode1 <- softSilhouette(
    prob_matrix = result@rowposteriorprob,
    method = "pac")
  sil_mode2 <- softSilhouette(
    prob_matrix = result@colposteriorprob,
    method = "pac"
    )

  # Extended silhouette
  ext_sil <- extSilhouette(list(sil_mode1, sil_mode2),print.summary = TRUE)
}
#> ---------------------------
#> Extended silhouette: 0.9325 
#> ---------------------------
#> 
#> Dimension Summary:
#>   dimension n_obs avg_sil_width
#> 1    Mode 1   150        0.9307
#> 2    Mode 2     4        1.0000
#> 
#> Available components:
#> [1] "ext_sil_width" "dim_table"    
# }