This function optimises the number of elements per window that is used in calculate_expression_similarity_counts, by requiring the distribution of correlations/distances to stabilise to a uniform distribution. The Jensen-Shannon divergence is used to assess the stability.

optimise_window_length(
  expression.matrix,
  similarity.measure = "correlation_pearson",
  window.length.min = NULL,
  window.length.max = NULL,
  window.length.by = NULL,
  n.step.fraction = 0.05,
  iteration.number = 50,
  minimum.similar.windows = 3,
  save.plot = NULL
)

Arguments

expression.matrix

expression matrix, can be normalized or not

similarity.measure

one of the correlation or distance metrics to be used, defaults to pearson correlation; list of all methods in get_methods_correlation_distance

window.length.min, window.length.max, window.length.by

definition of the parameter search space; default is between 1% and 33% of the number of rows in the expression matrix, incremented by 1%

n.step.fraction

step size to slide across, as a fraction of the window length; default is 5%

iteration.number

number of iterations for the subsampling and calculation of JSE; subsampling is needed because shorter windows have fewer points; default is 100

minimum.similar.windows

number of windows that a window needs to be similar to (including itself) in order to be accepted as optimal; default is 3, but can be reduced to 2 if no optimum is found

save.plot

name of the pdf in which to print the output plot showing the distribution of JSE by window; output to the console by default

Value

A single value of the optimal number of elements per window. If no optimal value was found, this function returns NULL.

Examples

optimise_window_length( matrix(1:100+runif(100), ncol=5, byrow=TRUE), window.length.min=3, window.length.max=5, iteration.number=5 )
#> Window length optimisation
#> number of windows: 3
#> minimum window: 3
#> maximum window: 5
#> window step: 1
#> # of iterations: 5
#> minimum similar windows: 3
#> Calculating expression summary and JSE for each window...
#> Performing t-tests...
#> The number of similar windows found for each window (including itself) were:
#> 1 1 1
#> Optimal window not found, consider less stringent parameters, e.g.
#> > reducing minimum.smaller.windows
#> > reducing iteration.number
#> > reducing window.length.by
#> NULL