Snakemake Rule 3: Match observations between modalities#
This rule create matching observations between MSI and Visium and saves into another spaceranger-style object.
Input#
MSI in spaceranger style:
output/[sample]/spaceranger
Parameters#
radius_to_use is used to identify which MSI pixels should be aggregated to which Visium spot (by default ‘visium_expanded’, otherwise ‘visium’ or ‘msi’). These options correspond to (‘visium’) all pixels whose centres are within the true Visium radius of 55 μm, (‘visium_expanded’) all pixels whose centres are within an expanded Visium radius, maximally extended to remove gaps between spots or (‘msi’) all pixels with any overlap with the true Visium spot, calculated by estimating the MSI pixel size after transformation and adding the true Visium radius. The default option is the expanded Visium radius (‘visium_expanded’); however, the true Visium radius (‘visium’) may be a cleaner, more robust choice in the case of a very high resolution MSI dataset since there could be entire pixels which fall outside the true Visium radius, and these should therefore not be used in the aggregation. Alternatively, for MSI datasets of lower resolution than Visium, the user may choose to use the estimated MSI pixel size (‘msi’) instead, as each pixel will contribute to multiple Visium spots; this approach will avoid gaps in coverage where there are Visium spots which overlap with MSI pixels but where the MSI pixel centre lies outside the spot and so the default setting would not link the MSI data for this pixel to the Visium spot.
agg_fn allows the user to select how multiple MSI pixels corresponding to the same Visium barcode are aggregated (‘mean’, ‘sum’ or ‘weighted_average’). To give more detail, these options are (‘mean’) taking the mean intensity across all selected pixels, per peak, (‘sum’) taking the sum of all intensities across all selected pixels, per peak or (‘weighted_average’) weighted average of peak intensities, based on distance of selected pixels from the Visium spot centre.
only_within_tissue specifies whether Visium barcodes should be filtered for only those labelled as within the tissue by the Space Ranger pre-processing
verbose determines how much information about different stages of the process is delivered to the user
Output#
MSI data matched with Visium spots in spaceranger style:
output/[sample]/spaceranger_aggregatedCorrespondence between MSI pixels and Visium spots
output/[sample]/matched_Visium_MSI_IDs.csv
Code (from Snakemake file)#
rule create_barcode_matrix:
message:
"Generating aggregated data."
conda: 'magpie'
input:
"output/{sample}/spaceranger/filtered_feature_bc_matrix.h5"
output:
"output/{sample}/spaceranger_aggregated/filtered_feature_bc_matrix.h5"
params:
sample = "{sample}",
radius_to_use = 'visium_expanded',
agg_fn = 'mean',
verbose = True,
only_within_tissue = False
script:
"scripts/create_perbarcode_matrix.py"