SuperSCC.clustering.sub_consensus_cluster
- SuperSCC.clustering.sub_consensus_cluster(data, data2, n_components=30, resolution=0.2, class_weight='balanced', n_features_to_select=0.15, ratio_of_none_zero_counts=0.1, ep_cut_off=1, pct_cut_off=0.5, num_pos_genes=10, n_neighbors=100, robust=True, n_jobs=-1, file_name=None, save=True, logger=None, **kwargs)[source]
A function to merge sub clusters in different global clusters and find markers for corresponding sub cluster.
- Parameters:
data – A log-normalized expression matrix. Rows are cells; Columns are features.
data2 – A dict returned by running global_consensus_cluster function.
n_components – A int to decide how many principle components will be used for KNN clustering. Default is 30.
resolution – A int to control the coarseness of the clustering. Higher values lead to more clusters. Default is 0.2.
class_weight – A string to decide whether class weights will be considered. If None, all classes are supposed to have weight one. The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)). Default is ‘balanced’.
n_features_to_select – A int or float to control the number of features to select. If integer, the parameter is the absolute number of features to select. If float between 0 and 1, it is the fraction of features to select. Default is 0.15.
ep_cut_off – A float value to control the selection criteria of positive markers. Default is 1.
pct_cut_off – A float value to control the selection criteria of positibe markers. Default is 0.5.
num_pos_genes – A int to decide the cutoff for positive markers. If the number of positive markers in a cluster is above this number, it will not be merged. Default is 10.
n_neighbors – A int to control the number of neighborhoods being considered for merging. Default is 100.
robust – A Bool value to decide whether re find the markers of each cluster after cluster merging. Default is true.
filename – A string to control the name of output. Default is None.
save – A Bool value to decide whether write the output into the disk. Default is true.
n_jobs – A int to decide the number of thread used for the program. Default is -1, meaning using all available threads.
logger – A log_file object. Default is None.
**kwargs – Other paremeters passed to feature_selection function.