Module `explosig_connect.connection`

Browse git

Classes

class Connection

Represents a connection to an ExploSig session, with functions for transforming and sending data.

Browse git

Subclasses

ConfigConnection
EmptyConnection

Methods

def send_sample_metadata(self, df)

Send a dataframe containing sample metadata values to ExploSig.

Parameters

df : pandas.DataFrame: Dataframe with index of sample IDs. Columns are metadata variables. The following are recognized column names: {Study, Donor}.

Browse git

def send_mutation_type_counts(self, df)

Send a dataframe containing mutation count values by mutation type to ExploSig.

Parameters

df : pandas.DataFrame: Dataframe with index of sample IDs. Columns are mutation types (SBS, DBS, INDEL).

Browse git

def send_signatures(self, mut_type, df, prob_max=None)

Send a dataframe containing signatures to ExploSig.

Parameters

mut_type : str: The mutation type corresponding to this set of signatures (SBS, DBS, INDEL).
df : pandas.DataFrame: Dataframe with index of signature names. Columns are mutation categories (A[C>A]A, etc.).
prob_max : None or 'auto', optional: How to compute the maximum y-value of signature plots. If None, defaults to 0.2. If 'auto', set to the maximum value in the matrix. by default None

Browse git

def send_exposures(self, mut_type, df, send_sigs=False)

Send a dataframe containing exposures to ExploSig.

Parameters

mut_type : str: The mutation type corresponding to this set of signatures (SBS, DBS, INDEL).
df : pandas.DataFrame: Dataframe with index of sample IDs. Columns are signature names.
send_sigs : bool, optional: Whether to also send signature names with the exposures. Useful if not intending to call send_signatures(). by default False

Browse git

def send_clinical_data(self, df, types={}, scales={})

Send a dataframe containing clinical data.

Parameters

df : pandas.DataFrame: Dataframe with index of sample IDs. Columns are clinical variables.
types : dict, optional: A dict mapping column names to data types ('continuous' or 'categorical'), by default {} If a column name is not found in the dict, it is assumed that numeric columns are continuous and string columns are categorical.
scales : dict, optional: A dict mapping column names to scale domains, by default {} If a column name is not found in the dict, it is assumed that categorical column scales are simply a list of unique elements and continuous column scales are [min, max].

Browse git

def send_gene_mutation_data(self, df)

Send a dataframe containing gene mutation data.

Parameters

df : pandas.DataFrame: Dataframe with index of sample IDs. Columns are gene IDs.

Browse git

def send_gene_expression_data(self, df)

Send a dataframe containing gene expression data.

Parameters

df : pandas.DataFrame: Dataframe with index of sample IDs. Columns are gene IDs.

Browse git

def send_copy_number_data(self, df)

Send a dataframe containing copy number data.

Parameters

df : pandas.DataFrame: Dataframe with index of sample IDs. Columns are gene IDs.

Browse git

class ConfigConnection (session_id, token, server_hostname, client_hostname)

Represents a connection to a previously-configured ExploSig session.

Browse git

Ancestors

Connection

Methods

def get_config(self)

Get the current data configuration as a dict.

Returns

dict A dictionary containing the selected samples, signatures, clinical variables, and genes.

Browse git

def get_mutation_type_counts(self)

Get the counts by mutation type dataframe associated with the current config.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and mutation types (SBS, DBS, INDEL) on the columns. Values are counts.

Browse git

def get_mutation_category_counts(self, mut_type)

Get the counts by mutation category dataframe (for a particular mutation type) associated with the current config.

Parameters

mut_type : str: One of {'SBS', 'DBS', 'INDEL'}.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and mutation categories on the columns. Values are counts.

Browse git

def get_clinical_data(self)

Get the clinical data dataframe associated with the current config.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and clinical variables on the columns.

Browse git

def get_gene_mutation_data(self)

Get a dataframe containing mutation classes associated with the current config.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and genes on the columns. Values are mutation classes.

Browse git

def get_gene_expression_data(self)

Get a dataframe containing gene expression values associated with the current config.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and genes on the columns. Values are gene expression classes.

Browse git

def get_copy_number_data(self)

Get a dataframe containing copy number values associated with the current config.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and genes on the columns. Values are copy number classes.

Browse git

def get_exposures(self, mut_type, tricounts_method=None)

Get the sample by signature exposures dataframe (for a particular mutation type) associated with the current config.

Parameters

mut_type : str: One of {'SBS', 'DBS', 'INDEL'}.
tricounts_method : str, optional: One of {'By Study', 'None'}. Whether or not to normalize trinucleotides by frequency (based on sequencing strategy of each selected cohort). By default, 'None'.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and signature names on the columns. Values are counts (exposures).

Browse git

Inherited members

Connection:
- send_clinical_data
- send_copy_number_data
- send_exposures
- send_gene_expression_data
- send_gene_mutation_data
- send_mutation_type_counts
- send_sample_metadata
- send_signatures

class EmptyConnection (session_id, token, server_hostname, client_hostname)

Represents a connection to an "empty" ExploSig session.

Browse git

Ancestors

Connection

Methods

def open(self, how='auto')

Attempts to open the session URL in a browser. Calls webbrowser.open if how == 'browser'. Outputs JavaScript if how == 'nb_js'. Outputs HTML if how == 'nb_link'. Otherwise, simply prints the URL.

Parameters

how : str, optional: One of {'auto', 'nb_js', 'nb_link', 'browser'}, by default 'auto'

Browse git

def get_mutation_type_counts(self, projects)

Get the counts by mutation type dataframe associated with the current config.

Parameters

projects : list of str: A list of sample cohort IDs.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and mutation types (SBS, DBS, INDEL) on the columns. Values are counts.

Browse git

def get_mutation_category_counts(self, mut_type, projects)

Get a mutation count dataframe (for a particular mutation type and set of sequencing projects).

Parameters

mut_type : str: One of {'SBS', 'DBS', 'INDEL'}.
projects : list of str: A list of sample cohort IDs.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and mutation categories on the columns. Values are counts.

Browse git

def get_clinical_data(self, projects)

Get a clinical data dataframe (for a particular set of sequencing projects).

Parameters

projects : list of str: A list of sample cohort IDs.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and clinical variables on the columns.

Browse git

def get_gene_mutation_data(self, genes, projects)

Get a dataframe containing mutation classes (for a particular set of genes and set of sequencing projects).

Parameters

genes : list of str: A list of gene IDs.
projects : list of str: A list of sample cohort IDs.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and genes on the columns. Values are mutation classes.

Browse git

def get_gene_expression_data(self, genes, projects)

Get a dataframe containing gene expression values (for a particular set of genes and set of sequencing projects).

Parameters

genes : list of str: A list of gene IDs.
projects : list of str: A list of sample cohort IDs.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and genes on the columns. Values are gene expression classes.

Browse git

def get_copy_number_data(self, genes, projects)

Get a dataframe containing copy number values (for a particular set of genes and set of sequencing projects).

Parameters

genes : list of str: A list of gene IDs.
projects : list of str: A list of sample cohort IDs.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and genes on the columns. Values are copy number classes.

Browse git

def get_exposures(self, projects, signatures, mut_type, tricounts_method=None)

Get the sample by signature exposures dataframe (for a particular mutation type) associated with the current config.

Parameters

projects : list of str: A list of sample cohort IDs.
signatures : list of str: A list of signature names.
mut_type : str: One of {'SBS', 'DBS', 'INDEL'}.
tricounts_method : str, optional: One of {'By Study', 'None'}. Whether or not to normalize trinucleotides by frequency (based on sequencing strategy of each selected cohort). By default, 'None'.

Returns

pandas.DataFrame A dataframe with sample IDs on the index and signature names on the columns. Values are counts (exposures).

Browse git

Inherited members

Connection:
- send_clinical_data
- send_copy_number_data
- send_exposures
- send_gene_expression_data
- send_gene_mutation_data
- send_mutation_type_counts
- send_sample_metadata
- send_signatures