Module explosig_connect.connection
Classes
class Connection-
Represents a connection to an ExploSig session, with functions for transforming and sending data.
Subclasses
Methods
def send_sample_metadata(self, df)-
Send a dataframe containing sample metadata values to ExploSig.
Parameters
df:pandas.DataFrame- Dataframe with index of sample IDs. Columns are metadata variables.
The following are recognized column names: {
Study,Donor}.
def send_mutation_type_counts(self, df)-
Send a dataframe containing mutation count values by mutation type to ExploSig.
Parameters
df:pandas.DataFrame- Dataframe with index of sample IDs. Columns are mutation types (
SBS,DBS,INDEL).
def send_signatures(self, mut_type, df, prob_max=None)-
Send a dataframe containing signatures to ExploSig.
Parameters
mut_type:str- The mutation type corresponding to this set of signatures (
SBS,DBS,INDEL). df:pandas.DataFrame- Dataframe with index of signature names. Columns are mutation categories (
A[C>A]A, etc.). prob_max:Noneor'auto', optional- How to compute the maximum y-value of signature plots.
If
None, defaults to0.2. If'auto', set to the maximum value in the matrix. by defaultNone
def send_exposures(self, mut_type, df, send_sigs=False)-
Send a dataframe containing exposures to ExploSig.
Parameters
mut_type:str- The mutation type corresponding to this set of signatures (
SBS,DBS,INDEL). df:pandas.DataFrame- Dataframe with index of sample IDs. Columns are signature names.
send_sigs:bool, optional- Whether to also send signature names with the exposures.
Useful if not intending to call
send_signatures(). by defaultFalse
def send_clinical_data(self, df, types={}, scales={})-
Send a dataframe containing clinical data.
Parameters
df:pandas.DataFrame- Dataframe with index of sample IDs. Columns are clinical variables.
types:dict, optional- A dict mapping column names to data types ('continuous' or 'categorical'), by default {} If a column name is not found in the dict, it is assumed that numeric columns are continuous and string columns are categorical.
scales:dict, optional- A dict mapping column names to scale domains, by default {} If a column name is not found in the dict, it is assumed that categorical column scales are simply a list of unique elements and continuous column scales are [min, max].
def send_gene_mutation_data(self, df)-
Send a dataframe containing gene mutation data.
Parameters
df:pandas.DataFrame- Dataframe with index of sample IDs. Columns are gene IDs.
def send_gene_expression_data(self, df)-
Send a dataframe containing gene expression data.
Parameters
df:pandas.DataFrame- Dataframe with index of sample IDs. Columns are gene IDs.
def send_copy_number_data(self, df)-
Send a dataframe containing copy number data.
Parameters
df:pandas.DataFrame- Dataframe with index of sample IDs. Columns are gene IDs.
class ConfigConnection (session_id, token, server_hostname, client_hostname)-
Represents a connection to a previously-configured ExploSig session.
Ancestors
Methods
def get_config(self)-
Get the current data configuration as a
dict.Returns
dictA dictionary containing the selected samples, signatures, clinical variables, and genes. def get_mutation_type_counts(self)-
Get the counts by mutation type dataframe associated with the current config.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and mutation types (SBS, DBS, INDEL) on the columns. Values are counts. def get_mutation_category_counts(self, mut_type)-
Get the counts by mutation category dataframe (for a particular mutation type) associated with the current config.
Parameters
mut_type:str- One of {
'SBS','DBS','INDEL'}.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and mutation categories on the columns. Values are counts. def get_clinical_data(self)-
Get the clinical data dataframe associated with the current config.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and clinical variables on the columns. def get_gene_mutation_data(self)-
Get a dataframe containing mutation classes associated with the current config.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and genes on the columns. Values are mutation classes. def get_gene_expression_data(self)-
Get a dataframe containing gene expression values associated with the current config.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and genes on the columns. Values are gene expression classes. def get_copy_number_data(self)-
Get a dataframe containing copy number values associated with the current config.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and genes on the columns. Values are copy number classes. def get_exposures(self, mut_type, tricounts_method=None)-
Get the sample by signature exposures dataframe (for a particular mutation type) associated with the current config.
Parameters
mut_type:str- One of {
'SBS','DBS','INDEL'}. tricounts_method:str, optional- One of {
'By Study','None'}. Whether or not to normalize trinucleotides by frequency (based on sequencing strategy of each selected cohort). By default,'None'.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and signature names on the columns. Values are counts (exposures).
Inherited members
class EmptyConnection (session_id, token, server_hostname, client_hostname)-
Represents a connection to an "empty" ExploSig session.
Ancestors
Methods
def open(self, how='auto')-
Attempts to open the session URL in a browser. Calls
webbrowser.openifhow == 'browser'. Outputs JavaScript ifhow == 'nb_js'. Outputs HTML ifhow == 'nb_link'. Otherwise, simply prints the URL.Parameters
how:str, optional- One of {
'auto','nb_js','nb_link','browser'}, by default'auto'
def get_mutation_type_counts(self, projects)-
Get the counts by mutation type dataframe associated with the current config.
Parameters
projects:listofstr- A list of sample cohort IDs.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and mutation types (SBS, DBS, INDEL) on the columns. Values are counts. def get_mutation_category_counts(self, mut_type, projects)-
Get a mutation count dataframe (for a particular mutation type and set of sequencing projects).
Parameters
mut_type:str- One of {
'SBS','DBS','INDEL'}. projects:listofstr- A list of sample cohort IDs.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and mutation categories on the columns. Values are counts. def get_clinical_data(self, projects)-
Get a clinical data dataframe (for a particular set of sequencing projects).
Parameters
projects:listofstr- A list of sample cohort IDs.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and clinical variables on the columns. def get_gene_mutation_data(self, genes, projects)-
Get a dataframe containing mutation classes (for a particular set of genes and set of sequencing projects).
Parameters
genes:listofstr- A list of gene IDs.
projects:listofstr- A list of sample cohort IDs.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and genes on the columns. Values are mutation classes. def get_gene_expression_data(self, genes, projects)-
Get a dataframe containing gene expression values (for a particular set of genes and set of sequencing projects).
Parameters
genes:listofstr- A list of gene IDs.
projects:listofstr- A list of sample cohort IDs.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and genes on the columns. Values are gene expression classes. def get_copy_number_data(self, genes, projects)-
Get a dataframe containing copy number values (for a particular set of genes and set of sequencing projects).
Parameters
genes:listofstr- A list of gene IDs.
projects:listofstr- A list of sample cohort IDs.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and genes on the columns. Values are copy number classes. def get_exposures(self, projects, signatures, mut_type, tricounts_method=None)-
Get the sample by signature exposures dataframe (for a particular mutation type) associated with the current config.
Parameters
projects:listofstr- A list of sample cohort IDs.
signatures:listofstr- A list of signature names.
mut_type:str- One of {
'SBS','DBS','INDEL'}. tricounts_method:str, optional- One of {
'By Study','None'}. Whether or not to normalize trinucleotides by frequency (based on sequencing strategy of each selected cohort). By default,'None'.
Returns
pandas.DataFrameA dataframe with sample IDs on the index and signature names on the columns. Values are counts (exposures).
Inherited members