Module explosig_connect.connection
Classes
class Connection
-
Represents a connection to an ExploSig session, with functions for transforming and sending data.
Subclasses
Methods
def send_sample_metadata(self, df)
-
Send a dataframe containing sample metadata values to ExploSig.
Parameters
df
:pandas.DataFrame
- Dataframe with index of sample IDs. Columns are metadata variables.
The following are recognized column names: {
Study
,Donor
}.
def send_mutation_type_counts(self, df)
-
Send a dataframe containing mutation count values by mutation type to ExploSig.
Parameters
df
:pandas.DataFrame
- Dataframe with index of sample IDs. Columns are mutation types (
SBS
,DBS
,INDEL
).
def send_signatures(self, mut_type, df, prob_max=None)
-
Send a dataframe containing signatures to ExploSig.
Parameters
mut_type
:str
- The mutation type corresponding to this set of signatures (
SBS
,DBS
,INDEL
). df
:pandas.DataFrame
- Dataframe with index of signature names. Columns are mutation categories (
A[C>A]A
, etc.). prob_max
:None
or'auto'
, optional- How to compute the maximum y-value of signature plots.
If
None
, defaults to0.2
. If'auto'
, set to the maximum value in the matrix. by defaultNone
def send_exposures(self, mut_type, df, send_sigs=False)
-
Send a dataframe containing exposures to ExploSig.
Parameters
mut_type
:str
- The mutation type corresponding to this set of signatures (
SBS
,DBS
,INDEL
). df
:pandas.DataFrame
- Dataframe with index of sample IDs. Columns are signature names.
send_sigs
:bool
, optional- Whether to also send signature names with the exposures.
Useful if not intending to call
send_signatures()
. by defaultFalse
def send_clinical_data(self, df, types={}, scales={})
-
Send a dataframe containing clinical data.
Parameters
df
:pandas.DataFrame
- Dataframe with index of sample IDs. Columns are clinical variables.
types
:dict
, optional- A dict mapping column names to data types ('continuous' or 'categorical'), by default {} If a column name is not found in the dict, it is assumed that numeric columns are continuous and string columns are categorical.
scales
:dict
, optional- A dict mapping column names to scale domains, by default {} If a column name is not found in the dict, it is assumed that categorical column scales are simply a list of unique elements and continuous column scales are [min, max].
def send_gene_mutation_data(self, df)
-
Send a dataframe containing gene mutation data.
Parameters
df
:pandas.DataFrame
- Dataframe with index of sample IDs. Columns are gene IDs.
def send_gene_expression_data(self, df)
-
Send a dataframe containing gene expression data.
Parameters
df
:pandas.DataFrame
- Dataframe with index of sample IDs. Columns are gene IDs.
def send_copy_number_data(self, df)
-
Send a dataframe containing copy number data.
Parameters
df
:pandas.DataFrame
- Dataframe with index of sample IDs. Columns are gene IDs.
class ConfigConnection (session_id, token, server_hostname, client_hostname)
-
Represents a connection to a previously-configured ExploSig session.
Ancestors
Methods
def get_config(self)
-
Get the current data configuration as a
dict
.Returns
dict
A dictionary containing the selected samples, signatures, clinical variables, and genes. def get_mutation_type_counts(self)
-
Get the counts by mutation type dataframe associated with the current config.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and mutation types (SBS, DBS, INDEL) on the columns. Values are counts. def get_mutation_category_counts(self, mut_type)
-
Get the counts by mutation category dataframe (for a particular mutation type) associated with the current config.
Parameters
mut_type
:str
- One of {
'SBS'
,'DBS'
,'INDEL'
}.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and mutation categories on the columns. Values are counts. def get_clinical_data(self)
-
Get the clinical data dataframe associated with the current config.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and clinical variables on the columns. def get_gene_mutation_data(self)
-
Get a dataframe containing mutation classes associated with the current config.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and genes on the columns. Values are mutation classes. def get_gene_expression_data(self)
-
Get a dataframe containing gene expression values associated with the current config.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and genes on the columns. Values are gene expression classes. def get_copy_number_data(self)
-
Get a dataframe containing copy number values associated with the current config.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and genes on the columns. Values are copy number classes. def get_exposures(self, mut_type, tricounts_method=None)
-
Get the sample by signature exposures dataframe (for a particular mutation type) associated with the current config.
Parameters
mut_type
:str
- One of {
'SBS'
,'DBS'
,'INDEL'
}. tricounts_method
:str
, optional- One of {
'By Study'
,'None'
}. Whether or not to normalize trinucleotides by frequency (based on sequencing strategy of each selected cohort). By default,'None'
.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and signature names on the columns. Values are counts (exposures).
Inherited members
class EmptyConnection (session_id, token, server_hostname, client_hostname)
-
Represents a connection to an "empty" ExploSig session.
Ancestors
Methods
def open(self, how='auto')
-
Attempts to open the session URL in a browser. Calls
webbrowser.open
ifhow == 'browser'
. Outputs JavaScript ifhow == 'nb_js'
. Outputs HTML ifhow == 'nb_link'
. Otherwise, simply prints the URL.Parameters
how
:str
, optional- One of {
'auto'
,'nb_js'
,'nb_link'
,'browser'
}, by default'auto'
def get_mutation_type_counts(self, projects)
-
Get the counts by mutation type dataframe associated with the current config.
Parameters
projects
:list
ofstr
- A list of sample cohort IDs.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and mutation types (SBS, DBS, INDEL) on the columns. Values are counts. def get_mutation_category_counts(self, mut_type, projects)
-
Get a mutation count dataframe (for a particular mutation type and set of sequencing projects).
Parameters
mut_type
:str
- One of {
'SBS'
,'DBS'
,'INDEL'
}. projects
:list
ofstr
- A list of sample cohort IDs.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and mutation categories on the columns. Values are counts. def get_clinical_data(self, projects)
-
Get a clinical data dataframe (for a particular set of sequencing projects).
Parameters
projects
:list
ofstr
- A list of sample cohort IDs.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and clinical variables on the columns. def get_gene_mutation_data(self, genes, projects)
-
Get a dataframe containing mutation classes (for a particular set of genes and set of sequencing projects).
Parameters
genes
:list
ofstr
- A list of gene IDs.
projects
:list
ofstr
- A list of sample cohort IDs.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and genes on the columns. Values are mutation classes. def get_gene_expression_data(self, genes, projects)
-
Get a dataframe containing gene expression values (for a particular set of genes and set of sequencing projects).
Parameters
genes
:list
ofstr
- A list of gene IDs.
projects
:list
ofstr
- A list of sample cohort IDs.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and genes on the columns. Values are gene expression classes. def get_copy_number_data(self, genes, projects)
-
Get a dataframe containing copy number values (for a particular set of genes and set of sequencing projects).
Parameters
genes
:list
ofstr
- A list of gene IDs.
projects
:list
ofstr
- A list of sample cohort IDs.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and genes on the columns. Values are copy number classes. def get_exposures(self, projects, signatures, mut_type, tricounts_method=None)
-
Get the sample by signature exposures dataframe (for a particular mutation type) associated with the current config.
Parameters
projects
:list
ofstr
- A list of sample cohort IDs.
signatures
:list
ofstr
- A list of signature names.
mut_type
:str
- One of {
'SBS'
,'DBS'
,'INDEL'
}. tricounts_method
:str
, optional- One of {
'By Study'
,'None'
}. Whether or not to normalize trinucleotides by frequency (based on sequencing strategy of each selected cohort). By default,'None'
.
Returns
pandas.DataFrame
A dataframe with sample IDs on the index and signature names on the columns. Values are counts (exposures).
Inherited members