Transform functions¶

Helper functions to process profile data.

cytominer_eval.transform.transform¶

cytominer_eval.transform.transform.get_pairwise_metric(df: pandas.core.frame.DataFrame, similarity_metric: str) → pandas.core.frame.DataFrame¶

cytominer_eval.transform.transform.metric_melt(df: pandas.core.frame.DataFrame, features: List[str], metadata_features: List[str], eval_metric: str = 'replicate_reproducibility', similarity_metric: str = 'pearson') → pandas.core.frame.DataFrame¶

cytominer_eval.transform.transform.process_melt(df: pandas.core.frame.DataFrame, meta_df: pandas.core.frame.DataFrame, eval_metric: str = 'replicate_reproducibility') → pandas.core.frame.DataFrame¶

cytominer_eval.transform.util¶

cytominer_eval.transform.util.assert_eval_metric(eval_metric: str) → None¶

Helper function to ensure that we support the input eval metric

Parameters: eval_metric (str) – The user input eval metric
Returns: Assertion will fail if we don’t support the input eval metric
Return type: None

cytominer_eval.transform.util.assert_melt(df: pandas.core.frame.DataFrame, eval_metric: str = 'replicate_reproducibility') → None¶

Helper function to ensure that we properly melted the pairwise correlation matrix

Downstream functions depend on how we process the pairwise correlation matrix. The processing is different depending on the evaluation metric.

Parameters

df (pandas.DataFrame) – A melted pairwise correlation matrix
eval_metric (str) – The user input eval metric

Returns

Assertion will fail if we incorrectly melted the matrix

Return type

None

cytominer_eval.transform.util.assert_pandas_dtypes(df: pandas.core.frame.DataFrame, col_fix: type = <class 'numpy.float64'>) → pandas.core.frame.DataFrame¶

Helper funtion to ensure pandas columns have compatible columns

Parameters

df (pandas.DataFrame) – A pandas dataframe to convert columns
col_fix ({np.float64, np.str}, optional) – A column type to convert the input dataframe.

Returns

A dataframe with converted columns

Return type

pd.DataFrame

cytominer_eval.transform.util.check_grit_replicate_summary_method(replicate_summary_method: str) → None¶

Helper function to ensure that we support the user input replicate summary

Parameters: replicate_summary_method (str) – The user input replicate summary method
Returns: Assertion will fail if the user inputs an incorrect replicate summary method
Return type: None

cytominer_eval.transform.util.check_replicate_groups(eval_metric: str, replicate_groups: Union[List[str], dict]) → None¶

Helper function checking that the user correctly constructed the input replicate groups argument

The package will not calculate evaluation metrics with incorrectly constructed replicate_groups. See cytominer_eval.evaluate.evaluate().

Parameters

eval_metric (str) – Which evaluation metric to calculate. See cytominer_eval.transform.util.get_available_eval_metrics().
replicate_groups ({list, dict}) – The tentative data structure listing replicate groups

Returns

Assertion will fail for improperly constructed replicate_groups

Return type

None

cytominer_eval.transform.util.convert_pandas_dtypes(df: pandas.core.frame.DataFrame, col_fix: type = <class 'numpy.float64'>) → pandas.core.frame.DataFrame¶

Helper funtion to convert pandas column dtypes

Parameters

df (pandas.DataFrame) – A pandas dataframe to convert columns
col_fix ({np.float64, np.str}, optional) – A column type to convert the input dataframe.

Returns

A dataframe with converted columns

Return type

pd.DataFrame

cytominer_eval.transform.util.get_available_eval_metrics()¶: Output the available eval metrics in the cytominer_eval library

cytominer_eval.transform.util.get_available_grit_summary_methods()¶: Output the available metrics for calculating pairwise similarity in the cytominer_eval library

cytominer_eval.transform.util.get_available_similarity_metrics()¶: Output the available metrics for calculating pairwise similarity in the cytominer_eval library

cytominer_eval.transform.util.get_upper_matrix(df: pandas.core.frame.DataFrame) → numpy.array¶

Helper function to return only an upper matrix of the size of the input

Parameters: df (pandas.DataFrame) – Any dataframe with a shape
Returns: An upper triangle matrix the same shape as the input dataframe
Return type: np.array

cytominer_eval.transform.util.set_grit_column_info(profile_col: str, replicate_group_col: str) → dict¶

Transform column names to be used in calculating grit

In calculating grit, the data must have a metadata feature describing the core replicate perturbation (profile_col) and a separate metadata feature(s) describing the larger group (replicate_group_col) that the perturbation belongs to (e.g. gene, MOA).

Parameters

profile_col (str) – the metadata column storing profile ids. The column can have unique or replicate identifiers.
replicate_group_col (str) – the metadata column indicating a higher order structure (group) than the profile column. E.g. target gene vs. guide in a CRISPR experiment.

Returns

A nested dictionary of renamed columns indicating how to determine replicates

Return type

dict

cytominer_eval.transform.util.set_pair_ids()¶

Helper function to ensure consistent melted pairiwise column names

Returns: A length two dictionary of suffixes and indeces of two pairs.
Return type: collections.OrderedDict

Transform functions¶

cytominer_eval.transform.transform¶

cytominer_eval.transform.util¶

Module contents¶