PLSA.surv package

Submodules

PLSA.surv.cutoff module

Module for determinding cutoffs in survival analyze

The function of this Module is served for determinding cutoffs by different methods in survival analyze.

PLSA.surv.cutoff.coxph_coef(data, duration_col, event_col, silence=True)
PLSA.surv.cutoff.hazards_ratio(data, pred_col, duration_col, event_col, score_min=0, score_max=100, balance=True)

Cutoff maximize HR or BHR.

Parameters:
  • data (DataFrame) – full survival data.
  • pred_col (str) – Name of column to reference for dividing groups.
  • duration_col (str) – Name of column indicating time.
  • event_col (str) – Name of column indicating event.
  • score_min (int, optional) – min value in pred_col.
  • score_max (int, optional) – max value in pred_col.
  • balance (bool) – True if using BHR as metrics, otherwise HR.
Returns:

Optimal cutoffs according to ratio of hazards methods.

Return type:

float

Examples

>>> hazards_ratio(data, 'score', 'T', 'E', balance=True)
PLSA.surv.cutoff.loss_bhr(data_list, duration_col, event_col, base_val=2, silence=True)
PLSA.surv.cutoff.loss_dis(data, data_list, col)
PLSA.surv.cutoff.loss_hr(data_list, duration_col, event_col, base_val=0, silence=True)
PLSA.surv.cutoff.stats_var(data, x_col, y_col, score_min=0, score_max=100)

Cutoff maximize distant between groups, minimize variance in group

Parameters:
  • data (pd.DataFrame) – Data set.
  • x_col (str) – Name of column to reference for dividing groups.
  • y_col (str) – Name of column to measure differences.
  • score_min (int, optional) – Min value in x_col.
  • score_max (int, optional) – Max value in x_col.
Returns:

Optimal cutoffs according to statistical methods.

Return type:

float

Examples

>>> stats_var(data, 'score', 'y')
PLSA.surv.cutoff.youden_onecut(data, pred_col, duration_col, event_col, pt=None)

Cutoff maximize Youden Index.

Parameters:
  • data (pandas.DataFrame) – full survival data.
  • pred_col (str) – Name of column to reference for dividing groups.
  • duration_col (str) – Name of column indicating time.
  • event_col (str) – Name of column indicating event.
  • pt (int, default None) – Predicted time.
Returns:

Value indicating cutoff for pred_col of data.

Return type:

float

Examples

>>> youden_onecut(data, 'X', 'T', 'E')
PLSA.surv.cutoff.youden_twocut(data, pred_col, duration_col, event_col, pt=None)

Two values of cutoff maximize Youden Index.

Parameters:
  • data (pandas.DataFrame) – Full survival data.
  • pred_col (str) – Name of column to reference for dividing groups.
  • duration_col (str) – Name of column indicating time.
  • event_col (str) – Name of column indicating event.
  • pt (int) – Predicted time.
Returns:

(cutoff-1, cutoff-2) value indicating cutoff for pred_col of data.

Return type:

tuple

Examples

>>> youden_twocut(data, 'X', 'T', 'E')

PLSA.surv.utils module

Module for utilitize function of survival analyze.

The function of this Module is served as utility of survival analyze.

PLSA.surv.utils.surv_data_at_risk(data, duration_col, points=None)

Get number of people at risk at some timing.

Parameters:
  • data (pandas.DataFrame) – Full survival data.
  • duration_col (str) – Name of column indicating time.
  • points (list(int)) – Points of Time selected to watch.
Returns:

Number of people at risk.

Return type:

pandas.DataFrame

Examples

>>> surv_data_at_risk(data, "T", points=[0, 10, 20, 30, 40, 50])
PLSA.surv.utils.surv_roc(data, pred_col, duration_col, event_col, pt=None)

Get survival ROC at predicted time.

Parameters:
  • data (pandas.DataFrame) – Full survival data.
  • pred_col (str) – Name of column to reference for dividing groups.
  • duration_col (str) – Name of column indicating time.
  • event_col (str) – Name of column indicating event.
  • pt (int) – Predicted time.
Returns:

Object of dict include “FP”, “TP” and “AUC” in ROC.

Return type:

dict

Examples

>>> surv_roc(data, 'X', 'T', 'E', pt=5)
PLSA.surv.utils.survival_by_hr(T0, S0, pred)

Get survival function of patients according to giving hazard ratio.

Parameters:
  • T0 (np.array) – time.
  • S0 (np.array) – based estimated survival function of patients.
  • pred (pandas.Series) – hazard ratio of patients.
Returns:

T0, ST indicating survival function of patients.

Return type:

tuple

Examples

>>> survival_by_hr(T0, S0, data['hazard_ratio'])
PLSA.surv.utils.survival_status(data, duration_col, event_col, end_time, inplace=False)

Get status of event at a specified time.

0: status = 0, Time = end_time (T >= end_time)
status = 0, Time = T (T < end_time)
1: status = 1, Time = T (T <= end_time)
status = 0, Time = end_time (T > end_time)
Parameters:
  • data (pandas.DataFrame) – Full survival data.
  • duration_col (str) – Name of column indicating time.
  • event_col (str) – Name of column indicating event.
  • end_time (int) – End time of study.
  • inplace (bool, default False) – Do replace original data.
Returns:

data indicates status of survival.

None or tuple(time(pandas.Series), status(pandas.Series))

Return type:

None or tuple

Examples

>>> survival_status(data, 'T', 'E', 10, inplace=False)

Module contents