If you use this code in your research, please cite the following:
- Paper: Machine Learning-Engineered Nanozyme System for Synergistic Anti-Tumor Ferroptosis/Apoptosis Therapy. Tianliang Li(李天亮)&, Bin Cao(曹斌)&, Tianhao Su(苏天昊)&, ..., Lingyan Feng^(冯凌燕), Tongyi Zhang^(张统一). ( SMALL)
- Paper: Divide and conquer: Machine learning accelerated design of lead-free solder alloys with high strength and high ductility Qinghua Wei(魏清华)&, Bin Cao(曹斌)&, Hao Yuan (元皓)&, ..., Ziqiang Dong^(董自强), Tong-Yi Zhang^(张统一). (NPJcm)
- Patent: Zhang Tongyi (张统一), Cao Bin (曹斌), Yuan Hao, Wei Qinghua, Dong Ziqiang. Authorized Chinese Patent.
-
2022: I proposed TCGPR and developed the first version. In collaboration with Mr. Hao Yuan (元皓) (experimental) and Mr. Qinghua Wei (魏清华) (experimental), this method was successfully applied to lead solder optimization. Our first paper was published in npj Computational Materials. News
-
2024: After two years of development, we introduced the sequential forward/backward feature selection methods and outlier detection techniques. In collaboration with Mr. Tianliang Li (李天亮) (experimental) and Mr. Tianhao Su (苏天昊) (computional), we successfully applied TCGPR to anti-tumor ferroptosis studies. The paper has been published in SMALL. News
This Python-based library is compatible with Windows, Linux, and macOS operating systems.
For detailed algorithm information, refer to the Introduction.
To install TCGPR, use pip:
pip install PyTcgpr
You can check the installation details with:
pip show PyTcgpr
Update TCGPR to the latest version using:
pip install --upgrade PyTcgpr
from PyTcgpr import TCGPR
dataSet = "data.csv"
initial_set_cap = 3
sampling_cap = 2
up_search = 500
CV = 'LOOCV'
Task = 'Partition'
TCGPR.fit(
filePath = dataSet,
initial_set_cap = initial_set_cap,
Task = Task,
sampling_cap = sampling_cap,
up_search = up_search,
CV = CV
)
# Note: Mission is set to 'DATA' by default. No need to declare it explicitly.
from PyTcgpr import TCGPR
dataSet = "data.csv"
sampling_cap = 2
up_search = 500
Task = 'Identification'
CV = 'LOOCV'
TCGPR.fit(
filePath = dataSet,
Task = Task,
sampling_cap = sampling_cap,
up_search = up_search,
CV = CV
)
# Note: 'Mission' is 'DATA' by default; no need to declare it. 'initial_set_cap' is masked in this case.
from PyTcgpr import TCGPR
dataSet = "data.csv"
sampling_cap = 2
Mission = 'FEATURE'
up_search = 500
CV = 'LOOCV'
TCGPR.fit(
filePath = dataSet,
Mission = Mission,
sampling_cap = sampling_cap,
up_search = up_search,
CV = CV
)
# Note: For feature selection, 'Mission' must be explicitly set to 'FEATURE'.
:param Mission: str, default='DATA'
The task to perform:
- 'DATA' for data screening
- 'FEATURE' for feature selection
:param filePath: str
Path to the input dataset in CSV format.
:param initial_set_cap: int or list
Initial set capacity. For 'Partition' under 'DATA', defaults to 3.
Can also be a list specifying the indices of the initial set.
:param sampling_cap: int, default=1
Number of data points or features added at each iteration.
:param measure: str, default='Pearson'
Correlation criteria. Can be 'Pearson' (R values) or 'Determination' (R² values).
:param ratio: float
Tolerance ratio for correlation. Varies based on the mission and task.
:param target: int, default=1
Used in feature selection. Specifies the number of targets in regression tasks.
:param weight: float, default=0.2
Weight factor for calculating the GGMF score.
:param up_search: int, default=500
Upper boundary for brute-force search.
:param exploit_coef: float, default=2
Constraint on the variance in the Cal_EI function.
:param exploit_model: bool, default=False
If True, only R values will be used for the search (GGMF will not be considered).
:param CV: int or str, default=10
Cross-validation setting. Can be an integer (e.g., 5, 10) or 'LOOCV' for leave-one-out cross-validation.
The algorithm will output a CSV file containing the processed dataset: Dataset_remained_by_TCGPR.csv
.
This project is maintained by Bin Cao. If you encounter any issues or have suggestions, feel free to open an issue on GitHub or contact:
- Email: [email protected]
We welcome contributions and suggestions! You can submit issues for questions, bugs, and feature requests, or submit a pull request directly. We are also open to research collaborations—please get in touch if you're interested!