Skip to content

Molecular dataset pre-processing toolkit

Notifications You must be signed in to change notification settings

Takaogahara/moldataproc

Repository files navigation

Molecular Dataset Processing


Latest version: 0.0.4


Features of this package

  • Select desired columns from dataset
  • Remove columns with NaN values
  • Filter molecules with undesired atoms
  • Standardize Smiles
  • Remove duplicates


⚙️ Installation

  1. Install Python3-tk and python3-dev with the command:

    sudo apt install python3-dev python3-tk
  2. Clone the repo using the following command:

    git clone [email protected]:Takaogahara/molproc.git
  3. Create and activate your virtualenv with Python, for example as described here.

  4. Install required libraries using:

    python -m pip install -r requirements.txt


🔎 Usage

With all parameters configured correctly and with the virtualenv activated, you can proceed to the execution.

From the root folder, open the terminal and run:

streamlit run run.py

About

Molecular dataset pre-processing toolkit

Resources

Stars

Watchers

Forks

Languages