-`infer.py`: code for inference with a trained model
-`metrics.py`
-`train.py`: training loops
-`util.py`: extra utility code (mostly rdkit)
-`pretained`: pretrained models (see [`pretrained/README.md`](pretrained/README.md))
-`scripts`: data processing scripts (see [`scripts/README.md`](scripts/README.md))
- (outdated) `infer.py`: code for inference with a trained model
-`metrics.py`: pytorch implementations of several metrics
-`model_conf.py`: contains code to configure and train models
-`util.py`: utility code for rdkit/openbabel processing
- (outdated) `scripts`: data processing scripts (see [`scripts/README.md`](scripts/README.md))
-`train.py`: CLI interface to launch training runs
-`leadopt.py`: CLI interface to run inference on new samples
- (outdated) `leadopt.py`: CLI interface to run inference on new samples
# Dependencies
You can build a virtualenv with the requirements:
```sh
$ python3 -m venv leadopt_env
$ source ./leadopt_env/bin/activate
$ pip install-r requirements.txt
```
Note: `Cuda 10.1` is required during training
# Training
You can train models with the `train.py` utility script
To train a model, you can use the `train.py` utility script. You can specify model parameters as command line arguments or load parameters from a configuration args.json file.
```bash
python train.py \
--save_path=/path/to/model \
--wandb_project=my_project \
{model_type}\
--model_arg1=x \
--model_arg2=y \
...
```
or
```bash
python train.py \
--save_path=/path/to/model \
--wandb_project=my_project \
--configuration=./configurations/args.json
```
`save_path` is a directory to save the best model. The directory will be created if it doesn't exist. If this is not provided, the model will not be saved.
`wandb_project` is an optional wandb project name. If provided, the run will be logged to wandb.
See below for available models and model-specific parameters:
# Leadopt Models
In this repository, trainable models are subclasses of `model_conf.LeadoptModel`. This class encapsulates model configuration arguments and pytorch models and enables saving and loading multi-component models.
```py
fromleadopt.model_confimportLeadoptModel,MODELS
model=MODELS['voxel']({args...})
model.train(save_path='./mymodel')
...
model2=LeadoptModel.load('./mymodel')
```
Internally, model arguments are configured by setting up an `argparse` parser and passing around a `dict` of configuration parameters in `self._args`.
## VoxelNet
```
--no_partitions If set, disable the use of TRAIN/VAL partitions during
training.
-f FRAGMENTS, --fragments FRAGMENTS
Path to fragments file.
-fp FINGERPRINTS, --fingerprints FINGERPRINTS
Path to fingerprints file.
-lr LEARNING_RATE, --learning_rate LEARNING_RATE
--num_epochs NUM_EPOCHS
Number of epochs to train for.
--test_steps TEST_STEPS
Number of evaluation steps per epoch.
-b BATCH_SIZE, --batch_size BATCH_SIZE
--grid_width GRID_WIDTH
--grid_res GRID_RES
--fdist_min FDIST_MIN
Ignore fragments closer to the receptor than this
distance (Angstroms).
--fdist_max FDIST_MAX
Ignore fragments further from the receptor than this
distance (Angstroms).
--fmass_min FMASS_MIN
Ignore fragments smaller than this mass (Daltons).