Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
jdurrant
deepfrag
Commits
0fd6a7f2
Commit
0fd6a7f2
authored
Jan 05, 2021
by
hgarrereyn
Browse files
cleanup
parent
27b2cbf5
Changes
88
Expand all
Hide whitespace changes
Inline
Side-by-side
.gitignore
View file @
0fd6a7f2
**/*.pyc
**/__pycache__
.ipynb_checkpoints
old
data/**
!data/README.md
pretrained/**
!pretrained/README.md
.DS_Store
grid_0.right_answer.json
README.md
View file @
0fd6a7f2
#
Lead Optimization
#
DeepFrag
This repository contains code for machine learning based lead optimization.
# Overview
-
`config`
: fixed configuration information (eg. TRAIN/VAL/TEST partitions)
-
`configurations`
: benchmark model configurations
-
`configurations`
: benchmark model configurations
(see
[
`configurations/README.md`
](
configurations/README.md
)
)
-
`data`
: training/inference data (see
[
`data/README.md`
](
data/README.md
)
)
-
`leadopt`
: main module code
-
`models`
: pytorch architecture definitions
-
`data_util.py`
: utility code for reading packed fragment/fingerprint data files
-
`grid_util.py`
: GPU-accelerated grid generation code
-
(outdated)
`infer.py`
: code for inference with a trained model
-
`metrics.py`
: pytorch implementations of several metrics
-
`model_conf.py`
: contains code to configure and train models
-
`util.py`
: utility code for rdkit/openbabel processing
-
(outdated)
`scripts`
: data processing scripts (see
[
`scripts/README.md`
](
scripts/README.md
)
)
-
`scripts`
: data processing scripts (see
[
`scripts/README.md`
](
scripts/README.md
)
)
-
`train.py`
: CLI interface to launch training runs
-
(outdated)
`leadopt.py`
: CLI interface to run inference on new samples
# Dependencies
...
...
config/partitions.py
deleted
100644 → 0
View file @
27b2cbf5
This diff is collapsed.
Click to expand it.
configurations/README.md
View file @
0fd6a7f2
This folder contains benchmark model configurations referenced in the paper.
You can retrain from these configurations using the
`train.py`
script:
Overview:
-
`layer_type_sweep/*`
: experimenting with different parent/receptor typing schemes
-
`voxelation_sweep/*`
: experimenting with different voxelation types and atomic influence radii
-
`final.json`
: final production model
You can train new models using these configurations with the
`train.py`
script:
```
sh
python train.py
\
...
...
configurations/final.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
50
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
0
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
test_
config
/cos_small
.json
→
config
urations/layer_type_sweep/lig_simple
.json
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/
fragments_desc
.h5"
,
"fingerprints"
:
"./data/
fp_rdk_desc
.h5"
,
"learning_rate"
:
0.00
0
1
,
"num_epochs"
:
5
00
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"fragments"
:
"./data/
moad
.h5"
,
"fingerprints"
:
"./data/
rdk10_moad
.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
1
5
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.5
,
"grid_res"
:
0.
7
5
,
"fdist_min"
:
null
,
"fdist_max"
:
3
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"rec_typer"
:
"simple"
,
"lig_typer"
:
"simple"
,
"rec_channels"
:
4
,
"lig_channels"
:
3
,
"in_channels"
:
7
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
test_
config
/cos_direct
.json
→
config
urations/layer_type_sweep/lig_simple_h
.json
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/
fragments_desc
.h5"
,
"fingerprints"
:
"./data/
fp_rdk_desc
.h5"
,
"fragments"
:
"./data/
moad
.h5"
,
"fingerprints"
:
"./data/
rdk10_moad
.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
1
00
,
"test_steps"
:
400
,
"batch_size"
:
32
,
"num_epochs"
:
1
5
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
1.0
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
3
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"rec_typer"
:
"simple"
,
"lig_typer"
:
"simple"
,
"rec_channels"
:
4
,
"lig_channels"
:
3
,
"in_channels"
:
7
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
32
,
64
],
"fc"
:
[
2048
],
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple_h"
}
test_
config
/20200813/bas
e.json
→
config
urations/layer_type_sweep/lig_singl
e.json
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/
fragments_desc
.h5"
,
"fingerprints"
:
"./data/
fp_rdk_desc
.h5"
,
"fragments"
:
"./data/
moad
.h5"
,
"fingerprints"
:
"./data/
rdk10_moad
.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
1
00
,
"test_steps"
:
400
,
"batch_size"
:
32
,
"num_epochs"
:
1
5
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
1.0
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
3
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"rec_typer"
:
"simple"
,
"lig_typer"
:
"simple"
,
"rec_channels"
:
4
,
"lig_channels"
:
3
,
"in_channels"
:
7
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
32
,
64
],
"fc"
:
[
2048
],
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"single"
}
test_
config
/cos_support_v1
.json
→
config
urations/layer_type_sweep/lig_single_h
.json
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/
fragments_desc
.h5"
,
"fingerprints"
:
"./data/
fp_rdk_desc
.h5"
,
"fragments"
:
"./data/
moad
.h5"
,
"fingerprints"
:
"./data/
rdk10_moad
.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
1
00
,
"test_steps"
:
400
,
"batch_size"
:
32
,
"num_epochs"
:
1
5
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
1.0
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
3
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"rec_typer"
:
"simple"
,
"lig_typer"
:
"simple"
,
"rec_channels"
:
4
,
"lig_channels"
:
3
,
"in_channels"
:
7
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
32
,
64
],
"fc"
:
[
2048
],
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"support_v1"
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"single_h"
}
configurations/layer_type_sweep/rec_meta.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"meta"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
configurations/layer_type_sweep/rec_meta_mix.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"meta_mix"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
configurations/layer_type_sweep/rec_simple_h.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"simple_h"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
configurations/layer_type_sweep/rec_single.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"single"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
configurations/layer_type_sweep/rec_single_h.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
3
,
"rec_typer"
:
"single_h"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
configurations/voxel_mse_direct_m150.json
deleted
100644 → 0
View file @
27b2cbf5
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/fragments_desc.h5"
,
"fingerprints"
:
"./data/fp_rdk_desc.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
300
,
"test_steps"
:
500
,
"batch_size"
:
32
,
"grid_width"
:
24
,
"grid_res"
:
1.0
,
"fdist_min"
:
0.5
,
"fdist_max"
:
3
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"rec_typer"
:
"simple"
,
"lig_typer"
:
"simple"
,
"rec_channels"
:
4
,
"lig_channels"
:
3
,
"in_channels"
:
7
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
32
,
64
],
"fc"
:
[
2048
],
"use_all_labels"
:
true
,
"dist_fn"
:
"mse"
,
"loss"
:
"direct"
}
configurations/voxelation_sweep/t0.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
0
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
\ No newline at end of file
configurations/voxelation_sweep/t1.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1
,
"point_type"
:
1
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
\ No newline at end of file
configurations/voxelation_sweep/t10.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1.75
,
"point_type"
:
4
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
\ No newline at end of file
configurations/voxelation_sweep/t11.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
1.75
,
"point_type"
:
5
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
\ No newline at end of file
configurations/voxelation_sweep/t12.json
0 → 100644
View file @
0fd6a7f2
{
"version"
:
"voxelnet"
,
"no_partitions"
:
false
,
"fragments"
:
"./data/moad.h5"
,
"fingerprints"
:
"./data/rdk10_moad.h5"
,
"learning_rate"
:
0.001
,
"num_epochs"
:
15
,
"test_steps"
:
400
,
"batch_size"
:
16
,
"grid_width"
:
24
,
"grid_res"
:
0.75
,
"fdist_min"
:
null
,
"fdist_max"
:
4
,
"fmass_min"
:
null
,
"fmass_max"
:
150
,
"ignore_receptor"
:
false
,
"ignore_parent"
:
false
,
"output_size"
:
2048
,
"pad"
:
false
,
"blocks"
:
[
64
,
64
],
"fc"
:
[
512
],
"use_all_labels"
:
true
,
"dist_fn"
:
"cos"
,
"loss"
:
"direct"
,
"point_radius"
:
2.5
,
"point_type"
:
0
,
"rec_typer"
:
"simple"
,
"acc_type"
:
0
,
"lig_typer"
:
"simple"
}
\ No newline at end of file
Prev
1
2
3
4
5
Next
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment