forked from tianlt/deepdirect
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
108 lines (71 loc) · 3.39 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "50%"
)
```
# deepdirect
<!-- badges: start -->
<!-- badges: end -->
Deepdirect is an in silico approach to generate mutations for protein complexes towards a specified direction (increase/decrease) in binding affinity.
## System requirements and dependencies
### Hardware Requirements
`deepdirect` model is able to be trained and perform its operations on a standard computer.
### OS Requirements
The `deepdirect` model should be compatible with Windows, Mac, and Linux operating systems. The package has been tested on the following systems:
* Linux 3.10.0
* Windows 10
### Dependencies
`deepdirect` framework is built and trained on the `Tensorflow 2.4.0` and `Keras 2.4.0`.
## Framework construction
The python file including all required deepdirect framework built function is able to be downloaded from GitHub: `deepdirect_framework/model_function.py`
## Data structure
* `data` folder contains the original datasets used for building the training datasets.
* `deepdirect_framework` folder contains the trained model weights and the model constructing functions.
* `deepdirect_paper` folder contains codes for building and training models, and performing analysis in the deepdirect manuscript. The file `ab_bind_data_extract.py` and `skempi_data_extract.py` contains code for constructing training datasets for Deepdirect framework. `train_step_1.py` contains code for training step 1 for the mutation mutator. `final_model.py` contains code for training step 2 (final) for the mutation mutator. `model_function.py` contains code for constructing the Deepdirect framework. `model_evaluation_application.py` contains code for model evaluation, and teh application on Novavax-vaccine. `evolution_analysis.py` contains code for performing evolution analysis.
## File source
For files that are required as input in the code but not generated from other codes, please refer to the data availability section in the original paper.
## Installation
Clone repository:
```
git clone https://github.com/tianlt/deepdirect.git
```
Create virtual environment:
```
conda create --name deepdirect python=3.6.8
```
Activate virtual environment:
```
conda activate deepdirect
```
Install dependencies:
```
pip install tensorflow==2.4.0
pip install keras==2.4.0
```
## Running deepdirect
### data processing
Data to be input to deepdirect include sequence to be mutated `pre`, RBD site `rbd`, ligand-receptor index `same`, protein tertiary structure information `x`, `y` and `z`, and random noise `input_noi`. All input has to be `tf.float32` type.
### Build deepdirect mutator with trained weights
```
aa_mutator = build_aa_mutator()
aa_mutator.load_weights(
'deepdirect_framework/model_i_weights.h5')
```
### Binding affinity-guided mutation
```
aa_mutator.predict([pre, rbd, same, x, y, z, input_noi])
```
### Additional information
Expected outputs: mutated amino acid sequence
Expected runtime for mutation: ~1 mintue
## Issues and bug reports
Please use <https://github.com/tianlt/deepdirect/issues> to submit issues, bug reports, and comments.
## License
deepdirect is distributed under the [GNU General Public License version 2 (GPLv2)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html).