This is a study of training performance degradation in the classification of handwritten numbers (MNIST dataset) when noise is added in the input images.
git clone [email protected]:iqf/iq-mnist-use-case
cd iq-mnist-use-case
- Then build the docker image with
make build
.(***) This will also download the dataset and weights - In order to execute the experiments:
make dockershell
(*)- Inside the docker terminal execute
python ./iqf-usecase.py
- Start the mlflow server by doing
make mlflow
(*) - Notebook examples can be launched and executed by
make notebookshell NB_PORT=[your_port]"
(**) - To access the notebook from your browser in your local machine you can do:
- If the executions are launched in a server, make a tunnel from your local machine.
ssh -N -f -L localhost:[your_port]:localhost:[your_port] [remote_user]@[remote_ip]
Otherwise skip this step. - Then, in your browser, access:
localhost:[your_port]/?token=IQF
- If the executions are launched in a server, make a tunnel from your local machine.
- The results of the IQF experiment can be seen in the MLflow user interface.
- For more information please check the IQF_expriment.ipynb or IQF_experiment.py.
- There are also examples of dataset Sanity check and Stats in SateAirportsStats.ipynb
- The default ports are
8888
for the notebookshell,5000
for the mlflow and9197
for the dockershell - (*)
Additional optional arguments can be added. The dataset location is:
DS_VOLUME=[path_to_your_dataset]
- To change the default port for the mlflow service:
MLF_PORT=[your_port]
- (**)
To change the default port for the notebook:
NB_PORT=[your_port]
- A terminal can also be launched by
make dockershell
with optional arguments such as (*) - (***)
Depending on the version of your cuda drivers and your hardware you might need to change the version of pytorch which is in the Dockerfile where it says:
pip3 install torch==1.7.0+cu110 torchvision==0.8.1+cu110 -f https://download.pytorch.org/whl/torch_stable.html
. - (***)
The dataset is downloaded with all the results of executing the dataset modifiers already generated. This allows the user to freely skip the
.execute
as well as theapply_metric_per_run
which take long time.