Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-training the flow estimation network #10

Open
YaoooLiang opened this issue Jul 31, 2018 · 7 comments
Open

Pre-training the flow estimation network #10

YaoooLiang opened this issue Jul 31, 2018 · 7 comments

Comments

@YaoooLiang
Copy link

Hi, @anchen1011 . I pre-trained the flownet on the Sintel dataset but that does not converge . The batchsize is 16 and learning rate is 0.0001, the loss is defined by calculating the l1 difference between the last sub-net's output and the ground truth. Can you share the details about pre-training the flownet?

@anchen1011
Copy link
Owner

I think there should be no problem with batchsize 16 and learning rate 0.0001 setup.
Would you like to share the visualized input/output/target/flow so that I can have a sense on what's preventing the network from converge?

@YaoooLiang
Copy link
Author

@anchen1011 ,thank you for your reply. Images are normalized between 0 and 1 image = image.astype(np.float32) / 255, image = image[0:432, :, :] while flows are unpreprocessed. The shape of images is [16, 432, 1024, 3] and the shape of flow is [16, 432, 1024, 2]. The downsampleed 8x images and flow0 flow0 = tf.constant(np.zeros((16, 54, 128, 2)), np.float32) are concatenated tf.concat([frame1, frame2, flow0], axis=3) as the first sub_net's input, and the rest of the subnets are as similar inputs as this way.

@YaoooLiang
Copy link
Author

Loss drops when training only on one batch but trains losss up and down on the entire data set.

@anchen1011
Copy link
Owner

It seems like you are implementing the pertaining pipeline with TF, which could introduce many issues that are unknown to me.

I think in general to figure out the reason why it doesn't converge, you need to:

  1. Visualize the network architecture (with tensorboard)
  2. Visualize a few groups of input/output/target/flow images

I would be happy to help if you attach these images so that I can take a look.

Also, your preprocessing of images is quiet different from ours. We use defaultTrainTransform from
this module

@YaoooLiang
Copy link
Author

@anchen1011 ,Hi, sorry for the late reply. I visiualize a few groups of the images in
images.zip. After a long period of training, training l1-loss stabilized around 0.1 and validation l1-loss is around 0.15 . But the model also had a bad performance both in training datasets and validation datasets. Can you share the details about:

1.Which way do you choose for training? end2end training or step by step ?
2.How to normalize optical flow data?
3.Input images'size is original size or be cropped to a smaller size?

@anchen1011
Copy link
Owner

I think your network is learning something, which means the input/output format are good.

However, the network structure seems problematic. Each subnet should output a optical flow, which you need to both resize and double the magnitude.

For your 3 questions:

  1. First step by step, and then fine-tune with end2end. Only step by step should be able to deliver a very good result.
  2. You don't normalize optical flow data.
  3. Input image size is cropped into the network input size (if not the same).

@YaoooLiang
Copy link
Author

Hi,@anchen1011 . Actually, I resize and double each subnet's output flow at the same time :flow2 = tf.image.resize_images(flow1, (flow1.shape[1] * 2, flow1.shape[2] * 2)) * 2 Then, I train each subnet one by one but failed again.Also, I check that the input images and target flow are matched.Would you give me any suggestions? Thank you a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants