Replies: 4 comments 2 replies
-
I see that paddings after first will be replaced with previous inputs. Am I right? |
Beta Was this translation helpful? Give feedback.
-
May be you have some pseudocode for this operation? including paddings, strides, kernel-size, etc |
Beta Was this translation helpful? Give feedback.
-
Have you tried dumping your example network to onnx and run the pulsification through tract command line ? It will show you the pulsified network. |
Beta Was this translation helpful? Give feedback.
-
Very good questions. The pulsifications semantics are pretty clear in my head but I never took the time to write about them somewhere public. Let's try to discuss them a bit. The gist of it: pulsification transforms a stateless "causal" network into a statefull network that will perform the same computation with some delay. For instance if your training network takes a 1D input of 10 and performs a convolution with a kernel size of 3, your output will be of length 8.
Now let's say you want to pulse this network with a pulse size of 4. During the first turn, we know the value of only 4 input frames, so we can only compute the first two output frames. One option would be to output less frames in the first pulse:
But that is NOT what tract is doing. The implementation sticks to a principle: fixed size tensors. So instead the first output pulse will be partially invalid:
Then it's up to the caller to skip these two invalid frames. When tract pulsifies the network, it computes the overall delay (2 in our example) and stores it as a property of the model. Note that the network delay is intrinsic of the network. If you think of convolutions, it's the global receptive field - 1. And the pulse is a value picked by the model integrator. So how is it implemented ? The nice thing is, it is more or less composable. If your network is pulsifiable, you can pulsify it operator per operator. Let's have a look at the valid convolution case. It turns out that pulsifying the convolution can be done without altering the convolution code itself...
In this case, a delay operator with an overlap of 2 is inserted before the convolution will store the last two frames of each pulse and prepend them to the next pulse over and over. And that's about it. The convolution operator here is the same as the one in the original network. tract will "tag" the output network with a "delay" of 2 as it can tells that the conversion introduced two invalid frames. (There is a bit of a difficulty for giving semantics to the "delay" and "length" for the tensors between the Delay and the Convolution, but we pretend not to be aware it. We think of this Delay+Conv more or less as an atomic thing.) Delay has two main parameters, overlap and delay (a lot of things are called delay, right?). The delay parameter is necessary in some circumstances to offset the output without overlapping, but it's not super frequent. With a delay of 2, you get:
If we have more convolution layers, then it just composes nicely, and tensors delay are just additive. Things can get a bit more complicated with padded convolution: we need to convert them to valid convolutions first, prepending a padding operator, then pulsify the padding separately (actually it is a bit more complicated because this approach could sometimes lead to unnecessary delays). Recurring operators are super easy to pulsify: they just need to learn how to skip the invalid frames coming to their input (so the Scan operators has a "skip" property). So while they have not seen "skip" frames, they are just operating as usual, but keep their initial state unchanged. (It is frequent to have RNN ops following CNN ops, so the convolution part will introduce a delay that the RNN must know about). I hope this helps :) |
Beta Was this translation helpful? Give feedback.
-
Hello! I'm interested in PulsedModel, which saves information about previous inputs. And I'm interested in understanding and reimplementing it into torch model.
So, for example on every input we have tensor
[batch_size=1, n_channels=1, time=1, inp_dim]
And like in PulsedModel we want to get output tensor after convolution of incoming tensor considering information from previous steps
Can you explain what PulsedModel is doing for example for this code:
Taken from - https://github.com/Rikorose/DeepFilterNet/blob/12fe14af0790b4dfa537aa6011b082a0bfe609a2/DeepFilterNet/df/modules.py#L18
Beta Was this translation helpful? Give feedback.
All reactions