Linear predictor from output
In the real world we cannot measur the white noise, the only aviable information is the values of the process up to time $t$.
We need to construct the white noise underlyning the generation of the process from the values of the process itself up to time $t$. This means that if we are able to define: $$e(t)=h_0y(t)$$ $$e(t)=h_0y(t)+h_1y(t-1)+...= \sum_{i=0}^{+\infty}{h_iy(t-i)}$$ Then the optimal predictor from noise can be written as a predictor from output: $$\hat{y}(t+k|t)=\sum_{i=0}^{+\infty}{w_{k+i}[\sum_{j=0}^{+\infty}{h_jy(t-i-j)}]}$$