Give two important design considerations when thinking about how to represent a 3D scene that is being reconstructed from 2D images. In other words, what properties are useful in a 3D representation, or what are some of the trade-offs?
The somewhat obfusque equation used in NeRF for weighting samples along a volume rendering ray is: \[ C(π«)=\int_{t_n}^{t_f}T(t)\sigma(π«(t))π(π«(t),π)dt \]
in its continuous form, and the discretized quadriture equation is: \[ \begin{align*} \hat{C}(π«) &= \sum_{i=1}^N w_i π_i \\ &=\sum_{i=1}^{N}T_i(1-\text{exp}(-\sigma_iΞ΄_i))π_i \end{align*} \] where N is the number of samples, \(T_i=\text{exp}(-\sum_{j=1}^{i-1}\sigma_jΞ΄_j)\), and \(Ξ΄_i=t_{i+1}-t_i\) is the distance between adjacent samples. This boils down to a weighted sum of the colors (\(\mathbf{c}_i\)) along the sample ray.
To get some intuition for this, letβs plug in a simple case and plot the weights. Letβs take samples at \(t = 1..10\) and assume that the density is 0 except for an constant-density object with density \(\sigma= 0.4\) ranging between \(t=4\) and \(t=6\) inclusive. Using software of your choice, plug this situation into the above equation to compute the weights \(w_{1..10}\), and plot these to show the weights on the 10 different sample points.