Lab notes for the first two weeks of January.
I’ve been fiddling with a branch of the PamSeam project named Lattice in an effort to streamline the base and speed up the runtime. There are three algorithms enabled and all of them have similar processes. The basic engine is also very slow, with lots of large allocations and copying the image over and over.
The basic idea of the Lattice project is to turn the image into a graph, one node for each pixel, with eight pointers to the neighboring nodes.
The Lattice graph is a generic container; we can put anything at all into it, so the data structure the Lattice holds can contain the image, any metadata maps (such an energy or flow map), the accumulative energy or flow data from processing, and the upward pointers that calculate the final seam.
The Lattice can remove a seam without copying; that is, the pointers from left-to-right or top-to-bottom can be set to “jump” the removed seam without having to copy the data. This can be done all at once, removing the current need to “seam and copy” both the energy map and the original image. The resulting image can be extracted from the Lattice data at the end.
I believe this process will make a single seam slower, but will significantly speed up the process of removing multiple seams. Also, known problem with the energy map is that removing a seam “damages” not just the pixels immediately adjacent to the seam, but also the pixels of any residuals seam the removed seam traversed. With the lattice, before a seam is removed the energy map regions to the left & right of (or above & below) the scheduled seam can be recalculated prior to removing the seam, enabling the algorithm to detect that the recalculation is no longer creating changes and halting any further processing. It might even be possible to limit recalculating seams themselves only to the damaged region.
There are problems with this solution. The first is that it takes significantly more memory to store a single image for processing. While we may be able to eliminate the multiple-planes problem for pixels, energy, and total energy, we’ve replaced those with a [u32; 8]
array of neighbor offsets, and that’s a lot of memory.
Next, while the graph itself is represented by a vector and a node’s neighbors are stored as offsets into the vector (a common representation technique in Rust), the graph itself is not coherently “chunkable” the way a vector would be. This makes multi-threaded processing problematic, as Rust absolutely does NOT want you to have multiple threads writing to the graph simultaneously. Having atomic sub-graphs doesn’t seem to be an answer, as even linear processing relies on having access to neighboring nodes that, regardless of which algorithm is used, at least one other thread must also have read-access. This could theoretically be solved by having one graph for reading and another for writing, but the memory usage quickly becomes burdensome.
Third, the “graph walking” algorithms become fairly complex. The current solution is to create a Step and Walker; the Step knows where you are and the direction you are headed, and as the Walker takes Steps, it returns the desired contents of the node at that point. This solution allows the Step to modify its direction, which is needed for the meandering path a seam may take, and it allows the Step to terminate early, which is needed for the damaged-seam management pass. The biggest headache here is lifetime management, as the Step and Walker need similar lifetimes, but they’ll also be taking the Lattice as an argument, and I have yet to figure out the lifetime relationships between the two.
I’m confident I can get this to work, but as I now have a full-time job, it’s going much slower than I’d like. It’s not exactly something they need at the office, so I’m restricted to working on it only in my spare time, of which I have very little these days.