Graphcuts are the most popular technique in computer vision for doing inference in MRF's.
The most famous papers are:
Fast approximate energy minimization via graph cuts - Boykov, Y. and Veksler, O. and Zabih
and the follower
What energy functions can be minimized via graph cuts? - Kolmogorov, V. and Zabih, R.
Here I will coment on some points regarding the first of these papers.
A graphcut is a partition of a graph in to zones.
The minimal cut and max-flow graphcuts are equivalent. (Ford–Fulkerson)
CV people cast their problems into a graph and use graphcuts algorimths to solve their problems.
There are two popular algorithms, alpha-beta swap and alpha-expansions.
In a computer vision problem where we have binary pixel values (nodes), these algorithms can give exact results. If the edge cost is V=|fq -fp|, e.i, absolute value of the label difference (labels are numbers) then it is also possible.
There are approximate algorithms for semi-metric and metric edge costs.
Typical edge costs are:
Truncated L2 distance V(a,b)=min(K,||a-b||). <-- makes sense for continuous "labels", intensity values
Potts interaction penalty V= delta(fp != fq) <-- for class "labels" , when a distance between labels is hard to define.
Moves: operations on the labels of the graph.
alpha-beta swap := interchange some nodes labeled alpha are now labeled beta and all other nodes, that are not alpha nor beta stay with the same label.
Requires a semi-metric edge costs (Vs).
alpha-expansion := Some nodes on the graph go from some label to the alpha label and all alpha nodes labeled with alpha stay the same.
Requires a metric con the costs.
If a minimum is found with respect to all possible alpha-expansions, the exact best solution is at most a constact factor away from the current best solution (so we know how far we can be from the best solution).
The trick in both, alpha-beta swaps and alpha-expansions is to build a graph Gab or Ga whose minimum cut corresponds to an the best alpha-beta swap or alpha-expansion.
No comments:
Post a Comment