Interesting, I’ve always wondered if the assignment problem could be solved as continuous optimization. I’m familiar with manifold optimization but didn’t realize the permutation matrix could be expressed as a manifold.
Has this been applied to networks like DETR for object detection? I’ve never been clear on how they made the Hungarian algorithm differentiable (and it seems like their approach has issues).
Of course it can, but likely you will also need some search as well.
In math programming, we create a relaxation of the assignment problem where all of the integer decisions are converted to continuous. Then you solve this super simple Linear Program. This is a theoretical lower bound on what the optimal assignment is. You can now start branch and bound search and find the optimal integer assignment.
Now the trick is that we have identified over the decades cuts (inequalities) that you can apply to this initial relaxed problem, that they chop off irrelevant continuous space but let the integer solutions are untouched. If your cuts are good enough and the integer optimal solution is at the vertex of the chopped polytope, congratulations! You don't have to search anything!
No cuts are necessary for minimum-weight bipartite matching as the constraint matrix is totally unimodular. Total unimodularity guarantees that any optimal basic solution is an integer solution.
Moreover, well-known algorithms for linear programming like simplex and primal-dual methods have straightforward (and illuminating!) combinatorial interpretations when restricted to minimum-weight bipartite matching.
Funny how things I never expect to see on here pop up occasionally. I had read some of the early papers by Boumal on optimization over Riemannian manifolds in 2016 for a similar problem and even wrote some Julia code for it (before manifolds.jl existed).
In my case, I was trying to perform synchronization over a set of noisy point clouds to extract the "ground truth" point cloud. I.e., take the set of coordinates corresponding to a point cloud, randomly permute the order of the coordinates, randomly rotate/reflect the point cloud, and then apply a Gaussian perturbation to each coordinate. Repeat this process multiple times. The goal then is to recover the original point cloud from the noisy rotated/reflected/permuted copies of the original.
Boumal and Singer had done some work to solve this problem for just rotations/reflections by essentially "stacking" the unknown orthogonal matrices (i.e., elements of O(3)) into a Stiefel manifold and then performing Riemannian optimization over this manifold. It worked fantastically well and was much faster than prior techniques involving semidefinite programming, so I decided to try the same approach for synchronization over permuted copies of the point cloud, and it also worked quite well. However, one can achieve better convergence properties by performing a discrete Fourier transform over the symmetric group on the sets of point clouds so that instead of optimizing over a Stiefel manifold that is supposed to represent stacked permutation matrices (well, stacked orthogonal matrices with constraints to make them doubly stochastic), you optimize over a manifold of stacked unitary matrices intended to represent irreps of S_n.
Ultimately I didn't know enough group theory at the time to figure out how to generate irreps for the "combined" group of O(3) and S_n, so I couldn't solve the problem simultaneously for both rotations/reflections and permutations, but ultimately Singer and Boumal developed an approach that bypassed the synchronization problem altogether by utilizing invariant polynomials to extract the ground truth signal in a more direct way.
Author here! Thank you f1shy for sharing.
I had to deal with minimum bipartite matching when implementing FedMA (a federated learning algorithm). I was lucky it could be solved with a scikit learn method because i wouldn't have been be able to write a solution in a short time, let alone an efficient one
When my supervisor read about fedma using matching he went "oh god, i have terrible memories about it". he was right
Yes, the problem can now be solved efficiently with a scikit one-liner, but the aim of this post was more speculative rather than beating the benchmark :)