The Maximum Flow problem is well-studied and there are very efficient algorithms to solve it. It turns out to be equivalent to finding the Minimum Cut of a network as well. If we can map the binarization problem to either of these problems, the result should be fast and accurate. Let's look at what the problems are and then see how to map it.
Suppose you have a network of pipes in a water system. The pipes may be different sizes (water capacities) and connect to each other at various junctions. Suppose you have one water supplier, called the source. And somewhere else you have one customer who wants as much water as possible, called the sink. The question is, given a particular configuration of pipes and junctions, what is the most water you can supply to the customer over the network?
In this example, the maximum flow possible through the network is 7 gallons per second.
Let's say you are a secret agent, trying to prevent the villain from escaping to safety. The villain will try to take a train from a starting city to their safehouse. Since the villain is wily, you know that if there is any path to safety, they will find it. But you only have so many resources and each train line you shut down has a cost. Shutting down all the train lines is far too expensive, so what is the cheapest set of train lines to shut down to guarantee success?
These two problems happen to be completely equivalent. The water source is the villain's starting city. The water sink is the villain's destination. The size of a pipe is equivalent to the cost of shutting down a train line. Once you map the one problem to the other, the total flow of water you can get through the water system is equal to the cost of trapping the villain.
In order to keep the villain away from safety at minimal cost, you have to cut off three routes whose cost adds up to $7 million. This is not a coincidence. When the flow is maximized, the water goes through pipes which collectively form a bottleneck. And those pipes, when removed, partition the graph at minimum cost.
We can turn the binarization problem into an instance of the mincut problem by constructing a graph from the pixels of a picture. Then when we run the mincut algorithm, some of the pixels will end up on the side of the source/villain. The rest will be on the side of the customer/escape. We set one group to be white and the other to be black.
Our main graph will form a lattice where every pixel has an edge to the four adjacent pixels. If the two pixels are in different groups, then the edge will have to be cut and this imposes a penalty. The bigger the weight, the more penalty is assessed when the two are in different groups and so the more likely they are to be in the same group.
Then we connect every pixel to both the source and sink, but vary the weights. Here is a simple way to do it. The brighter the pixel is, the more weight we assign to its link to the source and the less weight we assign its link to the sink. So bright pixels are biased toward one side and dark pixels towards the other.
The combination of these two kinds of weights helps ensure that contiguous regions of darker pixels are all tagged the same and helps reduce the chance of random speckles. If you want to see this method in action, take a look here.
Howe's binarization algorithm builds on this basic system. It builds a graph like I have described here and runs mincut on it. The tricky bit is the methods it uses for determining the weights between adjacent pixels and between the pixels and the source/sink.