Applying graph minors and tangles to image and cluster analysis
We show how an image can, in principle, be described by the tangles of the graph of its pixels.
The tangle-tree theorem provides a nested set of separations that efficiently distinguish all the distinguishable tangles in a graph. This translates to a small data set from which the image can be reconstructed.
The tangle duality theorem says that a graph either has a certain order tangle or a tree-structure witnessing that this cannot exist. This tells us the maximum resolution at which the image contains meaningful information.
Tangles are a novel mathematical concept designed to identify highly connected regions in a graph. Diestel and Whittle, two of the world’s leading graph theorists, have generalized this idea to serve to identify highly coherent clusters in data sets. This brings the mathematical theory of graph minors, one of the deepest developments in discrete mathematics of the last 25 years, to bear on some of today’s most topical computational problems: cluster analysis and image recognition. Tangles are a novel, precise, mathematical way to deal with inherently imprecise data sets.
Tangles represent a shift of paradigm in the identification of highly connected regions in a graph: rather
than attempting to describe what exactly these regions consist of, in terms of vertices and edges as usual, tangles identify them by merely orienting all the low-order separations of the graph consistently: `towards’ the intended region
All concrete highly-connected substructures induce such orientations. But while such concrete structures may vary, or may not be known in detail, the orientations of separations they induce are all that matters for their mathematical treatment: the two most fundamental theorems in the area, the tangle-tree theorem (which says that the highly connected substructures can all be separated from each other in a nested, tree-like way) and the tangle duality theorem (which says that every structure not containing a highly connected region has a tree structure witnessing this) can be proved at this very abstract level of separation systems. This has been carried out over the last few years by Professor Diestel and his group.
The level of abstraction is such that tangles can be defined on any data set with a meaningful notion of cutting it in two – such as lines cutting through an image. The tangles precisely identify clusters, or regions, even when these are fuzzy.
The tangle-tree theorem organizes the data sets’s clusters into a structure from which its essence can be reconstructed.
The duality theorem assesses the maximum cluster coherence that the data supports.
This innovation is now at its first, theoretical, stage. The input of firstrate mathematics is substantial, the idea to use graph minor theory in this way is entirely new.
Developing this theoretical breakthrough into concrete algorithms remains a non-trivial task that will require the efforts of some highly skilled computer scientists. If successful, the benefits in time are likely to be enormous – which makes this an attractive undertaking not only commercially, but also intellectually.
We therefore expect the CS community to take the idea up in the coming years; some leading researchers have already expressed interest and enthusiasm.
Whoever owns the patent at the time this comes to fruition will benefit.
- Image recognition and compression.
- Cluster analysis. Data mining.
- Data quality assessment.
- Patent right transfer
- PCT PCT/EP2017/058954 anhängig
StichworteImage recognition/compression, Cluster analysis, Data mining, Data quality assessment