Caterpillar tree

In graph theory, a caterpillar or caterpillar tree is a tree in which all the vertices are within distance 1 of a central path.

Caterpillars were first studied in a series of papers by Harary and Schwenk. The name was suggested by Arthur Hobbs. As Harary & Schwenk (1973) colorfully write, "A caterpillar is a tree which metamorphoses into a path when its cocoon of endpoints is removed."

The following characterizations all describe the caterpillar trees:

A k-tree is a chordal graph with exactly n − k maximal cliques, each containing k + 1 vertices; in a k-tree that is not itself a (k + 1)-clique, each maximal clique either separates the graph into two or more components, or it contains a single leaf vertex, a vertex that belongs to only a single maximal clique. A k-path is a k-tree with at most two leaves, and a k-caterpillar is a k-tree that can be partitioned into a k-path and some k-leaves, each adjacent to a separator k-clique of the k-path. In this terminology, a 1-caterpillar is the same thing as a caterpillar tree, and k-caterpillars are the edge-maximal graphs with pathwidth k.

A lobster graph is a tree in which all the vertices are within distance 2 of a central path.

Caterpillars provide one of the rare graph enumeration problems for which a precise formula can be given: when n ≥ 3, the number of caterpillars with n unlabeled vertices is

For n = 1, 2, 3, ... the numbers of n-vertex caterpillars are

Finding a spanning caterpillar in a graph is NP-complete. A related optimization problem is the Minimum Spanning Caterpillar Problem (MSCP), where a graph has dual costs over its edges and the goal is to find a caterpillar tree that spans the input graph and has the smallest overall cost. Here the cost of the caterpillar is defined as the sum of the costs of its edges, where each edge takes one of the two costs based on its role as a leaf edge or an internal one. There is no f(n)-approximation algorithm for the MSCP unless P = NP. Here f(n) is any polynomial-time computable function of n, the number of vertices of a graph.

There is a parametrized algorithm that finds an optimal solution for the MSCP in bounded treewidth graphs. So both the Spanning Caterpillar Problem and the MSCP have linear time algorithms if a graph is an outerplanar, a series-parallel, or a Halin graph.

Caterpillar trees have been used in chemical graph theory to represent the structure of benzenoid hydrocarbon molecules. In this representation, one forms a caterpillar in which each edge corresponds to a 6-carbon ring in the molecular structure, and two edges are incident at a vertex whenever the corresponding rings belong to a sequence of rings connected end-to-end in the structure. El-Basil (1987) writes, "It is amazing that nearly all graphs that played an important role in what is now called "chemical graph theory" may be related to caterpillar trees." In this context, caterpillar trees are also known as benzenoid trees and Gutman trees, after the work of Ivan Gutman in this area.