For surname studies I have a similar need to efficiently and effectively find and evaluate subtrees. Those subtrees are direct line ancestor and descendant trees, optionally pruned to include only bearers of a given surname.
Currently I find subtrees "by hand", which is a chore and the set is very incomplete.
Once I find subtrees I use Puzzilla graphs to get their approximate sizes, but it is wasteful of computing resources to draw tree graphs when all I want is statistics.
I need to find PIDs that don't have a single couple as parents.
I especially need to find PIDs with a surname of interest that have 0 parents; those PIDs are the heads of subtrees, where research will lead to joining them to another subtree in my study.
Various statistics exist for trees (aka rooted connected graphs). An ancestry tree that consists of a chain of son to father relationships (John Smith son of Jake Smith son of Jeff Smith son of . . .) has a branching value of 1. I have several of these in my main surname study. An ancestry tree with 2 parents for each person has a branching value of 2. An ancestry tree with legacy disputes can have a branching value greater than 2.
My concern is not "my" tree. My concern is I need a more efficient and effective way to scoop up all those small fragment trees. The 1-, 2-, 3- node trees are pretty easy because I can almost see from Find search results that they are that size. But the large family fragments give me fits, because in search results I see the tree listed once for each member and I have to pick out the head. That is an easy computing task but a strain on my brain.
Working with very rare surnames, it is tiring but not hard to generate a list of fragment trees by hand, but when getting into common surnames (which is where everyone needs the most help, right?) it becomes overwhelming.
(Readers, if you agree please upvote this discussion.)