A Possible Approach to Generalize Dan Hoey's Syllables
The thread on this forum entitled "Non Trivial Identities" has had the most responses of any thread to date (39 responses so far). The thread contains a great deal of useful information, including a detailed analysis of nontrivial identities of length 12q. The thread may be found at http://cubezzz.homelinux.org/drupal/?q=node/view/114 (you have to login to see all the responses).
I tend to view the problem of analyzing identities more in terms of duplicate positions than in terms of identities, but you can of course easily transition from one view of the problem to the other. In any case, the problem of finding a formula for the number of moves at each distance from Start is closely tied to the problem of finding duplicate positions and/or nontrivial identities.
At this time
I don't have a new formula
to offer up concerning the number of moves at each distance
from Start. The best available formula remains
Dan Hoey's original formula from 1981:
http://www.math.rwthaachen.de/~Martin.Schoenert/CubeLovers/Dan_Hoey__The_Supergroup__Part_2__At_least_25_qtw,_and_why.html
But I did want to offer up some ideas about how such an improved formula
might be developed.
For an optimal sequence of moves, Dan defined a syllable as a subsequence of moves on parallel faces, F and B, R and L, or U and D. Implicit in Dan's formula is that any syllable must be followed by a syllable from an orthogonal face. For example, it is implicit that the syllable F cannot be followed by the syllable B because instead the sequence (FB) would be considered to be the syllable. Dan's original posting concerned the quarter turn metric, but the concept of syllables applies equally well to the face turn metric.
Typical enumerations of Cube space are breadth first enumerations, going from distance n to distance n+1 in an iterative fashion. Dan's formula works a little differently. The formula begins conventionally. If P[n] is the number of moves that are n moves from Start, then P[0] and P[1] are trivially calculated as P[0]=1 (the Start position itself) and P[1]=12*P[0]=12 (the 12 quarter turns). But P[2] in Dan's formula is not calculated from P[1]. Rather, P[2] is calculated from both P[0] and P[1]. So for example, rather than going from the position I to the position R to the position RL in two steps, Dan's formula requires that you go from the position I to the position (RL) all in one step.
Similar, Dan's formula requires you to go from the position I to the position F to the position FR in two steps because FR is not a syllable. Indeed, F can be followed by any of the 8 orthogonal moves: (F)(R), (F)(R'), (F)(L), (F)(L'), (F)(U), (F)(U'), (F)(D), and (F)(D'), and these 8 moves are accounted for in Dan's formula. But the formula does not allow F to be followed by B to yield (F)(B). Rather, the formula allows I (the Start position) to be followed by the syllable FB to yield (FB), going from a distance of 0 from Start to a distance of 2 from Start all in one go. Which is to say, it is legal under Dan's formula to have two successive parallel moves, but it is not legal under Dan's formula to have two successive parallel syllables. So we go from a distance of 1 to a distance of 2 in some cases, and from a distance of 0 to a distance of 2 in other cases. Further from Start, quarter turn syllables may consist of 1, 2, 3, or 4 quarter turns. Similarly, face turn syllables may consist of 1 or 2 face turns. So the calculation of P[5] in the quarter turn metric involves going directly from a distance of 1 to a distance of 5 in some cases, going directly from a distance of 2 to a distance of 5 in some cases, going directly from a distance of 3 to a distance of 5 in some cases, and going from a distance of 4 to a distance of 5 in all the other cases.
Dan's formula is making note of the fact that successive moves on parallel faces commute. For example, the syllable (FB) and the syllable (BF) are the same syllable. The formula is also making note of trivial identities. For example, FFFF=I, so that FF=(FF)'=(F'F'), and therefore (FF) and (F'F') are the same syllable.
My idea is that rather than thinking about syllables in terms of moves that commute or in terms of identities, we think about syllables in terms of duplicate positions  that is, in terms of different optimal sequences that yield the same position. For example, we make note of the fact that FB=BF and that FF=F'F' without regard to why. And we discover these duplicate positions via search rather than by any sort of analysis. The approach I'm suggesting is much less elegant than Dan's, but it might offer the possibility of improving the formula to yield tighter bounds further from Start.
In thinking about syllables in terms of duplicate positions, we maintain Dan's notion of writing optimal sequences in terms of syllables rather than in terms of moves. And we also maintain Dan's notion of combining together adjacent syllables when combining the syllables will yield a longer syllable. A decomposition of an optimal sequence into syllables will be said to consist of fully joined syllables when all the adjacent syllables that can be joined together have been joined together. The "joining" process itself may not be unique. But a sequence of joined syllables is not fully joined if any additional joining may be accomplished.
The tricky part of using duplicate positions to define syllables is the following. One the one hand, URL and ULR are duplicate positions, but we don't want to treat URL=ULR as a syllable. On the other hand. RRL and RLR are duplicate positions, and we do want to treat RRL=RLR as a syllable. So we need a syllable definition that somehow or other is able to make a distinction between the URL=ULR case and the RRL=RLR case.
The main idea is the following: The Start position is not a syllable. A position that has exactly one decomposition into shorter fully joined syllables is not a syllable. All other positions are syllables.
This approach still does not give us a better formula (at least not yet!). But it does provide a syllable definition that depends totally on duplicate positions, rather than on moves of parallel faces. It provides a syllable definition that matches Dan's definition for distances from Start of 1q through 5q. And for 6q from Start, it provides a syllable that matches the analysis of 12q identities in the "Non Trivial Identities" thread (the 6q syllables are the halfway positions of the 12q identities).
My proposed definition of a syllable is recursive. Let's see how it works for a few distances close to Start.
 n=0. The Start position is not a syllable.
 n=1. The quarter turns have no decompositions into shorter fully joined syllables, so they are syllables.

n=2. (These examples exhaust all the ways in which
optimal sequences of length 2q may be formed.)
 Sequences of the form RL and RL' have two decompositions into shorter fully joined syllables (e.g., RL=(R)(L)=(L)(R) and RL'=(R)(L')=(L')(R) ), so they are syllables.
 Sequences of the form RR have two decompositions into shorter fully joined syllables (e.g., RR=(R)(R)=(R')(R') ), so they are syllables.
 Sequences of the form RU and RU' have one decomposition into shorter fully joined syllables (e.g., RU=(R)(U) and RU'=(R)(U') ), so they are not syllables.

n=3. (These examples do not exhaust all the ways in which
optimal sequences of length 3q may be formed, but they are
illustrative nonetheless.)
 Sequences of the form FRU have one decomposition into shorter fully joined syllables (e.g., FRU=(F)(R)(U) ), so they are not syllables.
 Sequences of the form FRL=FLR have one decomposition into shorter fully joined syllables (e.g., FRL=FLR=(F)(RL) ), so they are not syllables.
 Sequences of the form RRL=RLR=LRR=R'R'L=R'LR'=LR'R' have numerous decompositions into shorter fully joined syllables (e.g., (R)(RL), (RR)(L), (RL)(R), etc. ), so they are syllables. It might appear "obvious" that sequences such as (R)(RL) can and should be combined together to form the longer syllable (RRL) without first checking to see if (RRL) can be decomposed into shorter fully joined syllables. That is, RRL clearly complies with the Hoey definition of a syllable. But remember that the more general definition of syllable is recursive, and that in the case of syllables of length 3q we are defining them in terms of syllables of length 1q and 2q. So we can't declare (RRL) to be a syllable until we have found at least two different ways to decompose it into fully joined syllables each of which is less than 3q in length.
Finally, let's look at syllables of length 6q just a little bit.
In the
"Non Trivial Identities" thread, Herbert Kociemba posted
32 nontrivial identities of length 12q that are unique up to
symmetry. The first one is
U U R U R' F' U U L' U' L F (12q*)
We may write this as xy'=UURUR'F'UUL'U'LF where x=UURUR'F' and
y'=UUL'U'LF, so that y=F'L'ULU'U'. Because
xy'=I, we have x=y which manifests as duplicate positions when
enumerating cube space. Written as fully joined syllables,
we have x=(UU)(R)(U)(R')(F') and y=(F')(L')(U)(L)(UU).
Hence, x=y has more than one decomposition into shorter
fully joined syllables, and therefore x=y is a syllable of
length 6q. We note in passing that
StartsWith(x)={U,U',F'} and EndsWith(x)={U,U',F'}.
I haven't verified all 32 of Herbert's nontrivial identities of length 12q, but I'm sure that for each one of them the halfway position corresponds to a syllable. Indeed, I would argue that if one of them didn't correspond to a syllable, then the 12q identity wasn't really nontrivial and could be trivially changed to a shorter identity.
For example, suppose instead of 12q identities we were enumerating 6q identities, and we claimed that URLR'L'U' were a nontrivial identity. The obvious counterargument is that all the moves in the subsequence RLR'L' commute and can therefore be rewritten as the trivial identity RLL'R'. The syllable counterargument as applied to the 6q identity would suggest writing the identity as (U)(RL)(R'L')(U') and would note that the syllables (RL) and (R'L') are inverses. The syllable counterargument as applied to duplicate positions would suggest rewriting the identity as xy' with x=(U)(RL) and y=(U)(LR) and would note that (RL) and (LR) are the same syllable so that x=y only has one decomposition into shorter fully joined syllables.
Here's a question for Herbert. When you were calculating your 32 nontrivial identities of length 12q unique up to symmetry, what was your criterion for verifying that all of them were nontrivial (i.e., that they couldn't be shortened)?