I made it self-contained: Live On Coliru
Now, when you want to understand the X3 grammar - beyond mentally debugging - you can enable
#define BOOST_SPIRIT_X3_DEBUG
This debugs rules. Consider adding some debug-only rules for more detailed info:
auto dbg(auto name, auto p) { return x3::rule<struct _>{name} = p; };
auto name = dbg("name", x3::alpha >> *x3::alnum); // to be improved later
auto length = dbg("length", ':' >> x3::double_);
auto leaf = dbg("leaf", -name);
auto internal = dbg("internal", '(' >> (branch % ',') >> ')' >> -name);
auto subtree = dbg("subtree", leaf | internal);
auto tree = dbg("tree", subtree >> ';');
Now the output will be e.g.: Live
<tree>
<try>;</try>
<subtree>
<try>;</try>
<leaf>
<try>;</try>
<name>
<try>;</try>
<fail/>
</name>
<success>;</success>
</leaf>
<success>;</success>
</subtree>
<success></success>
</tree>
";" -> true true
You can "trace" the rule invocations and results. Now, let's look at the first fail:
<tree>
<try>(,);</try>
<subtree>
<try>(,);</try>
<leaf>
<try>(,);</try>
<name>
<try>(,);</try>
<fail/>
</name>
<success>(,);</success>
</leaf>
<success>(,);</success>
</subtree>
<fail/>
</tree>
"(,);" -> false false
You can see it tries subtree, which tries leaf, which succeeds because leaf is optional by definition:
auto leaf = -name;
A parser shaped -p will always succeed. So in a|b when a = -p, the alternative b will never be invoked. Either make name less optional, or reorder your branches, so an internal will get a chance before deciding an empty leaf was matched:
auto subtree = internal | leaf;
Now we get:
void quetzal::newick::test::tree()
";" -> true true
"(,);" -> true true
"(,,(,));" -> true true
"(A,B,(C,D));" -> true true
"(A,B,(C,D)E)F;" -> true true
"(:0.1,:0.2,(:0.3,:0.4):0.5);" -> true true
"(:0.1,:0.2,(:0.3,:0.4):0.5):0.0;" -> false false
"(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);" -> true true
"(A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;" -> true true
"((B:0.2,(C:0.3,D:0.4)E:0.5)F:0.1)A;" -> true true
Looking at the one remaining failing parse:
<tree>
<try>(:0.1,:0.2,(:0.3,:0.</try>
<subtree>
<try>(:0.1,:0.2,(:0.3,:0.</try>
<internal>
<try>(:0.1,:0.2,(:0.3,:0.</try>
<branch>
<try>:0.1,:0.2,(:0.3,:0.4</try>
<subtree>
<try>:0.1,:0.2,(:0.3,:0.4</try>
<internal>
<try>:0.1,:0.2,(:0.3,:0.4</try>
<fail/>
</internal>
<leaf>
<try>:0.1,:0.2,(:0.3,:0.4</try>
<name>
<try>:0.1,:0.2,(:0.3,:0.4</try>
<fail/>
</name>
<success>:0.1,:0.2,(:0.3,:0.4</success>
</leaf>
<success>:0.1,:0.2,(:0.3,:0.4</success>
</subtree>
<length>
<try>:0.1,:0.2,(:0.3,:0.4</try>
<success>,:0.2,(:0.3,:0.4):0.</success>
</length>
<success>,:0.2,(:0.3,:0.4):0.</success>
</branch>
<branch>
<try>:0.2,(:0.3,:0.4):0.5</try>
<subtree>
<try>:0.2,(:0.3,:0.4):0.5</try>
<internal>
<try>:0.2,(:0.3,:0.4):0.5</try>
<fail/>
</internal>
<leaf>
<try>:0.2,(:0.3,:0.4):0.5</try>
<name>
<try>:0.2,(:0.3,:0.4):0.5</try>
<fail/>
</name>
<success>:0.2,(:0.3,:0.4):0.5</success>
</leaf>
<success>:0.2,(:0.3,:0.4):0.5</success>
</subtree>
<length>
<try>:0.2,(:0.3,:0.4):0.5</try>
<success>,(:0.3,:0.4):0.5):0.</success>
</length>
<success>,(:0.3,:0.4):0.5):0.</success>
</branch>
<branch>
<try>(:0.3,:0.4):0.5):0.0</try>
<subtree>
<try>(:0.3,:0.4):0.5):0.0</try>
<internal>
<try>(:0.3,:0.4):0.5):0.0</try>
<branch>
<try>:0.3,:0.4):0.5):0.0;</try>
<subtree>
<try>:0.3,:0.4):0.5):0.0;</try>
<internal>
<try>:0.3,:0.4):0.5):0.0;</try>
<fail/>
</internal>
<leaf>
<try>:0.3,:0.4):0.5):0.0;</try>
<name>
<try>:0.3,:0.4):0.5):0.0;</try>
<fail/>
</name>
<success>:0.3,:0.4):0.5):0.0;</success>
</leaf>
<success>:0.3,:0.4):0.5):0.0;</success>
</subtree>
<length>
<try>:0.3,:0.4):0.5):0.0;</try>
<success>,:0.4):0.5):0.0;</success>
</length>
<success>,:0.4):0.5):0.0;</success>
</branch>
<branch>
<try>:0.4):0.5):0.0;</try>
<subtree>
<try>:0.4):0.5):0.0;</try>
<internal>
<try>:0.4):0.5):0.0;</try>
<fail/>
</internal>
<leaf>
<try>:0.4):0.5):0.0;</try>
<name>
<try>:0.4):0.5):0.0;</try>
<fail/>
</name>
<success>:0.4):0.5):0.0;</success>
</leaf>
<success>:0.4):0.5):0.0;</success>
</subtree>
<length>
<try>:0.4):0.5):0.0;</try>
<success>):0.5):0.0;</success>
</length>
<success>):0.5):0.0;</success>
</branch>
<name>
<try>:0.5):0.0;</try>
<fail/>
</name>
<success>:0.5):0.0;</success>
</internal>
<success>:0.5):0.0;</success>
</subtree>
<length>
<try>:0.5):0.0;</try>
<success>):0.0;</success>
</length>
<success>):0.0;</success>
</branch>
<name>
<try>:0.0;</try>
<fail/>
</name>
<success>:0.0;</success>
</internal>
<success>:0.0;</success>
</subtree>
<fail/>
</tree>
"(:0.1,:0.2,(:0.3,:0.4):0.5):0.0;" -> false false
Looking at the end clearly indicates the problem is that the length (":0.0") is encountered outside the last parentheses, where it is not expected. Perhaps you forgot that you were using tree as the rule, not branch? Anyways, you can probably take it from here.
Side Notes
You are using a skipper which will probably make your life unless you make some rules lexeme (like name). I'd also suggest coding the skipper into your grammar:
auto tree = x3::skip(x3::space) [ subtree >> ';' ];
Note that space includes newlines, so maybe you really want blank instead. Finally, you can incorporate the f == l iterator check into the grammar by appending >> eoi:
auto tree = x3::skip(x3::space) [ subtree >> ';' >> x3::eoi ];
Full Listing
Also addressing the side notes and removing the debug/expositional stuff:
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>
namespace x3 = boost::spirit::x3;
namespace quetzal::newick::parser {
x3::rule<struct branch> branch{"branch"};
auto name = x3::lexeme[x3::alpha >> *x3::alnum]; // to be improved later
auto length = ':' >> x3::double_;
auto leaf = -name;
auto internal = '(' >> (branch % ',') >> ')' >> -name;
auto subtree = internal | leaf;
auto tree = x3::skip(x3::blank)[subtree >> ';' >> x3::eoi];
auto branch_def = subtree >> -length;
BOOST_SPIRIT_DEFINE(branch)
} // namespace quetzal::newick::parser
namespace quetzal::newick::test {
void run_tests(auto name, auto p, std::initializer_list<char const*> cases) {
std::cerr << "============ running " << name << " tests:\n";
for (std::string const input : cases)
std::cout << quoted(input) << " \t-> " << std::boolalpha
<< parse(begin(input), end(input), p) << std::endl;
}
void internal() {
run_tests("internal", quetzal::newick::parser::internal,
{
"(,)",
"(A,B)F",
"(A:10,B:10)F",
});
}
void tree() {
run_tests("tree", quetzal::newick::parser::tree,
{
";",
"(,);",
"(,,(,));",
"(A,B,(C,D));",
"(A,B,(C,D)E)F;",
"(:0.1,:0.2,(:0.3,:0.4):0.5);",
"(:0.1,:0.2,(:0.3,:0.4):0.5):0.0;",
"(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);",
"(A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;",
"((B:0.2,(C:0.3,D:0.4)E:0.5)F:0.1)A;",
});
}
} // namespace quetzal::newick::test
int main() {
using namespace quetzal::newick::test;
internal();
tree();
}
Prints
============ running internal tests:
"(,)" -> true
"(A,B)F" -> true
"(A:10,B:10)F" -> true
============ running tree tests:
";" -> true
"(,);" -> true
"(,,(,));" -> true
"(A,B,(C,D));" -> true
"(A,B,(C,D)E)F;" -> true
"(:0.1,:0.2,(:0.3,:0.4):0.5);" -> true
"(:0.1,:0.2,(:0.3,:0.4):0.5):0.0;" -> false
"(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);" -> true
"(A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;" -> true
"((B:0.2,(C:0.3,D:0.4)E:0.5)F:0.1)A;" -> true