While not a formal and technically correct way of understanding how partial specialization works (check out the cppreference page for a more technical explanation), you can think of the code snippet you posted in terms of "pattern matching" (just as an intuition):
// `C` is a template class that takes two types as template parameters.
template<class F, class Alloc> class C;
// I'm going to specialize `C` so that:
// * The first type will be a function signature type, where the return
// type is going to be matched by `T` and the argument types will be
// matched by `Args...`.
// * The second type will be an user-provided `Alloc` typename.
template<class T, class... Args, class Alloc>
class C<T(Args...), Alloc> { /* ... */ };
Let's say I instantiate C like this:
using My_C = C<int(float, char), std::allocator<int>>;
My_C c;
Very roughly speaking, My_C will "match" the C<T(Args...), Alloc> partial template specialization as follows:
int ( float, char ) std::allocator<int>
^^^ ^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
T ( Args... ) Alloc
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
F Alloc
int is "matched" by T, and float, char are "matched" by Args.... There's no reason why Args... should be restricted to be the last template parameter of C.
As you can see, it is not required for the Args... parameter pack to be the last template argument, as we're just using it to "provide a name" for the argument type list in the passed function signature.