Despite the existence of a number of animation tools for a variety of languages, methods for employing these tools for specification testing have not been adequately explored. Similarly, despite the close correspondence between specification testing and implementation testing, the two processes are often treated independently, and relatively little investigation has been performed to explore their relationship. This paper presents the results of applying a framework and method for the systematic testing of specifications and their implementations. This framework exploits the close correspondence between specification testing and implementation testing. The framework is evaluated on a sizable case study of the Global System for Mobile Communications 11.11 Standard, which has been developed towards use in a commercial application. The evaluation demonstrates that the framework is of similar cost-effectiveness to the BZ-Testing-Tools framework and more cost-effective than manual testing. A mutation analysis detected more than 95% of non-equivalent specification and implementation mutants.