Performance improvement solely through transistor scaling is becoming more and more difficult, thus it is increasingly common to see domain specific accelerators used in conjunc- tion with general purpose processors to achieve future perfor- mance goals. There is a serious drawback to accelerators, though: binary compatibility. An application compiled to uti- lize an accelerator cannot run on a processor without that ac- celerator, and applications that do not utilize an accelerator will neveruse it. To overcome this problem, we proposedecou- pling the instruction set architecture from the underlying ac- celerators. Computation to be accelerated is expressed using a processor’s baseline instruction set, and light-weight dynamic translation maps the representation to whatever accelerators are available in the system. In this paper, we describe the changes to a compilation framework and processor system needed to support this ab- straction for an important set of accelerator designs that sup- port innermost loops. In this analysis, we investigate the dy- namic overheads associated with abstraction as well as the static/dynamic tradeoffs to improve the dynamic mapping of loop-nests. Aspartoftheexploration,wealsoprovideaquanti- tative analysis of the hardware characteristics of effective loop accelerators. We conclude that using a hybrid static-dynamic compilation approach to map computation on to loop-levelac- celerators is an practical way to increase computation effi- ciency, without the overheads associated with instruction set modification.