Abstract: Recent work describes methods for systematic, data-driven improvement to instructional content and calls for diverse teams of learning engineers to implement and evaluate such improvements. Focusing on an approach called "design-loop adaptivity," we consider the problem of how developers might use data to target or prioritize particular instructional content for improvement processes when faced with large portfolios of content and limited engineering resources to implement improvements. To do so, we consider two data-driven metrics that may capture different facets of how instructional content is "working." The first is a measure of the extent to which learners struggle to master target skills, and the second is a metric based on the difference in prediction performance between deep learning and more "traditional" approaches to knowledge tracing. This second metric may point learning engineers to workspaces that are, effectively, "too easy." We illustrate aspects of the diversity of learning content and variability in learner performance often represented by large educational datasets. We suggest that "monolithic" treatment of such datasets in prediction tasks and other research endeavors may be missing out on important opportunities to drive improved learning within target systems.