Semantic Mapping of Learning Assets

Award Winner: 
2010 Sloan-C Effective Practice Award
Author Information
Phil Ice
Jennifer Staley
Chad Patrizi
Justin Beals
Institution(s) or Organization(s) Where EP Occurred: 
American Public University System
Effective Practice Abstract/Summary
Abstract/Summary of Effective Practice: 




Ensuring that courses touch upon desired goals and objectives is a problem that has long plagued instructional design. Compounding the problem is increased pressure for accountability and demonstration of learning effectiveness at institutions of higher education. From the organizational perspective, the overwhelming amount of information associated with online learning, that resides in silo structures, presents a significant challenge when codification of tacit knowledge is required for accreditation by external agencies. While keyword and metadata strategies have provided some benefit these methods suffer from a lack of robustness. As such, they provide only incremental improvements to what remains a highly manual process.


A research and development team at American Public University System addressed these interrelated problems by employing an open source repository and semantic engine for analysis and alignment of content, materials, and learning activities across all courses within the School of Business. The result was a highly detailed, accurate mapping of the programs' knowledge base to established goals and objectives. As an added benefit, resources were disaggregated to a very granular level and sorted into taxonomies that could be referenced independent of the Business program and utilized for cross-curricular consumption.

Description of the Effective Practice
Description of the Effective Practice: 




From the organizational perspective, the overwhelming amount of information associated with online learning, that resides in silo structures, presents a challenge in terms of the efficacy of instructional design workflows and codification of tacit knowledge for accreditation by external agencies. In the Learning or Content Management System environments, content management frequently translates into a single-purpose allocation of content resources, with cataloging and meta-tagging being a haphazard affair. This can lead to potential duplication of content and significant time loss associated with asset retrieval for incorporation into new curricula as well as less effective, poorly constructed, and mis-aligned learning objectives. Because content is created with the notion that all contributors have knowledge of the underlying taxonomies or common vernacular that information is based upon, it is difficult for organizations to survey their content universe for existing objects that can be incorporated into emerging workflows.


 By automating the meta-tagging and gap analysis process, semantic analysis allows one to not only smartly survey existing learning objects in a specific curriculum area, but examine more learning objects across unrealized curriculums. The ability to determine content interrelationships through the mapping of assets across the content universe enables one to effectively and efficiently facilitate object reusability towards curricular goal and objective fulfillment. This process allows for the actualizing of opportunities to locate learning objects to fulfill course level objectives for alignment across course level objectives, programmatic outcomes and industry standards. Improvement of instructional outcomes, through the ingestion of work products from implementation of content distillation and semantic analysis, increases return on investment and time on task. The opportunity to provide detailed analysis reporting that demonstrates curricular alignment to accrediting bodies and others is effectuated.


At American Public University System (APUS) the Instructional Design and Development (IDD) Team created an Instructional Design Process Model to design and develop curricula. The APUS ID Process Model (see is a continuous course development process allowing for the collaboration with Subject Matter Experts (SMEs), Instructional Designers (IDs), Graphic and Media specialists, and Content Area experts both internal and external to the institution. Through the APUS Instructional Design Process Model, a derivative of the ADDIE model (Molenda, 2003) was incorporated with the foundational principles of the Community of Inquiry (CoI) (Arbaugh, Cleveland-Innes, Diaz, Garrison, Ice, Richardson & Swan, 2008) and applied to every step of the process; planning, pre-development, development, design, and evaluation, and maintenance.


 The balance six-phase process model is an architecture which enables the team to focus on three important elements of the courseware development. First, the team implements accepted best practices in online Instructional Design (ID) systems pedagogy, andragogy, and heutagogy modeling to structure overall courseware development. Next, the CoI is applied to the online learning framework in the courseware to ensure successful student outcomes. Finally, the team utilizes agile project management principles to allow for collaboration and communication, both internally and with subject matter experts, while still maintaining discipline, quality, and rigor throughout the project’s lifecycle (Staley, Gibosn & Ice, 2010).


Though highly efficient and effective in rapidly creating quality content across a network of over 130 geographically dispersed contributors, the APUS IDD is still confronted with three problems that consistently confound the field. First, content and learning activities created with the premise that SMEs have a mastery of their area and will ensure that goals and objectives are met through tacit embedding of area knowledge within course structures. However, subject matter experts are human and prone to the same tendency to have expert blind spots; a phenomenon long noted in the traditional classroom (McKeachie, 1986). Second, metatagging data is a time intensive process that even when executed with a high degree of accuracy provides little more than key word associations. Finally, because all contributors have knowledge of the underlying taxonomies or common vernacular that information is based upon, it is difficult for organizations to survey their content universe for existing objects that can be incorporated into emerging workflows.


As with all institutions, APUS is also confronted with the issue of providing adequate data for accreditation by external entities. Within the accreditation process is the task of demonstrating that curricula fulfills both course level and program level goals and objectives. While the IDD process, described above, is designed to help expedite goal and objective alignment, the limitations previously described still limit the robustness of execution.


To solve this problem, APUS investigated the feasibility of using semantic analysis to: 1. match program and course level goals and objectives to course content and activities; 2. create a gap analysis to note where additional resources should be applied to meet goals and objectives; and 3. create robust content repositories that have granular associations between course components and over-arching ontologies that can be applied in a cross-curricular fashion.


Similar to contemporary object oriented programming language, semantic analysis is reliant on defining data in terms of classes with attributes and instances. The vision of the semantic aware applications builds upon this concept by refining these ontologies through comparisons of associated metadata. Currently, there are two approaches for developing semantic applications; the bottom-up approach and the top-down approach. The bottom-up approach is problematic in that it assumes metadata will be added to each piece of content to include information about its context; tagging at the concept level, if you will. The top-down approach appears to have a far greater likelihood of success, as it focuses on developing natural language search capability that can make those same kinds of determinations without any special metadata (Johnson, Levine & Smith, 2009).


Interestingly though the NMC / ELI Horizon Report 2009 indicates that semantically aware applications are not likely to become mainstream for four to five years, a few cutting edge prototypes are currently being utilized. Even as these applications are still undergoing refinement the prototypes demonstrate the potential power of semantic applications for both formal and informal learning. The IDD team at APUS vetted several of these cutting edge solutions, both open source and proprietary for the purposes previously described. The Common Library solution, which is open source and available under an Apache 2 license on Source Forge, was ultimately selected.


Developed from the ground-up to address specific needs in education, Common Library ( is the first standards-based content management system to enable true collaborative potential through the integration of content development and social networking.  The Common Library latent semantic search engine defines a unique and powerful aspect of the application. In the current 2.0 implementation of the system, the metadata and content of each learning object are compared against defined standards systems. This higher-order logic enables Common Library to dynamically suggest interconnections between content items and applicable state standards, providing immediate value for users in the K-12 educational market. This functionality also defines the potential for constructing dynamic relationships between state standard systems that evolve over time. Implementation of search and aggregate technology generates references that feed new granularly addressable connections between content and curriculum structures as more is learned about a specific users requirements. A PowerPoint detailing the Semantic Analysis process can be found at:


An instance of this solution was stood up for the APUS IDD team and all course components for the APUS Business Program were federated. After disaggregation of the materials a granular analysis was conducted using Common Library's underlying latent Dirichlet analysis engine. A full mapping of program goals and objectives was created across the content universe. Where incidents of deficiency were noted, remedial action was taken to provide additional resources.

Supporting Information for this Effective Practice
Evidence of Effectiveness: 




Content and activities from 29 courses in the  APUS Business Program were federated in Common Library. Disaggregation of content yielded 5227 granular level assets and ontological ordering, using Dirichlet analysis, was conducted and categorical structuring was implemented using an iterative, multi-pass approach. A total of 538 goals and objectives, from both the program and course level, were input into the system. Using a natural language approach, these goals and objectives were specified as being representative of over-arching ontological structures. From this point, information from the granular asset analysis was converged on the goals and objectives.


 Of the 538 goals and objectives, matches were identified among 465 respective assets. The asset match ranged in scope from between two and 38 assets per goal. With respect to identification of gaps, 73 goals and objectives were identified for which there were no corresponding assets. These areas were noted and recommendations were made to the School of Business. Subsequent content development, which was added to the existing repository, resulted in a second run in which all but three goals and objectives could clearly be mapped to corresponding course content and activities.


 A brief PowerPoint located at the actual Common Library interface available to the IDD team. Evidence illustrates that the ability to roundtrip content and goals / objectives is a key benefit of this process. In other words, a content universe can be checked for assets that is then capable of being related to goals and objective or individual assets may be viewed and their correlation to goals and objectives are then identified.


  For program evaluation, both internal and by external accrediting agencies, this capability provides a means for rapid assessment of both compliance across programs and relevance of course assets. From an ID perspective this methodology also allows for rapid identification of existing assets that can be repurposed for use in new courses or programs. This later capability is especially useful in instances where unrealized cross-curricular content may have already been developed but not realized due to programmatic silo effects and the related lack of awareness on the part of subject matter experts.


 As semantic analysis is an emerging technology it is still considered somewhat suspect in the realm of higher education. To assess efficacy, random sampling of Common Library output, both matching and gap analysis, was compared against human to Common Library agreement as well as human to human agreement. For the first test interrater reliability was established at .90. In the second instance the interrater reliability was established at .93. Thus, the reliability of the Common Library analysis is not considered significantly lower than human analysis.

How does this practice relate to pillars?: 




Learning Effectiveness

 Application of this technique allows for large scale analysis of correlations between goals / objectives and associated course assets. In turn, the ability to identify areas of deficiency and construct learning pathways ensures that all desired goals / objectives are addressed in a timely and thorough fashion. While not yet implemented, the ability to aggregate student work products, as well as discussion threads, also exists. Using this approach the power of semantic analysis can be extended to include relating learning outcomes back to goals / objectives, providing a complete map of the learning cycle.



 The ability to rapidly map assets to goals and objectives is significantly more effective when using a semantic engine than when similar processes are implemented by human coders. For this implementation, 137 hours were required from the IDD Team, project management, and technical personnel. A corresponding test case revealed that mapping one course against goals and objectives required 64 hours. When expanded to the 29 course sequence that was reviewed the mapping process would have translated into 1856 hours. Thus, the application of semantic analysis resulted in a 92.7% reduction in human labor. Translated into monetary terms the savings, including fringe benefits would be approximately $80,000 (U.S.).


 Notably, standing up the instance of Common Library required 32 hours which will not be required in future mapping initiatives. Given the demands of mapping for both internal and external purposes, and including multiple programs, it is easy to visualize how this technique could result in savings of over $200,000 per year while drastically streamlining institutional course review and development processes. The ROI on this later aspect should be the subject of more comprehensive program evaluations for instance of implementing this technique.

Equipment necessary to implement Effective Practice: 

Copy of Common Library - Avaialble as open source under Apache 2 license on Source Forge.

Dedicated server.

Estimate the probable costs associated with this practice: 

Though Common Library is an open source product standing up an instance requires a moderate startup cost. For APUS the hardware and personnel costs associated with this were approximately $30,000. Ingesting content, data management, data refinement, iterative coding and model development required slightly over 100 hours. Notably, some of this process required highly skilled programmers, bringing the total cost for this phase to approximately $20,000. However, these costs were mitigated by the need for drastically increased human labor had a manual approach to mapping been employed. A discussion of ROI is presented under the Scale section of this practice.

References, supporting documents: 




Arbaugh, J. B., Cleveland-Innes, M., Diaz, S. R., Garrison, D. R., Ice, P., Richardson, & Swan, K. P. (2008). Developing a community of inquiry instrument: Testing a measure of the Community of Inquiry framework using a multi-institutional sample. The Internet and higher Education, 11(3-4), 133-136.

Ice, P. & Staley, J. (March, 2010) "Using Semantic Analysis for Content Alignment and Gap

 Analysis" Orlando, FL: ELearning Guild Learning Solutions Conference and Expo, 2010.


Johnson, L., Levine, A., & Smith, R. (2009). The 2009 Horizon Report. Austin, Texas: The New


 Media Consortium.


McKeachie, W. (Ed.) (1986). Teaching and learning in the college classroom: A review of the research literature. Ann Arbor, MI: University of Michigan.

Molenda, M. (2003). In search of the elusive ADDIE model. Performance Improvement,


Staley, J., Gibson, A., & Ice,P. (February, 2010) "Impacts on Student Satisfaction through

 Curricular Design: Factors of the Community of Inquiry Framework and Instructional

 Design Practices" New Orleans, LA: Southwest Educational Research Association, 32nd

 Annual Conference.


Other Comments: 

Paper being prepared for submission to Journal of Asynchronous Learning networks.

Contact(s) for this Effective Practice
Effective Practice Contact: 
Phil Ice
Email this contact:
Effective Practice Contact 2: 
Jessica Staley
Email contact 2:
Effective Practice Contact 3: 
Justin Beals
Email contact 3: