# Metrics

A number of predefined metrics for analyzing curricula and degree plans are described below. You may also define your own metrics for curricula and degree plans. Each of these data types has a `metrics`

dictionary where you may write these user-defined metrics.

## Curricular Metrics

`CurricularAnalytics`

— ModuleThe curriculum-based metrics in this toolbox are based upon the graph structure of a curriculum. Specifically, assume curriculum $c$ consists of $n$ courses $\{c_1, \ldots, c_n\}$, and that there are $m$ requisite (prerequisite or co-requsitie) relationships between these courses. A curriculum graph $G_c = (V,E)$ is formed by creating a vertex set $V = \{v_1, \ldots, v_n\}$ (i.e., one vertex for each course) along with an edge set $E = \{e_1, \ldots, e_m\}$, where a directed edge from vertex $v_i$ to $v_j$ is in $E$ if course $c_i$ is a requisite for course $c_j$.

### Blocking Factor

The blocking factor is an important curriculum-based metric because it measures the extent to which one course blocks the ability to take other courses in the curriculum. That is, a course with a high blocking factor acts as a gateway to many other courses in the curriculum. Students who are unable to pass the gateway course will be blocked from taking many other courses in the curriculum.

We define the blocking factor of a course $v_i$ as the number of courses in the graph that are reachable from $v_i$. As examples of the blocking factor metric, conisder the two four-course curricula, with courses $v_1, v_2, v_3$ and $v_4$, shown below. In part (a) of this figure, $v_1$ is a prerequisite for courses $v_2$ and $v_3$, and $v_2$ is a prerequisite for course $v_4$, while in part (b), courses $v_1$ and $v_2$ are prerequisites for course $v_3$, and $v_3$ is a prerequisite for course $v_4$. The blocking factor of each course are shown inside of the course vertices in this figure.

`CurricularAnalytics.blocking_factor`

— Function`blocking_factor(c::Curriculum, course::Int)`

The **blocking factor** associated with course $c_i$ in curriculum $c$ with curriculum graph $G_c = (V,E)$ is defined as:

\[b_c(v_i) = \sum_{v_j \in V} I(v_i,v_j)\]

where $I(v_i,v_j)$ is the indicator function, which is $1$ if $v_i \leadsto v_j$, and $0$ otherwise. Here $v_i \leadsto v_j$ denotes that a directed path from vertex $v_i$ to $v_j$ exists in $G_c$, i.e., there is a requisite pathway from course $c_i$ to $c_j$ in curriculum $c$.

`blocking_factor(c::Curriculum)`

The **blocking factor** associated with curriculum $c$ is defined as:

\[b(G_c) = \sum_{v_i \in V} b_c(v_i).\]

where $G_c = (V,E)$ is the curriculum graph associated with curriculum $c$.

### Delay Factor

Many curricula, particularly those in science, technology engineering and math (STEM) fields, contain a set of courses that must be completed in sequential order. The ability to successfully navigate these long pathways without delay is critical for student success and on-time graduation. If any course on the pathway is not completed on time, the student will then be delayed in completing the entire pathway by one term. The delay factor metric allows us to quanity this effect.

We define the delay factor of course vertex $v_i$ to be the length of the longest path that contains $v_i$. As an example of the delay factor metric, consider the same four-course curricula shown above. The delay factor of each course are shown inside of the course vertices in the figure below.

`CurricularAnalytics.delay_factor`

— Function`delay_factor(c::Curriculum, course::Int)`

The **delay factor** associated with course $c_k$ in curriculum $c$ with curriculum graph $G_c = (V,E)$ is the number of vertices in the longest path in $G_c$ that passes through $v_k$. If $\#(p)$ denotes the number of vertices in the directed path $p$ in $G_c$, then we can define the delay factor of course $c_k$ as:

\[d_c(v_k) = \max_{i,j,l,m}\left\{\#(v_i \overset{p_l}{\leadsto} v_k \overset{p_m}{\leadsto} v_j)\right\}\]

where $v_i \overset{p}{\leadsto} v_j$ denotes a directed path $p$ in $G_c$ from vertex $v_i$ to $v_j$.

`delay_factor(c::Curriculum)`

The **delay factor** associated with curriculum $c$ is defined as:

\[d(G_c) = \sum_{v_k \in V} d_c(v_k).\]

where $G_c = (V,E)$ is the curriculum graph associated with curriculum $c$.

### Centrality

A course can be thought of as central to a curriculum if it requires a number of foundational courses as prerequisites, and the course itself serves as a prerequisite to many additional discipline-specific courses in the curriculum. The centrality metric is meant to capture this notion.

We define the centrality of source and sink vertices to be 0. For all other course vertices, consider all of the long paths (i.e., unique paths from a source to a sink) containing course vertex $v_i$. The centrality of $v_i$ is given by the sum of these path lengths. As an example of the centrality metric, consider the same four-course curricula shown above. The centrality factor of each course are shown inside of the vertices in the figure below.

In the case of the curriculum in part (a), there is one long path of length three that includes course $v_2$, hence its centrality is 3, while in part (b), there are two long paths of length three that include course $v_2$, hence its centrality is 6.

`CurricularAnalytics.centrality`

— Function`centrality(c::Curriculum, course::Int)`

Consider a curriculum graph $G_c = (V,E)$, and a vertex $v_i \in V$. Furthermore, consider all paths between every pair of vertices $v_j, v_k \in V$ that satisfy the following conditions:

- $v_i, v_j, v_k$ are distinct, i.e., $v_i \neq v_j, v_i \neq v_k$ and $v_j \neq v_k$;
- there is a path from $v_j$ to $v_k$ that includes $v_i$, i.e., $v_j \leadsto v_i \leadsto v_k$;
- $v_j$ has in-degree zero, i.e., $v_j$ is a "source"; and
- $v_k$ has out-degree zero, i.e., $v_k$ is a "sink".

Let $P_{v_i} = \{p_1, p_2, \ldots\}$ denote the set of all directed paths that satisfy these conditions. Then the **centrality** of $v_i$ is defined as

\[q(v_i) = \sum_{l=1}^{\left| P_{v_i} \right|} \#(p_l).\]

where $\#(p)$ denotes the number of vertices in the directed path $p$ in $G_c$.

`centrality(c::Curriculum)`

Computes the total **centrality** associated with all of the courses in curriculum $c$, with curriculum graph $G_c = (V,E)$.

\[q(c) = \sum_{v \in V} q(v).\]

### Structural Complexity

The curricular complexity of a course is meant to capture the impact of curricular structure on student progression. Through experimentation, we have found that a simple linear combination of the delay and blocking factors provides a good measure for quantifying the structural complexity of a curriculum.

As an example of the structural complexity metric, consider the same four-course curricula shown above. The compleixty factor of each course, which is simply the sum of the course's delay and blocking factors, are shown inside of the course vertices in this figure.

`CurricularAnalytics.complexity`

— Function`complexity(c::Curriculum, course::Int)`

The **complexity** associated with course $c_i$ in curriculum $c$ with curriculum graph $G_c = (V,E)$ is defined as:

\[h_c(v_i) = d_c(v_i) + b_c(v_i)\]

i.e., as a linear combination of the course delay and blocking factors.

`complexity(c::Curriculum, course::Int)`

The **complexity** associated with curriculum $c$ with curriculum graph $G_c = (V,E)$ is defined as:

\[h(G_c) = \sum_{v \in V} \left(d_c(v) + b_c(v)\right).\]

For the example curricula considered above, the curriculum in part (a) has an overall complexity of 15, while the curriculum in part (b) has an overall complexity of 17. This indicates that the curriculum in part (b) will be slightly more difficult to complete than the one in part (a). In particular, notice that course $v_1$ in part (a) has the highest individual course complexity, but the combination of courses $v_1$ and $v_2$ in part (b), which both must be passed before a student can attempt course $v_3$ in that curriculum, has a higher combined complexity.

### Basic Metrics (Curriculum)

All of the predefined metrics for a given curriculum described above will be computed and stored in the curriculum's `metric`

dictionary by using the following function.

`CurricularAnalytics.basic_metrics`

— Method`basic_metrics(c::Curriculum)`

Compute the basic metrics associated with curriculum `c`

, and return an IO buffer containing these metrics. The basic metrics are also stored in the `metrics`

dictionary associated with the curriculum.

The basic metrics computed include:

- number of credit hours : The total number of credit hours in the curriculum.
- number of courses : The total courses in the curriculum.
- blocking factor : The blocking factor of the entire curriculum, and of each course in the curriculum.
- centrality : The centrality measure associated with the entire curriculum, and of each course in the curriculum.
- delay factor : The delay factor of the entire curriculum, and of each course in the curriculum.
- curricular complexity : The curricular complexity of the entire curriculum, and of each course in the curriculum.

Complete descriptions of these metrics are provided above.

```
julia> metrics = basic_metrics(curriculum)
julia> println(String(take!(metrics)))
julia> # The metrics are also stored in a dictonary that can be accessed as follows
julia> curriculum.metrics
```

## Degree Plan Metrics

The aforementioned curricular complexity metrics are independent of how a curriculum is layed out as a degree plan. That is, the curricular metrics will not change as different degree plans are created. Degree plan metrics are related to the manner in which courses in the curriculum are laid out across the terms in the degree plan. These metrics are used in the creation of optimal degree plans as described in Optimized Degree Plans.

### Basic Metrics (Degree Plans)

A set of basic statistics associated with the distribution of credit hours in a degree plan can be obtained by using:

`CurricularAnalytics.basic_metrics`

— Method`basic_metrics(plan)`

Compute the basic metrics associated with degree plan `plan`

, and return an IO buffer containing these metrics. The baseic metrics are primarily concerned with how credits hours are distributed across the terms in a plan. The basic metrics are also stored in the `metrics`

dictionary associated with the degree plan.

**Arguments**

Required:

`plan::DegreePlan`

: a valid degree plan (see Degree Plans).

The basic metrics computed include:

- number of terms : The total number of terms (semesters or quarters) in the degree plan, $m$.
- total credit hours : The total number of credit hours in the degree plan.
- max. credits in a term : The maximum number of credit hours in any one term in the degree plan.
- min. credits in a term : The minimum number of credit hours in any one term in the degree plan.
- max. credit term : The earliest term in the degree plan that has the maximum number of credit hours.
- min. credit term : The earliest term in the degree plan that has the minimum number of credit hours.
- avg. credits per term : The average number of credit hours per term in the degree plan, $\overline{ch}$.
- term credit hour std. dev. : The standard deviation of credit hours across all terms $\sigma$. If $ch_i$ denotes the number of credit hours in term $i$, then

\[\sigma = \sqrt{\sum_{i=1}^m {(ch_i - \overline{ch})^2 \over m}}\]

To view the basic degree plan metrics associated with degree plan `plan`

in the Julia console use:

```
julia> metrics = basic_metrics(plan)
julia> println(String(take!(metrics)))
julia> # The metrics are also stored in a dictonary that can be accessed as follows
julia> plan.metrics
```

### Requisite Distance

A degree plan metric that is based upon the separation of courses and their pre- and co-requisites in a degree plan is described next.

`CurricularAnalytics.requisite_distance`

— Function`requisite_distance(DegreePlan, course::Course)`

For a given degree plan `plan`

and target course `course`

, this function computes the total distance in the degree plan between `course`

and all of its requisite courses.

**Arguments**

Required:

`plan::DegreePlan`

: a valid degree plan (see Degree Plans).`course::Course`

: the target course.

The distance between a target course and one of its requisites is given by the number of terms that separate the target course from that particular requisite in the degree plan. To compute the requisite distance, we sum this distance over all requisites. That is, if write let $T_j^p$ denote the term in degree plan $p$ that course $c_j$ appears in, then for a degree plan with underlying curriculum graph $G_c = (V,E)$, the requisite distance for course $c_j$ in degree plan $p$, denoted $rd_{v_j}^p$, is:

\[rd_{v_j}^p = \sum_{\{i | (v_i, v_j) \in E\}} (T_j - T_i).\]

In general, it is desirable for a course and its requisites to appear as close together as possible in a degree plan. The requisite distance metric computed by this function is stored in the associated `Course`

data object.

`requisite_distance(plan::DegreePlan)`

For a given degree plan `plan`

, this function computes the total distance between all courses in the degree plan, and the requisites for those courses.

**Arguments**

Required:

`plan::DegreePlan`

: a valid degree plan (see Degree Plans).

The distance between a course a requisite is given by the number of terms that separate the course from its requisite in the degree plan. If $rd_{v_i}^p$ denotes the requisite distance between course $c_i$ and its requisites in degree plan $p$, then the total requisite distance for a degree plan, denoted $rd^p$, is given by:

\[rd^p = \sum_{v_i \in V} = rd_{v_i}^p\]

In general, it is desirable for a course and its requisites to appear as close together as possible in a degree plan. Thus, a degree plan that minimizes these distances is desirable. A optimization function that minimizes requisite distances across all courses in a degree plan is described in [Optimized Degree Plans]@ref. The requisite distance metric computed by this function will be stored in the associated `DegreePlan`

data object.