Radial Big Data Models

When the number of decision-making units is large, traditional DEA models are slow to solve. Khezrimotlagh, Zhu, Cook, and Toloo (2019), propose a framework that reduces the computational time by finding the set of best practices DMUs from a subsample and evaluating the rest of the decision-making units with respect to the best performers.

The proposed framework includes five steps:

Select a subsample of DMU.
Find the best practices in the subsample.
Find the exterior DMUs with respect to the hull of the best practices.
Identify the set of all efficient DMUs.
Calculate performance scores as in the traditional DEA model.

This example computes the Big Data radial input-oriented DEA model under variable returns to scale, using random data drawn from a uniform distribution. 500 DMUs with six inputs and four outputs in the interval (10, 20) are generated:

# Generate random data
using DataEnvelopmentAnalysis
using Distributions
using Random
using StableRNGs

rng = StableRNG(1234567)
X = rand(Uniform(10, 20), 500, 6);
Y = rand(Uniform(10, 20), 500, 4);

# Calculate the Big Data DEA Model
deabig = deabigdata(X, Y)

# Get efficiency scores
efficiency(deabig)

500-element Vector{Float64}:
 0.9443166747955681
 0.9543433550948335
 0.9759522991840279
 0.9790697646919239
 0.9492405455403267
 0.8434690825376188
 0.9038735725245473
 0.7464778255026098
 0.8246436759251636
 0.7863832229341852
 ⋮
 0.9018335373646374
 0.9425238373580438
 1.0
 0.937474824652036
 0.5369262942592442
 0.9599514759375228
 1.0
 0.694898164000674
 0.920645990787349

deabigdata Function Documentation

DataEnvelopmentAnalysis.deabigdata — Function

deabigdata(X, Y)

Compute the big data radial model using data envelopment analysis for inputs X and outputs Y.

Optional Arguments

orient=:Input: chooses the radially oriented input mode. For the radially oriented output model choose :Output.
rts=:CRS: chooses constant returns to scale. For variable returns to scale choose :VRS.
atol=1e-6: tolerance for DMU to be considered efficient.
names: a vector of strings with the names of the decision making units.