AlgorithmicCompetition.AIAPCEnv
— TypeAIAPCEnv(p::AIAPCHyperParameters)
Build an environment to reproduce the results of the 2020 Calvano, Calzolari, Denicolò & Pastorello AER Paper
Calvano, E., Calzolari, G., Denicolò, V., & Pastorello, S. (2020). Artificial Intelligence, Algorithmic Pricing, and Collusion. American Economic Review, 110(10), 3267–3297. https://doi.org/10.1257/aer.20190623
AlgorithmicCompetition.AIAPCHyperParameters
— TypeAIAPCHyperParameters(
α::Float64,
β::Float64,
δ::Float64,
max_iter::Int,
competition_solution_dict::Dict{Symbol,CompetitionSolution};
convergence_threshold::Int = Int(1e5),
)
Hyperparameters which define a specific AIAPC environment.
AlgorithmicCompetition.AIAPCSummary
— TypeAIAPCSummary(α, β, is_converged, convergence_profit, iterations_until_convergence)
A struct to store the summary of an AIAPC experiment.
AlgorithmicCompetition.CompetitionSolution
— TypeCompetitionSolution(params::CompetitionParameters)
Solve the monopolist and Bertrand competition models for the given parameters and return the solution.
AlgorithmicCompetition.ConvergenceCheck
— TypeConvergenceCheck(convergence_threshold::Int64)
Hook to check convergence, as defined by the best response for each state being stable for a given number of iterations.
AlgorithmicCompetition.DDDCEnv
— TypeDDDCEnv(p::AIAPCHyperParameters)
Build an environment to reproduce the results of the Lewis 2023 extentions to AIAPC.
AlgorithmicCompetition.DDDCHyperParameters
— TypeDDDCHyperParameters(
α::Float64,
β::Float64,
δ::Float64,
max_iter::Int,
competition_solution_dict::Dict{Symbol,CompetitionSolution},
data_demand_digital_params::DataDemandDigitalParams;
convergence_threshold::Int = Int(1e5),
)
Hyperparameters which define a specific DDDC environment.
AlgorithmicCompetition.DDDCSummary
— TypeDDDCSummary(α, β, is_converged, data_demand_digital_params, convergence_profit, convergence_profit_demand_high, convergence_profit_demand_low, profit_gain, profit_gain_demand_high, profit_gain_demand_low, iterations_until_convergence, price_response_to_demand_signal_mse, percent_demand_high)
A struct to store the summary of an DDDC experiment.
AlgorithmicCompetition.AIAPCPolicy
— MethodAIAPCPolicy(env::AIAPCEnv; mode = "baseline")
Create a policy for the AIAPC environment, with symmetric agents, using a tabular Q-learner. Mode deterimines the initialization of the Q-matrix.
AlgorithmicCompetition.AIAPCStop
— MethodAIAPCStop(env::AIAPCEnv; stop_on_convergence = true)
Returns a stop condition that stops when the environment has converged for all players.
AlgorithmicCompetition.DDDCPolicy
— MethodDDDCPolicy(env::DDDCEnv; mode = "baseline")
Create a policy for the DDDC environment, with symmetric agents, using a tabular Q-learner. Mode deterimines the initialization of the Q-matrix.
AlgorithmicCompetition.InitMatrix
— MethodInitMatrix(env::AIAPCEnv, mode = "zero")
Initialize the Q-matrix for the AIAPC environment.
AlgorithmicCompetition.InitMatrix
— MethodInitMatrix(env::DDDCEnv, mode = "zero")
Initialize the Q-matrix for the AIAPC environment.
AlgorithmicCompetition.Q_i_0
— MethodQ_i_0(env::AIAPCEnv)
Calculate the Q-value for player i at time t=0, given the price chosen by player i and assuming random play over the price options of player -i.
AlgorithmicCompetition.Q_i_0
— MethodQ_i_0(env::DDDCEnv)
Calculate the Q-value for player i at time t=0, given the price chosen by player i and assuming random play over the price options of player -i, weighted by the demand state frequency.
AlgorithmicCompetition._best_action_lookup
— Method_best_action_lookup(state_, table)
Look up the best action for a given state in the q-value matrix
AlgorithmicCompetition.construct_AIAPC_profit_array
— Methodconstruct_AIAPC_profit_array(price_options, params, n_players)
Construct a 3-dimensional array which holds the profit for each player given a price pair. The first dimension is player 1's action, the second dimension is player 2's action, and the third dimension is the player index for their profit.
AlgorithmicCompetition.construct_AIAPC_state_space_lookup
— MethodconstructAIAPCstatespacelookup(actionspace, nprices)
Construct a lookup table from action space to the state space.
AlgorithmicCompetition.construct_DDDC_profit_array
— Methodconstruct_DDDC_profit_array(price_options, params, n_players)
Construct a 3-dimensional array which holds the profit for each player given a price pair. The first dimension is player 1's action, the second dimension is player 2's action, and the third dimension is the player index for their profit.
AlgorithmicCompetition.construct_DDDC_state_space_lookup
— Methodconstruct_DDDC_state_space_lookup(action_space, n_prices)
Construct a lookup table from action space to the state space.
AlgorithmicCompetition.extract_profit_vars
— Methodextract_profit_vars(env::AIAPCEnv)
Returns the Nash equilibrium and monopoly optimal profits, based on prices stored in env.
AlgorithmicCompetition.extract_profit_vars
— Methodextract_profit_vars(env::DDDCEnv)
Returns the Nash equilibrium and monopoly optimal profits, based on prices stored in env
.
AlgorithmicCompetition.extract_quantity_vars
— Methodextractquantityvars(env::DDDCEnv)
Returns the Nash equilibrium and monopoly optimal quantities, based on prices stored in env
.
AlgorithmicCompetition.extract_sim_results
— Methodextract_sim_results(exp_list::Vector{AIAPCSummary})
Extracts the results of a simulation experiment, given a list of AIAPCSummary objects, returns a DataFrame
.
AlgorithmicCompetition.extract_sim_results
— Methodextract_sim_results(exp_list::Vector{DDDCSummary})
Extracts the results of a simulation experiment, given a list of DDDCSummary objects, returns a DataFrame
.
AlgorithmicCompetition.get_convergence_profit_from_env
— Methodget_convergence_profit_from_env(env::AIAPCEnv, policy::MultiAgentPolicy)
Returns the average profit of the agent, after convergence, over the convergence state or states (in the case of a cycle).
AlgorithmicCompetition.get_convergence_profit_from_hook
— Methodget_convergence_profit_from_env(env::DDDCEnv, policy::MultiAgentPolicy)
Returns the average profit of the agent, after convergence, over the convergence state or states (in the case of a cycle). Also returns the average profit for the high and low demand states.
AlgorithmicCompetition.get_optimal_action
— Methodget_optimal_action(env::AIAPCEnv, policy::MultiAgentPolicy, last_observed_state)
Get the optimal action (best response) for each player, given the current policy and the last observed state.
AlgorithmicCompetition.get_prices_from_state
— Methodget_prices_from_state(env::AIAPCEnv, state)
Helper function. Returns the prices corresponding to the state passed.
AlgorithmicCompetition.get_profit_from_state
— Methodget_profit_from_state(env::AIAPCEnv, state)
Helper function. Returns the profit corresponding to the state passed.
AlgorithmicCompetition.get_state_from_memory
— Methodget_state_from_memory(env::AIAPCEnv)
Helper function. Returns the state corresponding to the current memory of the environment.
AlgorithmicCompetition.get_state_from_prices
— Methodget_state_from_prices(env::AIAPCEnv, memory)
Helper function. Returns the state corresponding to the memory vector passed.
AlgorithmicCompetition.profit_gain
— Methodprofit_gain(π_hat, env::AIAPCEnv)
Returns the profit gain of the agent based on the current policy.
AlgorithmicCompetition.profit_gain
— Methodprofit_gain(π_hat, env::AIAPCEnv)
Returns the profit gain of the agent based on the current policy.
AlgorithmicCompetition.run_aiapc
— Methodrun_aiapc(
n_parameter_iterations=1,
max_iter=Int(1e9),
convergence_threshold=Int(1e5),
α_range=Float64.(range(0.0025, 0.25, 100)),
β_range=Float64.(range(0.02, 2, 100)),
version="v0.0.0",
start_timestamp=now(),
batch_size=1,
)
Run AIAPC, given a configuration for a set of experiments.
AlgorithmicCompetition.run_and_extract
— Methodrun_and_extract(hyperparameters::AIAPCHyperParameters; stop_on_convergence = true)
Runs the experiment and returns the economic summary.
AlgorithmicCompetition.run_and_extract
— Methodrun_and_extract(hyperparameters::DDDCHyperParameters; stop_on_convergence = true)
Runs the experiment and returns the economic summary.
AlgorithmicCompetition.run_dddc
— Methodrun_dddc(
n_parameter_iterations = 1,
max_iter = Int(1e9),
convergence_threshold = Int(1e5),
n_grid_increments = 100,
)
Run DDDC, given a configuration for a set of experiments.
ReinforcementLearningBase.act!
— MethodRLBase.act!(env::AIAPCEnv, price_tuple::Tuple{Int64,Int64})
Act in the environment by setting the memory to the given price tuple and setting is_done
to true
.
ReinforcementLearningBase.act!
— MethodRLBase.act!(env::DDDCEnv, price_tuple::Tuple{Int64,Int64})
Act in the environment by setting the memory to the given price tuple and setting is_done
to true
.
ReinforcementLearningBase.is_terminated
— MethodRLBase.is_terminated(env::AIAPCEnv)
Return whether the episode is done.
ReinforcementLearningBase.is_terminated
— MethodRLBase.is_terminated(env::DDDCEnv)
Return whether the episode is done.
ReinforcementLearningBase.reward
— MethodRLBase.reward(env::AIAPCEnv, p::Int)
Return the reward for the current state for player p
as an integer. If the episode is done, return the profit, else return 0
.
ReinforcementLearningBase.reward
— MethodRLBase.reward(env::AIAPCEnv, player::Player)
Return the reward for the current state for player
. If the episode is done, return the profit, else return 0
.
ReinforcementLearningBase.reward
— MethodRLBase.reward(env::AIAPCEnv)
Return the reward for the current state. If the episode is done, return the profit, else return 0, 0
.
ReinforcementLearningBase.state
— MethodRLBase.state(env::AIAPCEnv, player::Player)
Return the current state as an integer, mapped from the environment memory.
ReinforcementLearningBase.state
— MethodRLBase.state(env::DDDCEnv, player::Player)
Return the current state as an integer, mapped from the environment memory.
ReinforcementLearningCore.check!
— MethodRLCore.check_stop(s::StopWhenConverged, agent, env)
Returns true if the environment has converged for all players.