Docstrings · AlgorithmicCompetition.jl

AlgorithmicCompetition.AIAPCEnv — Type

AIAPCEnv(p::AIAPCHyperParameters)

Build an environment to reproduce the results of the 2020  Calvano, Calzolari, Denicolò & Pastorello AER Paper

Calvano, E., Calzolari, G., Denicolò, V., & Pastorello, S. (2020). Artificial Intelligence, Algorithmic Pricing, and Collusion. American Economic Review, 110(10), 3267–3297. https://doi.org/10.1257/aer.20190623

AlgorithmicCompetition.AIAPCHyperParameters — Type

AIAPCHyperParameters(
    α::Float64,
    β::Float64,
    δ::Float64,
    max_iter::Int,
    competition_solution_dict::Dict{Symbol,CompetitionSolution};
    convergence_threshold::Int = Int(1e5),
)

Hyperparameters which define a specific AIAPC environment.

AlgorithmicCompetition.AIAPCSummary — Type

AIAPCSummary(α, β, is_converged, convergence_profit, iterations_until_convergence)

A struct to store the summary of an AIAPC experiment.

AlgorithmicCompetition.CompetitionSolution — Type

CompetitionSolution(params::CompetitionParameters)

Solve the monopolist and Bertrand competition models for the given parameters and return the solution.

AlgorithmicCompetition.ConvergenceCheck — Type

ConvergenceCheck(convergence_threshold::Int64)

Hook to check convergence, as defined by the best response for each state being stable for a given number of iterations.

AlgorithmicCompetition.DDDCEnv — Type

DDDCEnv(p::AIAPCHyperParameters)

Build an environment to reproduce the results of the Lewis 2023 extentions to AIAPC.

AlgorithmicCompetition.DDDCHyperParameters — Type

DDDCHyperParameters(
    α::Float64,
    β::Float64,
    δ::Float64,
    max_iter::Int,
    competition_solution_dict::Dict{Symbol,CompetitionSolution},
    data_demand_digital_params::DataDemandDigitalParams;
    convergence_threshold::Int = Int(1e5),
)

Hyperparameters which define a specific DDDC environment.

AlgorithmicCompetition.DDDCSummary — Type

DDDCSummary(α, β, is_converged, data_demand_digital_params, convergence_profit, convergence_profit_demand_high, convergence_profit_demand_low, profit_gain, profit_gain_demand_high, profit_gain_demand_low, iterations_until_convergence, price_response_to_demand_signal_mse, percent_demand_high)

A struct to store the summary of an DDDC experiment.

AlgorithmicCompetition.AIAPCPolicy — Method

AIAPCPolicy(env::AIAPCEnv; mode = "baseline")

Create a policy for the AIAPC environment, with symmetric agents, using a tabular Q-learner. Mode deterimines the initialization of the Q-matrix.

AlgorithmicCompetition.AIAPCStop — Method

AIAPCStop(env::AIAPCEnv; stop_on_convergence = true)

Returns a stop condition that stops when the environment has converged for all players.

AlgorithmicCompetition.DDDCPolicy — Method

DDDCPolicy(env::DDDCEnv; mode = "baseline")

Create a policy for the DDDC environment, with symmetric agents, using a tabular Q-learner. Mode deterimines the initialization of the Q-matrix.

AlgorithmicCompetition.InitMatrix — Method

InitMatrix(env::AIAPCEnv, mode = "zero")

Initialize the Q-matrix for the AIAPC environment.

AlgorithmicCompetition.InitMatrix — Method

InitMatrix(env::DDDCEnv, mode = "zero")

Initialize the Q-matrix for the AIAPC environment.

AlgorithmicCompetition.Q_i_0 — Method

Q_i_0(env::AIAPCEnv)

Calculate the Q-value for player i at time t=0, given the price chosen by player i and assuming random play over the price options of player -i.

AlgorithmicCompetition.Q_i_0 — Method

Q_i_0(env::DDDCEnv)

Calculate the Q-value for player i at time t=0, given the price chosen by player i and assuming random play over the price options of player -i, weighted by the demand state frequency.

AlgorithmicCompetition._best_action_lookup — Method

_best_action_lookup(state_, table)

Look up the best action for a given state in the q-value matrix

AlgorithmicCompetition.construct_AIAPC_profit_array — Method

construct_AIAPC_profit_array(price_options, params, n_players)

Construct a 3-dimensional array which holds the profit for each player given a price pair. The first dimension is player 1's action, the second dimension is player 2's action, and the third dimension is the player index for their profit.

AlgorithmicCompetition.construct_AIAPC_state_space_lookup — Method

constructAIAPCstatespacelookup(actionspace, nprices)

Construct a lookup table from action space to the state space.

AlgorithmicCompetition.construct_DDDC_profit_array — Method

construct_DDDC_profit_array(price_options, params, n_players)

AlgorithmicCompetition.construct_DDDC_state_space_lookup — Method

construct_DDDC_state_space_lookup(action_space, n_prices)

Construct a lookup table from action space to the state space.

AlgorithmicCompetition.extract_profit_vars — Method

extract_profit_vars(env::AIAPCEnv)

Returns the Nash equilibrium and monopoly optimal profits, based on prices stored in env.

AlgorithmicCompetition.extract_profit_vars — Method

extract_profit_vars(env::DDDCEnv)

Returns the Nash equilibrium and monopoly optimal profits, based on prices stored in env.

AlgorithmicCompetition.extract_quantity_vars — Method

extractquantityvars(env::DDDCEnv)

Returns the Nash equilibrium and monopoly optimal quantities, based on prices stored in env.

AlgorithmicCompetition.extract_sim_results — Method

extract_sim_results(exp_list::Vector{AIAPCSummary})

Extracts the results of a simulation experiment, given a list of AIAPCSummary objects, returns a DataFrame.

AlgorithmicCompetition.extract_sim_results — Method

extract_sim_results(exp_list::Vector{DDDCSummary})

Extracts the results of a simulation experiment, given a list of DDDCSummary objects, returns a DataFrame.

AlgorithmicCompetition.get_convergence_profit_from_env — Method

get_convergence_profit_from_env(env::AIAPCEnv, policy::MultiAgentPolicy)

Returns the average profit of the agent, after convergence, over the convergence state or states (in the case of a cycle).

AlgorithmicCompetition.get_convergence_profit_from_hook — Method

get_convergence_profit_from_env(env::DDDCEnv, policy::MultiAgentPolicy)

Returns the average profit of the agent, after convergence, over the convergence state or states (in the case of a cycle). Also returns the average profit for the high and low demand states.

AlgorithmicCompetition.get_optimal_action — Method

get_optimal_action(env::AIAPCEnv, policy::MultiAgentPolicy, last_observed_state)

Get the optimal action (best response) for each player, given the current policy and the last observed state.

AlgorithmicCompetition.get_prices_from_state — Method

get_prices_from_state(env::AIAPCEnv, state)

Helper function. Returns the prices corresponding to the state passed.

AlgorithmicCompetition.get_profit_from_state — Method

get_profit_from_state(env::AIAPCEnv, state)

Helper function. Returns the profit corresponding to the state passed.

AlgorithmicCompetition.get_state_from_memory — Method

get_state_from_memory(env::AIAPCEnv)

Helper function. Returns the state corresponding to the current memory of the environment.

AlgorithmicCompetition.get_state_from_prices — Method

get_state_from_prices(env::AIAPCEnv, memory)

Helper function. Returns the state corresponding to the memory vector passed.

AlgorithmicCompetition.profit_gain — Method

profit_gain(π_hat, env::AIAPCEnv)

Returns the profit gain of the agent based on the current policy.

AlgorithmicCompetition.profit_gain — Method

profit_gain(π_hat, env::AIAPCEnv)

Returns the profit gain of the agent based on the current policy.

AlgorithmicCompetition.run_aiapc — Method

run_aiapc(
    n_parameter_iterations=1,
    max_iter=Int(1e9),
    convergence_threshold=Int(1e5),
    α_range=Float64.(range(0.0025, 0.25, 100)),
    β_range=Float64.(range(0.02, 2, 100)),
    version="v0.0.0",
    start_timestamp=now(),
    batch_size=1,
)

Run AIAPC, given a configuration for a set of experiments.

AlgorithmicCompetition.run_and_extract — Method

run_and_extract(hyperparameters::AIAPCHyperParameters; stop_on_convergence = true)

Runs the experiment and returns the economic summary.

AlgorithmicCompetition.run_and_extract — Method

run_and_extract(hyperparameters::DDDCHyperParameters; stop_on_convergence = true)

Runs the experiment and returns the economic summary.

AlgorithmicCompetition.run_dddc — Method

run_dddc(
    n_parameter_iterations = 1,
    max_iter = Int(1e9),
    convergence_threshold = Int(1e5),
    n_grid_increments = 100,
)

Run DDDC, given a configuration for a set of experiments.

ReinforcementLearningBase.act! — Method

RLBase.act!(env::AIAPCEnv, price_tuple::Tuple{Int64,Int64})

Act in the environment by setting the memory to the given price tuple and setting is_done to true.

ReinforcementLearningBase.act! — Method

RLBase.act!(env::DDDCEnv, price_tuple::Tuple{Int64,Int64})

Act in the environment by setting the memory to the given price tuple and setting is_done to true.

ReinforcementLearningBase.is_terminated — Method

RLBase.is_terminated(env::AIAPCEnv)

Return whether the episode is done.

ReinforcementLearningBase.is_terminated — Method

RLBase.is_terminated(env::DDDCEnv)

Return whether the episode is done.

ReinforcementLearningBase.reward — Method

RLBase.reward(env::AIAPCEnv, p::Int)

Return the reward for the current state for player p as an integer. If the episode is done, return the profit, else return 0.

ReinforcementLearningBase.reward — Method

RLBase.reward(env::AIAPCEnv, player::Player)

Return the reward for the current state for player. If the episode is done, return the profit, else return 0.

ReinforcementLearningBase.reward — Method

RLBase.reward(env::AIAPCEnv)

Return the reward for the current state. If the episode is done, return the profit, else return 0, 0.

ReinforcementLearningBase.state — Method

RLBase.state(env::AIAPCEnv, player::Player)

Return the current state as an integer, mapped from the environment memory.

ReinforcementLearningBase.state — Method

RLBase.state(env::DDDCEnv, player::Player)

Return the current state as an integer, mapped from the environment memory.

ReinforcementLearningCore.check! — Method

RLCore.check_stop(s::StopWhenConverged, agent, env)

Returns true if the environment has converged for all players.