Re: [agi] Pick the Bit and Competitive Computing Platform - Towards a New Benchmark for AGI System Performance

Matt Mahoney Mon, 09 Dec 2024 11:30:54 -0800

This is a multi player variant of the matching pennies game. The optimal
strategy is to pick randomly.
https://en.m.wikipedia.org/wiki/Matching_pennies


It is also a proof of Wolpert's theorem, that two computers cannot mutually
predict the other's actions. Imagine a variation of the game where each
player receives the source code and initial state of their opponent as
input before the start of the game. Who wins?

Wolpert's theorem is the reason AI is dangerous. We measure intelligence by
prediction accuracy. If an AI is more intelligent than you, then it can
predict your actions, but you can't predict (and therefore can't control)
its actions.

On Mon, Dec 9, 2024, 12:34 PM swkane <dissip...@gmail.com> wrote:

> I think static dataset benchmarks have their place, but they don't test
> everything. And maybe a meta-benchmark could be created that includes both
> static dataset benchmarks as well as things like competitions like PtB in
> sandboxed environments. I think any scientifically relevant benchmark that
> is independent of the AGI system's architecture and hardware, and is
> general enough is worthy of looking at.
>
> The static dataset benchmarks are the low hanging fruit, IMO. In addition
> to the Wikipedia compression benchmark, you could feed every problem on
> kaggle into an AGI system, just as an example. And by 'low hanging fruit',
> I'm not saying less relevant, but just easier to attain. Hence, I'm not as
> interested in static dataset benchmarks, at least currently. I'm more
> interested in building a dynamic competitive benchmark system.
>
> On Mon, Dec 9, 2024, 06:57 James Bowery <jabow...@gmail.com> wrote:
>
>> I like it, even though it is inferior to lossless compression of
>> Wikipedia as a standard benchmark.  At least it conveys the central idea of
>> Solomonoff Induction:  Converging on the algorithm generating one's
>> observations.
>>
>> In particular, I like the multi-agent "theory of mind" angle it takes
>> which may get people thinking about nuking the social pseudo-sciences with
>> the Algorithmic Information Criterion for macrosocial model selection.  The
>> primary thing lacking in this approach, particularly compared to Wikipedia,
>> is that the utility function of the other agents is a given -- whereas with
>> Wikipedia, one is required to infer the utility functions of the agents
>> generating Wikipedia.
>>
>> On Sat, Dec 7, 2024 at 11:51 PM <dissip...@gmail.com> wrote:
>>
>>> Pick the Bit and Competitive Computing Platform - Towards a New
>>> Benchmark for AGI System Performance
>>> 2024-12-07 Version 0.1.0
>>> Steven W. Kane
>>>
>>> 1. The Pick the Bit Game
>>> *1.1 Game Overview*
>>> Pick the Bit (PtB) is a turn-based, multi-agent (minimum of 2 agents but
>>> theoretically an unlimited number of agents) game where agents compete by
>>> guessing a binary value—either 0 or 1—each round. The goal is to avoid
>>> picking the bit chosen by the majority of agents. Agents that pick the
>>> majority bit lose health (when their health goes to 0 or below, the agent
>>> 'dies' and is removed from the game), and the game continues until only one
>>> agent remains.
>>> *1.2 Game Mechanics*
>>> Health Dynamics:
>>>
>>>    - Each agent starts with a fixed amount of health points (HP).
>>>    - Agents that guess the majority bit lose health points equal to a
>>>    predetermined loss value.
>>>    - Agents that guess the minority bit retain their health.
>>>    - Health loss scales asymptotically in later rounds, increasing the
>>>    stakes over time. The reason for this is because earlier rounds of the 
>>> game
>>>    are more random and a loss should not incur as much health loss.
>>>
>>> Random Noise Agents:
>>>
>>>    - At a minimum, one random agent with infinite health is always
>>>    present to enable tie breaks when there are only two agents left.
>>>    - Additional random agents in addition to the single random agent
>>>    can be added from the beginning to increase random noise and maintain
>>>    unpredictability.
>>>    - These random agents choose their bits pseudorandomly based on a
>>>    cryptographically secure PRNG with a securely selected seed value.
>>>
>>> Hidden Information:
>>>
>>>    - The health levels of other agents and the number of agents
>>>    choosing each bit are hidden, forcing agents to infer patterns and make
>>>    strategic guesses.
>>>
>>> The only four things that an agent receives as inputs each round are:
>>>
>>>    - The current round number.
>>>    - The majority and minority bits from the previous round.
>>>    - The agent's own current health level.
>>>    - The amount of health that will be lost for a loss of the next
>>>    round (also, the health loss schedule will be passed to the agent at the
>>>    beginning of the game at a minimum).
>>>
>>> Incentivizing Monetary Rewards:
>>>
>>>    - Each round, agents that survive collect tokens, representing an
>>>    equal share of the health points lost by the defeated agents.
>>>    - The total tokens accrued by an agent are not revealed to any of
>>>    the agents at all (including the agent that is assigned the tokens), and 
>>> do
>>>    not give any advantage in the game.
>>>    - The tokens an agent ends up with at the end of the game can be
>>>    redeemed for monetary rewards by the team that owns the agent at the end 
>>> of
>>>    the game.
>>>    - A percentage of the prize pool is reserved for the game winner,
>>>    ensuring that strategic play and survival remain paramount.
>>>
>>> Game Complexity:
>>>
>>>    - PtB rewards non-random play by favoring agents that detect and
>>>    exploit patterns in opponents' choices. Random strategies are penalized
>>>    over time due to predictable health loss.
>>>    - Under the token system, pure random play by an agent will strongly
>>>    tend towards monetary loss since a percentage of the prize pool is
>>>    pre-allocated to the winning agent.
>>>
>>>
>>> 2. Running PtB on C2P
>>> *2.1 The Competitive Computing Platform (C2P)*
>>>
>>>    - The Competitive Computing Platform (C2P) is an isolated,
>>>    resource-constrained environment for executing AI-generated agents. It
>>>    enforces standardization across competitions, ensuring fairness and
>>>    reproducibility.
>>>
>>> *2.2 Agent Constraints*
>>> WASM WASI Modules:
>>>
>>>    - All agents are submitted as WebAssembly (WASM) WASI modules,
>>>    ensuring portability and security.
>>>
>>> Resource Limits:
>>>
>>>    - Memory: Limited to 4 GiB.
>>>    - Fuel: Execution is capped using Wasmtime's fuel feature to ensure
>>>    computational fairness.
>>>    - No Networking Access: Agents are entirely sandboxed, removing
>>>    external dependencies or external learning.
>>>
>>> Game State Communication:
>>>
>>>    - Agents receive game state updates via shared memory and submit
>>>    their moves back through the same mechanism. No external communication is
>>>    permitted, ensuring that all strategies are self-contained.
>>>
>>> *2.3 C2P Architecture*
>>> Broker and Hosts:
>>>
>>>    - The game broker orchestrates competitions, communicating game
>>>    state updates to agent hosts and logging outcomes.
>>>
>>> Single-Node Execution:
>>>
>>>    - For simplicity, C2P competitions can run on a single node with all
>>>    components (broker, Kafka instance, WASM modules) co-located.
>>>    - Turn-Based Execution:
>>>    - Each turn, agents receive the game state and submit their moves
>>>    asynchronously. The broker processes all moves, calculates health
>>>    adjustments, and updates the game state for the next round.
>>>
>>> *2.4 Benchmarking Independence*
>>>
>>>    - PtB on C2P can benchmark any AGI system, independent of its
>>>    architecture. The only requirement is that the AGI system generates a 
>>> WASM
>>>    WASI agent for the PtB game.
>>>    - This architectural independence ensures that C2P provides a level
>>>    playing field for all AGI systems, allowing researchers and developers to
>>>    focus on algorithmic sophistication rather than hardware or
>>>    language-specific implementations.
>>>
>>>
>>> 3. PtB and C2P as a Benchmark for AGI Performance
>>> *3.1 Benchmarking AGI Through PtB*
>>> Pick the Bit is designed to test core AGI capabilities:
>>> Strategic Adaptation:
>>>
>>>    - AGI systems must adapt to the shifting meta-game, learning and
>>>    optimizing strategies with limited feedback.
>>>    - Pattern Recognition:
>>>    - Detecting and responding to subtle patterns in agent behavior and
>>>    game state is critical for survival.
>>>
>>> Robustness Under Constraints:
>>>
>>>    - The WASM WASI sandbox ensures that agent performance is tied
>>>    solely to its algorithmic sophistication, not hardware advantages.
>>>
>>> *3.2 C2P as a Universal Standard*
>>> Decoupling from Hardware:
>>>
>>>    - By requiring agents to run on commodity hardware with standardized
>>>    constraints, C2P removes externalities, enabling direct comparisons 
>>> between
>>>    AGI systems.
>>>
>>> Interoperability:
>>>
>>>    - WASM WASI ensures agents can be developed in any language that
>>>    compiles to WASM, making C2P accessible to a wide range of researchers 
>>> and
>>>    organizations.
>>>
>>> Transparent Competitions:
>>>
>>>    - C2P logs all game state updates and agent moves, providing a fully
>>>    auditable record of each competition.
>>>
>>> *3.3 Meta-Learning and AGI Evaluation*
>>> Dynamic Agent Generation:
>>>
>>>    - PtB encourages the use of meta-learning systems that dynamically
>>>    generate agents tailored to the game environment.
>>>    - By iteratively refining agents through competitions, AGI systems
>>>    can demonstrate their ability to generalize, adapt, and innovate.
>>>
>>>
>>> 4. Conclusion
>>> Pick the Bit (PtB) and the Competitive Computing Platform (C2P) together
>>> represent a new frontier in AGI benchmarking. PtB's dynamic and evolving
>>> meta-game challenges agents to excel in adaptability, pattern recognition,
>>> and strategic thinking, while C2P provides a standardized,
>>> resource-constrained environment for fair competition. By isolating agent
>>> performance from hardware advantages and enabling reproducible evaluations,
>>> PtB and C2P offer a universal platform for AGI research and benchmarking,
>>> pushing the boundaries of what intelligent systems can achieve. Through
>>> these competitions, the AI community can foster innovation, collaboration,
>>> and progress toward truly general intelligence.
>>>
>>> Software:
>>> https://github.com/Competitive-Computing-Network/c2n/tree/main/software 
>>> (proof
>>> of concept is a work in progress)
>>>
>>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/T705ed500a1a7e589-M77f56730d836331b14432c33>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T705ed500a1a7e589-M6a4fd614a298864d3e6ca62d
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Re: [agi] Pick the Bit and Competitive Computing Platform - Towards a New Benchmark for AGI System Performance

Reply via email to