On Thu, May 1, 2025 at 2:50 PM Matt Mahoney <mattmahone...@gmail.com> wrote:
>
> I think I understand. We say that X causes Y if we can describe Y as a
> function of X. If the simplest description of X and Y has the form Y =
> f(X), then we are using algorithmic information to find causality. For
> example,
>
> X Y
> - -
> 1 1
> 2 2
> 3 2
>
> then I can write Y as a function of X, but not X as a function of Y.
> Thus, the DAG X -> Y is more plausible than Y -> X.
>
> To make this practical, the paper postulates a noise signal, as Y =
> f(X, N), where N can be 0 in the first case but not in the second.
> Thus, less algorithmic information is needed to encode the first case.

But is this a reasonable definition of causality? To test if Y is a
simple function of X, we would use compression to approximate K(Y|X),
and say that X causes Y if this value is smaller than K(X|Y). To
measure K(Y|X) you would compress X (a column of numbers in a table),
and subtract that from the compressed size of X concatenated with Y.
To test the causal relationships between n variables, you need to
compress n^2 pairs of columns. But observe that

K(Y|X)K(X) = K(X, Y) = K(X|Y)K(Y)

So you really only need to compare K(X) and K(Y). Whichever is larger
causes the other. You compress n columns and sort them by size from
largest to smallest. That is your DAG.

This would be easy to do with thousands of rows and columns, for
example, LaboratoryOfTheCounties, where the rows are counties and the
columns are things like the percent of population under age 5 or the
number of farms between 50 and 100 acres, to see which causes the
other.

But does that make sense?


-- 
-- Matt Mahoney, mattmahone...@gmail.com

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T0f47884dae19d52d-M5e93c344eaea154e09957715
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to