On 22/04/2013 10:48 AM, Lorenzo Isella wrote:
Dear All,
I hope this is not too off topic.
I am given a set of scatteplots (nothing too fancy; think about a
normal x-y 2D plot).
I do not deal with two time series (indeed I have no info about time).
If I call A=(A1,A2,...) and B=(B1, B2, ...) the 2 variables (two
vectors of numbers most of the case, but sometimes they can be
categorical variables), I can plot one against the other and I
essentially I need to determine whether
A=f(B, noise) or B=g(A, noise)
where the noise is the effect of other possibly unknown variables,
measurement errors etc.... and f and g are two functions.
Without the noise, if I want to test if A=f(B) [B causes A], then I
need at least to ensure that f(B1)!=f(B2) must imply B1!=B2 (different
effects must have a different cause), whereas it is not ruled out that
f(B1)=f(B2) for B1!=B2 (different causes may lead to the same effect).
However, in presence of the noise, these properties will hold only
approximately so....any idea about how a statistical test, rather than
eyeballing, to tell apart A=f(B, noise) vs B=g(A, noise)?
Any suggestion is welcome.
In general there can't be such a test. Think about the case of simple
linear regression. If I randomly draw X from a normal distribution,
then randomly draw Y_i = a + b X_i + e_i, where the e_i are drawn from
an independent normal distribution, I end up with (X,Y) having a
bivariate normal distribution.
In your notation, X would cause Y, but there is *nothing* here to
distinguish this from draws directly from the bivariate normal
distribution, or draws of Y first, followed by X from its conditional
distribution (which is also a linear regression model).
With some extra information inference might be possible, but not in the
generality you ask for.
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.