Friday, July 1, 2022

Week 6: Direct LiNGAM

 Direct LiNGAM


Experiments on more non-linear models


More experiments done on non-linear models. The model of the data and the summary of the experiments are shown below:



We can draw conclusions from the results above:
  • About correctness, all non-linear models show consistency. They all can not distinguish the direction on causal relation: (N, N3), with extreme value and hard to fit; (M, M4), the noise is related with the causal and can not detect independence; (M, M6), (M, M7), a reverse version of formula, the noise is on the other side of the direction, making the algorithm to draw the opposite conclusion; (IVB, IVA), normal noise which can not be distinguished by LiNGAM.
  • The margin varies between models, Gaussian Process Regress has a better margin and good running time. Meanwhile, Kernel Ridge Regression with Random Fourier Feature are fastest, making it possible to applying on larger datasets. In such condition, KRR shows a best performance.
Experiments are also done to test the ability about noise with different variance. The model of data and the summary of experiments are shown below:



We can observe that:
  • The correctness varies between models. However, in actual experiments, a lot of tests get margin around 0.05, which is exactly the threshold I set to determine the correctness. Therefore, it is very hard to say that the high correctness means the high accuracy, different runs may turn in different results, even if I run it 100 times every time.
  • Using more data has significant improvement on the performance. Therefore, KRR with RFF has a lot advantage since it is a non-linear regression method and runs very fast.

Direct LiNGAM Conjecture


I made a jupyter notebook to visualize the results of the experiments above to look deeply into the Direct LiNGAM Conjecture. Since it is much more straightforward to see it using jupyter, I will leave a link to the notebook here instead.

Bug Fix


I fixed a bug when try to use RCoT on conditional dependence test. Before that, I didn't implement the conditional dependence test with RCoT. Since now RCoT becomes default method, this must be implemented.

I also fixed a bug about lpb4 algorithm. There is a chance that when running on very extreme condition, the lpb4 algorithm may occur a ValueError (The exact reason is inside the algorithm, which I am not familiar with). Now the algorithm will use hbe algorithm instead when that error occurs.

Plan for Next Week

  • Presenting on Direct LiNGAM Conjecture and discuss it.
  • Working on conditional expectation with ML.

No comments:

Post a Comment

Week 10: Sensitive Feature in RCoT

 1 Sensitive Feature in RCoT The need for adding a parameter sensitive into RCoT algorithm is that, for data from reality, the variables alw...