Hello everyone,

It has been quite a while since the last time I wrote a blog post. The last one should be around 2008! So please bear with me while I adjust to writing blog posts again.

Anyways, I updated my whole web page at last. I hope to keep this site updated and use the blog functionality now and then to write about cool stuff I’m working on.

Currently, I am organising and documenting some bits and pieces of code that I have been using for random forest analysis in my PhD thesis for a new Python package, called RFtools. I started this endeavour as many of the helpful functions and cool visualisation tools for the random forest analysis are only available on R, which is pretty slow in random forest training (about 66x slower compared to scikit-learn implementation in Python, see the end of this presentation of Giles Louppe for benchmarking).

I will start releasing individual modules with corrected permutation importance measure (named PIMP) of Altmann et al. (2010). It’s almost done. After this, I plan to move a bit into more visualisation part with proximity plots.

The code will appear soon after I finish writing tests and documentation at the GitHub repository of the project. I am also hoping to write a detailed blog post soon to describe why I tried to use this measure and how RFtools implementation can be used in your own projects.

Stay tuned and thanks for your time!

  1. Mark

    Looks like you’ve already implemented the PIMP method.
    You may have just saved me an enormous amount of work.
    Is the code working? I should be able to figure out how to use the class but wanted make sure it’s functioning before I try.
    Thank you!


  2. asyavuz

    Hi Mark,

    Yes, PIMP implementation is functional. Please let me know if you encounter any issues.