Online Random Forests

A new and relatively faster implementation of this algorithm exists in my “Online Multi-Class LPBoost” package.

This package implements the “Online Random Forests” (ORF) algorithm of Saffari et al., ICCV-OLCV 2009 [1]. This algorithm extends the offline Random Forests (RF) to learn from online training data samples. ORF is a multi-class classifier which is able to learn the classifier without 1-vs-all or 1-vs-1 binary decompositions.

ORF package is implemented in C++ and uses ATLAS/LAPACK subroutines for high performance computations. Currently, it is only tested under Linux (Debian and Ubuntu), but overall it should be possible to run the package on other operating systems with minimal modifications. For installation instructions refer to the “INSTALL” file in the package. Also the usage instructions are available in “README” file.

Download: OnlineForest-0.11.tar.gz (Release date: 03/10/2009)

Author: Amir Saffari

[1] Amir Saffari, Christian Leistner, Jakob Santner, Martin Godec, and Horst Bischof, “On-line Random Forests,” in 3rd IEEE ICCV Workshop on On-line Computer Vision, 2009.

27 thoughts on “Online Random Forests

  1. Dear Mr. Amir
    I’m a student of Nguyen Dang Binh
    I’m researching about Online Random Forest
    Can you show me how to use OnlineForest-0.11.tar.gz packet in Visual Studio 2005

    1. Hi,
      I don’t program in Windows, so I don’t know how to compile the ORF in VS2005. There are dependencies which have to be compiled before hand (like ATLAS and libconfig). So I would suggest you start building those libraries before attempting in compiling the ORF package.

      However, on many Linux distributions those libraries are either installed or can be easily installed from their repositories. So if you just want to make a few experiments and see how ORF works, I would suggest install a Linux distribution (Ubuntu or Debian for example) and try to use ORF there.

  2. Hi Amir,
    I have some problems when I‘m researching the code of the ORF algorithm.
    Could you note the command line in the ORF algorithm more clearly?
    Regards,

    1. Hi Minh,

      Could you please be more specific with your question? Do you have problems understanding how to call the ORF binary from command line? Please take a look at the README file, there is plenty of explanation how to use it. If that does not solve your problem, please write exactly what happens and what is not clear.

      Cheers, Amir

  3. Hi Amir,
    Could you explain to me the meaning of the attributes in the RandomTest class ?
    const int *m_numClasses;
    double m_threshold;
    double m_trueCount;
    double m_falseCount;
    vector m_trueStats;
    vector m_falseStats;

    1. m_numClasses -> number of classes
      m_threshold -> threshold for a test in each node
      m_trueCount -> number of samples in the true (right) branch of a node
      m_falseCount -> number of samples in the false (left) branch of a node
      m_trueStats -> density of classes in the true (right) branch of a node
      m_falseStats -> density of classes in the false (left) branch of a node

  4. Sir,I’m a student new to online random forests. Can i ask you a question, which data do you use for tracking? Lately i have studied camshift algorithm. And in opencv, they just use “hue”, for tracking. So which data do you use, hue? to histogram? or else

  5. Hello Amir,
    A general programming question, is there a specific reason why most of the class member definitions are in header files? I am aware of usual practice where definitions are placed in a cpp file and declarations in a header file unless it is some kind of boiler-plate code. But my knowledge is not complete and I constantly keep learning so I wanted to understand if there’s a design issue involved in here. Thank you.

    1. Hi Chris,

      The reason is mainly speed, usually if the definition is in the header file, it gets automatically inlined by the compiler. So if the function body is small, it’s better to have them in the header file. Otherwise, there’s no reason as far as I know.

      Cheers

  6. Thanks Amir. Another quick question, I’ve actually been looking for a good open source C++ implementation of a decision tree to understand the entropy and pruning implementation. Can you point me to the right functions in your code ? or any other external references?
    Thanks

  7. Sir, i have read your paper, “On Robustness of On-line Boosting – A Competitive Study,2009.” The comment #3 on page 4 says that source code is available here, but i couldn’t find them. Is anywhere i could find?

    By the way, I’ve took a glance at the slides of “On-line Random Forests”. It said that “RFs achieve state-of-the-art performance in many applications.” and compared with Online AdaBoost, etc., but no On-line GradientBoost. I’d like to ask for some advice about comparison between Online Gradient and Online RF.

    Thanks a lot.

  8. Hi Amir
    Thanks for your nice code; however it would be far better if it was more commented. The RF algorithm is a bit hard to track

    Regards

  9. ./Online-Forest -c conf/orf.conf –ort –train –test
    OnlineMCBoost Classification Package:
    Loading config file: conf/orf.conf … Done.
    Could not open input file ../data/dna-train.libsvm

  10. Hi Amir,
    The code is not the tracking source code , but only for data classification.
    Could you release the tracking code so that we can do some comparison?

    Regards

  11. Hi Amir,
    i’m researching about Online Random Forest.
    i have a question. data file is libsvm format in your code.
    can i change the normal csv file format? is there any problem?

  12. sorry, have a another question. why the data have a same value in libsvm file?
    every data have 1.0 ? maybe, if the value is different. it’s make a problem? wrong classification??
    and where can i check result of classification ?

  13. Hi Amir,
    Could you tell me the reference paper of “random test” used in your code? I see it is different with Ho’s “random subspace test” and I am now writing a paper and need to refer this method. I will surely cite your paper if it is first proposed by you.

    Regards

  14. Hi Amir,

    Could you assist me with this problem, there is no gmm.h file and it is required when compiling.

    src/data.h:6:21: fatal error: gmm/gmm.h: No such file or directory

    Thanks in advance.

Leave a Reply