Online Multiclass LPBoost

The following is the OMCLPBoost package, implementing the Online Multi-Class LPBoost [1], Online Multi-Class Gradient Boost [1], and Online Random Forest algorithms [2]. Online boosting is one of the most successful online learning algorithms in computer vision. While many challenging online learning problems are inherently multi-class, online boosting and its variants are only able to solve binary tasks. In this work, we present Online Multi-Class LPBoost (OMCLP) which is directly applicable to multi-class problems. From a theoretical point of view, our algorithm tries to maximize the multi-class soft-margin of the samples. In order to solve the LP problem in online settings, we perform an efficient variant of online convex programming, which is based on primal-dual gradient descent-ascent update strategies.

For machine learning and Caltech101 categorization experiments, you can directly use the following code. The tracking code is not included in this package. However, you can see in the following videos the performance of our tracker, and how tracking with virtual classes works.

Algorithms implemented:

  • Online Multi-Class LPBoost
  • Online Gradient Boost (with exponential, logit, and Savage loss functions)
  • Online Random Forests
  • Weighted Linear LaRank SVM (modifications to the original code by Antoine Bordes)

Tracking Results: our tracker is shown in red color.

You need to a flashplayer enabled browser to view this YouTube video

Tracking with virtual classes.

You need to a flashplayer enabled browser to view this YouTube video

Download: OMCLPBoost-0.11.tar.gz (Release date: 06/5/2010)

Author: Amir Saffari

Please read the INSTALL file for build instructions. The license is GPL V3.

Change Log:

2010-05-06: Release 0.11:

  • Bug fix in RandomTest class which was previously choosing only integer thresholds (thanks to Andreas Geiger for pointing out the problem).

2010-04-18: Release 0.1

  • First release.

[1] Amir Saffari, Martin Godec, Thomas Pock Christian Leistner, and Horst Bischof, “Online Multi-Class LPBoost“, in IEEE Conference on Computer Vision and Patter Recognition (CVPR), 2010.

[2] Amir Saffari, Christian Leistner, Jakob Santner, Martin Godec, and Horst Bischof, “Online Random Forests“, in 3rd IEEE ICCV Workshop on On-line Computer Vision, 2009.

24 thoughts on “Online Multiclass LPBoost

    1. Right now I’m busy with writing up my thesis, so I don’t have time to write a formal description on how you could compile and use the code. But in 2 3 weeks I will have time and then I will put the code online. Stay tuned.

  1. Hi Amir, I have tried you MCLPBoost Code, and I got the following problem

    OnlineMCBoost Classification Package:
    Loading config file: conf/omcb.conf … Done.
    Segmentation fault

    Have you ever got the above problem, and how to deal with it. Thank you!
    BTW: I thought this may be caused by some illegal points…

  2. Hi Amir, I’m sorry for coming back again after such a long time.
    It’s not the problem of your package!
    I had used the wrong version of the required lib…

  3. Amir,
    A tree doesn’t need normalized values..so why is feature range used ?
    Say I have 10 training sample sets, this will calculate range and set a different range for each of the set, eventhough the trees use all the samples from all sets at the end of calls to all 10

    dataset_tr1.load(hp.trainData, hp.trainLabels);
    train(model, dataset_tr1, hp);
    dataset_tr2.load(hp.trainData, hp.trainLabels);
    train(model, dataset_tr2, hp);
    ….
    ….
    dataset_tr10.load(hp.trainData, hp.trainLabels);
    train(model, dataset_tr10, hp);

    1. Hi,

      The feature range is used to choose a proper random threshold. If the min-max range of a feature is not known, you might end up choosing many thresholds which puts all samples either to the left or right.

      Cheers, Amir

    1. Hi Raja,

      If you want to keep training the same model over and over again with each new dataset, yes, that would be the way to do it. In fact, if you look into the train function, it’s nothing more than a loop which updates the model sequentially with samples in a dataset.

  4. Thanks Amir, what about the case where the training sample is arriving sequentially and I want to update the tree on the fly. Then check a newly arrived test sample.
    In that scenario, how does one update a single training sample ?

  5. Sorry continuing the question from the above…
    will a simple update to the model take care of ?
    like this:
    model->update(sample);
    In that case, do you expose any method to convert a single line sample into sample object and how do you take care of feat_range ?

    1. Hi Raja,

      Yes, “model->update(sample);” is the correct way of updating the classifiers with incoming data samples. Take a look at the “experimenter.cpp” file, there you will see how training and testing over a dataset works.

      If you want to see how to create a “sample” object, take a look at the “data.h/cpp” file.

      Regarding, feature range, since you design features and take care of feature extraction, you usually know a priori what their range will be and you can give that range to the trees. Other than that, you can always give a default range like [-1,1].

  6. I successfully compile ur code
    I make some folder changing..thats why u can see “../../” stuff
    I got the following mesej running ur data:

    ./OnlineMLP -c ../../conf/omcb.conf –omclp –train –test
    OnlineMCBoost Classification Package:
    Loading config file: ../../conf/omcb.conf … Done.
    Loading data file: ../../data/dna-train.data …
    Loaded 1400 samples with 180 features and 3 classes.
    Loading data file: ../../data/dna-test.data …
    Loaded 1186 samples with 180 features and 3 classes.
    Assertion failed: (index >= 0 && index < size()), function operator(), file /usr/local/include/eigen2/Eigen/src/Core/Coeffs.h, line 150.
    Abort trap

    Is the provided data have features exceed index size?

  7. Dear Amir,
    Salam,

    Thanks for your reply. I hope that I could use you great experience on
    semi-supervised learning. I need to know more about Semi-Supervised Random
    Forest. Actually having your paper code about SSRF will be a great help. May I ask
    about the code of SS Random Forest?

    Thank you in advance.
    Regards,
    Jafar

  8. Dear Amir
    salam
    I’m going to use LPBoost for the classification task on image annotation,but i don’t know how it works.Would you please help me with this problem? I will appreciate you in advance.
    Best regards
    mahta

    1. Dear Mahta,

      To understand your question, you don’t know how LPBoost works or you don’t know how my code works?

      Cheers, Amir

  9. Hi Amir,

    I had a question regarding the ShouldISplit function in your online random forest implementation. In your paper from 2009 you imply that the Gini index or some metric for information gain could be used to decide whether a node will split. In your code, it seems to instead use the metric of whether or not the data at a node is pure (all belongs to the same class). It seems quite possible that with this current metric, a single incorrectly labelled data point could cause a node to split, which will inevitably happen with noisy data.

    Did you experiment with other metrics for this? If so, did you pick this one because it is the fastest/simplest or does it seem to perform best?

    Thank you!

    Malcolm

  10. Dear Amir,
    Salam
    I try to install OMCLPBoost but I get this errors although the libconfig and Eigen packages has installed!
    make
    g++ -c -O3 -Wall -march=native -mtune=native -DNDEBUG -Wno-deprecated -I/usr/local/include -I/root/local/include src/booster.cpp -o src/booster.o
    In file included from src/classifier.h:17,
    from src/booster.h:17,
    from src/booster.cpp:14:
    src/data.h:23: fatal error: Eigen/Core: No such file or directory
    compilation terminated.
    make: *** [src/booster.o] Error 1

Leave a Reply