03.27
The following is the OMCLPBoost package, implementing the Online Multi-Class LPBoost [1], Online Multi-Class Gradient Boost [1], and Online Random Forest algorithms [2]. Online boosting is one of the most successful online learning algorithms in computer vision. While many challenging online learning problems are inherently multi-class, online boosting and its variants are only able to solve binary tasks. In this work, we present Online Multi-Class LPBoost (OMCLP) which is directly applicable to multi-class problems. From a theoretical point of view, our algorithm tries to maximize the multi-class soft-margin of the samples. In order to solve the LP problem in online settings, we perform an efficient variant of online convex programming, which is based on primal-dual gradient descent-ascent update strategies.
For machine learning and Caltech101 categorization experiments, you can directly use the following code. The tracking code is not included in this package. However, you can see in the following videos the performance of our tracker, and how tracking with virtual classes works.
Algorithms implemented:
- Online Multi-Class LPBoost
- Online Gradient Boost (with exponential, logit, and Savage loss functions)
- Online Random Forests
- Weighted Linear LaRank SVM (modifications to the original code by Antoine Bordes)
Tracking Results: our tracker is shown in red color.
Tracking with virtual classes.
Download: OMCLPBoost-0.11.tar.gz (Release date: 06/5/2010)
Author: Amir Saffari
Please read the INSTALL file for build instructions. The license is GPL V3.
Change Log:
2010-05-06: Release 0.11:
- Bug fix in RandomTest class which was previously choosing only integer thresholds (thanks to Andreas Geiger for pointing out the problem).
2010-04-18: Release 0.1
- First release.
[1] Amir Saffari, Martin Godec, Thomas Pock Christian Leistner, and Horst Bischof, “Online Multi-Class LPBoost“, in IEEE Conference on Computer Vision and Patter Recognition (CVPR), 2010.
[2] Amir Saffari, Christian Leistner, Jakob Santner, Martin Godec, and Horst Bischof, “Online Random Forests“, in 3rd IEEE ICCV Workshop on On-line Computer Vision, 2009.
could I see the code?
Right now I’m busy with writing up my thesis, so I don’t have time to write a formal description on how you could compile and use the code. But in 2 3 weeks I will have time and then I will put the code online. Stay tuned.
[...] Online Multiclass LPBoost [...]
Hi Amir, I have tried you MCLPBoost Code, and I got the following problem
OnlineMCBoost Classification Package:
Loading config file: conf/omcb.conf … Done.
Segmentation fault
Have you ever got the above problem, and how to deal with it. Thank you!
BTW: I thought this may be caused by some illegal points…
Hi Amir, it’s me again, I have figured it out…
Hey Zhao, Was your problem related to any bug? If so, please let me know so that I could fix it in the package.
Hi Amir, I’m sorry for coming back again after such a long time.
It’s not the problem of your package!
I had used the wrong version of the required lib…
Amir,
A tree doesn’t need normalized values..so why is feature range used ?
Say I have 10 training sample sets, this will calculate range and set a different range for each of the set, eventhough the trees use all the samples from all sets at the end of calls to all 10
dataset_tr1.load(hp.trainData, hp.trainLabels);
train(model, dataset_tr1, hp);
dataset_tr2.load(hp.trainData, hp.trainLabels);
train(model, dataset_tr2, hp);
….
….
dataset_tr10.load(hp.trainData, hp.trainLabels);
train(model, dataset_tr10, hp);
Hi,
The feature range is used to choose a proper random threshold. If the min-max range of a feature is not known, you might end up choosing many thresholds which puts all samples either to the left or right.
Cheers, Amir
thanks Amir,
Is the above example I cited, is it the right way to update the model as I acquire more training sets ?
Hi Raja,
If you want to keep training the same model over and over again with each new dataset, yes, that would be the way to do it. In fact, if you look into the train function, it’s nothing more than a loop which updates the model sequentially with samples in a dataset.
Thanks Amir, what about the case where the training sample is arriving sequentially and I want to update the tree on the fly. Then check a newly arrived test sample.
In that scenario, how does one update a single training sample ?
Sorry continuing the question from the above…
will a simple update to the model take care of ?
like this:
model->update(sample);
In that case, do you expose any method to convert a single line sample into sample object and how do you take care of feat_range ?
Hi Raja,
Yes, “model->update(sample);” is the correct way of updating the classifiers with incoming data samples. Take a look at the “experimenter.cpp” file, there you will see how training and testing over a dataset works.
If you want to see how to create a “sample” object, take a look at the “data.h/cpp” file.
Regarding, feature range, since you design features and take care of feature extraction, you usually know a priori what their range will be and you can give that range to the trees. Other than that, you can always give a default range like [-1,1].
I successfully compile ur code
I make some folder changing..thats why u can see “../../” stuff
I got the following mesej running ur data:
./OnlineMLP -c ../../conf/omcb.conf –omclp –train –test
OnlineMCBoost Classification Package:
Loading config file: ../../conf/omcb.conf … Done.
Loading data file: ../../data/dna-train.data …
Loaded 1400 samples with 180 features and 3 classes.
Loading data file: ../../data/dna-test.data …
Loaded 1186 samples with 180 features and 3 classes.
Assertion failed: (index >= 0 && index < size()), function operator(), file /usr/local/include/eigen2/Eigen/src/Core/Coeffs.h, line 150.
Abort trap
Is the provided data have features exceed index size?
uuhhmm, in my case it’s working, did you change the data in someway?
Dear Amir,
Salam,
Thanks for your reply. I hope that I could use you great experience on
semi-supervised learning. I need to know more about Semi-Supervised Random
Forest. Actually having your paper code about SSRF will be a great help. May I ask
about the code of SS Random Forest?
Thank you in advance.
Regards,
Jafar
This work seem to be quite interesting. Will contact u as and how I get into the depth of it.
Thanks for now
for the online random forest..do you have an option to print out the variable importance during training ?
Hi Rex,
No unfortunately, not. But should be easy to add.
Cheers
Dear Amir
salam
I’m going to use LPBoost for the classification task on image annotation,but i don’t know how it works.Would you please help me with this problem? I will appreciate you in advance.
Best regards
mahta
Dear Mahta,
To understand your question, you don’t know how LPBoost works or you don’t know how my code works?
Cheers, Amir