Filter löschen
Filter löschen

Box Constraint SVM - Mistake?

12 Ansichten (letzte 30 Tage)
Ismail Kuzu
Ismail Kuzu am 14 Nov. 2017
Bearbeitet: Lukasz Szyc am 7 Aug. 2020
Hi,
in the documantary for the Support Vector Regression the box constraint is described as a parameter penalizing vectors laying outside the epsilon-margin. A higher box constraint causes higher costs. www.mathworks.com/help/stats/understanding-support-vector-machine-regression.html
In the documantary of the regression-learner-app it is quite different:
"The box constraint controls the penalty imposed on observations with large residuals. A larger box constraint gives a more flexible model. A smaller value gives a more rigid model, less sensitive to overfitting." www.mathworks.com/help/stats/choose-regression-model-options.html
Should it not be that a smaller box constraint leads to a more flexible model and reduces the risk of overfitting? Did I got it wrong?

Antworten (2)

Lukasz Szyc
Lukasz Szyc am 17 Apr. 2018
Bearbeitet: Lukasz Szyc am 7 Aug. 2020
Increasing the boxconstraint leads (or at least can lead) to fewer support vectors as described in the documentation. It might seem counterintuitive, but the number of support vectors is not a particularly good measure of model complexity. With a small boxconstraint you allow more points in the margin. An inf large box constraint you allow no data points in the margin. The smaller boxconstraint the wider the margin, and usually more data points lie within the margin. All the data points in the margin are always support vectors. In other words: a large boxconstraint = misclassification heavily weighted, smaller margin, fewer supporting vectors, sometimes also more complex model, higher overfitting risk.

Ahmad Obeid
Ahmad Obeid am 6 Feb. 2019
Bearbeitet: Ahmad Obeid am 6 Feb. 2019
I think that in the standard definition of the cost function of SVMs, the penalyzing parameter, AKA the regularization parameter, multiplies the first part of the cost function (as opposed to to say linear regression, where the second part of the cost function is multiplied). This may be the source of your confusion, as you might have thought that both definitions are the same. For better illustration, see the two equations in the attached image. I obtained these equations from the machine learning course on Coursera, taught by Dr. Andrew.
Hope it helps.

Kategorien

Mehr zu Introduction to Installation and Licensing finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by