linear model of my code and the R^2 answer change every time i change the input, but the non-linear model gives me the same R^2 answer even when changing my input.

Question

Sam am 29 Apr. 2023

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1955034-linear-model-of-my-code-and-the-r-2-answer-change-every-time-i-change-the-input-but-the-non-linear

Kommentiert: the cyclist am 30 Apr. 2023

linear model of my code and the R^2 answer change every time i change the input, but the non-linear model gives me the same R^2 answer even when changing my input. i followed all lecture notes, but can't figure out my error. is anyone able to spot my mistake. below is all my code which i have been told is correct by my lecture but now it is the weekend, i don't have access to him. i am unsure if i need to retrain my model for the non-linear model or not as it has already been done in the linear model.

% Load data from Excel file
DATA = readmatrix('Concrete_Data.xlsx');
% Compute mean and standard deviation for each column of data
M = mean(DATA);
S = std(DATA);
% Create labels for each variable
text = ["Cement (kgm^3)", "Blast Furnace Slag (kgm^3)", "Fly Ash (kgm^3)","Water (kgm^3)", "Superplasticizer (kgm^3)", "Coarse Aggregate (kgm^3)", "Fine Aggregate (kgm^3)", "Age (day)"];
 
% Plot scatter plots and compute R-squared values for each variable
r_squared = zeros(1,8); % pre-allocate array to store R-squared values
figure;
for i = 1:8
    subplot(2,4,i);
    scatter(DATA(:,i), DATA(:,9),'filled');
    title(sprintf('Average = %5.2f\n Standard Deviation = %5.2f',M(i), S(i)));
    xlabel(text(i));
    ylabel("Concrete Compressive Strength (MPa)");
    box on ; grid on ;
    hold on;
    x = DATA(:,i);
    y = DATA(:,9);
    p = polyfit(x, y, 1);
    yfit = polyval(p,x);
    yresid = y - yfit;
    SSresid = sum(yresid.^2);
    SStotal = (length(y)-1)*var(y);
    r_squared(i) = 1 - SSresid/SStotal;
    rsq = r_squared(i);
    fprintf('R-squared for %s: %5.2f\n', text(i), rsq);
    hold off;
end
input_threshold = input('Enter R-Squared threshold: ');
variable_names = [];
t = 0;
for i = 1:8
    if r_squared(i) > input_threshold
        t = t + 1;
        significant_data(:,t) = DATA(:,i);
        variable_names = [variable_names; text(i)];
    end
end
fprintf('Variables with R-Squared values above:\n');
disp(variable_names);
significant_data(:,t+1) = DATA(:,9);
rng(1);   
cv = cvpartition(length(significant_data),'HoldOut', 0.3);
training_DATA = significant_data(cv.training,:);
testing_DATA = significant_data(cv.test,:);
model = fitlm(training_DATA(:,1:end-1), training_DATA(:,end))
predictions = predict(model, testing_DATA(:,1:end-1));
nlm = @(b,x)b(1)+b(2).*x(:,1).^2+b(3).*x(:,1);
Non_linear_model = fitnlm(training_DATA(:,1:end-1), training_DATA(:,end), nlm,[1 1 1])
Non_linear_predictions = predict(Non_linear_model, testing_DATA(:,1:end-1));

5 Kommentare
3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

the cyclist am 29 Apr. 2023

After having taken just a quick look (and not really answering your question), I'll mention one other general thing.

When one refers to a "linear" model, that means that it is linear in the parameters, not the terms. Having a term that is x1^2 does not make the model non-linear. That model is linear the parameters (b1,b2, b3).

So, you can ignore what I said in my prior comment, about R^2 not being useful.

Also, there is no reason to use fitnlm to fit that model. You can use fitlm just fine. (I believe you should get the same coefficients, within perhaps some roundoff error.)

I don't know if that will help explain the problem you are seeing, which I have not tried to investigate yet.

Sam am 29 Apr. 2023

My knowledge of matlab is very poor so that’s why I’m posting in this forum. The only reason I used fitnlm is because this is what we were taught in lectures

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

the cyclist am 29 Apr. 2023

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1955034-linear-model-of-my-code-and-the-r-2-answer-change-every-time-i-change-the-input-but-the-non-linear#answer_1225994

In MATLAB Online öffnen

Here is why you always get the same result for your "non-linear" model.

Notice that your first variable (Cement) is the one with the highest R^2. Therefore, any threshold that is low enough to include any variables is going to include Cement, and it is going to be your first column.

Your "non-linear" model only looks at the first column of input data, because you specify it like this:

nlm = @(b,x)b(1)+b(2).*x(:,1).^2+b(3).*x(:,1);

Notice you only use x(:,1).

Therefore, the model you estimate using fitnlm will only ever use the Cement variable, and you will always get the same result.

4 Kommentare
2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

Sam am 30 Apr. 2023

i have uploaded the brief now also, so hopefully that makes it more clear

the cyclist am 30 Apr. 2023

In MATLAB Online öffnen

I think there are two tricky aspects to answering #4. One is conceptual, and one is technical.

The conceptual issue is that it is very unclear what non-linear model to try here. Looking at your nice subplots, there seems to be almost no dependence of compressive strength on any variable, other than the slight correlation with cement. I can't really offer any suggestions there.

The technical challenge is that if you don't know how many explanatory variables are going into the model, it is difficult to write the model formula. You wrote

nlm = @(b,x)b(1)+b(2).*x(:,1).^2+b(3).*x(:,1);

which is fine if there is only one variable. But if there are two variables, then you maybe want

nlm = @(b,x) b(1) + b(2).*x(:,1).^2 + b(3).*x(:,1) ...
                  + b(4).*x(:,2).^2 + b(5).*x(:,2);

and for three variables you could do

nlm = @(b,x) b(1) + b(2).*x(:,1).^2 + b(3).*x(:,1) ...
                  + b(4).*x(:,2).^2 + b(5).*x(:,2) ...
                  + b(6).*x(:,3).^2 + b(7).*x(:,3) ...
                  ;

and so on. So, you would need to a list of if statements that chooses the correct model formula, based on the number of selected variables. Maybe there is another way, but I can't think of one.

Melden Sie sich an, um zu kommentieren.

linear model of my code and the R^2 answer change every time i change the input, but the non-linear model gives me the same R^2 answer even when changing my input.

5 Kommentare
3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

Antworten (1)

4 Kommentare
2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

linear model of my code and the R^2 answer change every time i change the input, but the non-linear model gives me the same R^2 answer even when changing my input.

5 Kommentare 3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

Antworten (1)

4 Kommentare 2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

5 Kommentare
3 ältere Kommentare anzeigen3 ältere Kommentare ausblenden

4 Kommentare
2 ältere Kommentare anzeigen2 ältere Kommentare ausblenden