Asked by Swanand Kulkarni
on 6 Dec 2016

Hello all, I have some noisy data in the form of x and y variables. I plan to use moving average filer to get satisfactory results, yet as close as possible to the real data. I understand that higher window size means more smooth data, and hence less realistic. Is that correct? Is window size of 5 considered decent enough to establish relationship between the variables in general? Any leads are highly appreciated. Thanks and regards, Swanand.

Answer by Image Analyst
on 7 Dec 2016

Edited by Image Analyst
on 28 Jan 2019

Accepted Answer

It could be. Who's to say? It's more or less of a judgement call as to what amount of smoothing is best, isn't it. You could determine the sum of absolute differences for different window sizes and plot it. Maybe some pattern will jump out at you, like a knee in the curve.

clc; % Clear the command window.

close all; % Close all figures (except those of imtool.)

clear; % Erase all existing variables. Or clearvars if you want.

workspace; % Make sure the workspace panel is showing.

format long g;

format compact;

fontSize = 20;

numPoints = 5000;

noiseSignal = rand(1, numPoints);

x = linspace(0, 300, numPoints);

period = 100;

cleanSignal = cos(2*pi*x / period);

noisySignal = cleanSignal + noiseSignal;

subplot(2, 1, 1);

plot(x, noisySignal, 'b-', 'LineWidth', 2);

grid on;

xlabel('x', 'FontSize', fontSize);

ylabel('Noisy Signal', 'FontSize', fontSize);

windowSizes = 3 : 3 : 51

for k = 1 : length(windowSizes)

smoothedSignal = movmean(noisySignal, windowSizes(k));

sad(k) = sum(abs(smoothedSignal - noisySignal))

end

subplot(2, 1, 2);

plot(windowSizes, sad, 'b*-', 'LineWidth', 2);

grid on;

xlabel('Window Size', 'FontSize', fontSize);

ylabel('SAD', 'FontSize', fontSize);

Pick the smallest window size where the SAD seems to start to flatten out. Going beyond that (to larger window sizes) really doesn't produce much more benefit (smoothing) and will take longer.

Image Analyst
on 7 Mar 2019

You probably have periodic structures in your signal (which you forgot to attach).

Sign in to comment.

Answer by Walter Roberson
on 7 Dec 2016

Sign in to comment.

Answer by Siyab Khan
on 28 Jan 2019

How can we select a wind size for the selection of DNA sequence like

ATCGGGCTTACGG

window length size 5 to read the sequence please drop the code.

Image Analyst
on 28 Jan 2019

I don't understand the question. How is a sequence of letters analogous to a noisy numerical signal?

Let's say the window size was 5. What would you expect the output to be for that short sequence you gave? Explain why/how you got that output.

Siyab Khan
on 28 Jan 2019

basically i am working on the CLassificaion of DNA sequence using neural networks i have a DNA sequcence like

ATCGTGGCCAATGGTAACCG...... upto 500 0r more Nucleotides

i converted it to binary now i want my network read a stream of five charecters 10 and 15 so how to write code for it'

Image Analyst
on 28 Jan 2019

Sign in to comment.

Answer by Greg Heath
on 9 Mar 2019

I'm very surprised that none of the previous responses mentioned

1. Determine characteristic self correlation lengths using output autocorrelation functions

2.. Determine characteristic cross correlation lengths using

input-output crosscorrelation functions

Hope this helps.

Greg

Sign in to comment.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.