Spectral Deconvolution using Bayesian Information Criteria and Gaussian Peak Shapes

10 Ansichten (letzte 30 Tage)
This is a problem that has been dealt with in part by many codes, but I am having trouble implementing the specific solution I need.
I have a continuous x,y dataset from UV-Vis absorption data for a compound. This convoluted (macroscopic/classical) observable is the result of one or more individual Gaussian(type) functions.
What I would like to do is use a probabilistic method to find the most likely values for number of Gaussian peak centers, and the resulting position and intensity for each of these Gaussian peaks that underlie the continuous spectrum.
We have an old code in R that uses the MClust library, but I would like to use the Optimization toolbox in Matlab to find a better way of performing this task.
Thanks in advance for your ideas and help.
This is a crude figure to represent the general idea (with improper scaling)

Akzeptierte Antwort

Image Analyst
Image Analyst am 16 Okt. 2015
If you have the Statistics and Machine Learning Toolbox:
fitgmdist
Fit Gaussian mixture distribution to data
Syntax
GMModel = fitgmdist(X,k)
example
GMModel = fitgmdist(X,k,Name,Value)example
Description
example
GMModel = fitgmdist(X,k) returns a Gaussian mixture distribution model (GMModel) with k components fitted to data (X).
example
GMModel = fitgmdist(X,k,Name,Value) returns a Gaussian mixture distribution model with additional options specified by one or more Name,Value pair arguments.
For example, you can specify a regularization value or the covariance type.
  2 Kommentare
Soren
Soren am 16 Okt. 2015
Thank you for the quick help. I knew Matlab had just what I needed in there somewhere. It has been hard to find the right approach given that most all of the examples I had found were not entirely applicable to my case. I think fitgmdist will be a good solution.
Soren
Soren am 6 Dez. 2015
I have been working with the fitgmdist and related tools for some time since I originally posed this question. I have some concerns about the dimensionality of my dataset and how Matlab is handling it.
Scenario 1: I have a single x,y dataset consisting of n rows of the columns energy and absorbance. In this case, I believe passing X = [energy,absorbance] will cause fitgmdist to try to fit this data to a 2d GMM. However, there is no real meaning to any variance about absorbance. Only the x-dimension holds data that is a true gaussian mixture. So, the model fits are weighted on their probability with regards to x (which would be our desired peak center) and y (which is really meaningless in this case). If I try to create a true 1d dataset by equally sampling the values of y at even intervals, the fit works well, BUT I lose all trace of the positions of the mu values it found. That is, if it finds a center at absorbance = 1, there can be many corresponding values of x - so the position info is lost.
Scenario 2: I have Y error data and feed that to fitgmdist by creating a random gaussian data sample with our experimental sigma for each value of y.
Both of these are causing problems.
So, I am out of ideas on how to effectively implement a 2d fit [Energy, Gaussian spread of absorbance values based on experimental SD], or a 1d fit where the data is evenly spaced samples of absorbance. In the latter case, I believe I get good fits, but the index of the proposed values of mu is lost and I cannot definitively reconstruct the peak.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by