Category Archives: surfactant proteins A and D

6 dodecamers, 368 trimer plots, peak width, height, valley plots

EDIT: the plot shown below is what I think describes the data. It has the mean peak height, peak width and peak valley from 6 molecules. No smoothing or blur or anything else, just the numbers. It might be as useful as anything that some algorithm can invent.  Peak width is x, mean for the six molecules (the number of peaks per trimer and hexamer was determined by signal and image processing data early on) (15) and per trimer ( 8) respectively.

Number of peaks from each program depended upon various parameters, lag, threshold, influence, smoothing, and many I dont understand, but the separation of the signal and image processing graphs into the 8 peaks (color coded) was was performed by me, which, in my humble opinion, is just as good, if not more “learned” than any AI app.

The separation of each hexamer (or trimer) into peaks is reasonably consistent in terms of peak width, height and valleys.  THus I have means for all tracings, means for individual molecules, plus SD for widths, heights, and valleys, all of which can be given in table form shown below.

Means of all plots, and individual trimers dodecamers can be shown in the style of graph below, with SD of each parameter. The top image here is just a quick graphic of what that kind of plot would look like,  and are close to the actual numbers below but not exact, as this is a draft.  Peak width is in nm, peak height and valley are in grayscale 0-255.

Anyway, it is the format that I will use for collecting data on the remainder of the SP-D molecules.

Other options that I worked with for plots are below –

I have extensively looked for peak width, valley, height plot apps and cannot find one that works for me…this doesn’t mean they dont exist, but i have not made the effort to get on the chats in scipy and octave to find them.  The basic set of numbers is super simple, there is the possibility that i have not collected them in a way that is useful for making automatic plots.  They are perfectly useful for constructing a plot using a graphics program however.

Height and valley values are in grayscale (0-255), width (has variable measures (pixels, inches, cm and is not consistent) is changed to percent (left column).

Basic numbers for the width, peak height, and peak valley (data from the valley closest to the N term side) are here. Help is certainly welcome. Data below was accumulated from image and signal processing one hundreds of plots of the AFM images of surfactant protein D. Previously it was determined that the mean number of peaks per hexamer was 15, that means in counting the trimer peaks, the N center peak gets counted once for each trimer, but also, only once per hexamer, thus the number of peaks per bilaterally symmetrical  hexamer which is comprised of two trimers (but the N term blends into  a single very bright peak) is an odd number.

Example of a real plot of a SP-D molecule as a hexamer is top… below that is the same plot trimmed keeping the N term peak (light orange) as a whole, not dividing it into have – part for each trimer.

Rhe trimer plots  below assembeled in various ways with various problems and various programs (but mainly excel and corelDRAW). From top to bottom, beginning with the pinkish peach color N term composite peak (peak1); tiny peak, purple (peak 2);blue-green, glycosylation peak (peak 3);  darker green, peak 4; narrow peak 5, pink; unknown peaks, white, coiled coil neck domain yellow, seen intermittently, and not seen when it is likely to be behind the yellow, and last peak equals CRD peak.

I dont think this is rocket science, i just need to find the right program and certainly the data are consistent with the numbers in each case, just not “pretty plots”. In the case of Peak Valley Detection Template xslx, there was no value between 1 and the next highest smoothing function (3) that would do a better job of keeping the peaks but smoothing the corners.   So this is a “taste” thing, not important.

So the issue becomes how better to collect the data.

Here is a cute thing — actually not so funny… the summary plot .csv file plugged into octave, I was hoping to smooth the plot, and here i find that the corners of the line plot count as extra peaks….  Clearly, this plot has 8 peaks… not 12, and it didn’t bother counting the “tiny peak” (purple in above plots) or peak 5 (pink in above plots).

I found a link to an online converter of svg to matrix called Coordinator.  I put in an actual plot (see top image below) and used this open source app to create a plot.  It was not exactly what I had thought  (smilie face below)–  as i had been thinking for a couple years that I would really like to use the graphics flexibility of corelDRAW on the excel plots, then convert the vector graphics back into  a matrix…. didnt work that well the first time…???

Just for comparison with the plots of this same molecule, several years ago before signal processing was in the picture, here are the number of peaks per hexamer (11), and the additional 4 peaks, not present 100, or even 60% of the time, are four peaks (two pairs) which show up consistently enough to be considered something to work out.

Six dodecamers: SP-D – peak height (peak 4)

Only 3 of the 368 total peak values for peak 5 were absent or not detected. I still calculated the numbers with and without the zeros. There was very little difference either way.

Top graphic is the total count, and the mean for the individual dodecamers, n=6. Again, not a big difference, but clearly the n=6 gives the best values for variance and skew.  Bottom three graphics are the individual values for each of the six dodecamers for peak 5 height.

Values for the first four decamers is posted here.

Tiff image from Gwyddion as R, indexed to tiff RGB in imageJ, grayscale plots — same x but different Y

Dilemma:  How do i use the grayscale plots of SP-D hexamers obtained in ImageJ from images exported from Gwyddion (as red only) in the same datasets as imageJ plots of SP-D hexamers obtained from tiff files exported as RGB.  ???  I have left out of my peak height and valley analysis all those plots from Gwyddion because i did not know how to use them. They have very low grayscale peak points and can’t be used along with those which have a highest peak grayscale value (RGB) of around 250.  The peaks at about 90 (0-255) and peaks at about 250 for both types of plots just dont work together and I hate to ignore the plots from Gwyddion (as they have a good  “limit range and gaussian blur” filters).

To see if  I could safely adjust the plots required figuring out, in ImageJ, how to save a segmented line drawn in an image saved in R only, and recall it, and use it on an identical image saved in RGB.  So I did this, and while the two plots are not totall “identical” as i moved one node at the right end, they are almost identical. Each grayscale line plotted in imageJ for the R and RGB image was saved to excel. (i wish i could figure out how to create a standard plot template in excel, because even if i choose 0 to 255  scale in ImageJ, excel does what it wants with the y axis and i have to rescale it.)

Below are images of the identical SP-D image (named 127 aka supplement 4A) exported from Gwyddion (gw) as a red tiff, and plotted, and changed to indexd RGB in imageJ, and plotted again using the same segmented plot line restored.  Both plots were saved, and opened in excel and a chart was created. Those two plots were saved as metafiles and pasted into corel draw, ungrouped and the line from the R only plot was scaled (without rescaling the x axis) to the same height as the plot from the RGB image and then moved to bottom right of the RGB plot.

As you can there is no difference between the R and RGB plots.  So this means to me, that i can take my gw plots and scale them on the y axis and use them in my dataset with the RGB plots.  Any issues that i am missing that say “dont do this”?

Top two images are the images with segmented plot lines created (and saved) in imageJ. Bottom image is the two plots, and the lower plot scaled to the y axis (only) and pasted into the RGB plot. — so the difference in grayscale peaks can be scaled using a formula.

Indeed it would be almost laughable if it just requires a 300pc increase, i would need to find the grayscale value for the highest peak of each image i think inorder to align the R plot to a value.

Thanks to my kids…. dan and aaron. Dan was right in seeing that the grayscale R image max grayscale value was 85.  i will use a factor of 3 when analyzing the other gw plots to find a grayscale comperable to the RGB plots.


Six dodecamers: SP-D – tiny peak height

Six dodecamers: SP-D – tiny peak height (peak two, in each trimer, located (about half the time) on the down-slope of the N termini junction peak (peak 1) and just before the glycosylation peak (peak 6). This is detected by many of the signal processing programs, at this point it is an unreported peak in SP-D.  Means, SD, median, etc listed below for each individual dodecamer of this set of 6, and as a group of 6. MO is the same as reporting on other peak widths-heights-and valleys. First set of 4 dodecamers, 134.7+/19.5 (grayscale 0-255) and all data for that set  is  here.


Six dodecamers: SP-D – N termini junction peak height

Same procedure, N termini junction peak height for a selected data set (n=368 trimer plots), for the mean of the select 6 dodecamers (all plots each of the dodecamer images of SP-D). The grayscale values of the individual images, does make a difference in the peak heights…. whick at some point i will try to normalize. But for now, the N termini peak has always, and continues to be the highest peak in the plots, typically close to center (there are very rare exceptions which can be attributed to overlapping molecules in an image.

With an extensive number of plots of the first four dodecamers, the grayscale peak was 238.6 +/- 3.75, so the fourth and fifth dodecamers added did change this number slightly. Clearly the values are influenced by the contrast of each image.  I think a “limit range” function (which can be found in Gwyddion and likely other programs for image analysis) could be useful. I am more interested in a quick normalizing of peak height and the lowest point in the plot using a graphics program.

(I continue to thank Arroyo et al, and Thomas O’Haver and my two sons, Aaron and Dan for help with image and signal processing apps.