Category Archives: surfactant proteins A and D

Six dodecamers: SP-D – an analysis of shape, peaks and size, AFM images – peaks numbers


Previous signal and image processing programs have shown that the mean number of peaks per trimer is about 8…. this has the N peak counted twice, since the N peak in AFM images only infrequently shows (but sometimes it is very apparent, just not consistently apparent) that there is a decrease in grayscale in the center of the N peak (thus called a junction, as it is the joining somehow of the N term domains of two trimers in the case of the hexamer, and four trimers in the case of the dodecamer).  The number was 15 peaks across a hexamer.

Using that 15 as a guidline, and the signal processing peak finding apps mentioned many times in this blog, i have sorted it into 8 general appearance – type peaks..  That is not individual peaks in particular, but the ups and downs of the valleys and peaks in harmony.  N term junction peak is present in 100% of the grayscale plots (a given), next, the glycosylation peak (peak 3) which is a prominent peak laterally on both sides of the N term peak in a hexamer, is also present 100% of the time.  What I call peak 4 is a rolling but still prominent peak lateral to the glycosylation peaks, and is often wider than the other peaks but also has a higher number of “peak-divisions” than any other set of peaks.  Peak 5 is usually thin, not prominent, and appears just 72% of the time, depending upon how “smoothed” the signal processing app is. Peak 6, is present frequently, measured above 94% of the time, is low and not too broad and it is isolated as a separate peak to justify the division number of 15 peaks that signal processing has defined.  Peak 7 which is pretty clearly the neck region, not present all the time because the CRD domain just flops over and covers it often, and it is quite close and thin, beside the CRD peak. Peak 8 is the CRD peak(s) domain and can presents with up to 5 smaller peaks within what is pretty surely the elevated grayscale values at either end of the hexamer.

Without question, the type of signal processing influences the way peaks are counted, and the settings within those programs.  I will at some point pick out my favorite signal processing programs and use them to recreate the dataset.  CUrrently i like the PeakValleyDetectionTemplate.xlsx (Tom O’Haver), and Octave’s AFPPxy.  These still are not as good as my own peak detection … ha ha…  which i call “educated machine vision”

Six dodecamers: SP-D – an analysis of shape, peaks and size, AFM images – trimer length

Six dodecamers: SP-D – an analysis of shape, peaks and size, AFM image, these trimers are not normalized in size (yet) so these values are for each of the four trimers in a dodecamer, and they begin at the (far) furthest side of the N term peak and move laterally to the end of the CRD peak.  This means that each N term is measured as a whole, for each of the trimers. In addition, when we get to peak width, the valley point is on the side of each peak which is proximal to the N term, or maybe a better term is medial to the N term, so the peak valleys are found in mirror images of the trimers, medial in the direction of N term to CRD each direction.  Here is a link to easy measurements (diameter) of dozens of SP-D molecules (Arroyo et al, cover image) and little has changed in the time between when i began working on finding out how molecule size variation of SP-D hexamers in 2019 (btw, the simplicity of this approach is really striking and the image is pretty nice too), and what is seen now in 2023. It is really hard to rationalize the 3 to 4 hours required to make this image, and the 3 or 4 years it is taking to validate those values in an “unbiased” (tongue in cheek) way.  There is no unbiased observation here, just call it science with discretion.  Below, top section is the individual value summary (left) and the n of 6 dodecamers (right) each as an N.  Bottom row has data for each individual dodecamer.

 

Six dodecamers: SP-D – an analysis of shape, peaks and size, AFM images – hexamer length

It has been a while since i posted results of my attempt to determine the shape of the SP-D dodecamer (hexamer and trimer) using AFM images (mostly from Arroyo et al). Plotting and finding signal processing programs to do this without bias (of course that was naive, as there is choice in the paramaters of each programs algorithms) took a couple years… but adding two new dodecamers (hexamers, trimers) to the dataset took just a couple months. The data can be compared, the first four, vs the first 6, that is the original four reduced to more or less similar numbers of plots and similar numbers of signal processing programs with two new added.  The latter two will provide the format used for all others.  That is: one image, six different processing methods, the first of which is “without signal and without image processing” the other five using signal processing programs (those selected have been stated before). So these data are the first four fact finding, but trimmed, datasets, and the data from the last two images are provided from what will be the methods for all future plots.

Just trying to figure out how to make these plots tell a story….

Most preliminary numbers are of the hexamer width.  See below: top image was the original set of four. Image below that (six individual dodecamers, listed separately, and with two means for the whole – 1) which includes all plots in a single number (n=1 big dataset), and 2) the data for each of the six molecules individually (n=6). You can compare the initial results with the more refined results… pretty similar.

SP-D new way to show ridge plot and grayscale peaks

Still working on the peak height, width, valley measurements on surfactant protein D dodecamer grayscale values using AFM images.  Just thought this was an interesting way to show both the signal processing divisions for peaks, as well as the more subjective estimation (also from peak counts of image and signal processing peak numbers) division into sections. N peak center, either side, glycosylation peak(s), next is a long lumpy section (dark green), a reasonably consistent tiny low peak (pink), and a broader peak before the presumed neck coiled section (white), the likely neck domain (yellow) and the CRD (orange).  These measurements are taken from the N term outwards in two directions to the respective CRD of hexamers.  Two hexamers are measured separately, though mirrored and identical (theoretically) in all for trimeric arms.

This particular ridge plot is from molecule 80 (my designation number, as collected mostly from images found in Arroyo et al), and light gray lines along the plot lengths for each of the two hexamers are the divisions found by octave’s ipeak.m program using M80 as the amplitude and smoothing settings (unbiased).  These come close in this randomly selected image to fitting the 15 peak per hexamer consensus found previously. Color divisions are mine (educated bias), made to approximate the 15 peak per hexamer findings as well as my observations on the AFM images.

An obvious deviation is the unusually tall (bright) peak on the left side of the top hexamer at the domain of the CRD, and just as unusual is the low (less bright peak) on the left side of the bottom hexamer.  In other respects this molecjule (number 80) is very close to the means found for the first four dodecamers posted on this site.

Four dodecamers: CRD peak-valley grayscale values

same MO. See from bottom image that just one of the dodecamers (number 42a_aka_44)  had any missing values, and then just 1. So the separate sum (172 with and without that one missing value) is listed in the top image, along with the summary of just the means of the four dodecamers calculated separately) and the single dodecamer with one missing value is the very bottom set of valley parameters.

Four dodecamers: values for the valleys of peak 5

SP-D peak widths, heights and valley values for four dodecamers. These are the data for peak 5. (count begins with peak 1 as the N term peak and moves toward the CRD, each for trimers.

When there are peaks which are “not detected” a value of 0 was used, and data were plotted both ways, i.e. with the non-detected and eliminating the non-detected). The pattern was applied to all peaks as one dataset, and to individual dodecamers (N=4) – see bottom image.

Four dodecamers: peak 4 “valley” grayscale values

The peak just lateral (remember going to the right and the left of the N term peak with peak 1 being counted as the center N term peak), peak 4 is a very consistent finding with image and signal processing. You can see top image shows that only once in 172 plots of trimers was there no detection of peak 4. Means and other data are shown both ways. The image just below that is the mean grayscale valley value for peak 4, calculated as an N of the number of dodecamers examined, and the N of plots is not the same in the dodecamers, but each dodecamer is given the same weight in those means and SD.  With and without the single missing value in one dodecamer, means are very close, that is grayscale value of 112. The third image down shows the mean grayscale value for the peak 4 valley for each of the four dodecamers individually (these were used for the means shown in the second image from the top). Lastly the single dodecamer where a single peak 4 was missing is shown at very bottom).

Four dodecamers: glycosylation peak valley

Grayscale values(0-255) the y axis. Glycosylation peak was detected by image and signal processing algorithms 100% of the time, so there is a single value recorded for the valley (immediately lateral to the “tiny” peak – keeping in mind that this molecule has bilateral symmetry beginning in the center at the N termini junction peak. so many images of this molecule have been posted on this blog that they shape of the AFM images is easy to see.