This is a known peak in the SP-D dodecamer (actually measured here as peak height in each trimer, on the left as the total of the grayscale plot values, on the right as the mean of each of the separate trimers in each of four dodecamers).
Four dodecamers: trimer pattern in glycosylation peak (peak 3) height
Does anyone see a pattern here, in the glycosylation peak height, arm 1 and arm 2. It is very clear that the different image analysis filters dont mean a whole lot, when compared to the actual molecule, the relative heights of each of the peaks along a length of a trimer (hexamer = 2 trimers). This made me examine the trimer glycosylation peaks of this SP-D dodecamer separately, comparing peak heights. Repeating pattern for each arm is clear. Since each hexamer is plotted as trimer 1a and 1b, and 2a and 2b, the fact that the difference in the hexamer glycosylation peak heights is not significantly different using a t-test, even though this pattern occurs, I am going to continue to use the data each trimeric arm separately.
It is important for me to say that I think thee are differences in the glycosylation peak height that relate to the “number” of arms in the trimer that actually have a glycan attached — could it vary, one, two or three, which would then result in a different grayscale value for peak height (and valley).
The mean for each glycosylation peak in each of the four trimers of the dodecamer, and each of the image and signal processing apps (n was 11 for each trimer)
t-test comparing the four trimers follow.
Values for the glycosylation peak grayscale height for each hexamer of the single dodecamer (that is arm 1a and 1b, and arm2a and 2b) in a t-test were not significantly different).
Four dodecamers: tiny peak (peak 2) height
I know this is going to sound like I dont care about traditional statistical methods, but in this case I makes NO sense to me at all to include many 0 in a mean as a missing value when the results puts the peak height below the background. SO, while i have calculated that missing value in on the old database, and the new database (and the only one i will continue to get peak heights from) and shown them both here. I have taken just the values from instances where this tiny peak (peak 2 – adjacent to the N term peak) occurs. Those means are presented in the bottom screen print from excel. I also will stop reporting about the normal distribution (see medium green cells), which can happen when data of missing values are excluded) otherwise that makes no sense to me either.
Four dodecamers of SP-D, four databases, mean grayscale peak height for the N term peak
Four dodecamers of SP-D, four databases, mean grayscale peak height for the N term peak is about 238.6 +/ 3.75 nm. The grayscale given in the plots made for the images in ImageJ, and then subjected to a variety of image and signal processing algorithms. The column indicated with the title NEW, is a smaller dataset taken from four dodecamers (8 trimers, put plotted as trimers) so every n= is a value for a trimer. The N term peak is plotted in its completeness for each trimer, not divided in half. Dividing the N term in half didn’t really fit the images, as many cases there is a small peak, or valley in the center of the N trimer, and that complicates where it should be divided into two or four. So the peak of the N trimer below is the tallest point (not necessarily in the center) of the N peak.
Bottom line for me (maybe not for you) but for me is that the value of 238.6 is just fine for creating one point on the “training plot” to use for supervised learning in AI to apply to the remaining SP-D images I have obtained from published literature.
The four dodecamers used in this training plot are among the better images of SP-D (from Arroyo et al).
Four dodecamers: comparing mean peak widths +/ SD with a single image
This is going to be a hard sell to someone who has not spent dozens of hours looking at SP-D images. The mean peak width (+/SD) are in relative proportion placed just over a randomly selected image of a SP-D dodecamer. The color bar shows the mean peak widths for the 15 peaks (the number 15 was obtained by signal processing (peak finding programs), image processing (represented here are two or three image filters) and 4 dodecamers, hundreds of grayscale plots and width measurements). There is consensus, but it is not immediately intuitive. Colored bars with SD on either side represent peak width in nm (relative). Count from N term to CRD on top bar, CRD to N term on bottom bar. Glycosylation peak is mid-green color. Each peak is identified on bottom image with text and a circle and color to match the actual peak widths shown on the top image.
My sorting of those 15 signal processing identified peaks using circles and colors to match image above. Four peaks (per trimer, 8 peaks per hexamer) as yet unidentified are marked with a questionmark.
peak 1, N=19.9nm+/1.42nm
peak 2, tiny=0.9nm+/0.38nm
peak 3, glycos=16.7nm+/1.32nm
peak 4=11.98nm+/3.7nm
peak 5=4.18nm+/0.41nm
peak 6=6.89nm+/1.94nm
peak 7, neck=4.74nm+/0.56nm
peak 8, CRD=16.56nm+/2.04nm
Four dodecamers: peak 6 width (nm)
The trial dataset for peak width I think is pretty much finished. There is not a lot more that
I can do using signal processing peak finding apps. The data are pretty consistent from old to new values, and different programs, each analyzed as a single large dataset, or as individual dodecamers. (N of four dodecamers (8 hexamers) only at this point, but hundreds of grayscale plots to determine peak width, using the basic peak count per hexamer from previous plots. This peak is something around 7 nm wide, and is consistently seen (354 times out of 392 plots of trimers – about 90%). It is consistently a low peak and fairly broad. Old dataset (from a few months ago) and New recent dataset, with more equitable numbers of plots from each of four dodecamers. Nothing really changed, which is nice.
Four dodecamers: peak 5 width (nm)
Peak 5 of the trimers of surfactant protein D is shown this post. The peak number is counted with the N termini junction peak as peak 1 and moving in a mirrored direction from that point toward the CRD peak. Peak 1 (N termini junction, well described, obvious and tallest); tiny peak (peak 2, not currently described, small, and between peak 1 and the alledged glycosylation peak); peak 3, the described glycosylation peak; peak 4 (lateral to the glycosylation peak, detected very frequently, large, and not as tall as the glycosylation peak); Peak 5, below, is not tall, and quite thin, but detected frequently enough to make it a likely event.
Four dodecamers of surfactant protein D: peak 4 width
This peak, consistently appears, (98.8% of the time – 5/392) plots in a combined dataset, the old and the new (shown below). The fourth peak (N term peak=peak 1, tiny peak=peak 2, glycosylation peak=peak 3, and this peak lies lateral to the glycosylation peaks. It appears to have a lower peak height, and a little broader peak width. Data below, calculated two ways, the summation of the old and new dataset using four dodecamers, shows that calculating them in these two ways matters very very little (p=.9281). The peak width is about 11.9nm in width.
Four dodecamers: peak width and incidence of the neck domain in AFM images of SP-D
A peak in the hexamer plots where the “neck domain” of SP-D likely appears is registered about 78% of the time (135/172 plots) in the new 4 dodecamer dataset, and about 68% of the time (151/222 plots) in the old – four dodecamer dataset. This particular peak shows up as a slightly lower peak when the CRD domains dont quite cover this region. In molecular models (the neck and CRD domains are currently the only molecular models that I have found, show that the CRD are sort of “floppy” and sometimes appear to cover, or partially cover the neck domain. This arrangement is very apparent in the AFM images, and drawing the line which is to be used to plot the grayscale values of is highly dependent upon the imput of the observer. I plotted these lines trying to pick a center portion of the hexamers not varying the plot line to include or eliminate the influence of neck domain.
peak width for the neck domain is about 5 nm, but keep in mind this peak shows up under the CRD domain(s) about 70% of the time only.