Optimization of skeletal protein preparation for LC-MS / MS sequencing yields 1 additional coral skeletal proteins in Stylophora pistillata 2 3

Stony corals generate their calcium carbonate exoskeleton in a highly controlled biomineralization process mediated by a variety of macromolecules including proteins. Fully identifying and classifying these proteins is crucial to understanding their role in exoskeleton formation, yet no optimal method to purify and characterize the full suite of extracted coral skeletal proteins has been established and hence their complete composition remains obscure. Here, we tested four skeletal protein purification protocols using acetone precipitation and ultrafiltration dialysis filters to present a comprehensive scleractinian coral skeletal proteome. We identified a total of 60 proteins in the coral skeleton, 44 of which were not present in previously published stony coral skeletal proteomes. Extracted protein purification protocols carried out in this study revealed that no one method captures all proteins and each protocol revealed a unique set of method-exclusive proteins. To better understand the general mechanism of skeletal protein transportation, we further examined the proteins’ gene ontology, transmembrane domains, and signal peptides. We found that transmembrane domain proteins and signal peptide secretion pathways, by themselves, could not explain the transportation of proteins to the skeleton. We therefore propose that some proteins are transported to the skeleton via non-traditional secretion pathways.

proteins in the coral skeleton, 44 of which were not present in previously published 23 stony coral skeletal proteomes. Extracted protein treatment protocols carried out in 24 this study revealed that there is no "one optimal method" and each protocol revealed a 25 unique set of method-exclusive proteins. To better understand the general mechanism 26 of skeletal protein transportation, we further examined the proteins' gene ontology, 27 transmembrane domains, and signal peptides. We found that transmembrane domain 28 proteins and signal peptide secretion pathways, by themselves, could not explain the 29 transportation of proteins to the skeleton . We therefore propose that proteins are echinoderms [e.x., 17, 18] [and reviewed by 19, 20], mollusks [e.x., 21, 22-24] and 56 mammals [e.x., 25, 26, 27], among others [28]. At present, the best described SOMP 57 complex is of mammalian bone and teeth [15]. 58 In stony corals, the most current knowledge of SOMPs is limited and based on 59 intraskeletal protein extraction [29][30][31][32][33] . It has been suggested that coral SOMPs aid in 60 the molecular processes of crystallization as well as in the development and 61 strengthening of the minerals ([34-37] among others). The constant advancement of 62 mass spectrometry technology has broadened our capability to identify many proteins 63 in skeletal extracts, even those proteins in low abundance [38]. However, this 64 technology is sensitive to contamination by organic matter remnants from soft tissue 65 and cell debris from the study organism, and the little-addressed issue remains of 66 contamination by researchers during the protein extraction, preparation, and 67 sequencing steps [39][40][41]. Upon extraction, the SOMPs are usually divided into two 68 fractions: acid-soluble matrix and acid-insoluble matrix proteins (ASM and AIM 69 respectively), based on their solubility in the acid of choice [22,[31][32][33]42]. While 70 some past attention has been directed towards the soluble fraction [30,43,44], 71 Pereira-Mouriès et al. [45] showed that, in the bivalve Pinctada maxima, the 72 classification of AIM and ASM is misleading and that both fractions share common 73 features. Furthermore, Goffredo et al. [42] found in the stony coral Balanophyllia 74 europaea that both fractions consist of the same macromolecules; they associated the 75 degree of solubility to the difference in cross-linking. They also showed that each 76 solubility fraction has a different influence on calcium carbonate crystal morphology, 77 aggregation, and polymorphism in-vitro. In contrast,  observed 78 a different SOMP composition between solubility fractions in the scleractinian coral 79 Acropora millepora,. Out of 36 SOMPs, only two were found exclusively in the 80 soluble fraction and twelve were exclusive to the insoluble fraction. These examples 81 demonstrate the attempts to attribute different properties to the two fractions but the 82 data remain inconclusive. 83 To date, three major coral skeletal proteomes have been published [31-33] 84 with each proteome consisting of 30-40 proteins. Of the 30 proteins sequenced from 85 A. digitifera skeleton, 26 were also detected in A. millepora skeleton [32,33]. They 86 consist mostly of either transmembrane (TM) domain proteins or secretory proteins 87 [33]. However, only 12 of the proteins identified in A. millepora skeleton matched 88 those found in S. pistillata skeleton [31,32]. In A. millepora, 11 TM domain-89 containing proteins were identified, as well as two proteases that were not detected in 90 S. pistillata [32]. The authors suggested that the proteases' role is in cleaving the 91 extracellular domain of TM proteins and incorporating them into the skeleton. 92 The coral skeletal proteomes published to date reveal an overlap of several 93 detected proteins, but at least 10 proteins from each species appear to be unique. It is 94 currently unknown if this is truly due to species-specific gene expression and protein 95 localization or to methods in extracting, purifying, and sequencing the proteins. In 96 this study we analyzed several methods for protein purification to increase the 97 detection of proteins from cleaned coral skeleton powder. We show that the use of 98 acetone precipitation versus centrifugal filter washing, and the degree to which each 99 purification method is performed, affects the numbers and types of proteins that can 100 be sequenced by mass spectrometry. Further, we suggest that there is no one 'best' 101 method for coral skeletal protein preparation such that future research projects may 102 need to utilize several preparation methods to detect the full breadth of proteins 103 embedded in coral skeleton. 104

Methods 105
Sample collection and preparation for protein extraction 106 The hermatypic coral Stylophora pistillata (Esper, 1797)  pestle. Skeleton powder was then oxidized in sterile Falcon tubes and washed as 119 above three more times (i.e., four complete rounds of oxidative cleaning) to ensure 120 that no organic residue remained on the skeletal grains. In each cycle, the removal of 121 the oxidizing or washes solution was performed by centrifugation at 5,000 x g for 3 122 min at 4 C˚. Cleaned skeletal powder was then dried overnight at 60 C˚. We carried 123 out all the described processes in a laminar flow biological hood (apart from oven 124 drying) with all preparation tools and surfaces bleached to avoid contamination. 125 To monitor the removal of proteins from the skeletal fragments, we checked 126 the cleaning efficiency under SEM after the fourth oxidative cleaning. Samples were 127 vacuum-coated with 4 nm gold prior to examination under a ZEISS Sigma 128 TM scanning electron microscope an in-lens detector (5kV, WD = 5-7mm) (SI Figure  129 1a,b). In addition, we sonicated the powder at 4˚C in filter-sterilized phosphate 130 buffered saline (PBS, pH 7.4) for 30 minutes, pelleted the powder at 5,000 x g for 3 131 minutes at 4˚C, concentrated the supernatant on a 3-kDa cutoff centrifugal filter unit 132 (Amicon) and loaded samples of supernatant on a 8-16% SDS-PAGE TGX stain free 133 gels (Bio-Rad) (SI Figure 1c). 134

Extraction of skeletal proteins 135
We decalcified approximately 1.3 g cleaned skeleton powder per protein extraction 136 protocol in 0.5 M acetic acid (30 ml acid/g cleaned skeleton powder) in Falcon tubes 137 while rotating the tubes at room temperature for 3 hours. Samples were then 138 centrifuged at 5,000 x g for 5 min at 4 o C and supernatant was transferred to a new 139 tube and stored at 4 o C. We continued the decalcification of the undissolved pellets 140 with a second volume of 0.5 M acetic acid and allowed decalcification to proceed to 141 completion. We then combined both liquid fractions (70 ml total) for each sample, 142 froze the total volumes at -80˚C, and dried them by overnight lyophilization. The 143 dried pellets were stored in -80°C until further processing. 144 The lyophilized pellets were re-suspended in 12 ml MilliQ water and the 145 proteins were concentrated on 3 kD cutoff Amicon® Ultra 15 centrifugal filter units 146 (Merk-Millipore) 5,000 x g at 4˚C to reach a final volume of 0.5 ml. To continue 147 desalting the samples we diluted them again to 12 ml in MilliQ water and repeated 148 the concentration procedure as above. At this stage, we observed an insoluble pellet in 149 all samples.  Two filter-concentrated samples containing both soluble and insoluble 157 fractions were examined for the effects of centrifugal filtration. Sample CF2 was 158 centrifuged at 5,000 x g for 5 min at 4°C to pellet the AIM fraction. Both the ASM 159 and AIM of CF2 were divided into separate sterile 1.5 ml microcentrifuge tubes, 160 lyophilized, and stored at -80°C until further use. Sample CF4 was desalted as 161 described above for a third time before separating ASM and AIM fractions. 162 Both ACT samples were centrifuged as above for CF2 to separate ASM and 163 AIM fractions. To sample ACT1 ASM (concentrated to 0.5 ml by centrifugal 164 filtration as described above) was added 2 ml 100% ice cold acetone. The sample was 165 vortexed for 10 seconds, incubated at -20°C for 30 minutes, and centrifuged at 4,300 166 x g for 30 min at 4°C. Finally, the pellet was washed three more times with 2 mL of 167 80% ice cold acetone and stored then at -80°C until further use. The AIM fraction was 168 similarly washed four times with 80% acetone. Both fractions of sample ACT3 were 169 treated as in ACT1 but with one less washing step of each fraction. 170

LC MS\MS 171
S. pistillata skeletal protein samples were dissolved in 5% SDS and digested with 172 trypsin using the S-trap method overnight at room temperature. We analyzed the 173 resulting peptides with nanoflow ultra-performance liquid chromatograph 174 (nanoAcquity) coupled to a high resolution, high mass accuracy mass spectrometer 175 (Fusion Lumos). The sample was trapped on a Symmetry C18 0.18*20mm trap 176 column (Waters, Inc) and separated on a HSS T3 0.075*250 mm column (Waters, 177 Inc.) using a gradient of 4-28% (80% acetonitrile, 0.1% Formic acid) for 150 minutes. 178 Spray voltage was set to +2kV. The data were acquired in the Fusion Lumos using a 179 Top Speed Data-Dependent Acquisition method using a cycle time of 3 s. An MS1 180 scan was performed in the Orbitrap at 120,000 resolution with a maximum injection 181 time of 60 ms. The data were scanned between 300-1800 m/z. MS2 was selected 182 using a monoisotopic precursor selection set to peptides, peptide charge states set to 183 +2-+8 and dynamic exclusion set to 30 s. MS2 was performed using HCD filter out these potential contaminants from our final list of coral-specific proteins, we 205 BLASTed all sequences against the 'Primates' database in NCBI using Blast2GO. 206 We then examined NCBI-generated sequence alignments of coral versus Homo 207 sapiens proteins with e-values lower than e -50 and percent mean similarity greater than 208 50%, all sequences with e-values lower than e -100 , and all sequences with percent 209 similarity greater than 80%, and removed from our final list of coral proteins any 210 sequences with three or more peptides each of seven or more amino acids in length 211 that were identical between S. pistillata and humans. 212 All proteins identified by the LC MS\MS analysis were filtered to proteins 213 with at least two significant spectra and at least one significant unique peptide. We determined orthologous biomineralization genes, known from skeletal 232 proteomic analysis, across coral taxa. First, we estimated orthology relationships 233 between all non-redundant genes of selected metazoa species using OrthoFinder. We 234 included all genes of all Cnidaria species with known genome-based annotations. 235 OrthoFinder generates orthology-groups (Orthogroups) based on normalized 236 reciprocal best Blast hits' bit scores [58], and then estimates orthologues genes pairs 237 within Orthoroups [59]. We then selected all pairs of Acropora spp. orthologs to S. 238 pistillata (1:1, 1:many, many:many relationships) ([31-33], this study). From these 239 pairs we further selected S. pistillata spectra-based identified proteins, or skeletal 240 proteins known in the literature. Since not all skeletal protein annotations from the 241 literature were included in our reference Orthofinder proteome datasets, we further 242 found their best matches in the reference orthofinder proteome using blastp. 243

Results: 244
After extensive cleaning of the powdered skeleton and acid-extraction of embedded 245 organic matter we identified in total 60 coral-specific proteins, with two or greater 246 spectra or one unique peptide with at least 10 spectra and an identification score of 247 250 or greater, in S. pistillata skeleton as predicted by the species genome [48] ( proteins, although many remain uncharacterized (SI Table 3). 250 In order to evaluate the efficacy and improve current methods for stony coral 251 skeletal protein extraction, we examined four different protocols; two centrifugation 252 ultrafiltration filters (CF) and two further acetone precipitation (ACT) protocols. 253 Proteomes of the four methodologies differed in composition and variety (Figure 1). 254 Combining results of all CF fractions identified 52 coral-specific proteins while 255 combined ACT protocols yielded 13 such proteins ( Figure 1A). Moreover, 256 redundancy between methodologies was low. Only 8.3% of the proteins overlapped 257 between methods while 78.3% and 13.3% of the proteins were exclusive to combined 258 CF and combined ACT fraction data, respectively. 259 To evaluate the extracted protein treatment efficiency of each protocol, we 260 first compared the number of proteins detected in each method (e,x,, ACT1 vs ACT 3 261 and CF2 vs CF4). Of the 8 proteins found only in samples from the acetone wash 262 protocols (ACT 1 and ACT 3), one was observed only in ACT1 while five were 263 observed only ACT3 (Figure 1, SI Table 2); both ACT1 solubility fractions had one 264 more wash step than did those in ACT3. In contrast, of the 47 proteins observed only 265 in CF samples, 11 were found in CF2, which went through two filter centrifugation 266 steps, while 16 were found in CF4, which went through a third filter centrifugation Our data compared to other coral skeletal proteomes 279 Since our analysis yielded a large amount of new skeletal proteins, we also compared 280 our results with the three previously published proteome of S. pistillata, A. digitifera 281 and A. millepora. Out of our entire identified skeletal proteome containing 60 282 proteins, using OrthoFinder and BlastP, only 16 were found to be similar to proteins 283 identified in these studies. Yet, this proportion of overlap (16 out of 60) is 284 significantly greater than the expected proportion by chance, since the proportion of 285 known skeletal matrix proteins in the reference coral proteomes is extremely small 286 (less than ~0.2%). Seven proteins were found to overlap all four proteomes: a 287 coadhesin-like protein, an EGF and laminin G domain-containing protein, a 288 hypothetical protein, a MAM and LDL-receptor class A domain-containing protein, a 289 mucin, aspartic acid-rich protein 2-like, and a ZP domain-containing protein ( Table  290 3). 291

Skeletal proteome characterization 292
We interrogated the mechanisms by which proteins may be exported from or attached 293 to the cell (SI Table 2). Seventeen coral skeletal proteins with likely complete N-294 terminus predictions possess signal peptides as a potential mechanism for export from 295 the cell. Eight proteins contain at least one transmembrane span suggesting that they 296 are embedded in the cell membrane. Further, 10 proteins likely interact with the 297 exterior of the cell membrane by GPI anchors. In total, 25 of the 60 sequenced coral 298 skeletal proteins exhibit documented characteristics for localization in the ECM. 299 Because the majority of the proteins sequenced from the S. pistillata skeleton 300 do not possess features for signaling their export from the cell, we queried the data set 301 for further suggestions of positioning the proteins in the membrane or that the proteins 302 may be exported by vesicles such as those that may be involved in calcium 303 concentration. To do this, we examined the skeletal proteome annotations and GO 304 classifications toward finding common features to allow grouping of proteins. Out of 305 the entire skeletal proteome sequenced in this study (60 proteins), 3 9 genes were 306 returned with GO terms that allowed their classification into five groups of interest 307 based on their cellular component, biological process, and molecular function to 308 suggest likely cellular locations pertinent to the calcification mechanism, which may 309 therefore be indicative of their function in this process: lipid\phosphate\glycan related 310 proteins (i.e., membrane processing); ECM-related, transmembrane, and protein 311 modification proteins; and metal binding proteins vesicular/secretion related proteins, 312 (SI Table 3). 313 Of the proteins with GO terms, 10 proteins are suggested to be involved with 314 processing of the cell membrane (Figure 2), A much larger number are related to 315 vesicles/secretion as well as binding metal, with 21 and 25 assigned to each category, 316 respectively. We combined these two categories in our proposed cellular location 317 Figure 2 as some of the skeletal proteins proposed to be found intracellularly in 318 vesicles are also known to bind calcium [34,61]. Finally, 19 and 12 are potential 319 ECM proteins or are involved in protein modification, respectively. It should be 320 noted that many protein are assigned to multiple categories. 321

Discussion: 322
In this study we show the importance of using complementary post-extraction 323 methods to concentrate coral skeletal proteins for sequencing the full breadth of the 324 skeletal proteome. Our results show a clear and marked difference in detected proteins 325 between protein purification methods, with only two proteins observed in all methods. 326 Centrifugal Filter (CF) methods yielded a much greater abundance and diversity of 327 proteins than did acetone precipitation (ACT) ( Figure  1 ). Moreover, we found that 328 protein detection is largely method-exclusive ( Figure  1 ). We hypothesize that the 329 protein yield differences between extraction methods is based on the different 330 properties of each method. CF is a mechanical filtration based on size and has a bias 331 toward hydrophobic proteins, whereas ACT is based mainly on the chemical 332 interactions of proteins resulting in increased precipitation of hydrophilic proteins 333 [62]. We speculate that the CF methods yielded more total proteins in our study as 334 acetone precipitated proteins may be difficult to resolubilize, potentially leading to the 335 possible loss of many proteins in the pellet which is not transferred to the trypsin 336 digestion step. 337 A major challenge in working with intra-skeletal proteins is isolating the true 338 skeletal proteins from soft tissue contamination [40,41]. In the present study, as in 339 previous work on S. pistillata skeletal proteins [40] we carried out an intensive 340 oxidative cleaning step on the skeletal powder, in addition to cleaning the skeletal 341 fragments, to avoid contamination. We did not observe any organic residues by SEM 342 on intact skeletons following the second cleaning step (SI Figure 1a Further differences in proteomes, beyond species differences, are the differing 367 reagents used in precipitation (compared to [31]) and our smaller centrifugal filter 368 cutoffs (compared to [32]). This second difference is particularly important with 369 respect to protein degradation. Even when embedded in biominerals so that amino 370 acids and even short peptides persist, proteins may still succumb to degradation [65]. 371 This result in small peptides fragments that may be lost from centrifugal filter units 372 with 10 kDa pore sizes and larger. It is reasonable to assume that coral skeletal 373 proteins go through the same process, and if so, the cutoff of the membrane directly 374 affects the number of peptide spectra matches (PSM). Using smaller cutoff filters in 375 this study might have allowed us to capture some of these sheered peptides and led to 376 higher PSMs. 377

Traditional protein trafficking by signal peptides and transmembrane domain does 378 not explain the full extent of protein transportation to skeleton 379
Corals' skeleton is external to the animal; therefore, proteins in the skeletal matrices 380 must be transported outside the cells or span the membrane and have an extracellular 381 portion to reach the skeletal crystallization front. Indeed, recent studies of anthozoans 382 reveal a significant proportion of TM domain proteins (~35%) in the SOM [32,33], 383 similar to that found in the better-studied Echinoidia [66] . Based on these findings, 384 we examined the hypothesis that many of the skeletal proteins originate in the plasma 385

membrane. 386
Our analysis revealed that TM domain proteins are not the major component  Previous studies have shown Ca 2+ rich granules in the calicoblastic epithelium 408 (skeletogenic cells), but not in the other tissue layers, suggesting their role as a Ca 2+ 409 reservoirs in the cells. Vesicles were previously identified in corals [72,73]; however, 410 their origin and content was not detailed and they are sometimes attributed to 411 preservation byproducts. These intracellular ion-rich vesicles may endocytose sea 412 water [74], after which they are enriched in carbonate ions and then form hydrated 413

ACC and anhydrous ACC precursors stabilized by acidic biomolecules including 414
CARPs [75]. Using cell cultures, Mass et al. [61] suggest that the vesicles, which 415 contain Asp rich proteins, then transport their contents to the ECM, releasing their 416 content by exocytosis. The biomineral then further develops extracellularly, likely 417 aided by other ECM proteins [30,35,49,76] as well as other biomolecules [77][78][79][80]. 418 At present the process of calcium delivery to the skeleton and the role of most of the 419 proteins in skeletal depositing remain to be determined. 420 Since TM and SP analysis did not fully explain the mechanism of protein 421 while, they were previously reported at less than 10% in Stylophora skeleton [31]. 438 Our study revealed a much greater proportion of 28% uncharacterized proteins. While 439 partially attributed to sample size, it is most likely due to quality of genomic data 440 available, since stony corals are non-model organisms and their genomic libraries are 441 far from being complete, resulting in incomplete databases on which to map the 442 proteome and many uncharacterized genes. 443

Conclusion: 444
In this study we have considered the differential effects of coral skeletal protein 445 extract preparation as well as the method by which these proteins, or parts thereof, are 446 transported from intracellular to extracellular locations. When preparing coral skeletal 447 proteomes, we propose that a multi-method approach to cleaning, demineralization, 448 and protein extraction should be used. Our results showed that each protein 449 preparation protocol yielded exclusive sets of proteins with little overlap between 450 ACT and CF fractions. While CF protocols yielded many more proteins than did ACT 451 methods, use of a single protocol to clean and concentrate coral skeletal proteins 452 results in a significant amount of data loss, and it is therefore of crucial importance to 453 consider alternative and complementary methods to obtain a fully comprehensive 454 skeletal proteome. We showed that while the role of TM domain proteins cannot be 455 overlooked, many of the proteins detected in the S. pistillata skeletal proteomes as 456 well as in that of other species point toward other secretory or vesicular pathways. 457 Our categorization method, supported by data from other recent studies, also suggests 458 that corals use an alternative secretory pathway, such as vesicles, and much work is 459 required in order to determine the calcium deposition pathway and the proteins 460 involved. Our study provides a large set of new uncharacterized coral skeletal proteins 461 as well as others of purported function but that have not been observed before in the 462 coral SOM. These data expand the current knowledge of the SOM in corals and will 463 help, in future studies, to resolve corals' calcium deposition mechanism and the 464 various roles of the proteins involved.    Table 1: Summary of methods for protein concentration and cleaning after extraction 786 in acetic acid. CF -Centrifugation filtration methods; ACT -acetone precipitation 787 methods; ASM -Acid soluble matrix; AIM -Acid insoluble matrix. 788 789 Table 2: 60 coral skeletal proteins detected by LC-MS/MS across all treatments and 790 solubility fractions. Proteins are listed in order of accession number. Geno ontology 791 categorization is represented as a ECM/transmembrane and protein modification, 792 b membrane processing, and c vesicle/secretion and metal binding. 793 794 Table 3. Orthologous coral skeletal proteins in the present work and previously 795 published. Gene ontology categorization is represented as a ECM/transmembrane and 796 protein modification, b membrane processing, and c vesicle/secretion and metal 797 binding.

799
Figures 800 Figure 1: Distribution of proteins numbers according of the extraction methods.

801
Distribution of all proteins (SOM and ISOM combined) by extraction methods (A).

802
SOM and ISOM distribution by acetone precipitation methods (ACT). SOM and 803 ISOM distribution by centrifugation and ultrafiltration method (CF