Vaccine design of coronavirus spike (S) glycoprotein in chicken: immunoinformatics and computational approaches

Background Infectious bronchitis (IB) is a highly contagious respiratory disease in chickens and produces economic loss within the poultry industry. This disease is caused by a single stranded RNA virus belonging to Cronaviridae family. This study aimed to design a potential multi-epitopes vaccine against infectious bronchitis virus spike protein (S). Protein characterization was also performed for IBV spike protein. Methods The present study used various tools in Immune Epitope Database (IEDB) to predict conserved B and T cell epitopes against IBV spike (S) protein that may perform a significant role in provoking the resistance response to IBV infection. Results In B cell prediction methods, three epitopes (1139KKSSYY1144, 1140KSSYYT1145, 1141SSYYT1145) were selected as surface, linear and antigenic epitopes. Many MHCI and MHCII epitopes were predicted for IBV S protein. Among them 982YYITARDMY990 and 983YITARDMYM991 epitopes displayed high antigenicity, no allergenicity and no toxicity as well as great linkage with MHCI and MHCII alleles. Moreover, docking analysis of MHCI epitopes produced strong binding affinity with BF2 alleles. Conclusion Five conserved epitopes were expected from spike glycoprotein of IBV as the best B and T cell epitopes due to high antigenicity, no allergenicity and no toxicity. In addition, MHC epitopes showed great linkage with MHC alleles as well as strong interaction with BF2 alleles. These epitopes should be designed and incorporated and then tested as multi-epitope vaccine against IBV.


Introduction
Infectious bronchitis virus (IBV) is a single Positive stranded RNA that belonging to coronavirus of the chicken (Gallus gallus). It's a highly contagious respiratory disease in chickens that's mainly severe for very young chicks. The signs of illness include tracheal rales, coughing, sneezing, nasal discharge and some strains may cause kidney damage [1,2]. The disease can be transmitted in respiratory discharges and feces by infected chickens, and it is spread by aerosol, ingestion of contaminated feed and water, and contact with contaminated equipment or clothing. The virus is not transmitted via eggs [3]. The disease causes economic loss within the poultry industry, affecting the performance of meat-type and egg-laying birds. The disease can affect all ages, but the clinical disease is more severe in young chicks. Chicks become more resistant to IBV-induced mortality with the increasing age [4].
There are four structural proteins related to the envelope, the spike (S), membrane (M), envelope (E), and nucleocapsid (N) protein [5]. The spike 'S' glycoprotein which located at the surface of the virion. The membrane 'M' glycoprotein is partially exposed at the surface of the virion and also the nucleocapsid 'N' protein that located internally. The spike glycoprotein of IBV induces virus neutralizing (VN) and HI antibodies and has been considered as the most likely inducer of protection [2,4]. The S protein is either a dimer or trimer. It has two recognized functions; binding the virus to receptor molecules on host cells, and activating fusion of the virion membrane with host cell membranes, releasing the viral genome into the cell [2]. The spike gene in particular the S1 part, is highly variable, due to insertions, deletions, substitutions and recombination events [6]. Application of vaccine is the most effective way to protect against pathogenic diseases, particularly when these pathogens have a high mortality rate such as IBV and viruses in general. On the other hand, the large number of serotypes and strains (genotype) of IBV specifically complicate control method. IBV has shift and drift property [7].
Inactivated and live-attenuated vaccines are employed to control the disease. However, inactivated vaccines often fail to induce strong cellular immunity, while liveattenuated vaccines can contribute to development of antigenic variant viruses [5]. The increasing number of new IBV serotypes, caused by frequent gene mutation and recombination, poses a major challenge for the prevention and control of infectious bronchitis disease [8].
RNA viruses such as IBV have high mutational rates. Thus, the most important step in the design of crossprotective peptide vaccine against IBV is to target the conserved epitopes of different IBV serotypes [5].
Presentation by MHC molecules is important for developing vaccinal immunity. MHC class I and class II molecules are typically highly polymorphic and polygenic [9]. Avian MHC class I and class II genes are localized into two regions (MHC-B and MHC-Y) on the chromosome 16. The MHC-B and MHC-Y haplotypes assort independently as the result of an intervening region that supports highly frequent recombination [9,10]. Chicken MHC B-F molecules have been structurally and functionally related to mammalian MHC class I molecules and have been involved in the presentation of antigen to CD8 + T lymphocytes, which is important for antiviral immune response [11]. Recently, the design of epitope-based vaccines has been expanded by developments in genomics, proteomics and the understanding of pathogens. Epitope is the negligible immunogenic region of a sequence of proteins that specifically produces accurate immune responses [12]. The identification of specific B and T cell epitopes produced more desirable manipulation of immune response [13]. It is known that designing of multi-epitope vaccines using bioinformatics tools can significantly reduce the time and cost of production and produce satisfactory results [14,15]. The production of safer and more reliable vaccines for controlling IBV is important. Therefore, the aim of this study is to analyze strains of spike (S) glycoprotein of infectious bronchitis virus reported in NCBI database  using immunoinformatics and computational approaches to select all possible epitopes that can be used as multi-epitopes vaccine. Protein characterization was also achieved for IBV spike protein.

Material and method
Protein sequence retrieval Spike (S) protein sequences of different infectious bronchitis virus (IBV) strains were retrieved from the Gene-Bank of National Central Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/protein/) database in March 2019. The sequences were saved in FASTA format (Table 1).

Structural analysis
Reference sequence of spike S protein (NP_040831.1) was analyzed to identify chemicals and physical properties including GRAVY (grand average of hydropathicity), half-life, molecular weight, stability index and amino acids atomic composition using an online tool Protparam [16] Secondary structure of spike S protein of IBV was analyzed through PSIPRED [17]. The secondary structure of protein including helix, sheet, turn, and coil parameters was predicted using GOR IV server at https://npsaprabi.ibcp.fr/cgi-bin/secpred_sopma.pl. TMHMM an online tool (http://www.cbs.dtu.dk/services/TMHMM/), used to examine the trans-membrane topology of S protein. Presence of disulphide-bonds were predicted through an online tool DIANNA v1.1. It makes prediction based on trained neural system [18]. CDD-BLAST (http://www.ncbi.nlm.nih.gov/BLAST/) [19][20][21] and PFAM (https://pfam.xfam.org/) [22] were used to search the defined conserved domains in the targeted protein sequences. Blastp in NCBI database (https://blast.ncbi. nlm.nih.gov/Blast.cgi) using reference sequence (refseqprotein) database was used to compare spike reference sequences of different coronaviruses in human and animals against IBV spike protein sequence. Phylogenetic tree was also constructed based on COBALT multiple alignment (https://www.ncbi.nlm.nih.gov/blast/treeview/ treeView.cgi) [19,20].

Multiple sequence alignment and epitope conservancy assessment
The retrieved sequences of IBV S protein were aligned using Clustal program and consensus sequence was generated using the multiple sequence alignment (MSA) tool, Jalview version 2.10.5. (http://www.jalview.org/about/jalviewscientific-advisory-committee) [23]. Epitope conservancy analysis in Immune Epitope Database (IEDB) was used to detect potential epitope conservancy (http://tools.iedb.org/ conservancy/) [24]. For calculating the conservancy score, the sequence identity threshold was kept at 80%.

Phylogeny analysis
Phylogenetic tree of the retrieved sequences of spike (S) protein was performed using MEGA7.0.26 (7170509) software using maximum likelihood parameter [25].

B cell prediction
The Immune Epitope Database (IEDB) (http://tools.iedb. org/mhci/) was used to predict B and T cell epitopes of IBV reference sequence of S protein (NP_040831.1) [26]. Linear B-cell epitopes were predicted using BepiPred from IEDB [27]. Emini surface accessibility prediction tool was used to predict surface located epitopes [28]. Whereas, the antigenic epitopes were investigated using kolaskar and Tongaonkar antigenicity method [29]. Discontinuous epitopes were predicted using Disco-Tope server [30]. The parameter was set at ≥0.5 which indicated 90% specificity and 23% sensitivity. This method is based on surface accessibility and amino acid statistics in a collected form dataset of discontinuous epitopes found out by X-ray crystallography of antigen/ antibody protein buildings. Chimera software was used to display the position of predicted epitopes clusters on 3D structure of S protein [31].

T-cell epitope prediction
The T cell epitopes were predicted in human among different alleles of major histocompatibility complex class I (MHCI) and class II (MHCII).
MHC-I binding epitopes were predicted by the IEDB MHC I prediction tool at http://tools.iedb.org/mhci. The binding affinity of peptides to MHC I molecules was     measured using artificial neural networks (ANN) method [32,33]. Prior to prediction, peptide lengths were set as 9 mers. The half maximum inhibitory concentration (IC50) values needed for the binding of peptide to MHC-I molecules were set less than or equal to 300 nM. The IEDB MHCII prediction tool was used for MHC class II molecules at (http://tools.iedb.org/mhcii/) [26]. Human MHC class II alleles (HLA DR, HLADP and HLADQ) were used for MHCII binding predication. The NN-align method was used with IC50 less or equal to 1000 nM [34].

Molecular docking
To perform molecular docking, 3D structures of MHCI epitopes and BF alleles were submitted simultaneously to the PatchDock online autodock tools; an automatic server for molecular docking (https://bioinfo3d.cs.tau.ac. il/PatchDock/) [44]. The five top models were selected using firedock [44]. The results were visualized using the UCSF-Chimera software 1.8 [31].

Structural analysis
The physiochemical properties of the spike S protein, measured through Protparam, showed that it contained 1162 amino acids (aa) with a molecular weight of 128, 046.70 kDa. The spike protein showed an antigenic nature when subjected to Vaxijen v2.0.
Theoretical isoelectric point (PI) of spike protein was 7.71, indicating its positive in nature. An isoelectric point above 7 indicates the protein is charged positively. Near to 81 aa charges were found negative, whereas 84 aa found positive.
Protparam computed instability-index (II) 35.53, this categorize the protein as stable. Aliphatic-index 86.05, which devotes a thought to the proportional volume holding by aliphatic side chain and GRAVY value of the protein sequence is 0.012. Half-life of S protein shown as the total time taken for its vanishing after it has been synthesized in cell, computed as 30 h for mammalianreticulocytes, > 20 h for yeast, > 10 h for Escherichia coli. The secondary structure of IBV spike S protein was analyzed through PSIPRED and GOR IV server. The components of secondary structure prediction by GOR IV server are alpha helix (29.43%), extended strand (27.37%), beta turn (5.25%), and random coil (37.95%) (Fig.1).
DiANNA1.1 tool calculated 19 disulphides bond (S-S) positions and assign them a score and makes prediction based on trained neural system. The trans-membrane protein topology was investigated via online tool TMHMM. Residues from 1 to 1093 were found to be exposed to the surface, residue from 1094 to 1116 were found inside trans-membrane-region and residues from 1117 to 1162 were buried within the core-region of the S protein (Fig.1).
In refseq of IBV spike protein two conserved domains (Corona-S2, Corona-S2) were identified. The conserved domains were sequenced by Conserved Domain (CDD) BLAST search. The results revealed that corona-S1 (pfam01600) is the only member of the superfamily cl03276 and corona-S2 domain (pfam01601) is the only member of the superfamily cl20218. The top associated sequences in both domains were Feline infectious peritonitis virus (strain 79-1146), Avian infectious bronchitis virus (strain Beaudette), and Human coronavirus 229E while Severe acute respiratory syndrome-related coronavirus sequences were associated only with corona-S2 domain. The closest homologue obtained from BLASTP (refseq-protein) results was the Turkey coronavirus S protein with E value 0.00 followed by Murine hepatitis virus strain JHM with E value 9e-109 when comparing various coronaviruses in human and animals with IBV spike protein sequence (Table 2). Phylogenetic tree of IBV against other coronaviruses in human and animals was created based on COBALT multiple alignment see Fig. 2.

Multiple sequence alignment
Jalview was used to visualize the multiple sequence alignment of the retrieved sequences. Several areas in alignment were shown to have mutation see Fig. 3.

Phylogeny
Phylogenetic tree for IBV spike S protein sequences was constructed using MEGA7.0.26 (7170509) software using maximum likelihood parameter see Fig. 4.

B-cell epitopes
Several epitopes were predicted in B cell prediction methods using the Bepipred Linear Epitope Prediction tool. The conservancy percentages of these epitopes are presented in Table 3. After shortening of predicted epitopes, 21 linear conserved epitopes were recognized. Of these, seven epitopes with different lengths were identified as linear, surface and antigenic epitopes between the positions 1139-1146 (see Table 4). These epitopes were 1139 1141 SSYYT 1145 ) were selected as top B cell epitopes. Discotope 2.0 server was used to predict the discontinuous epitopes from the 3D structure of S protein (PDB ID: 6CV0), 90% specificity, − 3.700 threshold and 22.000 Angstroms propensity score radius [45]. Total 30 discontinuous epitopes were recognized at different exposed surface areas ( Table 5). The position of each predicted epitope on the surface of 3D structure of S protein is shown in Fig. 5 using Chimera visualization tool [31].

Prediction of MHC class I epitopes
In this study, the Human MHC class-I HLA alleles were used to explore the interaction of epitopes with MHCI alleles as chicken MHC alleles don't exists in IEDB database. MHC-1 binding prediction tool using IEDB database expected 13 conserved epitopes of spike protein (S) which were interacted with many cytotoxic T cell alleles. These epitopes were 1115  Antigenicity, allergenicity and toxicity of MHCI and MHCII epitopes The predicted epitopes of MHCI and MHCII were subjected to VaxiJen v2.0 server, AllerJen v2.0. and Tox-iPred to estimate the potential antigenicity, allergenicity and toxicity of epitopes. Five MHCI epitopes were identified as antigenic, non-allergic and non-toxic, but only three epitopes ( 985 TARDMYMPR 993 , 983 YITARDMYM 991 and 982 YYITARDMY 990 ) showed a high linkage with MHCI alleles (Table 6). Furthermore, six MHCII epitopes were predicted to be antigenic, non-allergic and non-toxic epitopes (Table 7). However, 983 YITARDMY M 991 and 982 YYITARDMY 990 epitopes which were also presented in MHCII prediction methods, showed high antigenicity, no allergenicity and no toxicity. These epitopes were interacted with 52 and 38 alleles in MHCII see Fig. 6.

Molecular docking
The molecular docking was achieved by docking MHCI epitopes with chicken BF alleles (BF2 * 2101 & BF2 * 0401) using peptide-binding groove affinity. The chicken alleles were used as receptors, and the top MHCI epitopes 982 YYITARDMY 990 , 983 YITARDMYM 991 and 985 TARDMYMPR 993 were used as ligands. Docking of 983 YITARDMYM 991 epitope with BF2*2101 and BF 2 *0401 alleles showed -72.11 and -37.39 global energy respectively, indicating a strong binding affinity between the ligands and the receptors compared to other epitopes (Fig. 7, 8 and 9). In general, the global binding affinity of ligands with the receptor BF2*2101 alleles was found to be lower compared to BF2*0401, suggesting strong receptor-ligand interaction.

Discussion
Epitopes capable of inducing immunity in both types (Bcell and T-cell) are considered to be strong candidates for the vaccine [46]. There are several potential benefits offered by peptide vaccine over traditional vaccines against organisms. Most importantly, it allows the immune response to focus only on relevant epitopes and avoid those leading to non-protective responses, immune evasion, or unwanted side effects, such as autoimmunity [47].
IBV vaccination studies have always focused on humoral immune responses regarding protection. Acquired immunity results in the activation of antigenspecific effector mechanisms including B-cells (humoral), T-cells (cellular) and macrophages, and memory cells production [4]. Chickens develop a good humoral response to IBV infections, which measured by ELISA, virus neutralizing (VN) and haemagglutinationinhibition HI antibodies tests [48].
IBV glycoprotein S1 is known to be responsible for virus neutralization (VN) and haemagglutinationinhibition HI antibodies and has been considered the most likely protective inducer [4]. Multi-peptide vaccines using immunoinformatics tools have recently been conducted in Sudan for several viral diseases in chicken such as ILTV, fowlpox, Newcastle and marek's disease virus [15,[49][50][51].
In the present study, IBV spike protein was analyzed using various prediction servers. Protein characterization of IBV spike S protein using Protparam confirmed its positive in nature and stable. The protein also exhibited good antigenic properties using Vaxijen 2.0v server.
Corona-S1 and Corona S2 have been identified as major conserved domains in the IBV spike glycoprotein refseq. Conserved Domain (CDD) BLAST search revealed that corona-S1 (pfam01600) is the only member of the superfamily cl03276 and corona-S2 domain (pfam01601) is the only member of the superfamily cl20218. The main related sequences in both domains were Feline infectious peritonitis virus (strain 79-1146), Avian infectious bronchitis virus (strain Beaudette), and Human coronavirus 229E. However, Severe acute respiratory syndrome-related coronavirus sequences was only associated with corona-S2 domain [52]. Prediction of B-cell epitopes is essential for the design vaccine components and immuno-diagnostic reagents. B-cell antigenic epitopes are either continuous or discontinuous in nature.
Most epitope prediction methods are based on continuous epitopes [53]. It has been reported that linear B cell epitopes play a role in virus neutralization [11]. IEDB prediction tool was used to predict linear, surface and antigenic epitopes based on the properties of amino acids such as hydrophilicity, surface accessibility, flexibility, and antigenicity [15].
The majority of B-cell epitopes are conformational (around a 90%) and only a minority of native antigens have linear B-cell epitopes [54]. Discotope server has been used for predicting discontinuous.
Epitopes from the 3D structure of the spike IBV reference sequence. Around 30 discontinuous epitopes with a specificity of 90% were recognized at different exposed surface areas. These epitopes have a significant  advantage in identifying the native well-structured protein Ag [55]. Cytotoxic T lymphocytes (CTL) provide a critical arm of the immune system in eliminating autologous cells expressing foreign antigen. Unlike humoral immunity, the specificity of CTL activation depends on membrane receptors rather than secreted molecules, and antigen receptors of CTL interact with peptide determinants only in association with matched major histocompatibility complex (MHC) molecules. Virusspecific CTL have been shown to be important, if not critical, for resolution of infection and elimination of viral shedding [1].
It is stated that, the major histocompatibility complex MHC restricted CTL response can be associated with decreases in viral load, and CD8 + lymphocytes were mostly responsible for the observed protection [1,56]. Responses to infectious bronchitis virus (IBV) with cytotoxic T-lymphocyte (CTL) were calculated at regular intervals between 3 and 30 days post infection [1].
However, MHCI prediction methods showed three conserved CTL epitopes 985 TARDMYMPR 993 , 983 YITA RDMYM 991 and 982 YYITARDMY 990 as they linked to 7 and 3 human MHCI alleles respectively and showed high antigenicity, no allergenicity and no toxicity. Recent studies showed that vigorous cytotoxic T lymphocyte (CTL) responses that correlate with initial decrease in infection and illness can be detected after IBV infection. It has been identified that the CD8 + T cells were exhausted without CD4 + helper T cells. CD4 + T cells do not seem important in the initial resolution of IBV infection in chickens [56].
In MHCII prediction method, several core peptides were predicted to interact with MHCII alleles, but surprisingly the top core peptides were also 983 YITARDMY M 991 and 982 YYITARDMY 990 which were presented in MHCI prediction methods. They linked with 52 and 38 human alleles respectively. These epitopes showed high antigenicity, no allergincity and no toxicity.
Ligands' interaction with the receptor BF2*2101 alleles was found to be better compared with BF2 * 0401. However for both BF alleles, the docked molecules showed different groove binding site. Future studies should test the predicted epitopes for therapeutic potency to prove their safety and effectiveness.

Conclusion
In this study, five epitopes were predicted from spike glycoprotein of IBV as the best B cell ( 1139 KKSSYY 1144 , 1140 KSSYYT 1145 and 1141 SSYYT 1145 ) and T cell epitopes ( 982 YYITARDMY 990 and 983 YITARDMYM 991 ). They showed high antigenicity, no allergenicity and no toxicity as well as great linkage of MHC epitopes with their alleles. The suggested epitopes should be designed, incorporated and tested as multi-epitopes vaccine against IBV. This vaccine may serve as a possible peptide vaccine to control IBV infection in chicken by inducing humoral and cellular responses.