HELP !
Abbreviations : PG : Paralogous groups. OG : Orthologous groups. CTL : controlHoxPred HELP
- Using HoxPred from the Website
- Sequence submission
- The server only accepts Protein sequences
- Up to 10 sequences may be submitted to the server at the same time. To analyse more sequences or for programmatic access, please consider using the SOAP access.
- It is recommended to submit at least 60 amino acids spanning the homeodomain region as the scoring system is length-dependent. Nevertheless, HoxPred has been evaluated on fragments of 39aa produced by PCR surveys. 98% of the complete set of 64 amphibian PCR fragments (Mannaert et al., Molecular Phylogenetics and Evolution, 2006) is correctly classified as regards to PG. 69% of these sequences are correctly classified in OG.
- The different versions of HoxPred
- Initially, HoxPred was dedicated to the prediction of PG and OG in vertebrate sequences. HoxPred has been extended to four versions, each having its own purpose. All versions need at least 60 amino acids spanning the homeodomain region for reliable classification, as the scoring system is length-dependent
- The result page
- HoxPred results are displayed in a table where each row corresponds to a query sequence. The length of the query sequence is noted.
- HoxPred first predicts the PG and then the OG (brown arrows)
- Posterior probabilities returned by the discriminant analysis are also displayed (blue arrows). Only the prediction with the highest posterior probability is displayed in the table. Detailed results for each possible PG and OG can be found in the xml version. Though often informative, posterior probabilities returned by the discriminant function should be interpreted with care since some misclassifications are known to be returned with a probability of 1.
- Raw results in XML format are accessible (green circle)
- The result table can be exported in a tab-delimited text format (red circle)
- Interpreting the results
- CTL : a CTL prediction indicates that HoxPred classified the sequence in the control group. The sequence is likely not to be a Hox protein.
- Some OG cannot be predicted by HoxPred method, since they are currently represented by only one non-redundant sequences : HoxA14, HoxD14, HoxD10. The evaluation of HoxPred showed that the PG are correctly classified, only OG predictions are affected. This limitation should be circumvented as new sequences become available.
- Inside the PG2, classification into A2 and B2 is limited by the very high similiraty between these homeodomains. Although classification in PG2 is accurate, classification in the OG A2 and B2 should be interpreted with care.
- Using HoxPred remotely (SOAP Client)
- Why using the SOAP access ? SOAP is a protocol for accessing a Web Service. HoxPred is exposed as a Web service, accessible by remote client programs via the SOAP protocol. By using the SOAP access, it is possible to submit more than 10 sequences to the server, and include HoxPred in a workflow. Although HoxPred is written in Java, the client program can be written in any langage supporting SOAP protocol.
- Example of clients for HoxPred
- Inputs and outputs
- Input parameter 1 = a Fasta sequence
- Output XML-formatted string
- Using Datab'Hox
- The query page
- Datab'Hox can be queried by sequence identifiers or by complex queries
- The identifiers can be Datab'Hox identifiers in the foloowing form: DH_X where X is a number
- The identifiers can be other databases identifiers:
- UniprotKB identifiers (eg: Q9UAL6)
- JGI identifiers (eg: e_gw.260.85.1|Brafl1)
- GLEAN identifiers for Strongylocentrotus purpuratus (eg: GLEAN3_02815)
- Ensembl ID for Gasterosteus aculeatus (eg: ENSGACG00000000310)
- Complex queries (= multiple criteria):Datab'Hox can be searched by combining multiple criteria. Empty fields are not taken onto account.
- The Hox or ParaHox class predicted by HoxPred. Specify the class (eg PG1) and choose the version of HoxPred in the select menu. Default is "any".
- Specify an organism name (scientific or common)
- Specify a taxon, as given in NCBI Taxonomy (eg Echinodermata) or by its Taxonomy id (eg: 7586)
- Restrict the search to a given dataset by choosing in the select menu. If no other field is filled, this allows to simply browse the chosen dataset.
- Demos of sample queries. Click on the Demo button to automatically fill the form.
- Warning
- Queries returning many results may take a little moment to display the complete result.
- The result page
- To sort the results, click on the header of the column of interest. For large tables, this make take a short moment.
- Click to get additional information such as the sequence and external links
- Datab'Hox identifier
- Classification with the four version of HoxPred. For vertebrate, the orthologous group is also indicated. The number in parentheses is the posterior probability associated to the prediction.
- Check this box to select this sequence
- Check this box to select all displayed sequences
- Click on this button to retrieve the FASTA sequences of the selected items.
- Detailed result page
- Datab'Hox identifier
- Various information on the protein
- Sequence and its length (important to interpret the results, as 60aa containing the homeodomain is required to obtain reliable results.
- External links, including when relevant : Uniprot, JGI, HomeoDB...
- This page also reports the posterior probabilities values returned by HoxPred (not on the screenshot)
| Version | Description | Purpose | How to interpret ? |
|---|---|---|---|
| Vertebrate | This is the original HoxPred version. It classifies into PG and OG. The training set only comprised vertebrate sequences. The CTL group included non-Hox homeodomain. | Classification of Hox vertebrate sequences into PG and OG | Non-CTL sequences are likely to be true Hox sequences. |
| Vertebrate_relaxed | It classifies into PG. The training set only comprised vertebrate sequences. The CTL group did not include non-Hox homeodomain (only random sequences). | Evolutionary studies of Metazoan homeodomain sequences | Hox sequences are classified into PG. Very divergent Hox sequences may be classified as CTL. Non-Hox homeodomain may be classified into CTL or into PG (like ParaHox). Non-CTL sequences thus contains both true Hox sequences and certain non-Hox homeodomain sequences. |
| Bilateria | It classifies into ANT/CENT/POST + GSX/CDX/XLOX. The training set comprised bilaterian sequences. The CTL group included non-Hox homeodomain. | Classification of Hox and ParaHox bilaterian sequences | This is a stringent version for Hox and ParaHox sequences. Non-CTL sequences are likely to be true Hox or ParaHox sequences. Can be used in combination with "Vertebrate_relaxed" to have both the PG predictions and decipher between Hox and non-Hox sequences. |
| Bilateria_relaxed | It classifies into ANT/CENT/POST. The training set comprised bilaterian sequences. The CTL group did not include non-Hox homeodomain (only random sequences). | Evolutionary studies of Metazoan homeodomain sequences | This is a less stringent version of "Bilateria". Non-CTL sequences contains both true Hox sequences and non-Hox homeodomain sequences. |
parameter 2 = "vert" or "evo" or "bilat_pred" or "bilat_evo"