|
Data mining of public microarray data with TBrowser |
|
TranscriptomeBrowser (TBrowser) host a large database of transcriptional signatures (TS, n~20 000) extracted from GEO public microarray repository using the DBF-MCL algorithm. TBrowser comes with a sophisticated search engine so that users can search for the biological contexts in which several genes were concomitantly regulated. Several examples are provided below and in the article published in PLoSONE . A video tutorial is available here .
The current database contains about 20 000 TS derived from ~ 1 500 microarray datasets (~222 millions expression values). Each TS was tested for functional enrichment using annotation obtained from numerous ontologies or curated databases (Gene Ontology, KEGG, BioCarta, Swiss-Prot, BBID, SMART, NIH Genetic Association DB, COG/KOG...) using the DAVID knowledgebase. |
|
|
Simply paste your gene list in the search panel and modify the "%min." argument Let say you performed a microarray experiment and found 100 genes that best discriminate between your condition A and B. You would like to find the biological contexts in which they were already observed as co-regulated. The probabibility of finding a transcriptional signature containing the whole list of genes is low. In this case it is advisable to directly paste the gene list in the search panel (the line feeds are converted into spaces) (step 1). The "%min." argument controls the proportion of genes falling into a transcriptional signature. For instance, iIn the following example, we have pasted 34 genes symbols in the search panel, the "%min." is set to 50% which means that we are looking for transcriptional signatures containing at least 17 genes out of the list (step 2). Pressing the "search" button (step 3) allows one to find 16 signatures (result panel). Informations about the signatures are available by pressing the "show" button. They can now be sent to plugins. |
|
|
Some examples of Boolean queries |
|
In Boolean mode, one can search the GeneID Field with the following queries. - "ESR1 & GATA3 & FOXA1" -> TS related to breast cancer (containing ESR1 and GATA3 and FOXA1)
- "CD3E & CD3D" -> TS containing T-cells (containing CD3E AND CD3D)
- "CD3E & CD3D & !CD14 -> TS that contain T-cells markers but not the monocyte/macrophage marker CD14 (containing CD3E and CD3D but not CD14)
- "PCNA & MKI67 & CDC2" -> TS containing cell-cycle related genes (containing PCNA and MKI67 and CDC2)
User can next, ask for genes that are frequently observed in the selected TSs (TBCommonGene plugin). In Boolean mode, one can search the Annotation Field with the following queries (user may select the q-value):
- "CELL CYCLE"[5,12,18] -> TS enriched in genes associated with the functional annotation term "CELL CYCLE".
- 6p21.3[4] & 14q32.33[4] & "T CELL ACTIVATION"[5,12] -> TS enriched in genes from 6p21.3 and 14q32.33 cytobands (major histocompatibility complex locus and human immunoglobulin heavy-chain locus respectively) and containing genes related to T-cell activation.
|
|
|