Explanation of the Probe Information Table modify this page

  1. The probe identifier is usually a six character symbol. The first three characters provide the X coordinate of the probe cell on the Affymetrix array (U74Av2, M430AB, or M430 2.0); for example 222 signifies column 222 and X85 signifies column 85 ("X" is used as a buffer character). Similarly, the second set of three characters in each probe ID provides the Y coordinate of the probe cell. The Perfect Match (PM) probes always end in an odd integer. The MisMatch probes (MM) always end in an even integer. Note that PM and MM probes come in pairs located in the same colum but adjacent rows (e.g. X22Y35 and X22Y36). Some probes on the 430 2.0 have x and y axis coordinates >999. Data source: Affymetrix.

  2. The 25 nucleotide sequence of probes in 5' to 3' order. The 5' end of each probe is attached to the quartz array substrate; the 3' end is free and often frayed. Roughly 10% to 20% of oligonucleotides are likely to be complete 25-mers. Alternating green and black colors highlight the Perfect Match sequence (GREEN), and the MisMatch sequence (BLACK). Note that the 13th nucleotide always differs between pairs of PM and MM probes in adjacent rows of this table. Data source: Affymetrix.

  3. Blast 2 Sequences (bl2seq) is an NCBI tool that aligns the 25-mer probe sequence to the GenBank accession that Affymetrix reports having used to design the probes. We automatically submit both data type types to NCBI's BLAST 2 server and return the results of this simple alignment in a new window. You can check that probes were correctly designed to hybridize to the cRNA sample and are not inadvertently antisense probes. Note that the GenBank entry may be backwards in orientation in which case, the probes may be correctly oriented even if they appear to be on the wrong strand.

  4. This column provides the approximate exon number to which the probe sequences should bind. If no exon is listed then in 9 out of 10 cases, the probes target the 3' untranslated region (3' UTR) at the end of the message near to the polyadenylation site. A format such as "7*8" signifies that the probe sequence is taken from both exons 7 and 8. Affymetrix tries to use probes that target the 3' UTR since these tend to be more common to transcripts produced by a gene. However, many genes actually have alternate 3' UTRs. For example Egr1 has two different mRNAs with 3' UTRs that are separated by several hundred base pairs. To confirm exon assignment of probes please click on the BLAT PM PROBES button above. This will open a BLAT Search Results window. Click on the link that is labeled "browser" to the far left. Finally, click on the Zoom Out 10X button (far right toward the top). Data source: Ensembl and UTHSC by Yanhua Qu. We thank Yan Cui for use of his Linux cluster to make these assignments.

  5. The approximate melting temperature of the probe sequence and the cRNA. These estimates are actually computed for DNA-DNA duplex rather than DNA-cRNA heteroduplex. Data source: Leonard Schalkwyk and Yanhua Qu.

  6. These stacking energy estimates (KbT) are the free energies computed using the method of Li Zhang and Mike Miles (Nat. Biotech 21:818). In essence, they provide an estimate of the gene-specfic binding energy (GSB) and the non-specific binding energy (NSB). Low values of GSB and higher values of NSB tend to enhance specificity/stability of binding between the DNA probe and the cRNA target (lower free energy reflects tighter binding). These value are computed for DNA-DNA hybrids and will typically underestimate the binding of DNA-cRNA heteroduplex. The larger the difference between GSB and NSB (assuming GSB is already lower), the better. It is often the case that lower GSB values are associated with higher mean signal of the probes. (text by M Miles 9/22/03, last update 01/13/05).

  7. The mean probe signal intensity averaged across all strains (not cases) in this specific data set. Please refer to the INFO page for a summary of strains and cases used to generate this particular data set. The scale of these numbers is close to a log base 2 reexpression of the orginal Affymetrix CEL file output. However, the value have usually been standardized as describe in the INFO page ("2z+8" method). Data source: INFO page

  8. The sample standard deviation of probe signal across the strain mean estimates. Please refer to the INFO page for a summary of strains and cases used to generate in this data set. Data source: INFO page

  9. Heritability of the probe signal intensity computed across the panel of isogenic strains. Heritability is essentially the ratio of the between-strain mean square error term to the sum of the within-strain mean square error term (the error mean square) plus the between-strain mean square error term. A highly informative probe is one with little withiin-strain variability but a great deal of among-strain variability. Naturally, we attempt to correct for batch effects and other non-genetic sources of among-strain variability. Some probes may have anomalously high heritability due to the presence of a sequence variant such as a SNP in the transcript sequence that is complementary to the probe. Small deletions in one of the parental strains will also generate high heritability in some probes (see Pparbp for an example). When we can confirm that probes do contain a SNP or other sequence variant, we discount that probe and do not use it for generating a heritability-weighted consensus estimate of transcript expression.

  10. Mouse mm8 probe set locations (historical). Should be updated to mm9. Probe locations were obtained from Ensembl ftp://ftp.ensembl.org/pub/current_mus_musculus/data/mysql/mus_musculus_core_43 by Hongqiang Li. We made use of text files and MySQL tables called:

    1. oligo_feature.txt.table.gz (25774 KB file of 3/1/07 1:53:00 AM)
    2. oligo_probe.txt.table.gz (24411 KB 3/1/07 1:54:00 AM)
    3. seq_region.txt.table.gz (383 KB, 3/1/07 1:59:00 AM)

  11. This column provides the names of single nucleotide polymorphisms that overlap the probe sequence. You can link from entries in this column directly to the GeneNetwork SNP Browser.