Welcome to the L2L Microarray Analysis Tool

A simple tool for discovering the hidden biological significance
in microarray expression data

L2L uses simple, tab-delimted formats for all of its files.
The three types of files it needs are data files, translator libraries, and list files.

Data Files

Data files contain a user's own experimental data - the list of genes that were up- or down-regulated in a microarray experiment. Genes that were up-regulated and genes that were down-regulated should be put in separate files and analyzed separately. The file is simply a list of unique probe identifiers for the particular microarray system that was used, one identifier per line:

probeID1
probeID2
probeID3

Support for a variety of popular microarray systems is built-in to L2L. If your microarray system isn't among them, you can create your own translator library. The following table lists the supported microarray systems and a few sample probe identifiers. Note that probe identifiers for chip sets (U133 Set, U95 Set, etc.) include the chip ID. All identifiers are case-insensitive (i.e. 200007_at or 200007_AT are both fine).

Built-in microarray platforms and sample probe identifiers

Human microarrays
Platform Probe ID Notes
Affy HG-U133 Plus 2.0 1552275_s_at
211759_x_at
Affy's current one-chip whole-genome expression array.
Affy HG-U133 Set 202116_at_HG-U133A
244828_x_at_HG-U133B
Set of U133A and U133B; requires chip IDs.
Affy HG-U133A 2.0 203753_at
222209_s_at
Current, revised version of U133A.
Affy HG-U133A 202116_at
219899_x_at
Chip from U133 Set that includes most known genes.
Affy HG-U95 Set 36361_at_HG_U95Av2
63169_at_HG_U95C
Previous-generation comprehensive chip set, including U95Av2 and U95B-E; requires chip IDs.
Affy HG-U95Av2 152_f_at
32226_at
Chip from previous-generation set that includes most known genes.
Affy HG-Focus 200018_at
205651_x_at
"Starter chip" that contains a subset of the known genes on U133A.
Affy Hu6800/HuGeneFL AB002332_at
M15465_s_at
Previous-generation "starter chip".
Affy Human Cancer G110 1531_at
2050_s_at
Custom array that includes 1700 genes implicated in cancer.
Affy Hu35K Set RC_AA283044_AT_HU35KSUBA
RC_N25923_F_AT_HU35KSUBD
Discontinued array set, including Hu35K subA, subB, subC and subD. Requires chip IDs.
Affy Hu35KA AA019475_at
RC_AA071075_at
Chip from Hu35K set that includes most known genes.
Agilent Whole Human Genome A_24_P417162
A_23_P414312
Agilent's current one-chip whole-genome expression array.
Agilent H1 Set A_23_P123587_1Av2
A_32_P449722_1B
Previous-generation comprehensive chip set, including H1Av2 and H1B; requires chip IDs.
Agilent H1A A_23_P12628
A_23_P216610
Chip from previous-generation set that includes most known genes.
Illumina Human-6 BeadChip GI_10092672-S
GI_31415879-I
Whole-genome bead chips for analyzing 6 samples simultaneously.
Illumina HumanRef-8 BeadChip GI_18765751-I
GI_42794764-A
Bead chips that permit analysis of 8 samples for characterized RefSeq transcripts.
NIH/NIA 15k Human 1
36382
15k cDNA array from the NIA's Gene Expression and Genomic Unit; probe ID is "14k index" column of annotation file.
JHU/NIH MGC1 Human 1-R3C5
384-R4C2
cDNA array from the NIA/JHU Microarray Facility; probe ID is grid coordinate of feature.
All HUGO names ACTB
LMAN1
Default translator library if your array is not supported by L2L.
Mouse microarrays
Platform Probe ID Notes
Affy Mouse 430 2.0 1436006_at
1448142_x_at
Affy's current one-chip whole-genome expression array.
Affy MOE 430 Set 1415691_at_MOE430A
1444190_at_MOE430B
Set of 430A and 430B; requires chip IDs.
Affy Mouse 430A 2.0 1422256_at
1439185_x_at
Currrent, revised version of 430A.
Affy MOE 430A 1415689_s_at
1419249_at
Chip from 430 Set that includes most known genes.
Affy MG_U74v2 Set 162478_r_at_MG_U74Av2
136558_at_MG_U74Cv2
Previous-generation comprehensive chip set, including U74(A,B,C)v2; requires chip IDs.
Affy MG_U74Av2 100015_at
160660_r_at
Chip from U74 set that includes most known genes.
Affy Mu11k Set l25913_f_at_Mu11ksubA
Msa.3206.0_s_at_Mu11ksubB
Early chip set, including Mu11ksubA and Mu11ksubB; requires chip IDs.
Affy Mu11k SubA aa000380_s_at
U35142_f_at
Chip from Mu11k set that includes most known genes.
Agilent Whole Mouse Genome A_51_P438924
A_51_P477917
Agilent's current one-chip whole-genome expression array.
Agilent Mouse v2 A_51_P371972
A_51_P220343
Previous-generation Agilent array.
Agilent Mouse Development A_66_P100091
A_66_P137744
Agilent array derived from NIA Mouse Gene Index, optimized for stem cell and developmental studies.
Illumina Mouse-6 BeadChip GI_38074788-S
RI|0610030G03|R000004K23|AK002703|711-S
Whole-genome bead chips for analyzing 6 samples simultaneously.
Illumina MouseRef-8 BeadChip GI_6671508-S
SCL40201.4.1_211-S
Bead chips that permit analysis of 8 samples for characterized RefSeq transcripts.
NIH/NIA Mouse 15k H3015B04
H3073F11
NIH/NIA Mouse 7.4k H4001C04
H4048B10
JHU/NIA M17Kam Set 2-R1C3-A
381-R1C1-B
Based on NIH/NIA mouse 15k clone set; probe ID is grid coordinate of feature.
Rat microarrays
Platform Probe ID Notes
Affy Rat 230 2.0 1367474_at
1387360_at
Affy's current one-chip whole-genome expression array.
Affy RAE 230 Set 1367466_at_RAE230A
1392885_at_RAE230B
Set of 230A and 230B; requires chip IDs.
Affy RAE 230A 1367468_at
1373932_at
Chip from 230 Set that includes most known genes.
Affy HT Rat Focus 1390315_a_at
1369905_at
High-throughtput array plate that contains probes for 16,000 of the best-characterized genes on the Rat 230 2.0 array.
Affy RG_U34 Set AA685112_at_RG_U34A
rc_AA819798_g_at_RG_U34C
Previous-generation comprehensive chip set, including U34(A,B,C); requires chip IDs.
Affy RG_U34A L23148_at
rc_AI234969_s_at
Chip from U34 set that includes most known genes.
Agilent Whole Rat Genome A_44_P1002173
A_44_P308858
Agilent's current one-chip whole-genome expression array.
Agilent Rat v2 A_42_P745958
A_43_P11226
Previous-generation Agilent array.
Primate microarrays
Platform Probe ID Notes
Affy Rhesus macaque MmugDNA.3443.1.S1_at
MmugDNA.16349.1.S1_at
Agilent Rhesus macaque A_23_P202540
KRM_1_04524

"All HUGO Names" is intended to be used for gene annotation - you can put a few genes you want to annotate in your data file, use this translator library, and see which L2L Microarray Database lists your genes of interest are found on. "All HUGO Names" can also be a used as a default "microarray system" if your microarray is not represented and you do not want to create a translator library for it. However, L2L's statistical analysis relies on knowing how many genes are actually on your microarray, and how many of those were changed in your experiment. Therefore, you should not put much faith in any P values or fold-enrichment numbers if you use "All HUGO Names". It is also very easy to create a new translator library (see below), so we highly recommend you do this if L2L doesn't include a translator library for your microarray system.

back to top
Translator Libraries

A translator library allows L2L to translate gene names to microarray probe identifiers and back. It is a tab-delimted file with a paired probe identifer and HUGO name on each line:

#L2L     library
#NAME    Affy-SomeArray_v1
#RELEASE  2007.1
probeID1    XYZ1
probeID2    ABCD1
probeID3    HUJA6

The annotations at the top of the file are optional; they are not used by L2L. The probe identifers can be anything, as long as they match the probe identifiers you use for your data. Try to avoid special characters, however. The web-interface of L2L will warn you if your uploaded translator library has improper characters in it (this is a security measure). Gene names must be official HUGO gene symbols in order for L2L's gene annotation functions to work (linking to EntrezGene, for example).

back to top
Database Lists

Each list in the database is a file with a few annotations at the top, followed by the HUGO gene symbols of all the genes on that list, one per line:

#L2L     listfile
#NAME    brca1_overexp_up
#REFERENCE    12032322
#DESCRIPTION  Upregulated by induction of BRCA1 in EcR-293 cells
#KEYWORDS cancer
#PLATFORM HuGeneFL
#RELEASE  2007.1
#ORGANISM human
#REPOSITORY 
FSTL1
GALNT3
SEC10L1
HTATIP...

The first line tells the L2L application that this is, indeed, a list. This line should be the same for all list files. The second line is a short, informative name for the list (the same as the file name). It should contain only alphanumeric characters and underscores. The third line is a reference to the source of the list. For L2L Microarray Database lists, this is the PubMed ID of the source publication. The fourth line is a description of the list. It can be as long as necessary, and can include any character except tabs.

The fifth line can contain one of a number of keywords for browsing the database and (in a future revision of L2L) restricting searches to particular topics. Current keywords. The sixth line describes the platform (microarray or otherwise) that was used to generate the data encompassed by the list. The seventh line is the release version of the list; this is for the user's reference only, and is not used by the L2L program (i.e. it can be omitted). The eight line is the organism on from which the data was generated. The ninth line contains a repository accession number, if this publication was linked to any data in Gene Expression Omnibus or Array Express. All other lines in the file contain one of the genes on the list. Note that the order and position of the lines is not critical; L2L identifies annotations by the "#XXX" designators.

Beginning with release 2006.2 (L2L v1.1), the L2L command-line tool can use a single-file database format to speed high-throughput batch processing. This format is simply a concatenation of all list files into a single text file, with a line containing "##" at the end of each list of genes.

L2L provides a variety of annotations and hyperlinks to external references in its output HTML files. The data needed to generate these links are contained in a text file, "listsets.txt", with a format similar to the single-file databases:

#L2L     listset
#NAME    l2lmdb
#REFERENCE  http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=XURLX
#REFDESC    PubMed Abstract
#DESCRIPTION  L2L MDB
##

"NAME" is the shorthand name to be used in file and folder names. "REFERENCE" is a template for hyperlinking to the data source for a list in this database; "XURLX" will be replaced with the "REFERENCE" annotation from the individual list. "REFDESC" is a description of the hyperlink destination for display in output web pages. "DESCRIPTION" is also for display in output web pages (a longhand version of "NAME"). "##" marks the end of one entry in the file.

back to top