This data is taken from the publication: http://www.ncbi.nlm.nih.gov/pubmed/12893887 Science. 2003 Sep 12;301(5639):1503-8. Epub 2003 Jul 31. Discovery of gene function by expression profiling of the malaria parasite life cycle. Le Roch KG, Zhou Y, Blair PL, Grainger M, Moch JK, Haynes JD, De La Vega P, Holder AA, Batalov S, Carucci DJ, Winzeler EA. SuppTable1_newIDs.txt: Publicly available Suppl. Table 1 from the above publication. sexualGenes-clusters1-2-3.txt: This file was taken from the above publication and lists all the gene IDs that were assigned to clusters 1, 2 or 3. geneCoordinates-expressionFoldChanges-SorbitolTreatedData.txt OR geneCoordinates-expressionFoldChanges-SorbitolTreatedData.xls: In order to create this file we extracted fold change values from sorbitol treated dataset for . 0hr (Early ring S, col #6), 12hr (Late ring S, col #8) . 18hr (Early trophozoite S, col# 10) and 24hr (Late trophozoite S, col# 12) . 30hr (Early schizont S, col# 14) and 36hr (Late schizont S, col# 16) . Merozoite (col# 18), Gametocyte (col #42), Sporozoite (col #44) We cross referenced gene IDs (new_ID) to gather gene coordinates. For genes with duplicate columns (i.e. new_IDs) we extracted the column that has higher value of total expression. Each gene is also assigned to the 10kb window that contains the largest portion of that gene. We also mark each gene as sexual/notSexual (1/0) by using the file described above. This value is at the last column. The script named "extractExprVals.sh" does this extraction and also converts the .txt file to excel file.