CONTENT ------- These tar files contain the exome next-generation sequencing data from a primary breast cancer and the associated metastatic lymph node, which were analyzed in the following paper to identify candidate somatic mutations: Zare, Habil, Junfeng Wang, Alex Hu, Kris Weber, Josh Smith, Debbie Nickerson, ChaoZhong Song, Daniela Witten, C. Anthony Blau, and William Stafford Noble. "Inferring clonal composition from multiple sections of a breast cancer." PLoS computational biology 10, no. 7 (2014): e1003703. SUBSECTIONS ----------- 9 subsections were sequenced including 7 primary tumor subsections, 1 subsection from a metastatic lymph node, and 1 normal sample. Each tar file name has the following form: sample_.tar.gz For each subsection, several fastq files are provided. While the majority of sequencing is paired-end, some subsections (5773 & 5774) may contain single-end sequencing data too. The following table shows how each tar file can be matched by the subsection names as referred in the above paper. Note that some of these subsections were not analyzed in the next round of targeted sequencing, and the other way around. Thus, the IDs from exome sequencing data do not completely match with the subsection IDs in the paper. Subsection-ID (in file name) Reference ID (in paper) Source ----------------------------------------------------------------------- 5775 N1-1 normal 5772 P1-3 primary 5773 P1-2 primary 5774 P1-5 (no deep analysis) primary 5776 M1-1 metastatic 31688 P3-3 (not deep analysis) primary 31689 P3-4 (not deep analysis) primary 31690 P2-1 primary 31691 P2-3 primary Samples 5772-5776 were obtained in 2011, and samples 31688-31691 were obtained in 2012. MORE DETAILS ------------ For further information, please refer to the paper, and feel free to contact a corresponding author of the paper if you have any questions. This readme file was created by Habil Zare on 10 Nov 2014.