Hoffman MM, Buske OJ, Noble WS. 2010. The Genomedata format for storing large-scale functional genomics data. Bioinformatics, 26(11):1458-1459; doi:10.1093/bioinformatics/btq164
Genomedata is a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. We have also developed utilities to load data into this format. A reference implementation in Python and C components is available here under the GNU General Public License.
The easy way to install genomedata and its prerequisites, and set up your environment properly to use them is to use our interactive install script. Just type these two commands on your Linux/Unix system*:
wget http://noble.gs.washington.edu/proj/genomedata/install.py python install.py
To upgrade an existing Genomedata installation to the latest version, type the following command at the shell prompt:
easy_install -U genomedata
Genomedata is briefly described in the Bioinformatics application note cited and linked at the top of this page.
The application's documentation is available in two formats:
* Added support for adding additional tracks using genomedata-open-data and Genome.add_track_continuous(). * Added support for creating Genomedata archives without any tracks. * Made chromosome.start and chromosome.end be based upon sequence instead of supercontigs. * Made iter(chromosome) and chromosome.itercontinuous() yield supercontigs sorted by start index (instead of dictionary order). * Fixed pointer dereference bug that could cause segfault in genomedata-load-data. * Improved installation script robustness and clarity.
There is a moderated genomedata-announce mailing list that you can subscribe to for information on new releases of Genomedata.
There is also a genomedata-users mailing list for general discussion and questions about the use of the Genomedata system.
If you want to report a bug or request a feature, please do so using the Genomedata issue tracker.
For other support with Genomedata, or to provide feedback, please e-mail Michael. We are interested in all comments regarding the package and the ease of use of installation and documentation.
Michael Hoffman < mmh1 at uw period edu >