Various programs, documents, and side projects are made available here for downloading.


SMRTsim is a simulation of the "Single-Molecule Real Time" (SMRT) DNA sequencing process. It is implemented as a package in the R programming language.


AutoSeq is a "base caller", a program that interprets the binary electropherogram data generated by an automated fluorescent sequencing machine into a string of characters representing the DNA sequence. This program was originally written by Reece Hart circa 1993, and updated by Joe Burks in 2003-2005. The updated version compiles on GCC 3, and skips a lot of logging calls, so that it runs faster. This code is in the public domain.

Bioinformatics Algorithm Demonstrations

This is a collection of example programs demonstrating selected computer science algorithms important in bioinformatics, implemented in the spreadsheet program Microsoft Excel. You can download the documentation and programs in separate files. Spreadsheets provide an interesting platform for demonstration of algorithms, since various steps of the calculations can be exposed in a manner that is easily comprehensible to users with little programming experience. The algorithms demonstrated include two approaches to approximate string matching (dynamic programming and Shift-AND numeric approximate matching), Hierarchical Clustering (used in phylogenetic studies and microarray analysis of gene expression), a Naive Bayes Classifier for simulated microarray gene expression data, and a simple Neural Network. These demonstrations are designed to serve as instructional aids in bioinformatics courses.

Cybertory Documentation

Here we present various publications and white papers describing Cybertory software,  resources, and exercises.

Microarray Scan Simulator

The Cybertory Microarray Scan Simulator (MSS) is a Java application that creates simulated microarray images from two input files; one containing feature intensities, and the other describing the “style”, or details of how the image should be presented. Style details include feature shape, degree of variation allowed in feature shape and placement, background levels, and various types of noise that can affect interpretation of image data.