1. What is SimpleChangepointCalculator?

SimpleChangepointCalculator is our own implementation of changepoint detection algorithm PELT. This program is specialized for analysis of DNA methylome data obtained with whole-genome bisulfite sequencing (WGBS).

References

  1. Killick R Fearnhead P and Eckley IA (2012) Optimal detection of changepoints with a linear computational cost. J. Am. Stat. Assoc. 107, 1590?1598.
  2. Yokoyama T, Miura F, Araki H, Okamura K and Ito T, "Changepoint detection in base-resolution methylome data reveals a robust signature of methylated domain landscape", submitted

2. How to use SimpleChangepointCalculator

1) Outline of usage

SimpleChangepointCalculator is designed to use in an analytical pipeline developped in our laboratory. The stream of analysis is as followings.

  All of these programs are provided from our laboratory.

2) First step - Run BMap to map WGBS reads

At first, you have to map the WGBS reads on your reference genome. Use BMap for the purporse, since MethylationPerspectiveTrackCreator currently accepts only files in "bisulalign" format, which is our original format exported by BMap. You can download BMap from here.

3) Second step - Run MethylationPerspectiveTrackCreator to create .graph file

Next, you have to create a file in graph format, which is a binary file that stores reads number for each position of genomic coordinate. You can download MethylationPerspectiveTrackCreator from here.

For example, type a command as following.

>MethylationPerspectiveTrackCreator -species human -revision hg19 -track YFN -bisulalign \
             YFN.bisulalign -graph YFN.graph -thread 16

After completion of the command, you can get YFN.graph as binary export file. The graph format is used for data storage of our own genome browser named Simple Genome Browser. This is the reason why the name of program contains "track creator".

4) Third step - Run SimpleChangepointCalculator to calculate domains

Finally, you can run SimpeChangepointCalculator using .graph file generated by MethylationPerspectiveTrack. For example, type a command like following.

>SimpleChangepointCalculator -species human -revision hg19 -minread 5 -outprefix YFN -graph YFN.graph

After completing this command, you can find two files named "YFN.txt" and "YFN.png". Former of which is text file containing all of domains detected and latter is PNG image file showing MDL plot of the methylation data.

The tab delimitted text file is formatted as followings.

column 1 Chromosome name
column 2 Position from (starting point of genomic coordinate)
column 3 Position to (end point of genomic coordinate)
column 4 Size of domain
column 5 Number of CpG sites in the domain
column 6 Averaged DNA methylation rate of the domain

The output is like this.

The exported MDL plot images are as followings.

H1 ESC IMR90 fibroblast


3. How to compile SimpleChangepointCalculator


SimpleChangepointCalculator is coded with the C++ library Qt, you can compile and use it on the platforms supported by Qt. The source code is available from here.

Once you install Qt library in your system correctly, you can easily compile the source codes. After extraction of the source codes to an appropriate directory, change working directory to the same directory with the source codes and type the following commands.

>qmake SimpleChangepointCalculator.pro
>make (or nmake for Visual Studio on Windows)

After completion of the commands, you can find an executable file in the directory named "bin" located at the same layer with the directory to which you extracted the source code.

4. Test data and some examples

For your convenience, some test data are prepared.