General option notes:1. While the middle position of each peak is calculated, seqMINER will extend 5000bp to left and 5000bp to right from this position by default.
2. If enable, seqMINER will turn the peaks of reverse strand(if there is strand information in reference peak file) to forward strand.
3. If enable, each tag (read) will be extended to expected tag length (200bp by default).
4. For the distribution array (which will be used in clustering), we give one maximum tag height value to every 50 base pairs. That means, for a window of 10kb, one peak site profile contains 10000/50 = 200 values.
5. the size of separation bar of dataset in visualization.
6. If you'd like to get back the same result as precedent run, input the same KMeans seed value as before and do the clustering.
Gene profile option:When the gene profile mode is enable, seqMINER will take the reference peak body (usually, it's the gene start and gene end coordinates, the refseq data for human and mouse is provided here,refGene_genebody.bed).
As the genes don't have the same length, we divide each gene in 160 bins(by default). Another 20 extra bins will be given to upstream and downstream region (5kb for each as it was set in General option page).
Important:these settings should be done before step2(Extract data)
Gene profile example: