Generation Information and Q&A
What can be used as a reference for optical mapping?
Reference sequences, sequence assemblies / scaffolds, and the optical mapping assemblies can all be taken as a reference for optical mapping.
How do I convert a sequence to reference for optical mapping?
In-silico digestion is required to convert a sequence to virtual reference for optical mapping. Optionally, you could merge close signals on the reference as a tolerance strategy to resolution errors. Also, beware that some optical mapping software has different requirements on the reference. For example, some software only supports single concatenated reference, or reference name needs to be a number, etc. You should rename your virtual reference accordingly.
Could I use a sequence from the same species but different strains as reference?
Yes. However, be careful that in certain species, genomic structure of different strains could be very different, leading to completely different optical mapping patterns that may not be a proper reference, and results in a low mapping rate.
Could I use a sequence from different species as reference?
Yes but we do not recommend. Usually genomic structures are very different across different species and can result in a low mapping rate.
Detailed commands
In-silico digestion
In-silico digestion creates virtual optical maps based on input DNA sequences and nicking enzymes. The process is usually applied to create reference for optical mapping alignment based on reference sequences.
Example
java -jar OMTools.jar FastaToOM --fastain ReferenceSequence.fa --refmapout BspQIReference.data --enzyme BspQI
This command performs an in-silico digestion on the reference sequences using the nicking enzyme BspQI.
java -jar OMTools.jar FastaToOM --fastain ReferenceSequence.fa --refmapout BbvCIBspQIReference.data --enzyme BspQI BbvCI
This command performs an in-silico digestion on the reference sequences using two nicking enzymes BbvCI and BspQI.
java -jar OMTools.jar FastaToOM --fastain ReferenceSequence.fa --refmapout CustomEnzymeReference.data --enzymestring GCTCTTC
This command performs an in-silico digestion on the reference sequences using a custom nicking enzyme site GCTCTTC.