Statistic
A total of 6,463 soybean accessions from more than 49 countries (140 regions) were collected to construct the STIPBase database (Table 1). The average sequencing depth, mapping ratio and coverage are12.9x, 98.9% and 96.1% (Figure 1 and 2), respectively. STIPBase includes 1,080,534 transposon insertion sites (TISs), 69% of which are reference TISs identified by the alignments of transposon elements against reference genome and 31% of which are non-reference insertion sites identified with WGS datasets (Figure 3). Most of the TISs are distributed in intergenic regions accounting for 64%, and the remain TISs mainly locate in intronic regions accounting for 26% (Figure 4). There are 1,109 transposon insertion sites within 1Mb window sieze averagely. (Figure 5). The TISs are mainly associated with 25 transposon element super families (Figure 6). LTR/Cypsy, LTR/Copia and LINE/L1 are the most three retrotransposon families, as well as DNA/MULE-MuDR, DNA/hAT-Ac and DNA/CMC-EnSpm are the most three DNA retrotransposon families. In the 20 chromosomes, the 18th chromosome possesses the most TISs(Figure 7).
In the collected accessions, we averagely detected 8,167 reference site absences and 1,684 non-reference site presences (Figure 8). As for TIPs, we found more than 93% reference sites are absent in at least 2 accessions (Figure 9), and more than 59% non-reference sites are present in at least 2 accessions (Figure 10).