THE GENETIC SEQUENCE BINARY FACTOR GROUPING ROUTINES

The relation between the time of creation and the time of the life of the creation is nothing
but the borderline between the human truth about the creation and the nature of the creation ...





The Genetic sequence binary factor grouping is a set of routines that computes V3 CEL scan data producing catalogue(s) of genetic sequence data , for the purpose of genetic analysis , generating large catalogues of genetic sequence families of curves .(above and bellow - partial display of groups of curves extracted from lung cancer datasets - lung normal )

Above :
(1) permutations count - single permutation group ,
(2) 4 cell line scans (19 scans) , single permutation group
(3) scan file values merit factor curve , single permutation group permutations chart , single permutation chart


Genes were referenced by scan measured values number serie merits ( M.J.E. Golay (1902-1989) from The Merit Factor Problem by Jedwab at http://www.math.sfu.ca/~jed/Research/merit.html ) .

Chart from CL2002042639AACEL_merits_x32x16x8x4x2x1_b16d_41_.xls column A contains number scale , sorted column B contains numbers merits as computed from CL2002042639AA.CEL .
Merit factor values were computed with b16d_41 from CL2002042639AA.CEL .

Binary factor grouping merits : (example bellow computed by principia routine binary_factor_number_merit_scale_routine) and permutation chart from CL2001032255AA.CEL

* Upper charts computed by binary_factor_number_merit_scale_routine


Binary factor grouping merits computed with binary_factor_number_merit_scale_routine (upper charts top to bottom) :
(1) text file example (computer log)
(2) wav file (voice recording)
(3) single probe scan .CEL file
(4) merit values chart
(5) scan CEL file mean column values chart


Merit factor curve (left) vs binary factor merit factor curve (right) - binary factor number merit scale routine (1) vs binary factor number merit scale routine (2)


Binary factor merit factor curve logn = 0 region , Y-axis absolute measured values




ADENO_1 dataset 1173 genes merited values chart(s) , 4 of 19 , as produced by binary_factor_number_merit_scale_routine___ and rightmost resulting chart and related Blast searches and
Mean sqare values chart from CL2001032002AA.CEL , as computed by b16d_43 , 1588 values ,


merit factor values from 6.1 - 6.6 , list of gi(s) : 34815 , 285942 , 457784 , 460085 , 663009 , 1377762 , 2662150 , 3036839 , 3043565 , 3046899 , 3165456 , 3327037 , 3413799 , 3766196 , 4371373 , 5262535 , 5743356 and Blast result Figure (.) and taxonomy


Multicolumn values from linear binary factor intervals produced by binary factor number scale routine


CEL file header :
DatHeader=[0..46124] CL2002042636AA:CLS=4733 RWS=4733 XIN=3 YIN=3 VE=17 2.0 04/26/02 14:38:50 HG-U133A.1sq 6
Algorithm=Percentile
AlgorithmParameters=Percentile:75;CellMargin:2;OutlierHigh:1.500;OutlierLow:1.004
Left chart selected interval -0.29 to +0.29 and right chart - mean square measured values having :
(1) 41.079720 , 42.067210 , 43.054707 , 44.042200 45.029694
(2) 58.064606 , 60.039590 , 61.027084 , 62.014576 , 63.002070
(3) 76.036980 , 77.024475 , 78.011970

Sequence (1) : gi(s) 464183 , 5362869 , 5856829 , 8922743 Figure (.) , Figure (..) and taxonomy

Number permutations grouping generated chart using num_c_perm_2_cel from CL2001032612AA.CEL (bellow)
(1) Resulting table (sorted) coumns A and B and (yellow) frequency line
(2) Peak frequency group list and real (intensity) values chart
(3) Intensity values and linearity chart
(4) Comparison of charts from lung cancer


- Lung normal results (7 rows) and their intensity values from 17 probes
- Example procedure for a single CEL file (CL2001031609AA.CEL from lung cancer (small cels)
- A single CEL file (N01_normal.CEL from prostate normal)
- Lung adenocarcinoma ADENO_1

Following results computed with b16d_43
Gene annotations were queried by accession numbers by http://genecards.weizmann.ac.il/cgi-bin/geneannot/GA_search.pl
NCBI gi numbers were queried by gene annotations by http://www.ncbi.nlm.nih.gov/nuccore?term=SMAD4 using an automated PHP script url_get.txt .
Factor grouping permutation chart derived by a31_t.cpp and chart values linear crossection used for number range(s) selection .

- Charts :
()Title : Lung adenocarcinoma ADENO_1 chart in adeno1_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_2 chart in adeno2_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_3 chart in adeno3_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_4 chart in adeno4_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_5 chart in adeno5_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_6 chart in adeno6_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_7 chart in adeno7_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_8 chart in adeno8_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_9 chart in adeno9_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_10 chart in adeno10_res_20110224 (txt)
()Title : Lung adenocarcinoma ADENO_1-10 chart in adeno_1-10_20110516 (txt) .
()Title : Prostate tumor Prostate Tumor Sample CEL files T01-T30 and T31-T60 chart in pc_20110517 (txt) .
()Title : Haining Lab : Human Naive and Memory T cell Subsets chart 90 probes in tmap_1-90_20110329 (txt) .
()Title : Gene expression-based chemical genomics identifies rapamycin as a modulator of MCL-1 and glucocorticoid resistance in leukemia 29 probes results chart in armstrong_part_20110721_1 (txt) .
()Title : Gene Expression-Based Classification and Outcome Prediction of Central Nervous System Embryonal Tumors PT_SCANS_1 chart in pt1_20110305 (txt) .

Blast (www.ncbi.nih.gov) searches produce long genomic regions and numerous unmapped sequences .
Analysis of fasta coded genomic regions may get easier by searches based on single sequence permutations .
Determining typical sequences in a long genimic region results in numerous SNP s .
num_c_perm_2_fasta is locating long permuting sequences in a fasta coded genomic region .
Two runs (eg: num_c_perm_2_fasta LOC100653050.txt c > a.txt and num_c_perm_2_fasta LOC100653050.txt d > b.txt) would output permutation index arrays and fasta enumerated groups at the top of the output list .
The permutation index array appearing on the end of the output list requires inserting in a document calc utility and producing
a four column line chart . (Bellow) Line chart describing permutation groups , red and blue line out of sorted columns A and B , yellow line frequency count .


The permutation index array appearing at the beggining of the output list may be searched for composite index sequences .
Their line numbers would reference corresponding fasta sequences in the line enumerated fasta array .
Although having same index sequence numbers they will reference permutated fasta sequences .






17 CCTGGGGCTC
18 CTGGTCCCAGGT
19 GTTTCTAAGAAG
20 CCATCACTCTCA
21 GTGCAGCCGGGT

Further analysis would require Blast (www.ncbi.nih.gov) searches having these fasta sequences as input .
Above mRNA data was published by the Cancer Program Datasets title at www.broadinstitute.org .
Subsequent computations were motivated by the article (published by www.ncbi.nih.gov) :

Routine sources are included in genetic_sequence_binary_factor_grouping.zip ( at 20.12.2010 ) in the following execution order : genetic_sequence_binary_factor_grouping.txt
These routines use a permuted number index order based on a merit factor attributed number scale .
The merit factor attributed number scale is computed using scan numbers as 4x4bit binary factor indexes in an 16bit integer array .
The exponents of array sub group counts and corresponding factor sums in relation to actual number serie factor bounds produce the merit factor number attributes .
All computed data and corresponding sequences depend on multiplier values .
Multiplier values determine and abstract ranges and contents of the permutation groups .


References

(1) On the complexity measures of genetic sequences.


MOTIVATION: It is well known that the regulatory regions of genomes are highly repetitive. They are rich in direct, symmetric and complemented repeats, and there is no doubt about
the functional significance of these repeats. Among known measures of complexity, the Ziv-Lempel complexity measure reflects most adequately repeats occurring in the
text. But this measure does not take into account isomorphic repeats. By isomorphic repeats we mean fragments that are identical (or symmetric) modulo some permutation
of the alphabet letters.

RESULTS: In this paper, two complexity measures of symbolic sequences are proposed that generalize the Ziv-Lempel complexity measure by taking into account any isomorphic
repeats in the text (rather than just direct repeats as in Ziv-Lempel). The first of them, the complexity vector, is designed for small alphabets such as the alphabet
of nucleotides. The second is based on a search for the longest isomorphic fragment in the history of sequence synthesis and can be used for alphabets of
arbitrary cardinality. These measures have been used for recognition of structural regularities in DNA sequences. Some interesting structures related to the
regulatory region of the human growth hormone are reported.


(2) Key-string algorithm--novel approach to computational analysis of repetitive sequences in human centromeric DNA.


AIM: To use a novel computational approach, Key-string Algorithm (KSA), for the identification and analysis of arbitrarily large repetitive sequences and higher-order repeats (HORs) in noncoding DNA. This approach is based on the use of key string that plays a role of an arbitrarily constructed "computer enzyme".

RESULTS: Fifty-five copies of 2734-bp 16mer HORs were identified and investigated, and a start-string TTTTTTAAAAA was identified. The HOR-matrix was constructed and employed
for graphical display of mutations. KSA identification of HORs in AC017075.8 was compared with that of RepeatMasker and Tandem Repeat Finder, which
identified alpha monomers in AC017075.8, but not the HORs. On the basis of KSA study, the centromere folding was described as an effect of HORs and super-HORs (3 x
2734 bp) in AC017075.8. The following novel computational KSA-based methods, easy-to-use and intended for computational "pedestrians", were demonstrated: color-HOR
diagram, KSA-divergence method, 171-bp subsequence-convergence diagram, and total frequency distribution of the key-string subsequence lengths. The results were
supplemented by Fast Fourier Transform, employing a novel mapping of symbolic genomic sequence into a numerical sequence.


(3) FAST PATTERN MATCHING IN STRINGS*


Abstract. An algorithm is presented which finds all occurrences of one. given string within
another, in running time proportional to the sum of the lengths of the strings. The constant of
proportionality is low enough to make this algorithm of practical use, and the procedure can also be
extended to deal with some more general pattern-matching problems. A theoretical application of
the algorithm shows that the set of concatenations of even palindromes, i.e., the language {alpha alpha R}*, can be
recognized in linear time. Other algorithms which run even faster on the average are also considered.



(*1) Implementation of compression in (1) , (2) , (3) , (4) and (5) .
(*2) A simple enthropy measurement (1) , (2) and (3).
(*3) See text enthropy sequence in enthropy_text_sequence.txt .
(*4) 3-bit permuting groups (000,110,011,101...) in (1) (15bit) and (2) (24bit) .
(*5) Arbitrary lengths of text sequences in (1) , (2) , (3) and in (4) , (5) and permuting sums composed of 1 , 10 and 30 in one byte numbers (-4~+4) generator code
(*6) Similar text sequence repeats matching as in (1) and in (2).
(*7) Permutation groups of text sequences in (1) and (2)
(*8) 16-bit parity patterns in (1) and text sequence patterns in (2)
(*9) Text patterns , whole text count for letters and variances , from (1) and text search by variance (.) templates (where '_' stands for any single letter)
(eg) T__G__CAAG_ and T__T__CAAG__ and G__T__CAAG__ and T__C__CAAG__ in (2) and conjunction count of 8 in (3)
(*10) Uniform length text patterns , self-comparing , binary count , from (1)
(*11) Text sequencing , from (1) , (1.1) ,(2) and (3)
(*12) Texts longest string , from (1)
(*13) Text binary patterns, from (1) , (2) and binary indexes and counts , patterns matching binary indexes (3), (4), and patterns search , select and count
(eg) Sequence selected by having ( AAAA_A_A or AACA_A_A or AAGA_A_A or CAAA_A_A or CACA_A_A or CAGA_A_A or GAAA_A_A or GACA_A_A or GAGA_A_A)
and the subsequence selected from (.) by having only CAGA_A_A Blast taxonomy
Sequence and Blast taxonomy from a sequence having ( TAAC_A and TACC_A and TAGC_A and TATC_A )
Sequence and Blast taxonomy from a sequence having ( TACC_A and TAGC_A and TATC_A )
Sequence and Blast taxonomy from a sequence having ( TAAC_A and TAGC_A and TATC_A )
Sequence and Blast taxonomy from a sequence having ( TAAC_A and TACC_A and TATC_A )

Sequences from gi|568815323|ref|NT_187193.1| Homo sapiens chromosome 5 genomic scaffold, GRCh38.p2 Primary Assembly CEN5_8

count	order								sequence

01423	00000001000000000000000000000000	___x____________	TTTTC_TTTCATTCA			Blast taxonomy
00729	00000000010000000000000000000000	____x___________	TTT_ATT_ACACAG			Blast taxonomy
00087	00000000000100000000000000000000	_____x__________	G_TTTTTTTTCAT			Blast taxonomy
00002	01000000000000000000000000000000	x_______________	ACATTCAACACACAGATTTG		Blast taxonomy
00003	00000100000000000000000000000000	__x_____________	A_ACAGATCAGATTTG		Blast taxonomy
00003	00010000000000000000000000000000	_x______________	ATTCAGACATCATTG			Blast taxonomy
Sequence and Blast taxonomy from a sequence having ( ATTCAGACATCATTG , A_ACAGATCAGATTTG )
Sequence from gi|568815323|ref|NT_187193.1| Homo sapiens chromosome 5 genomic scaffold, GRCh38.p2 Primary Assembly CEN5_8 having ( ACTCAC and TCACAG and CTCACA and (2x) CACAGA and (2x) ACAGAG ) and resulting Blast taxonomy

(*14) Texts longest string (2), from (1) and counter frequency charts (145 sequnces , l=4) , (362 sequnces , l=5) , (3525 sequnces , l=9) from a long genomic sequence
Sequence and Blast taxonomy from a sequence having ATTC (403512) and GAAT (403733) and AGGA (407298) and AACT (408146) and TGTA (408598)
Sequence and Blast taxonomy from a sequence having ((75681) AAGGT and (95571) CTCCA and (147587) AACAT)
Sequence and Blast taxonomy from a sequence having (ATGTAGGC (810) or ATGAGCAC (810) or TGTAGGTC (811) or GTAGGCTT (811) or GGTCATCA (812) or GTAGGTGT (812) or GTCAGGTG (813) or CTGTACAC (813))
(*15) Similar text sequences , from (1) and (2)
_____AACTCACAG_G_T
____AACTCACAG_G_TG
___AACTCACAG_G_TG_
Genomic sequence (from gi|834763093|gb|CP006636.1| Escherichia coli PCN061) chart ordered by binary group and frequency
Sequence ordered chart of search list : AAAC_G_T , CAAC_G_T , GAAC_G_T , TAAC_G_T , AACC_G_T , CACC_G_T , GACC_G_T , TACC_G_T , AAGC_G_T , CAGC_G_T , GAGC_G_T , TAGC_G_T , AATC_G_T , CATC_G_T , GATC_G_T , TATC_G_T
Sequence grouping(.) search strings : AAAC_G_T , AACC_G_T , CACC_G_T , CAGC_G_T , GAAC_G_T
(*16) Texts longest string (3) , (4) and (5)
(*17) Sequence search execution example , using (15.2),(2),(3),(4),(5) in (6)

(eg) Sequence resulting from a string search having __GT__C____T and AAT_____CG__ and A____TT____G and __GT____TT__ from
gi|568802162|ref|NT_187285.1| Homo sapiens chromosome 19 genomic scaffold, GRCh38 Primary Assembly CEN19_5 and resulting Blast taxonomy
Ref : Key-string Algorithm - Novel approach to computational analysis of repetitive sequences in human centromeric DNA
Ref : Curr Genomics. Apr 2007; 8(2): 93-111.
Consensus Higher Order Repeats and Frequency of String Distributions in Human Genome


(*18) Grouping text sequences using binary (1) and letter (2,5,3,4) distances
Use of parameters in 18.2 , 18.3 , 18.4 and 18.5 
-   Sequence list from 18.2 (file out_.txt) , 
        may be attributed by 18.3 , for a given sequnce length , into a string template list of sequences :
	    by letter(1) , letter count(2) , sequnce length(3) and letter position(4),
-   Sequence list from 18.2 (screen redirect into a.txt) , may be browsed by 18.5 , for a given sequnce template (attributed by any letter(1) at distance(2)) , 
        into a string template list of sequences : by letter(s)(1) at distances(2),
-   (13.-search) and (13.-select)
(*19) Use of list(s) of (incomplete (_)) Fasta sequences ,
Generate (incomplete (_)) lists of Fasta sequences
-   Sequence list(s) generated by (15.2) , (20.1) , (20.2) , (20.3) , (20.4)  
-   Search by (incomplete) sequence list (13.-search) 
-   Count by (incomplete) sequence counter (19.1) 
     parameter 1.(output filename from (13.-search)) , 
     parameter 2.,3.(1 , number of lines to skip (used for subseqent runs on selections)) , 
     parameter 3.(output filename)
-   and Select row(s) from resulting by specific (and) or any (or) from (incomplete) sequence list (13.-select)
-   or Select row(s) from sequence list and list (sub)sequence group(s) (19.2)

    parameter 1.(output filename from 20.1) 
(20.5.1) parameter 2.                       parameter 3.
    01111010110111001111111001111111   10000101001000110000000110000000
    sub list (1s) from a list of 32    OR (any) from sub list 1. and other (from a list of 32) none included
(20.5.2) parameter 2.                       parameter 3. 
    01111010110111001111111001111111   11111111111111111111111111111111
    sub list (1s) from a list of 32    AND (all) from sub list 1. and other (from a list of 32) none included
(20.5.3) parameter 2.                       parameter 3.
    01111010110111001111111001111111   00000000000000000000000000000000
    sub list (1s) from a list of 32    OR (all) from sub list 1. and other (from a list of 32) any included
(20.5.4) parameter 2.                       parameter 3.
    01111010110111001111111001111111   01111010110111001111111001111111   
    sub list (1s) from a list of 32    AND (all) from sub list 1. and other (from a list of 32) any included

(*20) Texts context similar strings from (1) , (2) , (3) , (4) and (bellow)
redundant/sequence (probability) count at distances chart from two genomic sequences   



Sequence d=57 l=15 from Streptomyces rapamycinicus NRRL 5491 having
ACCACCCACCCACCA CACCACCACCGCCAC CAGCGCCGTGCCTGC
CCCAGTGGCGCGGTG CGGCGGCCGCCGCCG CGGGGCCGGGGCCGG
GGATGTGTTGTTCGG GGATGTGTTGTTCGG GTGGGTGAGGCGGAG
GTTGGGCGTTGGTGA TCCCCTCGGAACAGC
and sequence Blast taxonomy


Sequence (1) d=57 l=8 from Streptomyces rapamycinicus NRRL 5491 having
CATCGAGG CGCCACCG GCAGCAGC GCCACCGG CGAGGACG CAGCAGCC GCGCCACC CCGGACGG CAGCAGCA CCCGGACG CCACCTCG 
CGTCCCGG AGCAGCAG GCGCAGCC CGCAGCAG GCGCAGCA CCGTCCCG GCAGATCG CCACCGTC CCTGCGCC GAGGACGG CACCGTCC 
CGCCACCT GATCGAGG CCTCGATC TCGAGGAC TGCGCAGC GCCACCGT GCCACCTC CCCTGCGC CACCTCGA GGACGTCC CACCGGAC
and sequence Blast taxonomy


Sequence from Streptomyces rapamycinicus NRRL 5491 having
T_G__G_A_CGGGTGAT____ ____C___G_CGTCTCCAA__ ___C___G_ACAG_GCG_TG_ ________CGCCGCGT__G__ __C___G_CGTCTCCA____G
A____C_C_GCGTCCGCGGC_ ___C__CG_CGTCTCCA____ _____C__CG_CGTCTCCCA_ ________C_CGG_CGC_T__ C_______G_CGGA_AGCCC_
_GT_C__CG_TGGGTGACC__ T_C__CG_TGGGTGAC__GCG ___A_G_TC_CGG_CGCCG__ ___C______TGATCCG_T__ G_______G_CGACCTT____
__GG__C_GGGTGAT___TCG _____G__G_GCAGCTCGGC_ ___C__C_G_GA_GACGAA_C ________C_CGGCCGC____ ________C_CGGCCGCC___
________CGACCAGCCGGAC __CG__G_T_TTCGGGGAA_G ___C______TGATCCG_GCG GATGCG__G_TCTTCGGGGT_ __G___C_GGGTGAT___TCG
_G_G______GGTGAT__CGC __T__TG_G_GGCGTTC_G__ A__A__A_C_CGGCCGCCCC_ A________ACGGCCGCCG__ ___AC_A_C_CGGCCGCCCCG
AC_______CCGGCGGCTT__ _A_CCG_A_GACCAGCCGG_C
selected from (1)
and sequence Blast taxonomy


Sequence d=344 l=15 from Mycobacterium tuberculosis having
CACCACCGGCGCCGC , CGCCGGCGCCACCGG , CGGCACCGGCGGCCA , CGGCGCCGGCGGCAT , CTCCGGCGGCGCCGG , GTGAGGGCATCGAGG
and sequence Blast taxonomy


(*21) Sequence redundancy quotient from (20.1) in (1)
(*23) Segregation of a list of text sequences by their complexity from (1) , (2)
(*24) Texts longest string from (4) , (5) , (6) , (7)
redundant/sequence (probability) count at distances chart from 24.5 and 24.6    


(*25) (Simplified) use of maximum enthropy sequence search in (1,1.1 (32-bit),1.2 (48-bit)) , (2) and (3,3.1 (32-bit))
sequnce enthropy from 25.1.1 , sequnce enthropy from 25.1.2 and redundant/sequence (probability) count at distances chart from 25.3.1    


Sequence CTGCTCCCCGCACCCGCGG from Streptomyces rapamycinicus NRRL 5491 Blast taxonomy
Sequence from Streptomyces rapamycinicus NRRL 5491 having CTGCTCCCCGCACCCGCGG and Blast taxonomy
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having AAGTGTGCT , GGTGATCCT , CTTATGATG and Blast taxonomy
Ref : J Hum Genet. 2011 Sep;56(9):676-81. doi: 10.1038/jhg.2011.80. Epub 2011 Jul 28.
Genetic variation in phosphodiesterase (PDE) 7B in chronic lymphocytic leukemia: overview of genetic variants of cyclic nucleotide PDEs in human disease.
Peiro AM1, Tang CM, Murray F, Zhang L, Brown LM, Chou D, Rassenti L, Kipps TJ, Insel PA.

Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having
         GCTCCTCACTTCCCAGACGG
GGCAGAGACGCTCCTCACTT
and Blast taxonomy
Ref : J Biol Chem. 2015 Oct 23; 290(43): 26292-26302.
Stringent Control of NFE2L3 (Nuclear Factor, Erythroid 2-Like 3; NRF3) Protein Degradation by FBW7 (F-box/WD Repeat-containing Protein 7) and Glycogen Synthase Kinase 3 (GSK3)*
Meenakshi B. Kannan, Isadore Dodard-Friedman, and Volker Blank


(*26) Text sequence browsing by their complexity in (1)
- sequences having 6T2G5C2G2G2G2A :
TTTTTTCAGACGGAGTCTCGCTCTGTCCCCCAGGCTGGAGTGCTGTGGTGCGATCTCAGCTCACTGCAAG	
    TTGAGATGGAGTCTCGCTCTGTCACCCAGGCTGGAGTGCAGTGGCGAGATCTCAGCTCACTGCAAGCTCT	
   TTTGAGAGGGAGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGTGGTGCGATCTCAGCTCACTGCAAGCTC	
   TTTGAGACGGAGTCTCGCTCTGTCACCCAGGCTGGAGTGCAGTGGCGCGATCTCTGCTCACTGCAAGCTC	

- search string :
   TTT_AGA_GGAGTCTCGCTCTGTC_CCCAGGCTGGAGTGC_GTGG_G_GATCTC_GCTCACTGCAAGCTC

- Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having TTT_AGA_GGAGTCTCGCTCTGTC_CCCAGGCTGGAGTGC_GTGG_G_GATCTC_GCTCACTGCAAGCTC
and Blast taxonomy
Ref : Nat Commun. 2016; 7: 10994.
LIG4 mediates Wnt signalling-induced radioresistance
Sohee Jun, Youn-Sang Jung, Han Na Suh, Wenqi Wang, Moon Jong Kim, Young Sun Oh, Esther M. Lien, Xi Shen, Yoshihisa Matsumoto, Pierre D. McCrea, Lei Li Junjie, Chen and Jae-Il Parka

(*27) Binary counter of fasta sequences by length in (1)
(parameter 1-fasta sequence filename , parameter 2-data rows to skip , parameter 3-number of indexed sequences) 
charts ,sorted by sequence frequency, output from 27.1 (parameter 3) 
100K-400K in first row and 500K sequences in second row
chart legend X-axis ordininal fasta sequence number , 
Y-axis , red-sequence frequency , blue-sequence length , yellow-quotient=log(sequences having count of 1 and sequences having count greater than 1) 


2 Step(s) example (27.1) using fasta codes from Streptomyces rapamycinicus NRRL 5491 :

1. Step
sequences having max delta quotient
GATGTGCAAG
CTGCGCGATGT
CTGCGCGATGTG
CTGCGCGATGTGC
GTCCGCTGCGCTCC
GTCCGCTGCGCTCCC
GGTCGGCCGGGGCTGC
GGTCGGCCGGGGCTGCT
GACCCGGCGGGTGCACCG
TGGCCCAGCGGGCCTACGA
TGGCCCAGCGGGCCTACGAC

subset selecion sequences having min delta quotient 
CGGCGGCGGC GGCGGCGGCC GGCGGCGGCG GGCCGCCGCG CGCGGCGGCG CGGCCGCCGC CGCCGCCGCG GCGGCGGCCG
GGCCGCCGCC GCCGCCGCCG CGCGGCGGCC CCGCGGCGGC CGGCGGCCGC GGCGGCCGCG CGCCGCCGCC GCGGGCGGCG
GGGCGGCGGC CTCGGCGGCG GCCGCCGCGC GGCCGCGGCC GGCCGCGGCG GCCGCGGCCG GGCGCGGGCG GCCCGCGCCG
CGGCGGCCAG CCGGCCGCCG GGCGGCGCGG CCCGGCGGCG GCCGCCGCGG CGCCGCGGCC CGCCGCCCGC CGGCGGCCGG

2. Step
resulting sequences having max delta quotient from subset selection
GCATCGCGGA
GGTGGGCATCG
GGTGGGCATCGC
ACTGGGGGCGGCG
GGGGGCGGCGCGGC
GGGGGCGGCGCGGCG
CGGCCGGTCGGCCGGG
GCCCGCGCCGCCAGCGG
CTCGTCGGCCGCCGCGCC
CTCGTCGGCCGCCGCGCCG
CGCCGCGGCCACGGCCGCCC

(*28) Longest sequence from binary counter of fasta sequences by length in (1)
(parameter 1-fasta sequence filename , parameter 2-data rows to skip , parameter 3-number of indexed sequences)
- sequence GGAACACTTTTACACTGTTGGTGGGACTGTAAACTAGTTCAA and sequence Blast taxonomy
- sequence GAGTGGCCGATCAGCACATCCG and sequence Blast taxonomy
- sequence GTAGCTGCCGGCATTACCAAACCCGATATTACCCA and sequence Blast taxonomy
Ref : J Clin Microbiol. 2013 May; 51(5): 1558-1562.
Polymorphism of Antigen MPT64 in Mycobacterium tuberculosis Strains
Yi Jiang,a Haican Liu, Haiyin Wang, Xiangfeng Dou, Xiuqin Zhao, Yun Bai, Li Wan, Guilian Li, Wen Zhang, Chen Chen, and Kanglin Wan

- sequence AGCACTTTGGGAGGCCAAGGAGGGCAGATCACAAGGTCA and sequence Blast taxonomy

- sequence TCCCAGCACTTTGGGAGGCCAAGGAGGGCAGATCACAAGGTCAGGAGATCGAGACCATCCTGGCCACATG
           AAACCTTCTGGGCCGGGCACCTATAATCCCAGCACTTTGGGAGGCCAAGGAGGGCAGATCACAAGGTCAG and sequence Blast taxonomy

- sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having :
           AGCACTTTGGGAGGCCAAGGAGGGCAGATCACAAGGTCA
           AGCACTTTGGGAGGCCGAAGTGGGCGGATCACGAGGTCA
           AGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCA
 
           GTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCTGGCTA
           GCAGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCCGAAGTGGGCGGATCACGAGGTCAGGAGGTC
           TCGTCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAGGAGATGGAGACCATCCTGGCTAA
           CAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAGGAGATTGAGATCATCCTGGCTAACACGGT
           CACCTGTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAGGAGATCAAGACCATCCT
           CGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAGGAGAT
           GGCCGGGCGCATGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCA
           GCAGATCAAGACCACCCTGGCTAACACGGTGAAACTCCATCTCTACTAAAAATACAAAAAAATTAGCCAG
           GTAATCCCAGCACTTTGGGAGGCCGAAGTGGGCGGATCACGAGGTCAGGAAATTGAGACCATCCTGGCTA
           ACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGACGGGCGGATCACGAGGTCAGGAGATTGAGACCATCC
           TATAAAATTTAAAATAGGAGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAG
           ACGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAAA
           TCCCAGCACTTTGGGAGGCCAAGGAGGGCAGATCACAAGGTCAGGAGATCGAGACCATCCTGGCCACATG
           AAACCTTCTGGGCCGGGCACCTATAATCCCAGCACTTTGGGAGGCCAAGGAGGGCAGATCACAAGGTCAG and sequence Blast taxonomy

- sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having :
           GGC_GATCATGAGGTCAAGAGAT_GAGACCATCCTG

           GGCATGGTGGCTCACACCTGTAATCCCAGAACTTTGGGAGGCCGAGGCGGCAGATCATGAGGTCAAGAGA
           TTGAGACCATCCTGGCCAACATGGTGAAACCTCGTCTCTACTAAAAATACAAAAATTAGCTGGGCGTGGT
           CCAGCACTTTGGGAGGCTGAGGCAGGCGGATCATGAGGTCAAGAGATGGAGACCATCCTGACCAACATGT
           GGCGGATCATGAGGTCAAGAGATAGAGACCATCCTGGCCAACATGGTGAAACCCCATCTCTACTAAAAAT
           CCAAGCACTTTGGGAGGCTGAGGCGGGCAGATCATGAGGTCAAGAGATTGAGACCATCCTGTCCAACATG
           GCTCACGCCTGTAATCACAGCACTTTGGGAGGCTGAGGCAGGCAGATCATGAGGTCAAGAGATCGAGACC
           ATCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAATTAGCTGGGCGTGGTGGTGCATG
           TCACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCAGGCAGATCATGAGGTCAAGAGATCGAGACCAT
           CCTGGCCAACATGGTGAAACCCGTCTCTACTAAAAATACAAAAATTAGCTAAGCGTGGTGGCAGGCGCCT
           GGCAGATCATGAGGTCAAGAGATTGAGACCATCCTGGCCAACATGCTACTAAAAGTACAGAAATTACCTG
           GGTGCGGTGGCTCCTCACGCCTGTAATCCCAGCAGTTTGGGAGGCCAAGGTGGGCAGATCATGAGGTCAA
           GAGATCGAGACCATCCTGGCCAACATGGTGAAACCCCATCTCTACTAAAAATACAAAAATTAGCTGGGTG
           GTGGCTCAAGCTTGTAATGCCAGCACTTTGTGAGGCCGAGGCGGGCAGATCATGAGGTCAAGAGATAGAG
           ACCATCCTGGCTAACACGGTGAAACCCCATCTCTACTAAAAATACAAAAAATTACCCGGTTGTGGTGCTG
           AGGCTGAGGCAGGCAGATCATGAGGTCAAGAGATCGAGACCATCCTGGCCAACATGGTGAAACCCCGTCT
           TAAAATACAGACTGCTGGGCACGGTGGCTCACGCCTGTAATCCCAGTACTTTGGGAGGCCAAGACAGGCA
           GATCATGAGGTCAAGAGATCGAGACCATCCTGGCCAACATGGAGAAACCCGGTCTCTACTAAAAATACAA
           AAAACATTAAGCACTTTGAGAGGCTGAGGCTGGCAGATCATGAGGTCAAGAGATCGAGACCATCCTGGCC
           ATACCAGTCGGCCAGTCATGGTGGCCCACGCCTATAATCCCAGCACTTTGGGAGGCCGAGACTGGCAGAT
           CATGAGGTCAAGAGATCGAGACCATCCTGGCCAACATGGTGAAACCCTGTCCCTACTAAAAATACAAAAA
           GAAAGGTCCAGGCTGGGCATGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCGGGCAGA
           TCATGAGGTCAAGAGATAGAGACCATCCTGGCCAACATGGTGAAACCCTATCTCTACTAAAAATACAAAA and sequence Blast taxonomy

- sequence ACTTTGGGAGCCTGAGGTGGGTGGATCAC and sequence Blast taxonomy

- sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having :
           ACTTTGGGAGCCTGAGGTGGGTGGATCAC

           GGGTTTCAAAAATCCTCGATAGGCCGGGCACGGTAGCTCACGCCTGTAATCCCAGCACTTTGGGAGCCTG
           AGGTGGGTGGATCACGAGGTCAGAAGTTCGAGACCAGCTTGGCCAACATAGTGAAACCCTGTCTCTACTA
           TTCTGGCTGGGTGTGGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGCCTGAGGTGGGTGGATCACGA
           TAAAAATTACGTAACATGAGGGCCGGGTGCGGTGGTTCACACCTGTAATCGCAGCACTTTGGGAGCCTGA
           GGTGGGTGGATCACCAGGTCAGGAGATCCAGACCATCCTGGCTAACACAGTGAAACCCTGTCTCTACTAA
           ACTTTGGGAGCCTGAGGTGGGTGGATCACCTGAGCTCAGGAGTTTGAGACCAGCCTGGCCAACATGGCAA and sequence Blast taxonomy

- sequence CCTCGGCCTCCCAAAGTGCTGGGATTACAGGCATGAGCCACC and sequence Blast taxonomy

- Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having CCTCGGCCTCCCAAAGTGCTGGGATTACAGGCATGAGCCACC and Blast taxonomy

- sequence ATGGGGTTTCCCCATG and sequence Blast taxonomy

- sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having :
           ATGGGGTTTCCCCATG

           CATGGGGTTTCCCCATGTTGGCCAGGCTGGTCTTGAACTCCTGACCTCATGATCCGCCTGCCTTGGACTC
           GCTTCCTGAGTAGCTGAGACTACAGGCATGGGCCACCATGTCAGGCTAATTTTTGTATTTTTAGTAGAGA
           TGGGGTTTCCCCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAAGTGATCCACCCACCTTGGCCTC
           CATGCAGCACCATGCCCGGCTAATTTTGTATTTTTAGTAGAGATGGGGTTTCCCCATGTTGGCCAGGCTG
           CTAATTTTTTGTATTTTTAGTAGAGATGGGGTTTCCCCATGTTGGCCAAGCTGGTCTCGAACTCCTGACT
           GCCACCACGCCTGGCTAATTTTGTGTTTTTAGTAGAGATGGGGTTTCCCCATGTTGGCCAGGCTGGTCTC
           ACTGCGCCTGGCCATAGTTTGTATTTTAGTCGTGATGGGGTTTCCCCATGTTGCCTAGGCTCGTCCCGAA
           GCTAATTTTGTATATTTAGTAGAGATGGGGTTTCCCCATGTTAGCCAGGCTGATCTCAAACTCCTGACCT
           GGTAGAGATGGGGTTTCCCCATGTTGCTTAGGCTGGTATTGAATTCCTGGCCTCAGGCAATCCTCCCACC
           AATATTCTTTTATTTATTTAATAGAGATGGGGTTTCCCCATGTTGCCCAGGCTGGTCTTGAATTCCTGAG
           TACAGGCCTGCGCCACCATGCCCGACTAATTTTTGTATTTTTAGTAGAGATGGGGTTTCCCCATGTTGGC and sequence Blast taxonomy

- Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having AAGACAAATTA , TAAAAGAGAAG , CTGATGAAAAA , GTCAGAAGCA , GAACAAGAAAG (from Papio anubis centriolar coiled-coil protein 110 (CCP110), transcript variant X11, mRNA) and Blast taxonomy


Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 having AGTGAG ATTCCA CAGTGA and Blast taxonomy
Sequence CAATGAGAACACATGGACACA from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy

Redundant counts of Fasta sequences from 20.1 and 21.1 (similar sequences at a given distance , segregated by complexity of chosen list(s)) were used in search for a texts longest (Fasta) string from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 :
Sequence TTGACTCAACAATTGTACCTCTAGAAATTCCTCTAATATAAAAAAC (l=46 from d=409) , and sequence Blast taxonomy
Sequence AACTTCTCTTCTCGCTTCATTTCATTCATTTGATCTTCC (l=39 from d=243) , and sequence Blast taxonomy
Sequence(s) from (.) having AACTTCTCTTCTCGCTTCATTTCATTCATTTGATCTTCC
TATTTCTTGGAGGCTTTGTTTGTTTCTTTTTATTCTTTTTTCTCTAAACTTCTCTTCTCGCTTCATTTCA
TTCATTTGATCTTCCATCCTGATACCCTTTCTTCCAGTTGATCGAATCGGCTACTGAGGCTTGTGCATTC
TTTTTCTCTAAACTTCTCTTCTCGCTTCATTTCATTCATTTGATCTTCCATCACTGATACCCTCTCTTCC
TATTTCTTGGAGGCTTTGTTCATTTCTTTTTATTCTTTTTTCTCTAAACTTCTCTTCTCGCTTCATTTCA
TTCATTTGATCTTCCATCACTGATACCCTCTCTTCCAGTTGATTGAATCGGCTACTGAAGCTTTTGCATT
TCTTTTTTCTCTAAACTTCTCTTCTCGCTTCATTTCATTCATTTGATCTTCCATCACTGATACCCTTTCT
TTCGTTTCTTTTTATTCTTTTTTCTCTAAACTTCTCTTCTCGCTTCATTTCATTCATTTGATCTTCCATC
CTTTTTTCTCTAAACTTCTCTTCTCGCTTCATTTCATTCATTTGATCTTCCATCACTGATACCCTTTCTT
and sequence Blast taxonomy
Sequence AAAATACAAAAAATTAGCCAGGCGTGGT (l=28 from d=121) , and sequence Blast taxonomy
Sequence(s) from (.) having AAAATACAAAAAATTAGCCAGGCGTGGT
CCCCATCTCTACTAAAAAAATACAAAAAATTAGCCAGGCGTGGTGGTGGGTGCCTGTAGTCCCAGCTACT
TGGCTAACACGGTGAAACCCCGTCTCTATTAAAAATACAAAAAATTAGCCAGGCGTGGTGGCAGGCGCCT
CTTCTCTACTAAAAAATACAAAAAATTAGCCAGGCGTGGTGGCAGGCGCCTGTAGTCCCAGCTACTCGGG
AAGATGGTGAAACCCTGTCTCTACTAAAAATACAAAAAATTAGCCAGGCGTGGTGGCACGCGCCTGTAAT
and sequence Blast taxonomy
Sequence CTCTTTGTCTTT from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and (selected) sequences Blast taxonomy
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having AGTTTCTTTTGCTGTGCAGAAGCTCT and Blast taxonomy
Sequence from gi|568801956|ref|NT_011786.17| Homo sapiens chromosome X genomic scaffold, GRCh38.p2 Primary Assembly HSCHRX_CTG14 , having AGTCTTTGGC and resulting Blast taxonomy
Sequence from gi|568801956|ref|NT_011786.17| Homo sapiens chromosome X genomic scaffold, GRCh38.p2 Primary Assembly HSCHRX_CTG14 , having CCACTGCACTCCAGCCTGG , CCTGGGAGGTGGAGGTTGCAGTGAGCCAAGATGGCCCCACTGCACTCCAGCCTGGGTGACAGAGCGAGACTCAGTTTCAAAAAAAAAAAAAA and resulting Blast taxonomy
Search for a texts longest string as in (12.1,16.3,15.2,16.4,16.5) and
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2
AAAGCTCAGTATTCGGGTGGGAGTGACCCGATTTTCCAGGTGCGTCCGTCACCCCTTTCTTTGACTCGGAAAGGGAACTCCCTGACCCCTTGCACTTCCCAAGTGAGGCAATGTCTCG and resulting Blast taxonomy
Sequence from gi|568801956|ref|NT_011786.17| Homo sapiens chromosome X genomic scaffold, GRCh38.p2 Primary Assembly HSCHRX_CTG14 , having :
CTT , GAA , TGA , CCT , TCC , TTG , TTC , AGA , ACA , AAA , CAG , CCA , CAC , CAT , CTG , GAG , AGC , ACC , GTG and resulting Blast taxonomy
Ref : Mol Cell Biol. 2000 Feb; 20(3): 1021-1029.
PR48, a Novel Regulatory Subunit of Protein Phosphatase 2A, Interacts with Cdc6 and Modulates DNA Replication in Human Cells
Zhen Yan, Sergei A. Fedorov, Marc C. Mumby, and R. Sanders Williams


- Sequence from gi|354792485|gb|JN956986.1| Mus musculus targeted KO-first, conditional ready, lacZ-tagged mutant allele Pfkfb2:tm1a(KOMP)Wtsi; transgenic and Blast taxonomy
Ref : Nucleic Acids Res. 2013 January; 41(D1): D666-D675.
The Standard European Vector Architecture (SEVA): a coherent platform for the analysis and deployment of complex prokaryotic phenotypes

Sequence from gi|568801956|ref|NT_011786.17| Homo sapiens chromosome X genomic scaffold, GRCh38.p2 Primary Assembly HSCHRX_CTG14
ACATTTTAATTTTTATCTTTCCCTAATTTAGTTTAACAGGCTTTTCTCATGAAGAACTAG
ACGACTCTTGGTAACCATGTTTGCTGCCCAGCTTCTAACTTACATACCGTGAGAAGTTAC
GTAACATTTACTCCTTTGTAAATGTTTCCCTATCATCAGACAAAACTCAATAAAAATGTG
TGTAATCCAATGTGGGTTTTTTTTTCCATAATTAATTTTGATACCATAGTGTGTGAACCA
AGAATAATCTAGTCACGTGAAACCTCTTCTCCAGTCATAGTATTTCTCATTCATTATAAT
AAAAGTAACTGGCTTTTAACCTCCTTATTTTTGTCTTAAATTTATAAATGAGTATAACCT
ATCTTACTGCTCAACTGCAGGCCTACATTTTGGAAGTATAAGCTGTGTTCTTTCCTCTCC
AAAGTTGGGCTGCCACCTGTAAAACAGGAGGTCATTTTTCCTAGCACACTCCTGTTTTGT
CCCTGAACATAAAAACAGATTTTTCTAGGCATTAGAAATAGTAAGAAAGATACTGCCTCT
GGTCTTTACTTCTGATGTCTGTGGTTTGTATGTATGTGTCCCTCCAAAATTCATATGTTG
GAATTTAAACCCCCCCGTGTGATAGAATTAAGAGGTGGCACCTTTTGCAGGTGACTAAGA
GTGGGATTAGTGACCTTATAAAAGGGCTCAATGGAACTAGCTAGGCCCTTTTGCCCTTGC
ACCATTTGAAGACACTGCATTGATCCCCTCTGGAGAACACAGCAACAAGGCACCATCTTG
GAAGCCGAGACCGGGCCTTCACCAGACATCGAACCTGCCAGTGCCTTCCTTGATCTTCGA
CTTCCCAGCCTCTAAAACTGTAAGAAATACATTTCTGGTCTTTGTAAGTTACCCAGTCTA
AGGTATTTTGCTATAGCAGCAGGAACAGACTGAGATAGCATCTTTATTATGGAACCTTTG
GTCCTCATACTTAATGAATGGGCGTATTTCCTTTGTTTTTAAATGAATGGGCTCCTTCCC
TATAATTGACTTCATCTCTTAAAAAATCAAGTTTTACAAGGCAGAACTGGAAAAAAATCT
CTACTGAACCCTCTAAAGGGTCCTAGGTTTCGTGGCAGGATCACTATGGGTAATTAAGAG
GCTGCTTGTGATTTTATCTAACCATTATTTACAGTTTTCCTCTGGTGACTCATGTGACTT
AAATAATTGCACAAAATGCTACTTAAACAGAAGATACTAAGTAGTGTGACAAATAGCACA
CTAATTGAGCCTCCTGATTGTTCTGCCATGCATGAAGGAAGGGTGATGATCTAGAGTCAG
CACCTCATTTAGGAATGGCCATGTGTCAATAGTGACAAAGGCTGGGCCTCACTAAGAAAG
GTCAGACGCAGAGAATCCAAATGCATAATCATGTTGAGAGGTAAGAGTTTAAACAGATTA
TAAAATAAAATCCTAAGTAACATATTTAGCTTGGCTACTCTAGAGAGTCAACTGTTAGAA
CTAGGTTTTTTTGTATTCCAGAGTAAATCTTTCAATAGACTTTTGTACCTTTGGTTTGAA
AAATAACTACACACATGCCCTTATTTTGTTTCTGCTTTGCCCACGTTTAAGCTTTGTTGA
TCCTATAGGTAGATGGCATTTATTTGTTGAAAAACTAGTCAGAAAAATTCTTACTTTATA
GAGAACTACAGACGTGCTAGTGGCACTTTTATGCCCAATACAGCCCAGTGGATTATTTAA
TATTTTCTTTAATTGCTGAAGAGATTTTTGCCAGAAATAAAACAATTAGCACTCACAGAT
TTGATTTCCATACCAATGTGAAATCTTGACTCTGCACATTATTCAAAAGTGAAACCATAC
TGAGAACAAAGAAAGATAGGATTTATGTATGGTAAAAAGGCAAGTGAATGAAGCTATAAT
TTTATTTCATTACTCTTAGTACATCCACAACTATTTCCACCTAAATAGATCTTTAAAATA
and Blast taxonomy
Ref : Autophagy Volume 10, Issue 2, 2014
Self-eating to remove cilia roadblock
Zaiming Tangab, Muyuan Zhua & Qing Zhongb


Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having AGTCGG , CCGGAC , GCGGAC , CGCGGC , CGGCGC , CGTCGA and Blast taxonomy
Sequence      GGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGTGGG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence   GGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGA from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence GCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGCGGGTGGATT from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence GTGGCTCACGCCTGTAATCCCAGCACTTTGGGAG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having GTGGCTCACGCCTGTAATCCCAGCACTTTGGGAG and Blast taxonomy
Sequence ACACCAAAAGCAATGGCAACAAAAGCCAAAATTGACAAATGGG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having ACACCAAAAGCAATGGCAACAAAAGCCAAAATTGACAAATGGG and Blast taxonomy
Sequence ACAAACAACCCCATCAAAAAGTGGGTGAAGGACATGAACAGACACTTCTCAAAAGAAG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having ACAAACAACCCCATCAAAAAGTGGGTGAAGGACATGAACAGACACTTCTCAAAAGAAG and Blast taxonomy
TAGCAAAGACTTGGAA CAACCCAAATGTCCAACAAT ATAGACTGGATTAAG
       (1)                (2)              (3)
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having (1) and Blast taxonomy
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having (2) and Blast taxonomy
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having (3) and Blast taxonomy

TAGCAAAGACTTGGAA_CAACCCAAATGTCCAACAAT_ATAGACTGGATTAAG
                          (4)
Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having (4) and Blast taxonomy

Sequence GAGACGGAGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGT from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , and Blast taxonomy
Seqence (.) from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having (at least) one from ACG AGA AGG AGT CAG CCA CCC CGC CGG CTC CTG GAC GAG GCA GCC GCT GGA GGC GTC GTG TCG TCT TGC TGG TGT
Sequence from (.)                                                                                                                           CTG         GCA     GCT GGA GGC GTC GTG TCG TCT TGC TGG TGT  and Blast taxonomy
Sequence from (.)                                                                                                                                               GCT GGA GGC GTC GTG     TCT TGG     TGT  and Blast taxonomy

Sequence         AGTGCAGTGGCGGGATCTCGGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCCATTCTCCTGCCTCAG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence                     CGATCTCGGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCCATTCTCCTGCCTCAGCCT from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence having GAGTGCAGTGGC___ATCTCGGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCCATTCTCCTGCCTCAG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence having GAGTGCAGTGG____ATCTCGGCTCACTGCAAGCTCCGCCTCCCGGGTTCACGCCATTCTCCTGCCTCAGCCT___GAGTAGCTGG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy

Sequence having AAGTCATTGGTAGCTTGA , GGTAGCTTGATGGGGATG , GTAGCTTGATGGGGATGG , TAGCTTGATGGGGATGGC ,  GCTTGATGGGGATGGCAT , TTGATGGGGATGGCATTG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence having                          CATTGGTAGCTTGATGGGGATGGCATTGAATCTATAAATTACCTTG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence having TTTTTTCCAATTCTGTGAAGAAAGTCATTGGTAGCTTGATGGGGATGGCATTGAATCTGTAAATTACCTTGGGCAGTATGGC from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Chromosome Research
September 2016, Volume 24, Issue 3, pp 309-323
LINE-related component of mouse heterochromatin and complex chromocenters' composition
Inna S. Kuznetsova Dmitrii I. Ostromyshenskii Alexei S. Komissarov Andrei N. Prusov Irina S. Waisertreiger Anna V. Gorbunova Vladimir A. Trifonov Malcolm A. Ferguson-Smith Olga I. Podgornaya


Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having AGTCTT and GTCAGT and TCAGTC and Blast taxonomy
Ref : Mol Biol Cell. 1998 Oct; 9(10): 2963-2971.
The Cytoplasmic Zinc Finger Protein ZPR1 Accumulates in the Nucleolus of Proliferating Cells
Zoya Galcheva-Gargova, Laxman Gangwani, Konstantin N. Konstantinov, Monique Mikrut, Steven J. Theroux, Tamar Enoch, and Roger J. Davis

Sequence from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 , having TTATTATTATACTTTAAGTTTTAGGGTACATGTGCACA and Blast taxonomy
Ref : Int J Cancer. 2011 Sep 1; 129(5): 1162-1169.
Cyclic nucleotide phosphodiesterase 7B mRNA: an unfavorable characteristic in chronic lymphocytic leukemia
Lingzhi Zhang, Fiona Murray, Laura Z. Rassenti, Minya Pu, Colleen Kelly, Joan R. Kanter, Andrew Greaves, Karen Messer, Thomas J. Kipps, and Paul A. Insel
Sequence           CCTCCCAAAGTGCTGGGATTACAGG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence          GCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCAC from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence        CGGCCTCCCAAAGTGTTGGGATTACAGG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence having CGGCCTCCCAAAGTGTTGGGATTACAGG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence having   GCCTCCCAGAGTGCTGGGATTACAGG from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Sequence having AAAACAGCATGGTACTGGT , ACCAAAACAGCATGGTACT from Homo sapiens chromosome 6 genomic scaffold, GRCh38.p2 and Blast taxonomy
Ref : Oncotarget. 2016 Jan 26; 7(4): 4048-4061. Enhanced expression of LINE-1-encoded ORF2 protein in early stages of colon and prostate transformation
Chiara De Luca, Fiorella Guadagni, Paola Sinibaldi-Vallebona, Steno Sentinelli, Michele Gallucci, Andreas Hoffmann, Gerald G. Schumann, Corrado Spadafora, and Ilaria Sciamanna
Sequence having    CCTCCCAAAGTGCTGGGATTACAGGCGTGAGCA from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7
TCACCACGTTGGCCAGGATGGTCTTGAACTCCTGACCTCGTGATCCGCCCACCTCGGCCTCCCAAAGTGC
TGGGATTACAGGCGTGAGCAACCGCGCCCAGCCCCCACCATGTTTTTAAATGACACCAACTTTCGACTCA
ACCCTGACCAAGCTGGTCTTGAACTCCTGACGATCTGCCCGCCTGGGCCTCCCAAAGTGCTGGGATTACA
GGCGTGAGCAACCGCGACCGGCCTATTTTTTTTCTTTTTTCTAAACGCGTCCAGCTTCCCTGATTTCGGA
TTCACCATGTTGGCCAGGATGGTCTCGATCTCTTGACCTCGTGATCCGCCTGCCTTGGCCTCCCAAAGTG
CTGGGATTACAGGCGTGAGCACCGCGCCCAGCCTCAAGGTTTATTTTTATGTTTCTAAAGACAGAGTCTC
CTCAGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCAACCGCACCTGGCTTCTTTTTCTTTCTTTCTTT
and Blast taxonomy
Sequence        GGCTGCATAGTATTCCATGGTGTATATGTGCCACATTTTCTTAATCCAGTCTATC from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having GGCTGCATAGTATTCCATGGTGTATATGTGCCACATTTTCTTAATCCAGTCTATC from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
and AAB59368.1 ORF2 contains a reverse transcriptase domain [Homo sapiens] Blast taxonomy
Sequence        TTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having TTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 
CTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTTTTTTTCTTGTAAAT
TGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTT
GTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATG
TGTGCTTTTTGTCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGAT
CATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTTTT
GGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTT
GCATTTTTTCATGTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCC
CACTTTTTGACGGGGTTGTTTGTTTTTTTCTTGTAAATTTGTCTGAGTTCATTGTAGATTCTGGATATTA
ATTTCTCTGATGGCCAGTGATGATGAGCATTTCTTCATGTGTTTTTTGGCTGCATAAATGTCTTCTTTTG
AGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTTTTTTTCTTGTAAATTTGTTT
GATGGCCAGTGATGGTGAGCATTTTCTCATGTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGT
CTGTTCATGTCCTTCGCCCACTTTTTTATGGGGTTGTTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCAT
TTCACGTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTT
TGATGGGGTTGTTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTT
GGTCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTTGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTG
TTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCATTGTAGAT
and Blast taxonomy
Sequence having AAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGG from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 
ATTTGCATTTCTCTGATGGCCAGTGATGATGAGCATTTCTTCATGTGTTTTTTGGCTGCATAAATGTCTT
CTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTTTTTTTCTTGTAAAT
TGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTT
GTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATG
GGGTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTTGTCAGATGA
TGTGCTTTTTGTCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGAT
GGGGTTGTTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTTGTCA
CATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTTTT
GGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTT
ATTTCTCTGATGGCCAGTGATGATGAGCATTTCTTCATGTGTTTTTTGGCTGCATAAATGTCTTCTTTTG
AGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTGTTTTTTTCTTGTAAATTTGTTT
TTCACGTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTT
TGATGGGGTTGTTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTT
GGTCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTTGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTG
TTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTTTTTTTCTTGTAAATTTGTTTGAGTTCATTGTAGAT
and Blast taxonomy
Sequence having ATGGGGTTG_TTGTTTTT_TCTTGTA from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having ATGGGGTTG_TTGTTTTT_TCTTGTAAAT_TGTTT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having ATGGGGTTG_TTGTTTTT_TCTTGTAAAT_TGTTTGAGTTCATTGTAGATT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having ATGGGGTTG_TTGTTTTT_TCTTGTAAAT_TGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTTGTCAGATGAGTAGGTTGCGAAAA from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having             GGCTGGAGTGCAGTGGTGCGATCTC from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence                    GGCTGGAGTGCAGTGGTGCGATCTCGGCTCACTG from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having             GGCTGGAGTGCAGTGGTGCGATCTCGGCTCACTG from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having CTCTG_C_CCCAGGCTGGAGTGCAGTGGTGCGATCTCGGCTCACTG from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence                                CTGCCTCAGCCTCCCAAGTAGCTGGGATTACAGG from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence   
GGCTCACTCCAACCTCTGCCTTCTGGGTGCAAGCGATTCTCCTGCCTCAGCCTCCGAGTAGCTGGGATTACAGGCACGTGCCACCACAGCCGGCTAATTTTTTGTAGTTTGTAGAGAGGAGATTTTGCCATGTCGGCCAG
    ACTGCAACCTCTAGCCTTCTGGGTTCAAGCAATTCTCCTGCCTCAGCCTCCCAGTAGCTGGGATTACAGGTACGCCCCATCATGCCTGGCTAAATTTTGTATTTTTAGTAGAGATGGAGTTTCACCATGTTTGCCAGGCT
having                             ATTCTCCTGCCTCAGCCTCC__GTAGCTGGGATTACAGG_AC_ from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having                                 GCCTCCC_AGTAGCTGGGATTACAGGT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 and Blast taxonomy
Sequence having GGAGGCTGAGGCAGGAGAATCACTTGAACC_GGGAG from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7  and Blast taxonomy
and protein sequence TPDVSSALDKLKDKLKEFGNTLEDKARELISRIKQSELSAKMREWFSETFQKVKEKLKID alignment ACN81312.1 apolipoprotein C-I [Homo sapiens] Blast taxonomy
Sequence having ATATGA and TGAAAA and TTTTAT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7  and Blast taxonomy
and protein sequence QHYHQKGQNGSFDAPNERPYSLKIRNTTSCNSGTYRCTLQDPDGQRNLS alignment NP_001238830.1 CD83 antigen isoform c [Homo sapiens] Blast taxonomy
Sequence having AGTAATTAA ,AGTGACACA ,ACAGTAACA ,TCAGTCATT ,TAATTCAGT ,CCAAACATT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7  and Blast taxonomy
Sequence having AGATGCAGAAAAA , ACAGATGGAAA , AGATGAAGATG , ATGGAAAGAAA , CCACACAAAAA , CTTATATTTGA , TTTTTATATTTAT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7  and Blast taxonomy
Sequence having AGTAATTAA ,AGTGACACA ,ACAGTAACA ,TCAGTCATT ,TAATTCAGT ,CCAAACATT from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7  and Blast taxonomy
Sequence having AAAAA_A_AAAA_AA_AA_AA_AA_AA_AAA from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7  and Blast taxonomy
Sequence from Streptomyces rapamycinicus NRRL 5491 having ACG and ACT and CCG and CGC and CGG and CGT and CTG and GCA and GCG and GCT and GGC and GGG and GGT and GTG and TGC and TGG and Blast taxonomy
and Blast X86780.1 S.hygroscopicus gene cluster for polyketide immunosuppressant rapamycin taxonomy
and Blast CAA60460.1 polyketide synthase [Streptomyces rapamycinicus] taxonomy
and Blast CAA60461.1 pipecolate incorporating enzyme [Streptomyces rapamycinicus] taxonomy

Sequence from gi|568801956|ref|NT_011786.17| Homo sapiens chromosome X genomic scaffold, GRCh38.p2 Primary Assembly HSCHRX_CTG14 having GTC and CGT and CGC and CGA and ACG and GAC and Blast taxonomy
and Blast NP_000108.1 emerin [Homo sapiens] taxonomy
Ref : Tremblay et al. BMC Genomics 2010, 11:632
Expression, tandem repeat copy number variation and stability of four macrosatellite arrays in the human genome
Deanna C Tremblay, Graham Alexander Jr, Shawn Moseley, Brian P Chadwick

Sequence from Mycobacterium tuberculosis H37Rv having CAAGCTGTC or CCGAAGGGC or CTGGAAGCC or GGAGGCGAT and Blast taxonomy
Sequence from Mycobacterium tuberculosis H37Rv having CAGCGCGCAGCAC
CGCCTCACGCCGGAAGCGAAGGTAAAAACTGGGATCGCGGGCTAGATCAGCGCGCAGCACCTTGACCGCA
GGTTGGCCAGCGCGCAGCACAGCGTGCCGCCCAGTTCTTCCTCGCTGGCTTGGGTGTCGAGTTGGTTGAT
CGTCGGCGATGCGCCGGGCGGCGTCGGGCTTGGTGATGCGTAACCGGTTGGCCAGCGCGCAGCACAGCGT
CGTCTCGCGCACAGCGCGCAGCACGCTGCTGCGAGCGGTGATCAGCGCACGGGACTCAGCGTTGACCGCC
CCCGACCGACTCCCCTTTTCGAGGCGTCAAGAAGGTGCACTACTGGGCAGCGCGCAGCACCGGTGGGGAA
and Blast taxonomy
Sequence from Mycobacterium tuberculosis H37Rv having GGTGATGCCGCGTTTGCGGGCG
TGGCCCGCGCTTGGGGGGTCAGGTAGCCACTTAGCCGTGACATGCCGTCGTATTGCTGGTTGCTCAGGGT
GATGCCGCGTTTGCGGGCGCGTTCGGTGTCGGTGAGGTCGCCGTCGGGGTGTAGCCAGTCCATGACCCGC
CCACTTAGCCGTGACATGCCGTCGTATTGCTGGTTGCTCAGGGTGATGCCGCGTTTGCGGGCGCGTTCGG
and Blast taxonomy
Sequence GGTGATGCCGCGTTTGCGGGCG from Mycobacterium tuberculosis H37Rv and Blast taxonomy
Sequence from Streptomyces rapamycinicus NRRL 5491 having at least one from sequence list and Blast taxonomy
Sequence from Streptomyces rapamycinicus NRRL 5491 having CTCGGCGGCGGCAA or CTTCGGCGGCGGCC or GCGGCCTGGATCA
CCGCTACGCCACTCCCTGGGTGCCATCCGACCTCGCCGCCCTCGGCGGCGGCAACTGGGCGGAGCACCTG
TCGCGGCGGCCGGTGAGCCGACCGAGCACAACGGCCCAGGCGGCCTGGATCACGGTGTTCAGGGTCAGGC
CGCCGTCCTCCAGCCAGGTGATGTCCACGGGCTGGCCGACGGCCTTCCGGGCGGGGCTGTCGGCGGCGGC
CTGCGGGGCGTTGTGGGCGAGCAGCCAGGAGAACGGCACGGCGGCGCCGGTGGCGGCCGCGGCCTTGAGA
ACTGCGGCCTGGATCAGCACCAGTATCGGCAACTCAGGGCTTTGATGTTTTCGGCGGCGGCGCGTTAGTC
TCTGCGGCGCCGCGTGGTCGGTCGGCAGCCTGATCGCCTTCCGGGTCCTCCAGGGCATCGGCGGCGGCCT
GATCACCCCGGTGATGCAGACCATCCTGGTCCGCGCGGCCGGTCCCCAGCGGATCGGCCGCATCATGAGC
CGAACGCTGTTTGTGCGGTACCCGGGCGTGGCCCCGGTTGCCGCGCCGCAGCAGCTCCTGGAGGTCGGCG
GCGGCCTGCCGCCGCCCGTCGGACTCCGCGCCGGGCAGCGCCCGCAGGGCGGCGATCAGTTGCTCCAGGT
CCCGGTCCGAGCGGGTGGCCCTCGGCGGCGGCAAGGAGGACCTGACGTTCTTCGGCGCGGCGGCCAAGTG
CGCTGCGCGCCGCTCGTTCGCGGCCTGGATCAGGGCGGCGATACGGCGGTGGGCGGCGTGGTCCTCGGGG
GCACCGGCGCGGTTTCGCCACCGCCGTGCTGCGCTTCGGCGGCGGCCACCGCGACACCGGCCTGCGCGGA
TGGTGCTGCTCGGCGGCGGCAAGCTGCTGATGGCGGCGAGCGCGCCGCGCAGCGCCGAGCTGTTCGCCGA
CCTACCGCGGACTGGCCATGCGCGACCTCGCCGACCTCCAGGCGCGGAGCCTCGGCGGCGGCAAGAGCCT
TCCTCATCCTCACCGGTAACGTCGCCCTGGAGACGATGGGCTTCGAGACGTTCGGCTTCGGCGGCGGCCG
GCCCGTTGTGGCTGGGGTCGGTGAAGTCGAACATCGGCCACACTTCGGCGGCGGCCGGCGTGGCCGGTGT
CGCCGCCCACTCTGTGGAGGACCGCCAGCGCCTGCTCGGCCTCGGCGGCGGCCTGAGCGGGGCGGTGAGC
GCCTCCGCGACCATCCGAGAGCTCGACGTGACCGACGAAGCGGCCTGGATCAGGACGATCGAGCAGGTGC
CGCGCCGCCGCGGCTTCGGCGGCGGCCGCCCTGGAGCTCGTCTGGGGCCAGCTGCACGCCGCCCCCTGGA
and Blast taxonomy
Sequence from Streptomyces rapamycinicus NRRL 5491 having
ACG and CGT and GTG and TGG and GGA and CAC and TCC and CCA and CTG and GAG and CTC and CAG and GCT and AGC
and Blast taxonomy
Sequence from (.) GGAGCGGGGTCTGGGGCGGAGCCCCA and Blast taxonomy
Sequence from (.) having GGAGCGGGGTCTGGGGCGGAGCCCCA
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGA
GGAGCCCCGCCTTCCAGCCCCTCCGGCGATTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTTCGGGAAG
AGCGGTAGCTGGGGGAGTTTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTTCGGGAAGGGGCGGGGCGG
GCCCCACCTAACAGCCCCTCCAGGGGGCACCTCCCAGCGGTAGCTGGGGGAGTTTGAGGAGCGGGGTCTG
AGCCGCACCATTCAGCCCCTCCGGCGATTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGAGGGAAGGG
AAGCCCCGCATTCCAGCCCTCCGGCGCTTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGG
CCTCCGGCACTTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGGGCGGGTAGGGGAGCAGC
GCCCCCTGACGGGCTGAAATCAGCCCCTCCGGCGATTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGT
GGGGGCCATTGCCCCCTGGGCGCCACCTTCCAGCCCCTCCGGCGTTTGAGGAGCGGGGTCTGGGGCGGAG
CCAGCGGTAGCTGGGGGAGTTTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGAGGGAAGGGGCGGGGA
TGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGGGCGGGGAGGGGAGCAGCCCGCCGCAGGC
GGGGGTACCTCCCAGCGGTAGCTGGGGGAGATTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGCGGGA
and Blast taxonomy

Sequence from Streptomyces rapamycinicus NRRL 5491 having CGGCC or GCGGG or TCGCC or CGAGG or CTCGG or GTCGG and Blast taxonomy
Sequence from Streptomyces autolyticus strain CGMCC0516, complete genome having CCGGCGCG , CGCGCCCG , GCACCGCC and Blast taxonomy
Sequence from Streptomyces rapamycinicus NRRL 5491 having GGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGGGCGGG
AAGCCCCGCATTCCAGCCCTCCGGCGCTTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGG
GCGGGTAGGGGAACAGCCCGCCGCAGGCGCCACAGCCCGCCGGACGCCTCCCGGCCCCGCCGCACGGCGT
CCTCCGGCACTTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGGGCGGGTAGGGGAGCAGC
GCCCCCTGACGGGCTGAAATCAGCCCCTCCGGCGATTGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGT
GGGAAGGGGCGGGGAGGGGGAACGGCCCGCCGCAGGCGTACGCGTACCGCCGGACACCCCCTGCGCCCCA
CCGCGTGGCAGGGGCTTCGCCCCTGGACCTTGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGGGCGGG
GAGGGGAGCAGCCCGCCGCAGGCGTACGCGTACCGCCGGACACTCCTCCGGGCCTCGCGGTTTCTGGGTC
TGAGGAGCGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGGGCGGGGAGGGGAGCAGCCCGCCGCAGGC
GGGGTCTGGGGCGGAGCCCCGGCGGGGTCTGGGGCGGAGCCCCAGTTGTGGGAAGGGGCGGGGAGGGGAG
TGCTGCGGGGTGTCCGCCGCGTGGCGAGGGCTTCGCCCCTGGACCCGGGGTCTGGGGCGGAGCCCCAGTT
GTGGGAAGGGGCGGGGAGGGGAGCGGCCCGCCGCAGGCGTCACGATCCGTCGGACATCCCCTGGCGGGAG
and Blast taxonomy
and protein sequence AGP56635.1 hypothetical protein M271_25750 [Streptomyces rapamycinicus NRRL 5491] Blast taxonomy
and protein sequence AQA16328.1 acyltransferase [Streptomyces autolyticus] Blast taxonomy

Sequence from Streptomyces rapamycinicus NRRL 5491 having GCTCCCCGC_CCCGCGGGGATGGTCCCC
CATCGGCTGACCCTGCCGCGACACGACTGCTCCCCGCGCCCGCGGGGATGGTCCCCACATAAAGTCCGCG
CACGGAGTCTGCTCCCCGCGCCCGCGGGGATGGTCCCCCTCCGGCTTTGTCCCAGAGGTTGAAGAAGAGC
GGATGGTCCCAAGCCCCGACCACCTTCCGGGTTAGGGGGATGCTGCTCCCCGCGCCCGCGGGGATGGTCC
CCCGCCGGGCCGATGTTCAAGATCAGCGTGGTCTGCTCCCCGCACCTGCGGGGATGGTCCCATCCACGAC
TACTCAGAGCTGAGGGCGGAGACAGCAGTCTCCTGCTCCCCGCACCCGCGGGGATGGTCCCCCGACCGAT
TGCTCCCCGCACCCGCGGGGATGGTCCCCGCGAAACCGCGCCGCTCTCGCCGTCGCAGGACTGCTCCCCG
GTGGTCCCACCACGGTCATGGAGATCAAGGCCAAGTGCCGCTGCTCCCCGCACCCGCGGGGATGGTCCCC
GCTCCCCGGTCGGGGTCAGGATGATGAGCTGCTGCTCCCCGCACCCGCGGGGATGGTCCCCAAGATCGCA
CCTGCTGCTCCCCGCACCCGCGGGGATGGTCCCCGCAGCTGACCAGCCGCCTCACGCCGCGTCCGCTGCT
CCGCACGGCGCTCGGCTGGCGGTTCTTCAACGGCCTGCTCCCCGCACCCGCGGGGATGGTCCCCCCATGG
ATCACCTGGGTCAGGTCCGGCTCGCACTGCTCCCCGCACCCGCGGGGATGGTCCCCCGAAGGAAGCCCTG
NNNNNNNNNNNNNNNNNNNNGGGTGGTGATCATGCCCTCGGGCATGCGGATCTGCTCCCCGCACCCGCGG
GGATGGTCCCCACCGCAGGTCAGCTGGCGTAGCAGCTATGCCCTGCTCCCCGCACCCGCGGGGATGGTCC
ATGGTCCCGCCGCGCCCATCAGCGCGAGCACGTGCTCCTCCTGCTCCCCGCACCCGCGGGGATGGTCCCC
AAGATCGCATCACGGCCGCGCTCCAGCTCGGCTGCTCCCCGCACCCGCGGGGATGGTCCCTGGAGGCCCA
CGAACTGCTCCCCGCACCCGCGGGGATGGTCCCCAGCGGCGGCCGGACCGGCCCTTCCCCCGCTCCTGCT
GGTCCTCAGCGGTTCACTGTGGAGCGACTCGATATGCACTGCTCCCCGCACCCGCGGGGATGGTCCCCCC
GCGGGGATGGTCCCGAGGGCAGCATCCGCCAGCTCCTCGACGAGGGCTGCTCCCCGCACCCGCGGGGATG
GTCCCCACCACCAACCACCACACTCGAAGGAGATCAACTGCTCCCCGCACCCACGGGGATGGTCCCGTCG
CGCACCCGCGGGGATGGTCCCTCCCCATCACCGCCCGCCGTCCGTACGTGTCCGCTGCTCCCCGCACCCG
CGGGGATGGTCCCCATCCTTGACCGGCATCGCGTCCCCTCTCGCTCTGCTCCCCGCACCCGCGGGGATGG
CGTGGAGGACTGCTCCCCGCACCCGCGGGGATGGTCCCCCCGGGCGACAGCGACGGCCTGGCCGACGACC
CACCCGCGGGGATGGTCCCGCGCGCCGGCGGACATCGCGGCGGAACTTGGCCTGCTCCCCGCACCCGCGG
GGATGGTCCCCGTCGACCCGCTGACCCTCAACGAGAGCCGCGCTGCTCCCCGCACCCGCGGGGATGGTCC
CGGCTCGGCCTGTGGTCAGACGTCGCATGCCGTCTGCTCCCCGCACCCGCGGGGATGGTCCCCATCTCTT
CTCCCCGCACCCGCGGGGATGGTCCCACAACGGGCGCCGCAGCGGGCGGACTCGGGTCCTGCTCCCCGCA
CCCGCGGGGATGGTCCCCACCGCCACCGTCTGACCTCTCGAATTACCTCCTGCTCCCCGCACCCGCGGGG
and Blast taxonomy

Sequence from Streptomyces rapamycinicus NRRL 5491 having
CCACACCACCACC CCACCACACCACC CACCACCACCCCG CCACCACACCCAC ACACCACCCACAC ACCACACCACCAC
CACCACCACCCGC CACCACCCACCCC CCCCACCACCACC CCCCACCCCACCG CACCACACCCACC AACACCACCACCG
CACCACCACACCC CACCCACACCCCG CCACCCCCACCCC CCCACACCCACCG CCCACCCCACCGC CCCCACCACCCCC
ACCACACCCCACG ACCACCCACCACC ACCACCCCACCCG CACACCACCACCA CACCCACACCCTG CACCCCCACCCAC
CCACACCACCCCG CCACCACCACACG CCCACCACCACAC CCCCCACCCCACC
and Blast taxonomy

Sequence from Streptomyces rapamycinicus NRRL 5491 having
AACACACCCACACCC , AACCCACCACACCCT , CACCACACCCCACAC 
CACCACCACACCACC , CCACCACCCACCCCC , CCACCCACCCACCAT
CAGTTCGAAGTCGCCGTTAGGGCCACCACCCACCCCCTGCCCGTGCTCGGCGCGCTCCGCCAGTGCCCGG
CCCATCCTGCAGCCCTTCAAGCAGGCCATCAACGGGCTCACCTACCACCCACCCACCATCCCCCTCATCA
ACCCGTCCACTTCCACCCCGCCATCACCCACACCACACCCCACACCAGCATCTACCTCGAACTCGGCCCC
AACCCCATCCTCACCACCGCAACCCACCACACCCTCCACCACCACACCACCCAAAACCCCACCAACCGAC
CCCTGATGGACCCCATCCTGCAGCCCTTCAAGCAGGCCATCAACGGGCTCACCTACCACCCACCCACCAT
CCCCCTCATCAGCAACCTCACCGGACAACCAGCCGACGAACACATCACCACCCCCGACTACTGGACCCAA
CACATCCGCCAACCCGTCCACTTCCACCCCGCCATCACCCACACCACACCCCACACCAGCATCTACCTCG
AACTCGGCCCCAACCCCATCCTCACCACCGCAACCCACCACACCCTCCACCACCACACCACCCAAAACCC
CAGCGAGCTCACCTACCACCCACCCACCATCCCCCTCATCAGCAACCTCACCGGACAACCAGCCGACGAA
ACACCACACCCCACACCAGCATCTACCTCGAACTCGGCCCCAACCCCATCCTCACCACCGCAACCCACCA
CACCCTCCACCACCACACCACCCAAAACCCCACCAACCGACCAACCCCACTCATCACCTCCACCCTCACC
ACAACACACCCACACCCTCGCCCCAGCCGGTGCCGTCCGTGGCGGCTGCGAACGCCTTGCACCGGCCGTC
GTCACCAACAACACACCCACACCCTCCGCCCACCCCGTACCGTCCGCCGACGCCGAGAACGCCTTGCACC
GCACCACCGCCAACACCCGAAGCCCACGCCGCCGCGCCTCCGACAACCGCGTCACCAACAACACACCCAC
ACCCTCCGCCCACCCCGTACCGTCCGCCGACGCCGAGAACGCCTTGCACCGCCCATCGGCAGCCAACCCC
CCGAACCCCGCTGGGCGGCCTGCGACGCGGCCGACCGCAACGCCCTGGCCGAAACCCTGGCAGCGATCCC
CACCACCCACCCCCTCACCGCCGTGATCCACACGGCGGGCGTCCTGGACGACGGAGTGATCGGCTCGCTC
GGGCCCACCACCCACCCCCCGACCACCACGGTCCGCGAGGCGGTGCTGGCACTGTGCCGCGCCACCGGGA
and Blast taxonomy

Ref : Genome Announc. 2013 Jul-Aug; 1(4): e00581-13.
Draft Genome Sequence of Streptomyces rapamycinicus Strain NRRL 5491, the Producer of the Immunosuppressant Rapamycin
Damir Baranasic, Ranko Gacesa, Antonio Starcevic, Jurica Zucko, Marko Blazic, Marinka Horvat, Kresimir Gjuracic, Stefan Fujs, Daslav Hranueli, Gregor Kosec, John Cullum,corresponding author and Hrvoje Petkovic


Sequence from Pseudomonas aeruginosa PAO1 chromosome having
CGGCCATCAGCATGCCGG GCGGCCATCAGCACCG CATCAGCATGCCGCG
CCATCAGCATGCCGG CATCAGCATGCCGG CGCGGCCATCAA GCGGCCATCAGCCA 
GGCGGTTTCCAA CGCGGCCATCCAG CGGCCATCAGGCA CGGCCATCAGGCC 
CCGCGGCCATTC CGCGGCCATTCA TCCATCATA
TGGCGTTCAGCAGTGGGCCGACGGCGGCCATGGAGAGCCGCAGGTTGAGGCCGACCAGGACGATGGCGGC
CATCAGCCAGAGGGTGCGGCCGGGTGGGTTGGCAGTGGGCATGGCGATCTCCTTTCGGGAGCACCATTGC
GCCGGAGACGAACACTACCGGGATGTTCAGGCGCAGCGCGGCCATCAGCATGCCGGGGGTGATCTTGTCG
CGGCATCCGGTCGGCCATCAGCATGCCGCGGGTCCAGGCGTCCGGCAGGCCGTCGAAGGCCAGCCGGCGC
GCTCGGCGCCCAACATCCGCGCGCAGGCGGCGGCCATCAGGCCGACCGGGCCGGCGCCGTAGATCGCCAC
TCACCCTGGTCGCCAGCGCTGCGCCGCGCTGCATGGCCGACGCGCGGCCATTCATGGCCTTCATGGAAGC
TCCCTCGACCTTGTAGCCGCGGCCATCCAGCAGCAGTTGCGCGCCCTCCTCCACGCCCTGGCCGATCAGG
ACCCCGGGCGCTCCATCATATGCGCAAGCCATGTGACTTGGCGCCTCCCTGGCGCGCCGGGGGAATGTGC
GACCGGAATCTTGTGCGCGTTGAACATCAGGATGTCCGCGGCCATCAGCACCGGATAGCTGTAGAGGCCC
GAGCCCCAGCCCTCGATAACCGGGATCGACCACCAGGTGCTGGAGGTAACCACGGCGACCGTCATGCCCG
GCCATCAGGCAGGCGATCACCTCCCCTTCGGTTTCGACCAGCAGGCTCAGCCCGGGGTTGCGTTGAAGGT
CCGGAACCTGAACGCGGCCATCAACAACGCCAGCGCCCATGGCGATGTCAGCCTGCAAGCCGGTCGCTAC
TCGATCTTTCCGTCCAGCTGGTCCAGGCCCTGGGCGGCGGTTTCCAACCCGATTCCCGCTCCGCTGCGCT
CCGCGGCCATTCGGGCAGCACCGCCCAGGCTATGAGCAGCCCCAGCACGATCCGTTTCGCAGGCTTCGGC
and Blast taxonomy
and protein sequence (.) dgaaaqklarhsaarlhregyld Blast taxonomy
and protein sequence (..) addlrrasqdapakatar Blast taxonomy

Sequence from Mycobacterium tuberculosis H37Rv having
CGGCGGGGCCGGCGG GGCGGGGCCGGCGGC GGCGGCGCCGGCGGC
GCCTCAGCGGTAACGCCAACGGCGGGGCCGGCGGCGACAGCGGCCGTGGCGGCACGGGCGGCGCCGGCGG
CGAGGGCGGCGCCGCCGGGCTGCTGGTGGGCACCGGCGGGCACGGCGGTGACGGCGGGGCCGGCGGCGCC
GGCGGGGCCGGCGGCGCTGGCCGCGGTCTATTCCTTGGCCTGGGCGGTGATGGCGGCGCCGGCGGCACCA
and Blast taxonomy

Sequence from Homo sapiens BAC clone RP11-435D24 from 7, complete sequence , GenBank: AC017075.8 having AGTTTCTGAG and Blast taxonomy
Sequence from Homo sapiens BAC clone RP11-435D24 from 7, complete sequence , GenBank: AC017075.8 having TCTAGTTTTTATGTGAAGATATTTCCTTTTC and Blast taxonomy
Sequence from Homo sapiens BAC clone RP11-435D24 from 7, complete sequence , GenBank: AC017075.8 having GAGAATGCTTCTGTCTAGTTTTTATGTGAAGATATTTCCTTTTCTACCAGAG and Blast taxonomy
Sequence from Homo sapiens BAC clone RP11-435D24 from 7, complete sequence , GenBank: AC017075.8 having ATGTGAAGATATTTCCTTTT and Blast taxonomy
Sequence ATGTGAAGATATTTCCTTTT from Homo sapiens BAC clone RP11-435D24 from 7, complete sequence , GenBank: AC017075.8 and Blast taxonomy
Ref : Croat Med J. 2003 Aug;44(4):386-406.
Key-string algorithm--novel approach to computational analysis of repetitive sequences in human centromeric DNA.
Rosandic , Paar V, Gluncic M, Basar I, Pavin N.


Sequence (.) from gi|568801956|ref|NT_011786.17| Homo sapiens chromosome X genomic scaffold, GRCh38.p2 Primary Assembly HSCHRX_CTG14 , having :
CTT , TGA , GAA , CCT , TCC , TTG , CTA , TGT , AGA , CCA , CAG , TTC , CAC , ATG , CAT , CTG , ACC , GAG , AGC , CAA , GTG
and Blast taxonomy
and the following sequence(s) from (.)
TCCTCCTGTGAAGATCCTTCAGGTGGGTCCAGGTCTTCGTCCTCCTGTGAAGATCCTTCA
GATGAGTCCAGGTCTTCGTCCTCCTGTGAAGATCCTTCAGATGAGTCCAGGTCTTCGTCC
TCCTGTGAAGATCCTTCAGATGAGTCCAGGTCTTCGTCCTCCTGTGAAGATCCTTCAGAT
GAGTCCAGGTCTTCGTCCTCCTGTGAAGATCCTTCAGCTGAGTCTAGGCCTTCGTCCTCC
Blast taxonomy
GACGTAGGCTGGCGTGGGCTCTTCCCCCAGCCCCTTGCCGGTGCTGCCACGTGAGAAGGG
AGGCTGGCGTGGGCTCTTCCCCCAGCCCCTTGCCGGTGCTGCCACGTGAGAAGGGCCCGG
GTAGGCTGGCGTGGGCTCTTCCCCCAGCCCCTTGCCGGTGCTGCCACGTGAGAAGGGCCC
Blast taxonomy
ACCATGTTGGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCACCTGCCTCAGCCTCC
CACCATGTTGGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCACCTGCCTCAGCCTC
CACCATGTTGGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCACCTGCCTCAGCCTC
CACCATGTTGGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCACCTGCCTCAGCCTC
CACCATGTTGGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCACCTGCCTCAGCCTC
CACCATGTTGGCCAGGATGGTCTCGATCTCCTGACCTTGTGATCCACCTGCCTCAGCCTC
Blast taxonomy

Sequence having ATTAGCCGGACATGGTGGC from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7
CACAGTATAATCCCATCTCTACTAAAAATACAAAAAATTAGCCGGACATGGTGGCGTGCGCCTGTAGTCC
CATGGTGAAACCCTGTCTCTACAAAAATACAAAAATTAGCCGGACATGGTGGCAGGCGCCTGTAATCCCA
TGAGGTCAGGAGCCCAAGACCAGCCTGACCAACATGGTGAAAACCCATCTCTACTAAAAATACAAAATTA
GCCGGACATGGTGGCACAAGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATTGCTTGAACC
AGTGAAACCCCATCTCTACTAAAAATACAAAAAAATTAGCCGGACATGGTGGCGGGCACCTGTAGTCCCA
ACTAAAAATACAAAAATTAGCCGGACATGGTGGCAGGTGCCTGTAATCTCAGCTACTCTGGAGGCTGAGG
GTTCAAGACCAGCCTGACCAACATGGAGAAACCCCGTCTCTACTAAAAATACAAAAATATTAGCCGGACA
TGGTGGCACATGCCTGTAATCCCAGCTACTCAAGAGGCTGAGGCAGGAGAATCGCTTGAACCCAGGAGGC
TCACTGCACTCCAGCCTGGGTGACAGAGCAAGATTCTGTCTCAAAAAAAAAAAAAAAAAAAAAAAAATTA
GCCGGACATGGTGGCGGGTGACTGTAATCCCAGCTACTCATGAATCTGAGTTTTGAGGATTGCTTGAGCC
GGCCAACATGGCAAAACCCCGTCTCTACTAAAAATACAAAAATTAGCCGGACATGGTGGCGCAGGCCTGT
and Blast taxonomy

Sequence from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 having at d=73 , l=9
ACGACGCTT CCGTACGAC CGACCCGAA CGACGTTCA CGTACCGTG  
CTCGAACGT TGCGAACCG CCGAACCGA CCGTACCGG CGACGATTC  
CGTAACTCG CGTACGATA TCGAACGAC TCGAACGTT TACGACCGA
AGTCTTTGGGCTGCCCACAGACCGATGCCTGGCACTGTGTACTCTCGAACGACGGACACGTTCAGCATTC
AGCCGCCGCCTCAGGGACACGGCCCCCGGGACGACGCTTCAGTGGGGCGCCGCGCCCTGCCCGCCGCTGC
TCATACATATAGTTTGGAGGAACCTCCAAAAATTAAAAATAGAATTACCGTACGACCCAGCAATCCCTCT
GCAACCCTGAATGGACGCCGTCCTGCTTCCTATACGACCCGAAGTGTTTAGTGACCCCCTGGCTTTCACT
TAATAAAGGGTATAGGAGTTTACAACTCAAAACAATTCGTAAGAACTATCAGATACGTACGATAAATCAC
AAGAGATACCATTATACCGAACCGAGAACAGCTAAAAATAAAAAGACTACTAATAATCTCATTTATATAC
AACAGATGACTCGAACGACCCTCTTGACTTTCATTTAAGTTCTGTGACCAGAGATTTCCTGTGGTTAATG
TGCAGGTCCCAAGCTCCCGTGTCCCAGTGATGATGCCCCTGTAGGGACACATTTTGAGCAGCAAGACCCC
GTACCGTGGGTCTCACTGTACAGTCTGCTTGTACATATCTTAGTCTCATTTTATAAAAAGAAAGAAATAG
ATGGTCCACTAAACCATAGCAATTAACCTTGGACTTCTCCTTGGATGTCAGCTGGTGACGTAACTCGGTA
TGCAGGCTATTAATGCAAATCATATGCACCGTACCGGCATGCTGAATTATGCAATGCCATGTGAATGTCT
CTTAGCCTTCATGTCCGAACCCTGACCCCGTACCGTGGTCTGTGGATGTTTTGGCTGTCTTGGGAGAGCG
GCGTGCATGCGTGCACACACACACGTAACTCGTTTTTTATAATACACAACCATCCTATTAACATGAGCTA
GATTACAGGCATGAGCCATCATGCCTGGTCTCGAACGTTCTCTTGAGGAAGCAATATGCACAACTAGAAA
TGCATGTAAACATCGCCTCTAAGCCACCGGAGGAGCCCTTGAAGCCTTGTCATCTCATCTTTCGAACGTT
CTGCAGTTTTGCTTGTTCCGCTTCTCTTGGAGGTATCAATCTGTTTTTCTCTTATTAATAGAGTTCAGAG
GAACGGGAGGCCGGGGAATGCGAACCGGCGCAAACTCTCGAGGCGCAAACTCTCGAGGCGCAAACTTGGC
AACAGTAACTCCCCATTCCTCCTTTCTTCCAATCCCTGGTAACGACGATTCTACTTTCTGCCTCTCTGAA
TAGATAGATGGAGCCCGACCCGAAAAAAAAGGAAGAAAGGAAGAAAGAGAAAGAAAGAAAAGAAAGGAAG
AGAAGTTAATCTCCTTCCTAGACGACGCTTAGTAAGAGCTACAGCCAGAATTAGCATCAGGTCTCTAAGA
ACAGAACGACGCTTTATCCACAGAGAAAAGAGATAGAGCTAGATCAGCTTCAATAATGAGCCTGTCTCTC
ATGTTTTCTGGGTCTTGGTAAATAACAAACCGGGCATTCGACGTTCAATCCAAATCTAGATCAACGACTT
GAGTGTGGCCCCATCAAAGACTGGAGCGACGTTCACTGAAATGATACAAGACCAGCAGGGGCGCAGGGCG
AAAGGCCCAGCGAGCCCAGCACTCAGCGGAACCTGTAGAAGCGTCGGGGGTTGCCCCGCTTCTGCGAACC
GCCCTGGGATGCGAGGTTCAGCGTCACTATCTCATCCTGGGAAAGGATGAGATCCGGGTGGGCCGAGGCT
CCGCTCAAGCTCTGTTCCCTGGGGAAGAAACCTGGAAAGTGCGAACCGCGCGTCGGGACCCAAGCGTCGG
GGTGTTGTGTGCCCTTGACGTCAGCCCGTACCGGCTCCGCCTCCGGGCGAGTTGCGACATTTTCAGTGCT
CACTGAACAGTTACGACGTTCAGACATCAAACTCATTCAAATATTACAGCCTTGATGTAAGGCACAAGTT
CATAACTGCTGTTAGGGGGTGTTGGACGACGATTCTTTCTGGCTACTTCCTGCTGAAAAGGGGCGTCGTG
GGGGCCATCCTGTGCACCGTACGATACTGAGCAGTACCCCGACCTCTACCCACTGGAGGCCAGCAGCAGC
TGTGCAGGGCCAGAGTGGCAGTCTTGAAAGGGCTGAGGAAGTGGAAGGGAGAGGGGGCGACCCGAAAGGG
ACAGGACGGAACGCTGGTAAGAGGAAATCCATCACCTCTCCTCTCCCTCTCCCAACCCTACTGTCCCAAC
CGTACGACCGACACCCACTACCCAGCTCCTTCACCACCAGGGAAGCCTCCCTCCTTCAGCTGCCATCCTT
CCAGACTCACCTGCCTTCTCATGCTTTACTCTTTCCTTTATTCAGGTGCAAGCCGTACGACTGCTGAGCT
AATACATCATGCCATTGCCTTAATAAGAGCTTATATTTCTAAGTCTTCTGAATTTTGCACGTACCGTGCA
TTTTTGTATTTTCAGTAGAGACAGGGTTTCACCATGTTGGCCAGCCTGGTCTCGAACGTCTGACCTCAGG
TTGGTATTCATGCCCTTGCATAATCAGCTCCTCTCGAACGTGGGCTGGACCTAGTGACACACTTGTACTG
ACAAAAGGAAGAGATCACCGAACCGAAGGAGTCTTAGAACATGTTATTGATCATATGGTCTCCCAGTCCT
and Blast taxonomy

Sequence from NC_000006.12 Homo sapiens chromosome 6, GRCh38.p7 having at d=73 , l=15
AACAACAAACAAACA AACTCCTGAGTTCAA AAGTAGCTGGGATTG AATATATAAAGAACT ACAAAGAAATGATAA
ACACACACAGAAATA ACATGCTACAACATG AGAAACAAACAAACA AGCACAGCCACCAAC AGCCTGGGAAACATG 
CCACTGTGGAAAACA CCCCCGAGTAGCTGG CCGAAGTAGCTGGGA CTAAGAACTTGCTTT GCCTGGGAAACATGG
GGGGAATATCACACT GGTAGTGCACACCTG TGCCATAACAAAGTA TGCCTGCCTGCCTGC TTATATTTATTTATT 
TTCATGCCACTGCAC
and Blast taxonomy





To Borce Dzinleski

These routines were written by Dzinleski Jasenko jasenko17@gmail.com who is the author of C/C++ based routines for encryption/decryption, large numbers operations, the 123SQL database engine and the simplified mariaBasic interpreter which are undergoing projects . This project is self-financing and any contributions are welcomed .
This site resulted in years long support from Borce and Dusica Dzinleski and is devoted to them and especially to my daughter Maria Dzinleska and the faith from Nada Popstefanova .The author is currently seeking for a developers job and this is his cv.

References(1)
GINN AND COMPANY
A Xerox Company
Waltham Massachussets - Toronto - London

TOPICS IN ALGEBRA

I.N.HERSTEIN


Lemma 2.22 Every permutation is the product of its cycles .


References(2)
THE MERIT FACTOR PROBLEM

PETER BORWEIN, RON FERGUSON, AND JOSHUA KNAUER

Abstract

The merit factor problem is of considerable practical interest to communications engineers and theoretical interest to number theorists. For binary sequences, although it is generally believed that the merit factor is bounded, it still has not been completely established that the number of even length Barker sequences, each with merit factor N, is bounded. In this paper, we present an overview of the problem and results of quite extensive searches we have conducted in lengths up to slightly beyond 200.

The merit factor, F, of the sequence relates energy in the sidelobes to energy in the main lobe, F = N2 / 2E

References(3)
Discrete Applied Mathematics 155 (2007) 831 ? 839
Binary templates for comma-free DNA codes from Oliver D. King , Philippe Gaborit :

Abstract
Arita and Kobayashi proposed a method for constructing comma-free DNA codes using binary templates, and showed that the
separation d of any such binary template of length n satisfies d>=n/2. Kobayashi, Kondo and Arita later produced an infinite family
of binary templates with d>=11n/30. Here we demonstrate the existence of an infinite family of binary templates with d>=n/2 -

(18n loge n)1/2.We also give an explicit construction for an infinite family of binary templates with d>=n/2 - 19n1/2 loge n.

2006 Elsevier B.V. All rights reserved.

References(4)
In DNA Sequence Design Using Template Omsha,Ltd.2002 , By Masanori Arita and Satoshi Kobayashi , published by SpringerLink (15 Febrruary 2002) , and - This paper has been selected to receive the New Generation Computing Award for Distingushed Papers - the authors state :

Abstract
Sequence design is a crucial problem in information-based biotechnology such as DNA-based computation . We introduce a simple strategy named template method that sistematically generate a set of sequences of length l such that any of its member will have approximatively 1/3 mismatches with other sequences , their complements , and the overlaps of their concatenations .

References(5)
Proc. of The Fifth Int. Workshop on Frontiers in Evolutionary Algorithms (FEA 2003) under JCIS 2003 Cary, NC, USA, September 26-30, 2003
Reliable Cost Predictions for Finding Optimal Solutions to
LABS Problem: Evolutionary and Alternative Algorithms

Franc Brglez1, Xiao Yu Li1, Matthias F. Stallmann1 and Burkhard Militzer2
1Computer Science Department, NC State University, Raleigh, NC 27695, USA
2Lawrence Livermore National Laboratory, Livermore, CA 94550, USA


Abstract

The low-autocorrelation binary sequence (LABS) problem represents a major challenge to all search algorithms, with the evolutionary algorithms claiming the best results so far. However, the termination criteria for these types of stochastic algorithms are not well-defined and no claims have been made about optimality. Our approach to find the optima of the LABS problem is based on (1) experiments with problem sizes for which optimal solutions are known, (2) an asymptotic analysis of statistics generated by such experiments, (3) reliable predictions of the cost required to find optimal solutions for larger problem sizes. The proposed methodology provides a well-defined termination criterion for evolutionary and alternative search algorithms alike.



IMPLEMENTATION

  • Bit Parity Compression
  • This is a tiny portable compression based on data 24-bit word bit parity distances .

    Bit parity complements code where n=1-65536 , following bit-parity table was produced by 16-bit permutations of 12 48 60 192 204 240 252 768 780 816 828 960 972 1008 1020 and 3 .



    16-bit parity table

    21337 0101001101011001
    21341 0101001101011101
    23369 0101101101001001
    28952 0111000100011000
    29021 0111000101011101
    29465 0111001100011001
    29517 0111001101001101
    29960 0111010100001000
    30045 0111010101011101
    30477 0111011100001101
    31517 0111101100011101
    31561 0111101101001001
    32012 0111110100001100
    32029 0111110100011101
    32092 0111110101011100


    2-byte coding based on this table in 16_table_coding_min_max__ and practical WAV scrambling in wavscr_16_16b_min_max_.
    The 240 entries 16-bit parities table and any file 2-byte data vs 240 16-bit parity table treshold bit entries generator code .


    Illustration of and a generator code for data coding using bit parity table.
    Bellow 16bit , 8KHz WAV file transformation into a white noise WAV using 240 bit parity table , upper chart transformed values , lower chart transformed values merits ranging from -1.95496 to 6.35886 .


    Merits chart from simple min-max in 8KHz WAV file transformation into a white noise WAV using 240 bit parity table merits ranging from -1.66785 to 10.39834 .


    The 128 entries 16-bit parities table and
    the 131 entries 24-bit parities table
    followed by a binary distances permutation chart , 2 and 4 bit complementary binary distances : 0 3 48 51 .


    Bellow 1152 24bit table coding (partial) chart .



    24-bit parity table

    16676387 0 111111100111011000100011
    16676386 1 111111100111011000100010
    16675364 2 111111100111001000100100
    16675360 3 111111100111001000100000
    16413730 4 111110100111010000100010
    16412705 5 111110100111000000100001
    16412704 6 111110100111000000100000
    16397315 7 111110100011010000000011
    16397314 8 111110100011010000000010
    16396289 9 111110100011000000000001
    16396288 10 111110100011000000000000
    16156144 11 111101101000010111110000
    15625775 12 111011100110111000101111
    15625770 13 111011100110111000101010
    15624749 14 111011100110101000101101
    15624744 15 111011100110101000101000
    15362088 16 111010100110100000101000
    15346699 17 111010100010110000001011
    12473890 18 101111100101011000100010
    12472865 19 101111100101001000100001
    12472864 20 101111100101001000100000
    12456453 21 101111100001001000000101
    12456452 22 101111100001001000000100
    12210213 23 101110100101000000100101
    12194822 24 101110100001010000000110
    12194818 25 101110100001010000000010
    12193793 26 101110100001000000000001
    12193792 27 101110100001000000000000
    11806144 28 101101000010010111000000
    11422248 29 101011100100101000101000
    11406922 30 101011100000111001001010
    11405832 31 101011100000101000001000
    11160618 32 101010100100110000101010
    11143180 33 101010100000100000001100
    11143177 34 101010100000100000001001

    Bellow 24bit , 8KHz WAV file transformation into a white noise WAV using 24-bit bit parity table , upper chart transformed values , lower chart transformed values merits ranging from -2.65405 to 4.57071 .




    17.03.2016 VRM 2.0.1 Download Bit parity compression File mbpc_201.zip
    05.11.2016 VRM 4.0.1 Download Bit parity compression File bpc_401.zip
    06.11.2016 VRM 4.0.2 Download Bit parity compression File bpc_402.zip
    22.12.2016 VRM 4.0.3 Download Bit parity compression File bpc_403.zip

    15.03.2016 VRM 2.0.0 Download files packaging utility mls_200.zip



  • RIFF(WAV) COMPRESSION (principia example)
  • This is an binary factor number merit scale routine implementation on an 16 bit, 8000 Hz, 64kbps Wav recording. Output file out_.wav illustrate merit numbers scale implementation .

  • BYTE ORDER COMPRESSION
  • mdiff and boc routines byte order compression utilization via (re) occuring 2-byte order pairs indexing . This routine processes each data byte and next data byte pair (s) through this C/C++ code ((((b8e&<data byte>)>>1)^((b8e&<next data byte>)>>1))<<1)|((b8o&<data byte>)^(b8o&<next data byte>)) where b8e is an 8-bit even bit mask and b8o is an 8-bit odd bit mask . Thus 2-byte (s) order (sequence) is truncated into a single byte field . Two byte resultant field (s) consisting of byte 1, byte 2 and byte 2, byte 3 for a single 16-bit dictionary entry . Thus compressed data results in ( (re) occuring entries) index number order . Index entries are bit truncated when written to the compressed data output file . Compression dictionary produced while compressing typical ASCII data file eg source code or HTML code is relatively small and average compression gains in such files are good .

    12.09.2008 VRM 1.1.0 Download File e1173.zip

  • 6-bit BINARY COMPRESSION
  • This is a fast compression routine based on a 4-byte data word translation via 6-bit bit parity table . 6 bit table entries were computed by ommitting last two bits (having decimal values 1 and 2) in a single byte (0 to 255) giving 64 index entries combination (s) per data byte . Also all truncation entries from a 4-byte data word have only 256 combination (s) . The 4 6-bit (truncated) dictionary entries are (re) indexed and writen to a compressed data output file in a truncated index number order having (binary) length (s) from 7 to 18 bits compared to the original data 32 bit (word) length . Useful for compressing large document data files (binary and text data documents) . Yet compression gain in ASCII data files is obvious due to short binary length of index number entries .


    30.03.2010 VRM 3.5.1 Download File mar6_351_1.zip
    08.10.2012 VRM 3.6.0 Download File mar6_360_1.zip

  • BINARY COMPRESSION 79
  • This a binary compression based on 2-byte data word right (low) bit truncation dependent on max of dictionary (re) occurences for each data buffer . Thus such (truncated) dictionary entries represent most of input data buffer . Dictionary entries are written to the compressed data file in a left truncating manner with the leftmost significant bit ommited , having 7 to 16 bits in length . This method performs fast and efficient data storage . This routine performs principia used in the bellow listed methods .

    14.08.2010 VRM 3.1.3 Download File mar79_313_1.zip
    18.08.2010 VRM 4.0.1 Download File mar79_401_1.zip
    20.08.2010 VRM 4.1.1 Download File mar79_411_1.zip
    06.02.2011 VRM 5.0.1 Download File mar79_501_2.zip
    16.02.2011 VRM 5.1.0 Download File mar79_510_2.zip
    12.06.2011 VRM 5.3.0 Download File mar79_530_1.zip
    29.06.2011 VRM 5.3.1 Download File mar79_531_1.zip
    15.06.2011 VRM 5.4.0 Download File mar79_540_1.zip
    25.06.2011 VRM 5.4.1 Download File mar79_541_1.zip
    20.12.2012 VRM 5.4.3 Download File mar79_543_1.zip
    20.12.2012 VRM 5.3.3 Download File mar79_533_1.zip

    mls_200 is a file packager usefull in conjunction with mar79_510_1
    To create a filesystems subtree archive : mls_200 -c (source) drive:\path...\
    To restore an mar archive : mls_200 -r (target) drive:\path...\

    15.03.2016 VRM 2.0.0 Download files packaging utility mls_200.zip

  • BINARY TEXT COMPRESSION
  • This is a fast and efficient compression example that executes fast input data indexing and dictionary occurrence search based on binary 4x4-bit long data samples . Indexed sequences are checked vs variable data length buffer .
    Thus this compression method gains speed concerning strict 4x4(16) - bit long dictionary patterns . This routine is subject of further development .

    04.09.2007 VRM 1.3.3 Download File mar9.zip

  • BINARY COMPRESSION ROUTINE
  • Binary compression methods are widely used in communications, data storage and numeric analysis. Exploring genetic complexity numeric sequences employ such methods. Some of them are presented on this site together with a command-line Win32 implementation (s) that demonstrates the capability of compression of large ASCII data files and binary files and also slightly modified in numeric data sequence analysis. This binary compression method is based on numeric sequence generated by the following binary formula as presented by the C/C++ syntax: #define op_7(x,y)(((x+y)^y)|(((x&y)!=0)?(x&y)/y:0)) . This numeric sequence represents all numbers from 0-255(8-bit) for 0-127(7-bit) arguments in an x-y matrix manner . When always x=y and x:0-127 it results in all 8-bit odd numbers . When applied on a 2-byte data sequence it results in 14 or less bits long index . Combined together with one 1-bit substract indicator it will allow compression . Using these arguments as dictionary entries coded by hi/lo/length indicators whose reocurring indexes are stored instead of the input data allows gain of an average 30% compression in large ASCII text files . This numeric sequence formula was generated by another routine written for the purpose of exploring numeric sequences generation .This is an compression Win32 command-line tool based on binary compression . This example states the speed and efficiency of this static large ASCII files compression method .

    04.09.2007 VRM 1.3.3 Download File mar.zip

  • BINARY COMPRESSION 77
  • This is a binary compression based on 2-byte long data binary shifting concatenation (bit density increase) into dictionary entries that are left truncated (common in ASCII data text files) . Compression gain depends on data redundancy in an inverse meaning . The larger the enthropy compression will increase .

    04.06.2008 VRM 1.1.0 Download File mar77.zip

  • BINARY FACTOR GROUPING COMPRESSION ROUTINE
  • This compression example uses binary pattern indexing by 2-byte sequence bit truncation from 16-12 bits in order to gain max of dictionary occurrence . This compression method is a compression gain vs unoptimized compression speed compromise .
    This example states the correctness of the genetic text complexity display routine since its dictionary covers most of the numeric sequences occurrence . Yet this compression example is subject of further development .

    21.09.2007 VRM 1.4.0 Download File mar73.zip

  • ASCII TEXT FILE FAST SORT/INDEXING Routine
  • This is a fast sorting/indexing example that builds a file position sorting tree as a result of n-depth text file line byte sorting . The sorted sequence tree may expand to further depth levels , this routine uses default depth 6. It exibits fast sorting of a text file up to the size 100K lines/rows .
    E.g.: C:\msort3 -f "War and Peace NT.txt"

    30.10.2007 VRM 1.3.1 Download File msort3.zip
    07.01.2014 VRM 1.3.1 Download File msort4_131.zip
    21.01.2014 VRM 1.4.0 Download File msort3_140.zip

  • MCALC Simple Calc routine
  • This is a simple CALC screen routine .
    The -d2 or -d4 or -d6 command line switch stands for number of decimal places . Keyboard input exit char is q and reset char is c .

    09.10.2009 VRM 1.0.1 Download File mcalc.zip

  • MDUMP ASCII Text File Sequence Redundance Dump
  • This is a ASCII text file dump method that finds text data sequences inside a ASCII text file . Usefull for creating text file data sentence definition (s) .

    18.09.2009 VRM 2.0.1 Download File mdump3.zip

  • MDADD STRING DATE ADD
  • This is a string date add routine that computes add of start date with increment in days, months and (or) years . Comand line switches requires the start date string, increment string input together with a date formating string eg mdadd 20081008 00010000 YYYYMMDD (for adding 1 year) .

    16.11.2008 VRM 1.1.1 Download File mdadd.zip

  • MDDIFF STRING DATE DIFFERENCE
  • This is a string date difference routine that computes difference between two date strings in days, months and years . Comand line switches require two date (s) string input together with a date formating string eg mddiff 19591117 20081008 YYYYMMDD .

    08.10.2008 VRM 1.1.0 Download File mddiff.zip

  • MDIFF FILE COMPARE
  • This is a file compare routine based on a bit parity comparisson of 2-byte sequences . Hashing was built on sequenced use of this C/C++ code ((((b8e&<data byte>)>>1)^((b8e&<next data byte>)>>1))<<1)|((b8o&<data byte>)^(b8o&<next data byte>)) where b8e is an 8-bit even bit mask and b8o is an 8-bit odd bit mask , examined via byte sequence group (s) count (file 1 vs file 2) compare . Command line requires file 1 name and file 2 name resulting in fast comparrison result message . The -d (detail) switch displays all difference (lines) . Useful for file compare and change tracking in document and source code files .

    26.05.2010 VRM 3.0.1 Download File mdiff_301_1.zip

  • BIT PARITY BYTE ORDER FILE CHECKSUM
  • This is a file fingerprint routine that outputs cheksum number (s) file . Hashing was built on sequenced use of this C/C++ code ((((b8e&<data byte>)>>1)^((b8e&<next data byte>)>>1))<<1)|((b8o&<data byte>)^(b8o&<next data byte>)) where b8e is an 8-bit even bit mask and b8o is an 8-bit odd bit mask . Sequenced number treshold count was computed with comarisson of original byte (bit parity) result vs generated result . Number point (s) were computed inside a 1024 byte file buffer and stored (XOR op) inside a 512 number sequence consequently . The output fingerprint file numbers state data order and integrity eg when compared vs same file (copy from restore or data transfer) cheksum (s). Command line requires input filename and cheksum output filename . Usefull for building cheksum list (s) of important data archive (s) .

    25.03.2009 VRM 1.3.0 Download File bp_boc3.zip
    09.01.2013 VRM 5.2.0
    Download msearch , mgrep , mdiff , boc , mar79 and mls in bp_tools_520_1.zip
    10.05.2013 VRM 5.3.1
    Download msearch , mgrep , mdiff , boc , mar79 and mls in bp_tools_531_1.zip


    THE RANDOM KEYS DISTRIBUTION ENCRYPTION ROUTINE

    This is a strong encryption method based on a 4 number keys random number distribution . The 4 (5 - digit) number keys provide strong encryption protection due to message hashing that is provided on random number (s) generation where the inputed keys are used as random seeds . Each user choosen key is randomized and message hash is produced with a different randomizing method . Execution requires usage of the following command line switches:

    eg r71 -a 11111 -b 22222 -c 33333 -d 44444 -e < filename to encrypt>
    and to decrypt eg r71 -a 11111 -b 22222 -c 33333 -d 44444 -f < filename to decrypt>


    where the numbers following the -a -b -c and -d switches are user chosen encryption 5 digit number keys.

    1.1(min)...
    minmv=999;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1]){n=rs[l][1];}else{


    if(minmv>rs[l][2]){minmv=rs[l][2];minl=l;}
    n=0;

    }

    }
    if(df){printf(" %d",rs[minl][0]%outm);}
    htable[hti_dmin][0]=rs[minl][0]%outm;++hti_dmin;
    ...

    1.2(max)...
    maxmv=0;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1]){n=rs[l][1];}else{


    if(maxmv<rs[l][2]){maxmv=rs[l][2];maxl=l;}
    n=0;

    }

    }
    if(df){printf(" %d",rs[maxl][0]%outm);}
    htable[hti_dmax][1]=rs[maxl][0]%outm;++hti_dmax;
    ...

    2.1(min)...
    minmv=999;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1])
    {


    n=rs[l][1];
    if(minmv>rs[l][2]){minmv=rs[l][2];minl=l;}

    }else{n=0;}

    }
    if(df){printf(" %d",rs[minl][0]%outm);}
    htable[hti_rmin][3]=rs[minl][0]%outm;++hti_rmin;
    ...

    2.2(max)...
    maxmv=0;
    for(l=0;l<rsi;++l)
    {

    if(n=0||l==0){n=rs[l][1];continue;}
    if(n==rs[l][1]||n+1==rs[l][1]||n-1==rs[l][1])
    {


    n=rs[l][1];
    if(maxmv<rs[l][2]){maxmv=rs[l][2];maxl=l;}

    }else{n=0;}

    }
    if(df){printf(" %d",rs[maxl][0]%outm);}
    htable[hti_rmax][3]=rs[maxl][0]%outm;++hti_rmax;
    ...

    (1) Each of the entered key numbers resultant distribution series (3-133)*(3-7) according to these criteria are written in a 4 column table
    (2) Each table is hashed according the bellow listed binary criteria
    (3) The 4 resulting tables are then re-hashed using the same binary criteria.

    #define op_A(w,x,y,z)(((((w&0x0000ffff)<<16)|x)&0xffff0000)|((((y&0x0000ffff)<<16)|z)&0x0000ffff))
    #define op_B(w,x,y,z)(((((x&0x0000ffff)<<16)|w)&0xffff0000)|((((z&0x0000ffff)<<16)|y)&0x0000ffff))
    #define op_E(w,x,y,z)(op_A(w,x,y,z)>op_B(w,x,y,z)?op_A(w,x,y,z):op_B(w,x,y,z))

    One out of the 4 functions running inside this encryption was used in the Game of life using 100x100 cells that outputs the generations data in a graphics BMP file format .
    Download a game of life VRM 1.3.1 at 17.07.2007 and it states the diversity of random number distributions produced .











    Try looping this encryption in the following way:

    Step 1.C:\r7 -a <key1 number> -b <key2 number> -c <key3 number> -d <key4 number> -e "filename.txt"
    Step 2.C:\r7 -a <key5 number> -b <key6 number> -c <key7 number> -d <key8 number> -e "previous_output.mar"
    ...
    ...
    Step n.

    and repeat it in the same manner n times until the desired security level is gained .

    18.12.2007 VRM 1.3.3 Download File r7.zip

    MARIAHASH THE ENCRYPTION ROUTINE

    This is a fast encryption routine using proprietary hashing method. Cypher strength depends on a large hashing number and password length . password text must be entered in a password.txt file and should have between 50 and 100 char(s) .This routine was written by the authors wish to try to improve message privacy while sent across the networks .

    09.06.2007 VRM 1.3.0 Download File 79923.zip

    THE 123SQL DATABASE ENGINE

    This is an undergoing project aimed to construct a small portable SQL database engine for PDA's, and this is a functional browsing engine that contains data and sample browsing statements . Data may be imported together with table/column creation . Typically the source data may be spreadsheet column TAB delimited export data . Database/table/column creation may be viewed in the included code following the -c switch . Table names and column names and field byte sizes should be specified, but column/field lengths my also vary in size row by row . The engine performs SQL keyword/syntax checking using the syntax/keywords list files included . Object names check and object attributes read is performed in the system database files named 123SQL_db_1.mar and 123SQL_db_2.mar . Database structure allows multiple object browsing . The sorting/searching routines require low machine resources thus meeting most modern PDA specifications and their sources were also published under different names .
    This project was founded on the authors' unique relational database engine structure design . The 123SQL engine requires the following command line syntax:

    E.g.: C:\910791 -d "Sample"
    for attaching and browsing the included database, where Sample is the database name included . When

    E.g.: C:\910791 -c "import_data_file.txt"
    the engine will create a database table and table columns as specified in the included create.txt syntax and import the data from the file name specified after the -c switch . Number of column definitions and TAB delimited fields must match, if specified column length is greater than data length space justification will occur . Supported SQL like data browsing syntax is :

    {select}

    {*|column_name|column_name_1,...column_name_n}

    {from}

    {table_name|table_name_1,...table_name_n}

    [where

    |[column_name=string_litteral|column_name>string_litteral|column_name<string_litteral]

    |[column_name>string_litteral and column_name<string_litteral]

    |[column_name[>|<]string_litteral and column_name=string_litteral]

    |[column_name=string_litteral or column_name=string_litteral or column_name=string_litteral]

    |[column_name>string_litteral and column_name<string_litteral and column_name=string_litteral]

    ]



    123SQL 15.04.2008 VRM 1.5.0 Download file 123SQL.zip




    The MariaBasic Interpreter


    The Maria Basic Interpreter is a command-line programming tool - interpreter aimed to help PDA users code formula/calculations, string and file procedures that execute on their handhelds.
    The included source code may easily (re) compile for various OS/CPU architectures , since it was written in ISO/ANSI C and requres moderate machine resources .
    Interpreter design allows fast execution of basic syntax like procedures with calculations and file/string operations. Its simplified syntax allows basic programming skills and may be used for learning , but may expand to execution of rather complex routines .
    This interpreter allows basic like (simplified) syntax commands like nesting, statement loops, and conditional execution. The ZIP archive ready for download includes a few txt files which are sample basic syntax supported nesting and file/string function example (s) .
    Source procedures may execute with a command line stating: E.g.: C:\mariabasic3505w32 -edayofweek.txt .
    The decimal to binary , day of week , bubble sort , decimal remainder , tax calc , lexical calc , calendar example sources contain code structure necessary to supply for program execution .
    Supported code syntax :

                               MariaBASIC 3.5.0.5 Coding Structure:


    1. Coding convention (s)

       1.1.Declarations:

       <varname> is a <string literal> + <type declaration> = <initial value>

       where
           <string litteral> = {[_]|[a-z]|[A-Z]|[0-9]}
           <type declaration> = {[%]|[&]|[#]|[$]}

               where % stands for integer data type
               where & stands for long integer data type
               where # stands for double data type
               where $ stands for char data type with <=256 bytes
               
           <initial value> =
           {
               [<string constant> is a single quoted literal having [a-z]|[A-Z]|[0-9]]
               |
               [<num constant> is a number literal having [0-9]|[.]]
           }

       logical expression operators are [and]|[or]|[xor]
       conditional expression operators are [>]|[<]|[=]|[>=]|[<=]|[<>]

       1.2.Program body: Declaration(s) | Statement(s) | Logical expression(s) | Simple Block Statement(s) | Nested Statement(s) | End statement

           1.2.1.Statement:

           varname[%|&|#]=[[varname[%|&|#]|[<num constant>]][^,*,/,+,-][[varname[%|&|#]|[<num constant>]]
           varname$=varname$+varname$
           varname[%|&]=len$(varname$)
           varname[%|&|#]=val$(varname$)
           varname$=trim$(varname$)
           varname$=left$(varname$,<num constant>)
           varname$=right$( varname$,<num constant>)
           varname$=mid$( varname$,<num constant>,<num constant>)
           varname$=format$(varname+{[%|&|#]},<string constant>)
           varname[%|&|#]=round$(varname#)
           open varname$ for [input]|[output] as #<num constant>
           input #<num constant>, varname$
           print #<num constant>, [<string constant>| varname[%|&|#|$][,]][;]
           close #<num constant>
           print [<string constant>| varname[%|&|#|$][,]][;]

           1.2.2.Logical expression:

           varname[%|&]=(varname[%|&|#|$][=,<>,>,<,>=,<=]varname[%|&|#|$]
               [and]|[or]|[xor]
               [ varname[%|&|#|$][=,<>,>,<,>=,<=]varname[%|&|#|$])

           1.2.3.Simple Block Statement:

           {if (<conditional expression>) then}
               <statement(s)>
           {end if}
           {while (<conditional expression>)}
               <statement(s)>
           {wend}
           {for varname[%|&]=[[<num constant>]| varname[%|&]] to [<num constant>| varname[%|&]]}
               <statement(s)>
           {next varname[%|&]}

           1.2.4.Nested Statement:

           {if (<conditional expression>) then}
               <statement(s)>
               <simple block statement(s)>
           {end if}
           {while (<conditional expression>)}
               <statement(s)>
               <simple block statement(s)>
           {wend}
           {for varname[%|&]=[[<num constant>]| varname[%|&]] to [<num constant>| varname[%|&]]}
               <statement(s)>
               <simple block statement(s)>
           {next varname[%|&]}

           1.2.5.Comment(s):
           rem <string constant>

       1.3. End Statement:
       {end}

    Maria BASIC source code
    MariaBASIC for Pocket PC 09.08.2016 VRM 3.5.0.5
    MariaBASIC 01.08.2016 VRM 3.5.0.5

    MariaBASIC Number Permutation Cycle Function (ASCII Text Rhymes) output having all 1(s),2(s),...9(s) as input and its C/C++ code (!) in num_c_perm.cpp .

    The Eleven Comedies , an english translation of Aristophanes et al Comedies (from Project Gutenberg eBook) Part 1 , chart 1
    and chart 2 from Part 2 ,
    War and Peace , an english translation of Tolstoy (from Project Gutenberg eBook) permutation chart ,
    The Notebooks of Leonardo Da Vinci , an english translation of Leonardo Da Vinci (from Project Gutenberg eBook) permutation chart generated by num_c_perm_2.cpp .


    THE FAST (ASCII and Unicode) TEXT FILES SEARCH ROUTINE

    This is a fast text search routine that allows single (or quoted composite) string search throughout an ASCII or Unicode text (text containing) file(s) . Unicode search will also allow strings containing mixtures of different Unicode table(s).
    E.g.:
    1. (ASCII search) msearch3 <ASCII_input_filname.txt> <search_string>
    2. (Unicode search) msearch3 <Unicode_input_filname.txt>
    (search string in Unicode file uarg.txt and search results in Unicode file ures.txt)


    03.07.2008 VRM 1.1.1 Download File msearch3.zip

    THE FAST ASCII TEXT FILES SEARCH ROUTINE

    This is a fast text search routine that allows multi string (up to 10 search strings containing one or more words within) search throughout an ASCII text file . So, each search string (quoted) may have one or more words. The -s switch allows any match, while the -e switch allows only exact match.
    E.g.: C:\msearch -s(-e) "package install"+"media"+"component" -f "FreeBSD Handbook.html"
    E.g.: C:\msearch -s(-e) "network devices installation" -f "FreeBSD Handbook.html"
    E.g.:C:\msearch -s(-e) "trodes in his hands" -f "book_sd.txt"
    E.g.:C:\msearch -s(-e) "Bezukhov and Natasha"+"Buonaparte Napoleon"+"Pierre" -f "War_and_Peace_NT.txt"
    The program output will display all results along with their line number file positions, the unique and composite sentence search phrase matches together with their total occurrence count.

    15.04.2008 VRM 1.3.3 Download File msearch.zip
    09.04.2010 VRM 1.4.1 Download File msearch_141_1.zip

    mSearch4 Single sentence , single file :
    13.03.2016 VRM 2.0.0 Download File mSearch4_200.zip

    mSearch4 Single sentence , file folders tree walk :
    12.03.2016 VRM 1.5.0
    Download File mGrep4_150.zip


    mSearchSen(tence) 4 , 1-10 search strings divided by logical conjunction and inclusion symbols
    - Example , single search sentence : mSearchSen4_120 -s"Mucius|Scaevola" -f"War_and_Peace_NT.txt"
    - Example , search two sentence(s) in conjunction : mSearchSen4_120 -s"Scaevola&burned" -f"War_and_Peace_NT.txt"
    - Example , search two sentence(s) with inclusion : mSearchSen4_120 -s"Scaevola&burned&hand" -f"War_and_Peace_NT.txt"
    10.03.2016 VRM 2.0.0
    Download File mSearchSen4_200.zip



    THE ASCII TEXT FILES SENTENCE CONTEXT SEARCH ROUTINE

    This is a text file complex search routine that allows text search build on the context - sentence words concerning a given subject . This search allows automated search criteria build depending on sentence words contents and user choice . Sentence words files and their sentence links are built during the indexing phase for a given text file . After indexing, the routine will display all sentences for a chosen sentence subject (as enlisted in the words file) and allow detailed context search and all sentences display concerning the chosen context .
    For the indexing type:E.g.: C:\dp_13_201_1 -f "War_and_Peace_NT.txt"
    For the context search type:E.g.: C:\msearch_141_d_1 -s(e) "Bagration" -f "War_and_Peace_NT.txt"
    The -s switch enables any match search when d was chosen, and -e switch enables only exact word match. The included files contain the examples book already indexed. Typically the search word is a name, or a subject that is being often described and attributed in the book text . So after viewing/choosing the desired sentence/search combination all text lines containing the chosen words will be displayed . Thus viewing book contents by desired subject details requires smaller amount of time .

    15.04.2008 VRM 1.3.0 Download File r113.zip

    05.09.2013 VRM 2.1.0 Download File text_file_context_search_210_1.zip
    05.09.2013 VRM 3.2.0 Download File text_file_context_search_320.zip

    This package contains :
    (1) the dictionary routine and
    (2) mSearchSen4 routine that allows multile text sentences serches with logical conjunctions and inclusions
    10.03.2016 VRM 4.0.1 Download File text_file_context_search_401.zip

    THE FONT IMAGE RECOGNITION ROUTINE

    This routine creates a vector shape sequence file (using -i switch) out of an 100x100 pixels 24 bit colour depth black and white image representing a character true type image (font) or character freehand drawing . Then using the -c switch the two index files derived from two different images are compared and graphics match result is displayed .
    For the indexing type:
    E.g.: C:\cr13 -i "Drawing1.bmp" "Drawing1_Index.txt"
    For the comparison of two different index files type:
    E.g.: C:\cr13 -c "Drawing1_Index.txt" "Drawing2_Index.txt"
    At present the routine builds shape vectors on black/white bitmaps, it does not support different resolution nor colors/color depth.
    But how does it work?

    (1) indexing, creates vector txt file (that might be the meta character file) out of the bmp image file in the following manner:
    - inverts the b/w file matrix (the way human eye sees it),
    - searches for quadrants (10x10 pixels sized) with 40/60% b/w ratio, thus finding character image edges (up to 8 pairs in the same row),
    - creates vectors out of each quadrant,
    - shifts quadrants by (only) few pixels UP since bmp edges do not always REALLY represent character ID curves, repeating vector creation ...
    and
    (2) comparison of two vector files:
    - shifts back all X-axis values subtracting them by absolute minX value,
    - computes curve angles out of each quadrant values,
    - computes resultant angles out of quadrant pairs building most real character curves,
    - compares the two vector files angle pairs,
    - computes match statistics .


    This development is aimed for PDA users using easier ways for text input.
    To Maria Dzinleska

    27.04.2007 VRM 1.0.1 Download File cr13.zip

    THE ROUTINE THAT GENERATES THE PRIME NUMBERS KEY PAIR OUT OF THEIR PRODUCT

    These routines were written during and for the www.rsa.com prime key numbers context that requires finding the exact prime numbers key pair out of a very large (256,512...1024... bits long) product number. The routines were written in java and use the BIGINTEGER java class in order to compute the prime key pair .The starting point routine finds a prime numbers key pair with product_number_bit_length/2 bit length that give sufficient accuracy (near as far as possible) to the product number, the more the preciseness the more the computing time to spend . So the loop that computes the suggested starting prime number pair is limited with the corresponding number of equal product-target significant digits . The remaining procedures consequently perform a very long (all 1's and trailing ZEROS) 111 ... *10^N substraction (s) from the suggested key pair measuring the distance (difference) from the target product number by subsequent multiplication checks . At the point of diverging found and at a certain preciseness (number of equal significant digits) a new key pair may be generated through the first routine . Than the process has to be repeated while gaining more and more equal product-target significant digits .

    23.07.2006 Download File Welcome.zip

    How do these computations compute a very similar or near prime key pair out of a large product key?

    Exmining the bellow listed mariBasic code and its (partial) output shows a few number products appearing at large division loop distances and having a 0000 period between decimal remainder values . Testing those (listed) numbers might prove that most of them are prime numbers . Testing large (200 decimal or more) product keys in this way would take indefinite time . So, the WelcomeQ routine uses a substraction operation on a proposed prime key pair . The routine that generates prime key pairs that have a given decimal target product number match is based on a binary field seed number modification basing only on target match numbers as match search loop starting point . The substractor (having the (decimal) value of e.g. 1111111111000000000000000) shifts the 1111111111 period to the right by approoving that this way truncated prime key pair product matches more and more decimals to the target product number . Actually there are sets of prime kepairs obtaining a certain decimal match .Usually it is necessay to switch between different pairs in order to increase the decimal match of the product . And that is the main iteration of this method sometimes requiring examining and rejecting large number of prime key pairs in order to gain one or more decimal match more . Gaining a 100 decimals precisenes on a common PC computer thus would not be hard to achieve . These computations generate prime keys having computable decimal match gain or complete product number match compared to a given huge product number .

    Brief order and explanation of execution steps:

    rem Short multiply factor pair routine
    rem example in MariaBasic 3.2.1.1
    rem April , 26 , 2012


    Varn$='3539572063110071 '
    Varn1$=''
    Varn2$=''
    Varn3$=''
    Varn4$=''

    Vard1#=0
    Vard2#=0
    Vard3#=0
    Vard4#=0
    Vard5#=0
    Vard6#=0

    Vari1%=0
    Vari2%=0
    Vari3%=0
    Vari4%=0
    Vari5%=1

    while (Vari5%<=12)

    Vard1#=val$(Varn$)
    Varn1$=mid$(Varn$,Vari5%,3)
    Vari1%=len$(Varn$)
    Vari2%=len$(Varn1$)
    Vari3%=Vari1%+1-Vari2%-Vari5%
    Vard2#=val$(Varn1$)
    Vard2#=Vard2#+0
    Vard2#=Vard2#*10^Vari3%
    Vard6#=Vard1#/Vard2#


    print '--------------------'
    Varn2$=format$(Vard1#,'000000000000000')
    print Varn2$
    Varn2$=format$(Vard2#,'000000000000000')
    print Varn2$
    Varn2$=format$(Vard6#,'000000000000000')
    print Varn2$
    print '--------------------'

    Vari5%=Vari5%+3

    wend

    print '--------------------'
    print '--------------------'

    Vard11#=0
    Vard12#=0
    Vard13#=0

    Vard21#=0
    Vard22#=0
    Vard23#=0
    Vard24#=0
    Vard25#=0

    Vars11$='9570000000000'
    Vars12$='369'

    Vard11#=val$(Vars11$)
    Vard12#=val$(Vars12$)
    Vard13#=Vard13#+Vard11#*Vard12#
    Varn2$=format$(Vard13#,'000000000000000')
    print Varn2$
    Vard21#=val$(Varn2$)

    print '--------------------'
    print '--------------------'

    Vars11$='2060000000'
    Vars12$='1718238'

    Vard11#=val$(Vars11$)
    Vard12#=val$(Vars12$)
    Vard13#=Vard13#+Vard11#*Vard12#
    Varn2$=format$(Vard13#,'000000000000000')
    print Varn2$
    Vard22#=val$(Varn2$)

    print '--------------------'
    print '--------------------'

    Vars11$='3110000'
    Vars12$='1138126065'

    Vard11#=val$(Vars11$)
    Vard12#=val$(Vars12$)
    Vard13#=Vard13#+Vard11#*Vard12#
    Varn2$=format$(Vard13#,'000000000000000')
    print Varn2$
    Vard23#=val$(Varn2$)


    Vard24#=Vard24#+Vard21#+Vard22#/1000+Vard23#/1000000

    print '--------------------'
    print '--------------------'

    Varn2$=format$(Vard24#,'000000000000000')
    print Varn2$

    Varn2$=format$(Vard1#,'000000000000000')
    print Varn2$

    Vard25#=Vard1#-Vard24#

    print '------difference---'

    Varn2$=format$(Vard25#,'000000000000000')
    print Varn2$

    print '--------------------'

    Dzinleski Jasenko - jasenko17@gmail.com

    Mailing Address:
    +38922770296
    Dositej Obradovik 15/8
    1000 Skopje Republic of Macedonia


    (1) All published data, executables and sources from this site described above apply to GNU General Public License as published by the Free Software Foundation and can not be used, copied, sold, redistributed or used in any other way but only by written permission by Jasenko Dzinleski . Copyright (C) from 2001 - 2012 and later by Jasenko Dzinleski
    (2) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version , if not opposite to (1) .
    (3) This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE . See the GNU General Public License for more details .
    You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc ., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA .