Exact probability of fixed patterns occurring in a random sequence
We derive a procedure to obtain the exact probability that a specific pattern of letters occurs in a longer random sequence of letters. The procedure is generalized to find the exact probability of a fixed (specific) single pattern, and a union or intersection of multiple fixed (specific) patterns w...
Saved in:
Published in: | Communications in statistics. Simulation and computation Vol. 51; no. 9; pp. 4867 - 4882 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Philadelphia
Taylor & Francis
27-09-2022
Taylor & Francis Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We derive a procedure to obtain the exact probability that a specific pattern of letters occurs in a longer random sequence of letters. The procedure is generalized to find the exact probability of a fixed (specific) single pattern, and a union or intersection of multiple fixed (specific) patterns within a random sequence perfectly for any distributions of a cell in the random sequence, and can handle patterns with uncertain letters (including missing, blank, unclear, ambiguous, transposition, etc.). The procedure also finds the probability that a pattern that is randomly picked will appear in a separate longer random sequence of letters. These methods are of particular applicability in genetic sequence analysis, diagnostics, anthropology, clinical medicine, data mining, computational molecular biology, and pattern analysis and recognition. |
---|---|
ISSN: | 0361-0918 1532-4141 |
DOI: | 10.1080/03610918.2020.1766500 |