python - Traceback in Smith-Wateman algorithm with affine gap penalty -
i'm trying implement smith-waterman algorithm local sequence alignment using affine gap penalty function. think understand how initiate , compute matrices required calculating alignment scores, clueless how traceback find alignment. generate 3 matrices required have following code
for j in range(1, len2): in range(1, len1): fxopen = f[i][j-1] + gap xextend = ix[i][j-1] + extend ix[i][j] = max(fxopen, xextend) fyopen = f[i-1][j] + gap yextend = iy[i-1][j] + extend iy[i][j] = max(fyopen, yextend) matchscore = (f[i-1][j-1] + simmatrixdict[seq1[i-1]+seq2[j-1]]) xscore = ix[i-1][j-1] + simmatrixdict[seq1[i-1]+seq2[j-1]] yscore = iy[i-1][j-1] + simmatrixdict[seq1[i-1]+seq2[j-1]] f[i][j] = max(0, matchscore, xscore, yscore)
i unsure if need single matrix traceback, or 1? clarification on how go tracing max score in f appreciated.
the important thing remember traceback in smith-waterman matrix value in determines direction move. so, if in f
you're moving diagonally, if you're in ix
, you're moving horizontally, , if you're in iy
, you're moving vertically. means need store in pointer matrix matrix arrived @ square from. matrix coming from, not 1 going to, determines dirction go.
for example:
say @ f[5][5]
:
- if pointer matrix says go
ix
, goix[4][4]
- if pointer matrix says go
iy
, goiy[4][4]
- if pointer matrix says go
f
, gof[4][4]
whereas if @ ix[5][5]
:
- if pointer matrix says go
ix
, goix[4][5]
- if pointer matrix says go
f
, gof[4][5]
or if @ iy[5][5]
:
- if pointer matrix says go
iy
, goiy[5][4]
- if pointer matrix says go
f
, gof[5][4]
assuming first index x coordinate , second y coordinate.
continue tracing until reach cell maximum value of 0.
building pointer matrix: need 1 pointer matrix each f
, ix
, , iy
. these matrices need indicate matrix value came from, because tells direction moving in. so, you're running through dynamic programming phase of algorithm, should building pointer matrices. every time store new maximum value in cell in f
, ix
, or iy
, should update corresponding matrix indicate came from. if, instance, highest value can have in f[5][5]
comes aligning 2 next bases when you're in f[4][4]
, fpointer[5][5] should set f
, because got there f
matrix.
Comments
Post a Comment