Python pattern matching algorithm tuple error

**DossarLX ODI** · 02-9-2014, 04:34 PM

Re: Python pattern matching algorithm tuple error

This picture is showing what's happening in the iterative loops for our dp matrix. dp[0][0] gives us the final score of the best path (or paths in the case of a tied score) available.

It's basically doing what I said the backtrace algorithm did in my earlier posts in the thread -- the path is there, but the sequences are in reverse so when the strings are saved they have to be reversed to be the normal alignment.

Now there's a problem of saving the new aligned sequences into two string variables. possibleScore1 could have the same value as possibleScore2 when max is called etc. so saving the characters of the string is a lot more complex than what I originally thought. This is to save the string of the best alignment.

**Reincarnate** · 02-9-2014, 07:19 PM

Re: Python pattern matching algorithm tuple error

If you want to return a score-maximizing solution, you could use a second matrix to keep the corresponding substrings based on maximal score at each step of the dp, or something:

untested air code:

Code:

g=-5
dp = [[0 for j in xrange(len(seq2)+1)] for i in xrange(len(seq1)+1)]
bestPieces = [[("","") for j in xrange(len(seq2)+1)] for i in xrange(len(seq1)+1)]

for i in xrange(len(seq1)+1):
    dp[i][len(seq2)] = g*abs(len(seq1)-i)
    bestPieces[i][len(seq2)] = (seq1[i:],"-"*abs(len(seq1)-i))
for j in xrange(len(seq2)+1):
    dp[len(seq1)][j] = g*abs(len(seq2)-j)
    bestPieces[len(seq1)][j] = ("-"*abs(len(seq2)-j),seq2[j:])

for i in xrange(len(seq1)-1,-1,-1):
    for j in xrange(len(seq2)-1,-1,-1):
        possibleScore1 = g + dp[i+1][j]
        possibleScore2 = g + dp[i][j+1]
        possibleScore3 = mat[baseMap[seq1[i]]][baseMap[seq2[j]]] + dp[i+1][j+1]
        bestScore = max([possibleScore1, possibleScore2, possibleScore3])
        dp[i][j] = bestScore

        if bestScore==possibleScore1:
            bestPieces[i][j] = (seq1[i] + bestPieces[i+1][j][0], "-" + bestPieces[i+1][j][1])
        elif bestScore==possibleScore2:
            bestPieces[i][j] = ("-" + bestPieces[i][j+1][0], seq2[j] + bestPieces[i][j+1][1])
        else:
            bestPieces[i][j] = (seq1[i] + bestPieces[i+1][j+1][0], seq2[j] + bestPieces[i+1][j+1][1])
        
print dp[0][0], bestPieces[0][0]

You don't have to "reverse" anything (if I am understanding you correctly) -- although technically, sure, DP is generally a bottom-up type of thing where you work backwards from smaller subproblems and work your way towards the final result you're after. But you don't have to store things in reverse if you don't want to.

Python pattern matching algorithm tuple error

Comment

Comment