Python pattern matching algorithm tuple error

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • DossarLX ODI
    Batch Manager
    Game Manager
    FFR Simfile Author
    • Mar 2008
    • 14989

    #31
    Re: Python pattern matching algorithm tuple error



    This picture is showing what's happening in the iterative loops for our dp matrix. dp[0][0] gives us the final score of the best path (or paths in the case of a tied score) available.

    It's basically doing what I said the backtrace algorithm did in my earlier posts in the thread -- the path is there, but the sequences are in reverse so when the strings are saved they have to be reversed to be the normal alignment.

    Now there's a problem of saving the new aligned sequences into two string variables. possibleScore1 could have the same value as possibleScore2 when max is called etc. so saving the characters of the string is a lot more complex than what I originally thought. This is to save the string of the best alignment.
    Originally posted by hi19hi19
    oh boy, it's STIFF, I'll stretch before I sit down at the computer so not I'm not as STIFF next time I step a file

    Comment

    • Reincarnate
      x'); DROP TABLE FFR;--
      • Nov 2010
      • 6332

      #32
      Re: Python pattern matching algorithm tuple error

      If you want to return a score-maximizing solution, you could use a second matrix to keep the corresponding substrings based on maximal score at each step of the dp, or something:

      untested air code:

      Code:
      g=-5
      dp = [[0 for j in xrange(len(seq2)+1)] for i in xrange(len(seq1)+1)]
      bestPieces = [[("","") for j in xrange(len(seq2)+1)] for i in xrange(len(seq1)+1)]
      
      for i in xrange(len(seq1)+1):
          dp[i][len(seq2)] = g*abs(len(seq1)-i)
          bestPieces[i][len(seq2)] = (seq1[i:],"-"*abs(len(seq1)-i))
      for j in xrange(len(seq2)+1):
          dp[len(seq1)][j] = g*abs(len(seq2)-j)
          bestPieces[len(seq1)][j] = ("-"*abs(len(seq2)-j),seq2[j:])
      
      for i in xrange(len(seq1)-1,-1,-1):
          for j in xrange(len(seq2)-1,-1,-1):
              possibleScore1 = g + dp[i+1][j]
              possibleScore2 = g + dp[i][j+1]
              possibleScore3 = mat[baseMap[seq1[i]]][baseMap[seq2[j]]] + dp[i+1][j+1]
              bestScore = max([possibleScore1, possibleScore2, possibleScore3])
              dp[i][j] = bestScore
      
              if bestScore==possibleScore1:
                  bestPieces[i][j] = (seq1[i] + bestPieces[i+1][j][0], "-" + bestPieces[i+1][j][1])
              elif bestScore==possibleScore2:
                  bestPieces[i][j] = ("-" + bestPieces[i][j+1][0], seq2[j] + bestPieces[i][j+1][1])
              else:
                  bestPieces[i][j] = (seq1[i] + bestPieces[i+1][j+1][0], seq2[j] + bestPieces[i+1][j+1][1])
              
      print dp[0][0], bestPieces[0][0]
      You don't have to "reverse" anything (if I am understanding you correctly) -- although technically, sure, DP is generally a bottom-up type of thing where you work backwards from smaller subproblems and work your way towards the final result you're after. But you don't have to store things in reverse if you don't want to.
      Last edited by Reincarnate; 02-9-2014, 08:45 PM.

      Comment

      Working...