Monday, June 15, 2015

Edit Distance | Leetcode

Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)
You have the following 3 operations permitted on a word:
a) Insert a character
b) Delete a character
c) Replace a character

---

Solution: The only thing that works is dynamic programming on a 2D matrix. Great details available on wikipedia: https://en.wikipedia.org/wiki/Levenshtein_distance

Defining the edit distance function f(y,x):
For 1 <= y <= len(word1) and 1 <= x <= len(word2):
   f(y, x) = minimum {
                         f(y-1, x) + 1,
                         f(y, x-1) + 1,
                         f(y-1, x-1) + (word1[y] == word2[x] ? 0 : 1)
                     }
OR
   f(y, x) = minimum of:
                         DELETE_COST
                         INSERT_COST
                         (word1[y] == word2[x] ? NO_OP_COST : REPLACE_COST)

There are two different ways to implement the same algorithm:
1. Use a len1 x len2 matrix.
2. Use two arrays of the same length, either len1 or len2, depending on the direction of iteration.

The first algorithm is more straightforward to implement. The second algorithm is a space optimization of the first algorithm, and thus is a little more complex than the first algorithm. But the idea is identical in both cases. You probably should use the first algorithm in normal interview cases, but be prepared to also use the second algorithm if you do too well and the interviewer wants you to optimize for space. :-)

Algorithm 1: O(N^2) space
public class Solution {
    public int minDistance(String word1, String word2) {
        int len1 = word1.length();
        int len2 = word2.length();
        if (len1 == 0) return len2;
        if (len2 == 0) return len1;
        
        // For all y and x, m[y,x] will hold the edit distance between
        // the first y characters of word1 and the first x characters of word2.
        // Note that m has (len1 + 1) * (len2 + 1) values.
        int[][] m = new int[len1 + 1][];
        for (int y = 0; y <= len1; y++) {
            m[y] = new int[len2 + 1];
        }
        
        // Cost of dropping all characters from word1.
        // Note: we must include m[len1][0] also.
        for (int y = 1; y <= len1; y++) {
            m[y][0] = y;
        }
        
        // Cost of inserting all characters from word2.
        // Note: we must include m[0][len2] also.
        for (int x = 1; x <= len2; x++) {
            m[0][x] = x;
        }
        
        for (int y = 1; y <= len1; y++) {
            int char1 = word1.charAt(y - 1);

            for (int x = 1; x <= len2; x++) {
                int char2 = word2.charAt(x - 1);

                int leftCostPlusInsert = m[y][x - 1] + 1;
                int topCostPlusDelete = m[y - 1][x] + 1;
                int minCost = Math.min(leftCostPlusInsert, topCostPlusDelete);
                int replacementCostIfAny = (char1 == char2) ? 0 : 1;

                m[y][x] = Math.min(minCost, m[y - 1][x - 1] + replacementCostIfAny);
            }
        }
        
        return m[len1][len2];
    }
}

Algorithm 2: O(N) space
public class Solution {
    public int minDistance(String word1, String word2) {
        int len1 = word1.length();
        int len2 = word2.length();
        if (len1 == 0) return len2;
        if (len2 == 0) return len1;
        
        int[] a1 = new int[len2 + 1];
        int[] a2 = new int[len2 + 1];
        
        for (int x = 1; x <= len2; x++) {
            a1[x] = x;
        }

        for (int y = 1; y <= len1; y++) {
            a2[0] = y;
            int char1 = word1.charAt(y - 1);

            for (int x = 1; x <= len2; x++) {
                int char2 = word2.charAt(x - 1);

                int leftCostPlusInsert = a2[x - 1] + 1;
                int topCostPlusDelete = a1[x] + 1;
                int minCost = Math.min(leftCostPlusInsert, topCostPlusDelete);
                int replacementCostIfAny = (char1 == char2) ? 0 : 1;

                a2[x] = Math.min(minCost, a1[x - 1] + replacementCostIfAny);
            }
            
            // Swap a1 and a2.
            int[] tmp = a1;
            a1 = a2;
            a2 = tmp;
        }
        
        return a1[len2];
    }
}

No comments: