The Edit Distance, also known as Levenshtein Distance, is a widely used algorithmic concept in computer science and natural language processing. It measures the similarity between two strings by calculating the minimum number of single-character insertions, deletions, or substitutions required to transform one string into the other.
Step-by-Step Approach:
- We are given two input strings,
str1andstr2. - The goal is to find the minimum number of edit operations needed to transform
str1intostr2. - The possible edit operations are insertion, deletion, and substitution of a single character.
- To solve this problem, we use a Dynamic Programming approach.
- We create a 2D DP array of size
(m+1) x (n+1), wheremandnare the lengths ofstr1andstr2, respectively. - Initialize the DP array with base cases:
dp[i][0] = ianddp[0][j] = j, representing the number of edit operations required to convert an empty string tostr1orstr2. - We then iterate through the DP array, starting from
(1, 1)to(m, n). - For each position
(i, j)in the DP array, we check the characters atstr1[i-1]andstr2[j-1]. If they are equal, no edit operation is needed at this position, so we setdp[i][j]to the same value asdp[i-1][j-1]. - If the characters are different, we consider three edit operations:
- Insertion:
dp[i][j] = dp[i][j-1] + 1 - Deletion:
dp[i][j] = dp[i-1][j] + 1 - Substitution:
dp[i][j] = dp[i-1][j-1] + 1
- We update
dp[i][j]to the minimum value among the three edit operations. - After completing the iteration, the value at
dp[m][n]represents the minimum edit distance betweenstr1andstr2.
Java Solution:
public class EditDistance {
public int minDistance(String word1, String word2) {
int m = word1.length();
int n = word2.length();
int[][] dp = new int[m + 1][n + 1];
for (int i = 0; i <= m; i++) {
dp[i][0] = i;
}
for (int j = 0; j <= n; j++) {
dp[0][j] = j;
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
dp[i][j] = dp[i - 1][j - 1];
} else {
int insertion = dp[i][j - 1] + 1;
int deletion = dp[i - 1][j] + 1;
int substitution = dp[i - 1][j - 1] + 1;
dp[i][j] = Math.min(Math.min(insertion, deletion), substitution);
}
}
}
return dp[m][n];
}
public static void main(String[] args) {
String word1 = "intention";
String word2 = "execution";
EditDistance editDistance = new EditDistance();
System.out.println("Minimum Edit Distance: " + editDistance.minDistance(word1, word2));
}
}
Python Solution:
class EditDistance:
def minDistance(self, word1: str, word2: str) -> int:
m, n = len(word1), len(word2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = i
for j in range(n + 1):
dp[0][j] = j
for i in range(1, m + 1):
for j in range(1, n + 1):
if word1[i - 1] == word2[j - 1]:
dp[i][j] = dp[i - 1][j - 1]
else:
insertion = dp[i][j - 1] + 1
deletion = dp[i - 1][j] + 1
substitution = dp[i - 1][j - 1] + 1
dp[i][j] = min(insertion, deletion, substitution)
return dp[m][n]
if __name__ == "__main__":
word1 = "intention"
word2 = "execution"
edit_distance = EditDistance()
print("Minimum Edit Distance:", edit_distance.minDistance(word1, word2))
Explanation of Java/Python Solution:
The Java and Python solutions implement the Edit Distance (Levenshtein Distance) problem using Dynamic Programming to find the minimum number of edit operations required to transform one string into another.
The minDistance method in Java and minDistance method in Python take two input strings word1 and word2 and return the minimum edit distance between the two strings.
The provided example with word1 = "intention" and word2 = "execution" demonstrates how the Dynamic Programming approach calculates the minimum edit distance as 5. The minimum number of edit operations required to convert “intention” to “execution” is 5, which includes 3 deletions and 2 insertions.