We're pleased to announce a "better, faster and cheaper" EZ-Metrix.
What's New?
1. The Comparison feature now uses a more accurate and 33% faster algorithm (Levenshtein Distance).
2. The Difference threshold feature now works as a percentage of the line, instead of a fixed number of characters.
3. The Price has been reduced to attract new users...now only $9.95 per month, or $49.95 per year.
4. The first month is still FREE.
5. Now, EZ-Metrix supports more than 80 programming languages are supported (and counting).
6. New help system, now with complete definitions of all measures.
7. Updated User Guide, Admin Guide and Overview slides.
Now also on facebook! Stop by and say "like."
Many Thanks!
Thanks to Jason Jones and his team at Creative Software Services for their support on this project!
Thanks also to Mike Mah (QSM Associates) for the inspiration for this new version.
Thursday, December 4, 2014
Sunday, October 26, 2014
Levenshtein Distance detects differences in lines of code
After chatting with a colleague Michael Mah about the finer points of code counters a couple weeks ago, I've been studying ways to detect differences between 2 lines of code for my EZ-Metrix product.
The problem is not simply to detect differences of any kind, as this could be accomplished by a simple string compare (e.g., does string A = string B?). The challenge is in deciding if the extent of the differences between the lines constitutes, in effect, a completely new line of code. In other words, at what point does the accumulation of differences between two lines of code exceed what one might characterize as 'changes' to the point where the line is considered brand new?
This research has revealed a solution called the Levenshtein Distance (LD) algorithm (en.wikipedia.org/wiki/Edit_distance). In simple terms, the LD is essentially the minimum number of character edits (adds, modifies or deletes) that it takes to change string A into string B. If applied to a code counter, this LD value could be used along with a threshold value to decide if a line of code has been modified (LD value below a threshold), or is new (LD value above a threshold).
In EZ-Metrix, I plan to implement a % threshold, which gives the user a (configurable) way to set their own difference threshold, which is relative to each line's length. This way, short and long lines get treated the same, with respect to characterizing the line as either modified or new. An EZ-Metrix user may want to choose a threshold value of, for example 75%, which would count a line as new, if the number of changed characters exceeds 75% of the number of characters in the original line. Otherwise, the line would be counted as modified.
In the near future, a new version of EZ-Metrix (v5.0) will be released, which implements LD, among other improvements.
I'm hopefully this improvement will be embraced by the industry.
The problem is not simply to detect differences of any kind, as this could be accomplished by a simple string compare (e.g., does string A = string B?). The challenge is in deciding if the extent of the differences between the lines constitutes, in effect, a completely new line of code. In other words, at what point does the accumulation of differences between two lines of code exceed what one might characterize as 'changes' to the point where the line is considered brand new?
This research has revealed a solution called the Levenshtein Distance (LD) algorithm (en.wikipedia.org/wiki/Edit_distance). In simple terms, the LD is essentially the minimum number of character edits (adds, modifies or deletes) that it takes to change string A into string B. If applied to a code counter, this LD value could be used along with a threshold value to decide if a line of code has been modified (LD value below a threshold), or is new (LD value above a threshold).
In EZ-Metrix, I plan to implement a % threshold, which gives the user a (configurable) way to set their own difference threshold, which is relative to each line's length. This way, short and long lines get treated the same, with respect to characterizing the line as either modified or new. An EZ-Metrix user may want to choose a threshold value of, for example 75%, which would count a line as new, if the number of changed characters exceeds 75% of the number of characters in the original line. Otherwise, the line would be counted as modified.
In the near future, a new version of EZ-Metrix (v5.0) will be released, which implements LD, among other improvements.
I'm hopefully this improvement will be embraced by the industry.
Monday, September 1, 2014
EZ-Metrix for Windows?
After 6 years in the field, EZ-Metrix v4.1 has produced many comments from over 500 users. Now, finally we're considering taking action on these comments.
What's on the drawing board? A version of EZ-Metrix for Windows.
This "v5.0" will incorporate many of the features our users have been asking for:
1. Run locally, with no need to connect to the internet to measure source code.
2. Support more programming languages.
3. Support languages with complex rules (e.g., PHP).
4. Ability to delete individual reports (Admin only).
5. Support languages with multiple comment delimiters (e.g., JSP).
6. Improve the difference threshold from characters to percent.
7. Repair defects in the comparison algorithms.
If you have additional requested improvements for EZ-Metrix, let us know! We'll do our best to include as many as possible.
What's on the drawing board? A version of EZ-Metrix for Windows.
This "v5.0" will incorporate many of the features our users have been asking for:
1. Run locally, with no need to connect to the internet to measure source code.
2. Support more programming languages.
3. Support languages with complex rules (e.g., PHP).
4. Ability to delete individual reports (Admin only).
5. Support languages with multiple comment delimiters (e.g., JSP).
6. Improve the difference threshold from characters to percent.
7. Repair defects in the comparison algorithms.
If you have additional requested improvements for EZ-Metrix, let us know! We'll do our best to include as many as possible.
Subscribe to:
Posts (Atom)