firehand

Prometheus 6   

Do not make the mistake of thinking that because my conclusion is the same as another person's that my reasoning is the same

September 09, 2003

 

You think Teenage Mutant Ninja Turtles have a problem with The Shredder?

Look at the position SCO is in.

Will Linux Luminary 'Shred' SCO's Unix Claims?
By Peter Galli

Linux luminary Eric S. Raymond is taking the fight with The SCO Group right back to the basics: he has developed a utility known as a comparator that looks for common code segments in large source trees and which, on an Athlon 1.8 GHz box, has an effective comparison rate of over 55,000 lines per second.

…His comparator, the code for which can be downloaded here, uses a variant of an algorithm called "shred," which bears a resemblance to some techniques used for DNA sequencing.

The source trees get sliced into overlapping three-line shreds. The shreds then get turned into a list of 32-byte signatures by a process called MD5 hashing; each signature keeps information about its file and line number range.

"If the MD5 signatures are different, then the shreds that they were made from are different. When they match, it is almost certain than the two shreds they were made from are the same, to within odds of eighteen quadrillion to one. MD5 is normally used for making unforgeable digital signatures, but the side effect I'm exploiting is that it gives you a fast way to compare texts for equality," Raymond told eWEEK on Monday.

So, once all the signatures from all the code trees have been included in the comparator, all the "unique" signatures are then thrown out, leaving a list of shreds with duplicate signatures or common code segments. From there it is just report generation, he said.

Posted by P6 at September 9, 2003 06:24 AM | Trackback URL: http://www.prometheus6.org/mt/mt-tb.cgi/1594
Comments
Post a comment
WARNING:I have no problems altering your message to something personally embarrassing if you're rude









Remember personal info?