|
|
 |
 |
 |
 |
diff multi-line whitespace to verify Beautifier output
Hi, We're reformatting a lot of our project code using the excellent uncrustify beautifier. However, to gain confidence that it really is only changing whitespace (forget { } issues for just now), we were hoping to do a diff - a textual comparison of the files, ignoring whitespace. However, most diffs we've tried can't handle multi-line whitespace, so the following two prototypes are deemed to be different: void doStuff( int a, float b); void doStuff(int a, float b); Has anyone found a way to do a diff like this, that handles multi-line whitespace? Shug
Shug wrote: > We're reformatting a lot of our project code using the excellent > uncrustify beautifier. > However, to gain confidence that it really is only changing whitespace > (forget { } issues for just now), we were hoping to do a diff - a > textual comparison of the files, ignoring whitespace. > However, most diffs we've tried can't handle multi-line whitespace, so > the following two > prototypes are deemed to be different: > void doStuff( int a, float b); > void doStuff(int a, > float b); > Has anyone found a way to do a diff like this, that handles multi-line > whitespace?
I would actually do it differently: tokenize both sources. If the set of tokens is the same, you have the same source (now, don't ask me where you can find C++ tokenizers, I don't know, GIYF). The other way is to convert both of those into the third type of formatting (which should give you the exactly same output) and compare them. If the formatter make mistakes, it's likely to make them independently. V -- Please remove capital 'A's when replying by e-mail I do not respond to top-posted replies, please don't ask
On Feb 12, 3:24 pm, "Victor Bazarov" <v.Abaza@comAcast.net> wrote:
> Shug wrote: > > We're reformatting a lot of our project code using the excellent > > uncrustify beautifier. > > However, to gain confidence that it really is only changing whitespace > > (forget { } issues for just now), we were hoping to do a diff - a > > textual comparison of the files, ignoring whitespace. > > However, most diffs we've tried can't handle multi-line whitespace, so > > the following two > > prototypes are deemed to be different: > > void doStuff( int a, float b); > > void doStuff(int a, > > float b); > > Has anyone found a way to do a diff like this, that handles multi-line > > whitespace? > I would actually do it differently: tokenize both sources. If the set > of tokens is the same, you have the same source (now, don't ask me where > you can find C++ tokenizers, I don't know, GIYF). The other way is to > convert both of those into the third type of formatting (which should > give you the exactly same output) and compare them. If the formatter > make mistakes, it's likely to make them independently.
A simple third format would be one where every whitespace is replaced by a newline, which will give a format that is easy to compare (and I think it will still be valid C++ :-) -- Erik Wikstrm
Erik Wikstrm wrote: > On Feb 12, 3:24 pm, "Victor Bazarov" <v.Abaza @comAcast.net> wrote: >> Shug wrote: >>> We're reformatting a lot of our project code using the excellent >>> uncrustify beautifier. >>> However, to gain confidence that it really is only changing >>> whitespace (forget { } issues for just now), we were hoping to do a >>> diff - a textual comparison of the files, ignoring whitespace. >>> However, most diffs we've tried can't handle multi-line whitespace, >>> so the following two >>> prototypes are deemed to be different: >>> void doStuff( int a, float b); >>> void doStuff(int a, >>> float b); >>> Has anyone found a way to do a diff like this, that handles >>> multi-line whitespace? >> I would actually do it differently: tokenize both sources. If the >> set of tokens is the same, you have the same source (now, don't ask >> me where you can find C++ tokenizers, I don't know, GIYF). The >> other way is to convert both of those into the third type of >> formatting (which should give you the exactly same output) and >> compare them. If the formatter make mistakes, it's likely to make >> them independently. > A simple third format would be one where every whitespace is replaced > by a newline, which will give a format that is easy to compare (and I > think it will still be valid C++ :-)
It wouldn't be valid C++ without some continuation characters (\) in macro definitions. And broken up include directives aren't going to work either. :-) V -- Please remove capital 'A's when replying by e-mail I do not respond to top-posted replies, please don't ask
|
 |
 |
 |
 |
|