[1+1=2]

OneAndOneIs2

« Level upThe Internet »

Wed, Jan 15, 2014

[Icon][Icon]Letting vim take the strain

• Post categories: Omni, FOSS, Technology, Programming, Helpful

I recently embarked on a project at work to begin trying to replace Subversion with Git. Unfortunately, svn is not only used by people checking in code: It's also called by the code itself. So I need to find all instances in the code where either 'svn' is called, or where a 'Subversion' module is used.

I figured the best way to do this was grep all the files, dump the output to file, manually filter out the ones that I could ignore, and so end up with a list of files I need to look at.

I used my editor of choice, vim, for handling the grep output. I thought it might work as a good example of some of the slightly less well-known ways vim can make your life easier. So, here's what I did:

Firstly, filter out warnings/errors. Grep will tell you about things like binary files that it can't output, but I wasn't interested in binaries, only text.

The format grep outputs for matches I was interested in is:

(filename):(matching line)

So nice & simple, I can presumably filter out all lines that don't have a colon. Firstly, I like to verify my assumptions by doing a search for the regex I propose to filter with. In this case, that was:

/^[^:]*$

Broken down:

  • / = Search for the following regex
  • ^ = begin of line (in this context)
  • [] = character class - define a set of characters to match
  • ^ = not any of the following characters (in this context)
  • [^:] = any character that's not a colon
  • * = any number of the preceeding character
  • $ = end of line

so ^[^:]*$ = find all lines that are made up entirely of non-colons

Because I have search highlighting turned on, this lit up every match of my regex in the current screen of text. A quick scan told me that this was indeed DTRT - no false matches, etc. So now I want to check the entire file with this regex. This calls for grep. Or rather, g/re/p - Globally search for a REgex and Print matching lines.

So the ex command I want is of course:

:g/^[^:]*$/p

However, I don't need to type all that, nor C&P the regex. No, because I know that when typing out ex commands, hitting 'ctrl-r' followed by the '/' key will insert the current search term into the command. So I instead type

:g/'ctrl-r /'/p

and now I get shown every matching line in the file. Again, all looks good. So I want to delete all these lines. Again, I can shortcut this: Pressing the up-arrow in ex mode takes me to the previous command, so

:'up'

get me back to my grep, and then replacing the trailing 'p' with a 'd' sets the line to delete rather than print matching lines.

Just like that, I have filtered out the first bunch of unwanted results.

Next, I want to set up highlighting to make filenames and the matching 'svn/Subversion' terms easier to see, so I can identify false positives as easily as possible. So, define a couple of syntax terms with the relevant regexes:

:syntax match filename "^[^:]\+:" (Use "ctrl-R /" again)
:highlight filename ctermfg=magenta

:syntax match svn "svn\|Subversion"
:highlight svn ctermfg=red

And now filenames are magenta and uses of 'svn' or the 'Subversion' module are highlighted red.

Now that it's all visually marked, there's not much else I can do beyond wade through it all and try to whittle it down.

Once that's done, I want to get from my list of matching lines to a list of filenames. This is two steps: First, whittle down each line to the filename:

:%s/\(^[^:]\+\):.*/\1/

This is basically the same regex as I used at the start, but the escaped parentheses capture what's inside them, which is everything up to the first colon. It then replaces the entire line with just the captured text, which in this case is the filename.

(I much prefer Perl regexes where you don't have to escape quite so many special characters, but c'est la vie)

Lastly, any file that had more than one mention of subversion will be duplicated. This can be fixed by dumping to the standard 'uniq' command:

:%!uniq

The % indicates "entire document", the ! indicates that the command should be executed via the shell. So this command says "Run the 'uniq' command on the entire document" and now we're down to every unique filename that I need to look at.

Now I just need to go through all of them and mark the ones I actually care about. Don't think vim can save me from this one...


No feedback yet

 

[Links][icon] My links

[Icon][Icon]About Me

[Icon][Icon]About this blog

[Icon][Icon]My /. profile

[Icon][Icon]My Wishlist

[Icon]MyCommerce

[FSF Associate Member]


November 2014
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30

Search

User tools

XML Feeds

eXTReMe Tracker

Valid XHTML 1.0 Transitional

Valid CSS!

[Valid RSS feed]

blogging tool