Showing posts with label Vim. Show all posts
Showing posts with label Vim. Show all posts

2009-07-27

Regular Expressions in Vim

Regular expressions are a fantastic thing to keep in your arsenal. To get a rough idea of what regular expressions are for, check out this xkcd comic on regular expressions if you haven't already seen it. Basically, regular expressions allow you to find and act on patterns in a file or group of files. I'm going to be looking at this from how you can use regular expressions with Vim, but regular expressions are certainly available in a much larger capacity throughout linux.

When learning a new programming language or programming concept, I find it useful to be able to see examples of code that other people have written. Unfortunately, this attack is probably not quite as helpful for regular expressions. Reading regular expressions can be quite tricky and may require a fair amount of time and energy to decipher a regular expression if you don't have any notes handy to explain them. Just a hint from my personal experience, you should probably leave a comments in any program or script you write to explain any regular expressions contained within. If you don't leave comments, you may find yourself looking back at the regular expressions in your code and feeling like you're reading brainfuck. The best way to learn regular expressions, in my experience, is to dive right in and start writing them yourself.

That all said, let's get started here with where the bulk of the work is done with regular expressions: metacharacters. Metacharacters, or escaped characters, are special characters that represent something else in a regular expression. That maybe doesn't make a whole lot of sense as you read it, but, hopefully, it will be clearer with the following list of metacharacters:
  • . - Represents any character except the "new line" character

  • \n - Represents the "new line" character

  • \s - Represents any whitespace character (e.g. space, tab, etc.)

  • \S - Represents any non-whitespace character

  • \d - Represents any numerical digit (0-9)

  • \D - Represents any non-numerical character

  • \x - Represents any hex digit (0-f), case insensitive

  • \X - Represents any non-hex digit

  • \o - Represents any octal digit (0-7)

  • \O - Represents any non-octal digit

  • \h - Represents any head of word character (a-z,A-Z,_)

  • \H - Represents any non-head of word character

  • \p - Represents any printable character

  • \P - Represents any non-digit printable character

  • \w - Represents any word character

  • \W - Represents any non-word character

  • \a - Represents any alphabetic character

  • \A - Represents any non-alphabetic character

  • \l - Represents any lowercase character

  • \L - Represents any non-lowercase character

  • \u - Represents any uppercase character

  • \U - Represents any non-uppercase character

I know, that's quite a list, but once you start writing some regular expressions, it's not too bad to refer to a list and before you know it, you'll find you won't even have to reference any list. Now that we've covered the metacharacters, the next place we want to look is how to denote the number of times something is to be repeated. The answer is with another class of escaped characters, called quantifiers. Quantifiers can be either greedy or non-greedy. Greedy quantifiers try to match as many times as possible. Non-greedy quantifiers try to match as few times as possible. This should be clear once I run down the list of quantifiers and use them in a couple examples:
  • * - Matches 0 or more of the preceding characters, as many as possible (greedy)

  • \{-} - Matches 0 or more of the preceding characters, as few as possible (non-greedy)

  • \+ - Matches 1 or more of the preceding characters, as many as possible (greedy)

  • \= - Matches 0 or 1 of the preceding characters

  • \{n} - Matches the preceding characters exactly n times

  • \{n,m} - Matches the preceding characters at least n times and at most m times, as many as possible (greedy)

  • \{-n,m} - Matches the preceding characters at least n times and at most m times, as few as possible (non-greedy)

  • \{n,} - Matches the preceding characters at least n times, as many as possible (greedy)

  • \{-n,} - Matches the preceding characters at least n times, as few as possible (non-greedy)

  • \{,m} - Matches the preceding characters at most m times, as many as possible (greedy)

  • \{-,m} - Matches the preceding characters as most m times, as few as possible (non-greedy)

We have enough to go through some common examples.

Find dates in YYYY-MM-DD format (two equivalent expressions):

/\d\d\d\d-\d\d-\d\d
\d\{4}-\d\{2}-\d\{2}

There is another point that I should make, the / character is a special character in Vim, so it has to be escaped. For example, if you want to find dates in mm/dd/yyyy format:

/\d\{1,2}\/\d\{1,2}\/\d\{4}

Before we can do much with finding and repacing text within Vim, there is one more thing we should go over. A lot of times when finding and replacing text, we will want to keep part of the patern that we find and have it in a different (or even keep it in the same) place. This will probably be easier if I just do this with an example. The following will convert all dates from MM/DD/YYYY format to YYYY-MM-DD format:

:%s/\(\d\{2}\)\/\(\d\{2}\)\/\(\d\{4}\)/\3-\1-\2/g

There are a couple things to explain in the above example. The \( and \) characters are not actually searched for, but are used to delimit the paterns that you want to keep for the text to replace with. The \n characters are mapped to the paterns that are surrounded by \( and \) characters in order. In other words, the first thing in the "find" portion of the command surrounded by \( and \) characters maps to \1.

What if you want to turn a list with each value on separate lines into a comma separated list? Here's how:

:%s/\n/,/

What about that problem from the xkcd comic linked above? To find text formatted as an address I will assume the following about the address:
  1. The address is of the form that you would use to mail a letter in the United States

  2. The name is two words (i.e. first and last name only)

  3. The house number is no more than 4 digits

  4. The street name ends with a common ending: st., ave., blvd., etc. and none longer than four characters.
  5. The zip code is in extended form: #####-####

Before the example, I should explain that since the . character is a metacharacter, if you want to search for a period character specifically, we need to escape it like this: \.. Also, the ^ character is used by Vim to denote the beginning of a line and the $ character is used to denote the end of a line. Here's my example of a search command to find text formatted like an address:

/^\w*\s\w*\n\d\{,4}\P*\w\{2,4}\.\n\P*,\s\u\{2}\s\d\{5\}-\d\{4}$

I know that's kind of long, but let me break this example down one part at a time:
  • ^ - Makes sure that it starts finding the address at the beginning of a line

  • \w* - This is to find the first name

  • \s - This is to find the space between the first and last name

  • \w* - This is to find the last name

  • \n - This is to make sure that there is nothing else on this line

  • \d\{,4} - This looks for a house number of at most 4 digits

  • \P* - This looks for the text of the street name, and takes into account street names with hyphens or multiple words

  • \w\{2,4}\. - This looks for the street ending (e.g. st. or blvd) ending with a period character

  • \n - This makes sure that there is nothing else on this line

  • \P*, - This looks for the city name ending with a comma character, and takes into account cities with hyphens or multiple words

  • \s - Makes sure there is a space between the city and state abbreviation

  • \u\{2} - This looks for the two digit state abbreviation

  • \s - This makes sure there is a space between the state abbreviation and the zip code

  • \d\{5} - This looks for the first five digits of the extended zip code

  • - - This makes sure there is a hyphen separating the two sections of the zip code

  • \d\{4} - This looks for the last four digits of the extended zip code

  • $ - This makes sure that there is nothing else on this line

I know that this may all seem rather daunting, especially if you've never really worked with regular expressions, but the best advice I can give is what I said before: start trying to write your own regular expressions. Sure, you'll make mistakes at first; heck, I still make mistakes when I write regular expressions. But, at least you can learn from your mistakes.

Well, that pretty much does it for tonight's post. Have fun writing regular expressions. See you next time.

2009-07-23

Vim: Visual Modes

I realized that in my last post, I mentioned the Visual and Visual Line modes in Vim without actually mentioning what you would use these for. The visual modes are used for selecting text and doing things like cutting and copying text, or in vim-speak deleting and yanking text, and for running commands on the selected text. Once you change to visual mode, moving the cursor around will automatically select text. Visual line mode works much like visual mode, only you can only select entire lines.

Cutting and copying in visual modes:
  • y - yanks (copies) the selected text

  • d - deletes (cuts) the selected text

Cutting, copying, and pasting in normal mode
  • :y or yy or Y - yanks the current line

  • :ny - yanks line number n

  • :n,my - yanks line numbers n through m, inclusive

  • :d or dd - deletes the current line

  • D - deletes from the current cursor position to the end of the line

  • :nd - deletes line number n

  • :n,md - deletes from line number n through m, inclusive

  • p - puts (pastes) text from the clip board starting to the right of the cursor, or below the cursor if putting entire lines of text

  • P - puts text from the clip board starting to the left of the cursor, or above the cursor if putting entire lines of text

Another important thing when you are editing anything is to know how to undo and redo changes made. This is really easy to do in Vim:
  • u - undo the most recent change

  • U - undo all of the most recent changes to the current line

  • Ctrl + r - redo the last undone change

That pretty much does it for this post. See you next time

2009-07-19

Vim: an Introduction

If you saw my favorite linux applications post, then you'll know that Vim is my text editor of choice. Vim is a very powerful text editor, but it does come with a bit of a learning curve. I figured I would present an introduction into Vim that should help in getting to the point where using Vim is not a challenge.

I suppose the first thing that we should do is cover how to open a file in Vim:

$ vim filename

This will open the file filename, and if the file does not exist, Vim will create the file once you save. Probably the first thing everyone notices when they first open up Vim is that they cannot immediately enter text into the file. The reason for this is that Vim is a modal editor with the following modes: Insert, Replace, Visual, Visual Line, and Normal modes. Vim opens up by default in Normal mode which is where you can issue commands to the Program.

Another common problem that people face when they are first learning to use Vim is that they have a hard time keeping track of which mode they are in. The first thing I will point out is that at the bottom of the Vim window, it does tell you what mode you're in; it will display the following:
  • -- INSERT --: When in Insert Mode

  • -- REPLACE --: When in Replace Mode

  • -- VISUAL --: When in Visual Mode

  • -- VISUAL LINE --: when in Visual Line Mode

  • Or it will be blank when in Normal Mode

Part of the reason that people have a hard time keeping track of what mode they are in is that once the open up Vim, they set it to Insert Mode and leave it there. If you only set Vim to Insert mode when you are actually inserting text into the file and leave it in Normal mode otherwise, this will cut down on the confusion and encourage you to learn Vim commands, rather than just using Vim like Notepad.

OK, enough for my introductional rant, let's get down to the business at hand and start talking about how to use Vim:

  • Changing Modes
    Before we start discussing the commands and how to use the various modes, it may be a good idea to know how to change between them so that you don't get yourself stuck somewhere in unfamiliar territory. The first thing I will point out is that from Normal mode, you can get to any of the other modes, but from any other mode, you have to return to Normal mode before you can change to a different mode. Here is a list of hot-keys that you can use to change modes:
    • i - Puts you into Insert Mode where the cursor currently sits

    • I - Puts you into Insert Mode at the beginning of the current line

    • a - Puts you into Insert Mode one character after where the cursor currently sits

    • A - Puts you into Insert Mode at the end of the current line

    • s - Puts you into Insert Mode and deletes the character immediately under the cursor

    • S - Puts you into Insert Mode and deletes the current line

    • r - Puts you into Replace Mode for only one character where the cursor currently sits

    • R - Puts you into Replace Mode and keeps you there where the cursor currently sits

    • v - Puts you into Visual Mode where the cursor currently sits

    • V - Puts you into Visual Line mode on the current line

    • Esc - Returns you back to Normal Mode



  • Moving around in Normal Mode
    The first thing I will say about Normal Mode is that there are, what I call, hot-keys and commands. Hot-keys are keys that when pressed do something immediately. Commands start with either the : character for normal commands or the / character for search commands. Now I will run down a list of hot-keys and commands for moving around in Normal Mode in Vim
    • h - Moves the cursor to the left one character

    • j - Moves the cursor down one line

    • k - Moves the cursor up one line

    • l - Moves the cursor to the right one character

    • w - Moves the cursor to the right one word

    • b - Moves the cursor to the left one word

    • $ - Moves the cursor to the end of the line

    • ^ - Moves the cursor to the beginning of the line

    • gg - Moves the cursor to the first line of the file

    • G - Moves the cursor to the last line of the file

    • :n - Moves the cursor to line number n


  • Searching for and Replacing Text
    I shouldn't even have to explain how important it is to be able to do find and replace commands inside a text editor. I use them all the time and they make things go a whole lot faster. Searching for text in Vim is very easy:

    /text

    Typing the above while in Command Mode will move the cursor to the next appearance of text after the cursor's current position. To move to the next appearance, simply type n; to move to the previous appearance, type N.

    Replacing text is a little bit more complicated, but here is the basic syntax:

    :[region]s/[text to find]/[text to replace with]/[options]


    This may look a little daunting, but let's just break it down one piece at a time:
    • [region]
      The region tells the command what section of the file to do the find and replace on. You can use a number to do the replace only on one line, you can use two numbers separated by a comma to replace between two lines inclusive, or you can use the % character to replace throughout the entire document. Here are three examples:

      :5s/find/replace/
      :5,10s/find/replace/
      :%s/find/replace/

      The first example with substitute the first instance of find with replace on line number 5. The second example will substitute the first instance of find with replace on each line between line numbers 5 and 10, inclusive. The third example will substitute the first instance of find with replace on each line in the entire file.

    • [text to find] and [text to replace with]
      These two sections are where the bulk of the work for the substitute command is done. This is where you identify the text that you want to replace and what you want it to be replaced with. The real power of these areas comes when you start using regular expressions, which I will cover in a future post.

    • [options]
      The options are where you tell the substitute command how to behave. If you do not specify any options, then it only performs the substitution on the first match on each line in the specified region. Here is a list of the options available:
      • g - This option will perform the substitution on all matches in the specified region

      • c - This option will prompt you for confirmation before making substitutions

      • i - This option will ignore case when looking for matches

      The above options can be combined. For example if you wanted to find do a substitue on al matches ignoring case, you could do:

      %s/find/replace/gi


    • Working with files
      To really be able to use Vim effectively, you're going to have to know how to open and save files as well as exit the program when you are done. Here is a list of commands for handling these actions:
      • :e [filename] - This will open the file [filename] in Vim

      • :w - This will save the open file

      • :sav [filename] - This will save the open file as [filename]

      • :q - This will exit Vim

      • :q! - This will exit Vim, discarding any changes since the last save

      • :wq - This will save the open file and exit Vim

      • :x - This behaves much like :wq, it will save the open file and exit Vim, but will not save if no changes have been made

    Well, that does it for a pretty basic introduction into Vim. See you next time.
  •