DIY Homophone Checker in Microsoft Word

Homophones marked with an index tag (XE)

Homophones marked with an index tag (XE)

I am a very poor speller, but I write a lot. I often think of myself not so much as a writer, but as a manufacturer of typos. I have found word processors to be a blessing and curse on this front. I can’t imagine writing a book-length work without one. And yet, even though word processors have built-in spellcheckers they often silently correct mistyped words. Many of these corrections, although correctly spelled words, are not the correct word in context. That is, they are homophone errors, words that are spelled correctly but do not mean what I intended them to mean. To make matters worse, if you are like me, these words often resemble the correct word and in some cases I can’t even really tell them apart without carefully scrutinizing them. In trying to help me, word processors have actually compounded the problem by incorrectly ordering my jumbled typing into errors that are invisible to me. Can’t live with/without them….

I’ve become aware that among writers and editors there are two types of people. Those who can naturally spell and those who cannot spell. The natural spellers can see these errors. Each error is like jabbing them in the eye with an unfolded paper clip. They will say things, “How can you be so careless to write something full of words that jab me in the eye with a tiny, sharp object? Don’t you feel the pain?” Actually no. I don’t. They don’t bother me. It’s not that I ignore them; I can’t see them. But, I do not want people reading my books to suddenly feel like I’ve jabbed them in the eye with an unfolded paper clip.

The solutions to this is an easy yet elusive one. I can pay a copy editor who has this ability to get wounded by homophone errors to identify them. It is elusive, not to mention expensive, because I have no idea if they have caught the errors. I can’t tell. It is expensive for me because a good copy editor earns about 21 US Dollars an hour and can at most check about 1,500 words in that time. (Of course a good editor comes with all kinds of other abilities than purely catching homophone errors, but for me that is the biggest deal, but fact checking, name checking, and just plain re-parsing weird phrasings is totally great but expensive when you consider that I typically earn zero dollars on a published short story.)

Homophone checking would seem to be something that computers would be good at it. It is based on a defined word list. Natural language processing should be able to determine if the word has been used correctly or not. And yet, most grammar checkers and spell checkers, even those that promise homophone checking such as Grammatik (now part of Word Perfect) do a lousy job of checking them except for the most common ones such as “Their/there/they’re” and “its/it’s.”

In fact the standard spellchecker or autocorrecter tends to conceal homophones.

There are some commercial, dedicated homophone checkers on the market targeted toward dyslexics. I used a macro that was compatible with MSWord on the PC called RWord. It merely flagged all of the possible homophone errors in a document, but I had no way of adjusting the list, and so words such as “the” and “I” were always flagged. I couldn’t adjust the list, or figure it out, and then when I upgraded to Word 2007 it stopped working. It never worked in Mac Office.

I used ClaroRead from Claro Software and it did the same thing as the RWord, but had the ability to adjust the word list. However, the software costs more than 180 dollars. It included some other gee-gaws, such as reading the text out loud and other features that have been built into the Mac and PC now for years.

I realized I could use a list of homophones and flag their occurrence in a Microsoft Word file using Word’s concordance feature. I could then evaluate each possible homophone in context and quickly look up words in the dictionary to check the correct use.

This post provides a link to a concordance-ready Word file and the steps you can follow to check possible homophone errors in your own document.

DIY Homophone Checker in MS Word

Prerequisite
You will need a copy of Microsoft Word 2003 or greater. These instructions are written for a Word 2008 (MAC) and Word 2007 (PC), but should work with MAC Office X (2000), 2004, and for the PC MS Word 2000 and later.

The archive americanhomophones.zip, (which you can access here) contains two files: Americanhomophones.txt and Homophone_Concordance.rtf.  The text file (.txt) is a text only version of the list in case you would like to create your own concordance file or use the information in some way. This list comes from The Handbook of Homophones by William Cameron Townsend (1975) and updated by Evan Antworth at the Summer Institute of Linguistics. The only modification I made to his list was to place each homophone string on its own line in the Word file.

A note about Word and Concordances
In Microsoft Word a concordance file is a document with a Word table. The file only contains the table. The Word table has two columns the first column contains a word or phrase used to mark index entries. The second column contains the index entry. Typically, the concordance file is used to auto-generate an index. Word reads the file and marks each entry in the source document. It marks with an filed using the tag XE. In this case we are going to use the list to mark possible homophone errors.

To check homophones in Word 2008 (MAC)

  1. Download the archive americanhomophones.zip, and unpack the archive.
    The archive contains two files: Americanhomophones.txt and Homophone_Concordance.rtf.
    The rich text format file (.rtf) is a Microsoft Word concordance file created from the list. You will need to use the RTF file in step 4.
  2. Open the document you would like to check in MS Word, and save as, and rename the file, for example, “documentname_homo.docx.”
  3. Select Insert from the MS Word Menu in the document you would like to check. Select Index and Tables, and then Automark. The Choose a File dialogue box will open.
  4. Navigate to your homophone concordance list, select the file and click Open. Word will mark each possible homophone error with an index tag.
  5. In the document your are checking, click Show Invisible (the reverse P) to make sure that invisible codes are revealed. Each possible homophone error will be followed by an XE tag.
  6. Sweep through the document to check each occurrence. As you accept them, you can remove the tag.
  7. When you have questions about words, you can check the meaning of the word.To quickly check the meaning of the Word in MS Word
    a. Selecting the word
    b. Right-click, select Look Up, and then select Definition.
    c. Word’s built in dictionary will open to show the word’s part of speech and definition.

To check homophones in Word 2007 (PC)

  1. Download the archive americanhomophones.zip, and unpack the archive. The archive contains two files: Americanhomophones.txt and Homophone_Concordance.rtf.
    The rich text format file (.rtf) is a Microsoft Word concordance file created from the list. You will need to use the RTF file in step 4.
  2. Open the document you would like to check in MS Word, and save as, and rename the file, for example, “documentname_homo.docx.”
  3. Click the Reference tab. From the ribbon, select Insert Index, and then click Automark. The Open Index Automark dialogue box will open. Select All Word Documents in the File type box. The file I provided is an RTF file and will not be visible by default.
  4. Navigate to your homophone concordance list, select the file and click Open.
    Word will mark each possible homophone error with an index tag.
  5. In the document your are checking, click Show Invisible (the reverse P in the ribbon on the Home tab) to make sure that invisible codes are revealed. Each possible homophone error will be followed by an XE tag.
  6. Sweep through the document to check each occurrence. As you accept them, you can remove the tags.
  7. When you have questions about words, you can check the meaning of the word.To quickly check the meaning of the Word in MS Word
    a. Selecting the word
    b. Right-click, select Look Up.
    c.  Word’s built in dictionary will open to show the word’s part of speech and definition.

Notes about using Word concordance for checking homophone errors

You can create a list of the flagged items by creating an index using the XE Tags. Word will generate a list of words and their page numbers, just as it would with an index.

You can easily add or remove items from the homophone concordance list by editing the list. The list I provided is from Evan Antworth’s update of  The Handbook of Homophones by William Cameron Townsend. For my own use, I needed to augment this list with common errors “sounds similar” errors and remove errors that I’m not likely to make. For instance, I don’t often misuse the pronoun “I.”

Note, the list is case sensitive, so you may want to include first capitalized and lowercase entries.

You can also use problem phrases rather than just single words.

Looking up each error may be time-consuming. MS Word’s built in dictionary is also not the greatest dictionary on the planet.

I use Merriam-Webster and Bartelby.com also has some great online, searchable references that include a discussion of common errors.

Finally, tagging each potential error with an index entry is kludgy. It does the trick. But I can easily imagine an improvement would be to have the ability, such as some of the commercially available programs, to have a dialogue box that would allow you to assess and make changes and incorporate notes from a reference guide and dictionary about the difference between terms. In addition, you could tune the program to ignore words that are not a problem, or to add words that are a problem. Finally, an even more sophisticated program would use natural language processing to filter out easily corrected errors based on the grammatical slot occupied by the word. This wouldn’t work for words occurring as the same part of speech, but for instance could easily dispatch with affect and effect.

, , ,

Comments are closed.
%d bloggers like this: