Corruption in MS Word Document

Issue

When opening, closing or saving a document, you are warned that the document may be corrupt.

Solution

Being prompted with a message that the document that you're trying to open may be corrupt is one of the worst experiences that you can have when working with any software application. Corrupted documents can cause any application to exhibit unusual behavior if you are able to open the document. Such behavior occurs because the application attempts to make decisions about what to do next based on incorrect information in the corrupted document.

Ways to avoid corruption

  1. Turn off Word Fast Save whether in or out of RoboHelp. You can access this feature by going to the Word drop down menu TOOLS | OPTIONS and clicking the "Save" tab.



    This is a feature added to Word, beginning with Word 6.0 which actually appends all newly saved information, instead of fully saving the new document. The document can bloat until it finally corrupts.
  2. Refrain from embedding objects unless absolutely necessary. To check for embedded objects, turn on field codes, and insert the elements into the document without embedding wherever possible.
  3. Always work in True Code mode. RoboHelp simply embeds buttons when displaying the document in Dynamic WYSIWYG mode. If you are creating a highly graphical document, sooner or later, that one last embedded object will corrupt the file, and you will no longer be able to save the file. Use Active Test instead to test your topics.
  4. Per Microsoft, avoid frequently "round-tripping" a document from one document type to another or from one Word format to another program's document format.
  5. Keep documents less than several hundred pages where possible. If a document totally corrupts, the smaller the document, the better. You do not want to have to recreate a 1000-page document. You can have as many documents in a RoboHelp project as you need so limit your documents to 100-150 pages or less when possible.
  6. Avoid storing, accessing, and saving a file from diskette. A diskette is useful for transporting files, but once you are ready to work on the file again, be sure to save the file to your computer's hard drive.

Identifying a Corrupted Document

Corrupted documents often exhibit behavior that is not part of the program's design (for example, infinite repagination, incorrect document layout and formatting, unreadable characters on the screen, error messages during processing, system hangs or crashes when you load or view the file, or any other unusual behavior that cannot be attributed to the normal operation of the program). These behaviors can be caused by factors other than document corruption.

To rule out other factors, use the following troubleshooting steps:

  1. Check for similar behavior in other documents.
  2. Check for similar behavior in other programs.
  3. Take the file in question to another computer and attempt to duplicate the behavior.
  4. Use a different printer driver and attempt to duplicate the behavior.
  5. Rename any templates attached to the document and attempt to duplicate the behavior.
  6. Change other system components (such as video drivers or fonts) and attempt to duplicate the behavior. For example, if you are using an OEM version of a video driver, switch to a Microsoft Windows video driver using the Windows Setup program.
  7. Disable any third-party programs that are running (such as terminate- and-stay-resident programs [TSRs], font managers, screen savers, and system shells), then attempt to duplicate the behavior.

If the problem occurs only with a single document after performing the steps above, your document has probably been corrupted.

There are several techniques you can use to try to correct a corrupted document.

Which method you use depends on the nature and severity of the corruption and the nature of the behavior exhibited. Although many of the methods that follow succeed regularly, not every corrupted document can be recovered. A backup copy of the document is the best way to recover a corrupted document.

Note: There is never any reason to create a new RH project to handle document corruption. A RH project does not in any way create nor contribute to Word document corruption.

When following the instructions below, do not save until a step explicitly asks you to save. If you do, you might see broken links or duplicate files. Make a copy of your project before attempting any of the suggestions below. Suggestions offered are in the order of least radical to most radical. Proceed at your own risk.

Phase One (Complete this phase only if you are working in WYSIWYG mode)

  1. Assuming that you can save the document, convert to True Code if possible. Either way, follow the procedure below until you can first save the document, and then until you can convert to True Code mode.
  2. If you cannot save or convert the document (in that order), turn on field codes, and begin removing embedded objects a few as a time or start by removing authorable buttons, writing notes to yourself in color where the button should reside, indicating the function of the button.
  3. Repeat until you can successfully convert to True Code. At some point, you will have removed enough embedded objects to accomplish this conversion.
  4. Save and compile.
  5. Recreate lost buttons, embedded objects, etc., but do so in True Code mode.
  6. Compile.

Phase Two

  1. Round-trip the file to .rtf and back to .doc.
  2. Save and compile.

Phase Three

  1. Open the suspect document and copy everything except the last paragraph to a new Word document outside of RH.
  2. Close and delete the suspect document (you are working on a copy of the entire RH project, remember?).
  3. Create a new document in the RH project with the same name as the one you just deleted.
  4. Copy and paste everything from the document you created outside RH into the new document in RH.
  5. Save and compile.

Phase Four

  1. Repeat Phase Three, except instead of copying the entire document, copy in bits and pieces, but do not copy any suspect parts of the document, i.e., complex tables, embedded objects (which you should have removed in the first phase,) etc.
  2. Save and compile.
  3. From this point forward, everything remaining in the normal Word document is suspect.
  4. Return suspect parts of the document a few at a time (without any paragraph markers) to the new .doc file, and save and compile until you run into a problem.
  5. Repeat step 3 until you have successfully restored the document.

Phase Five

  1. Copy from the normal Word document to the RH document one paragraph at a time, without copying any paragraph markers, skipping any suspect parts of the documents.
  2. Recreate (do not copy) suspect portions of the document in the new RH document.
  3. Save and compile.

Phase Six

  1. Copy everything from the corrupt document into a text editor, and save as corruptedfile.txt.
  2. Insert the text into your new RH document.
  3. Reformat.
  4. Relink.
  5. Recompile.

Recover Text From Any File

If a Word document will not open and you need to at least get a copy of the text within the document, you can choose File > Open, and in the Files of type drop-down list, select Recover Text From Any File. Using the Recover Text From Any File converter does have its limitations. Document formatting will be lost, along with anything that is not of a text nature. Graphics, fields, drawing objects, and so on, will not be converted. However, headers, footers, footnotes, endnotes, and field text, will be retained as simple text.

NOTE: If the Recover Text From Any File converter is not installed, you will need to re-run Setup to install this converter.

NOTE: When you change the Files of type box to Recover Text from any File in the Open dialog box (on the File menu, click Open), Word will 'remember' this setting and will use it the next time you open a document. To avoid this problem, reset the Files of Type box back to Word Document (*.doc) after you have recovered the document.

You Can Attempt To Recover A Document That Will No Longer Open, By Linking A Good File With A Blank Document And Then Changing The Link Source To The Damaged Document.

Use the following steps to link and change the link to the damaged file:

  1. Create a new document.
  2. Type "This is a Test."
  3. Save the document.
  4. From the Edit menu, choose Select All.
  5. From the Edit menu, choose Copy.
  6. From the File menu, choose New.
  7. From the Edit menu, choose Paste Special.
  8. Select either Unformatted or Formatted text, and click Paste Link.
  9. From the Edit menu, choose Links.
  10. The Links dialog box appears. Select the filename of the first linked document and click Change Source (in Word 2.x, the button is Change Link).
  11. The Open dialog box appears and asks which document you want to change the link to. Select the document you can no longer open and click Open.
  12. Click OK in the Links dialog box (in Word 2.x, enter the path and filename).
  13. The data/text from the damaged Document will appear (provided there was any recoverable data/text). On the Edit menu, click Links, and select Break Links (Cancel Links in Word 2.x).
  14. You can now reformat and save the recovered text. This method works because during a link, part of the header information is not read. This allows you to open the file if this part of the header is the damaged area of the document.
Adobe logo

Sign in to your account