Microsoft Exchange and Windows 2000 Server contain many mechanisms to prevent data corruption, including the NTFS
file system and the various safeguards against errors within the Exchange database itself. Unfortunately, there can be data errors in even the best of circumstances, so a little foreknowledge of the kinds of hard data errors that can appear is useful.
There are three major types of damaged-database errors in Exchange, all of which are written to the system Application log:
- -1018 (JET_errReadVerifyFailure): This happens when Exchange tries to read a page out of the database and either gets the wrong page or gets a page that has an incorrect checksum. The most common reason for a -1018 error is a transient problem with hardware, firmware or drivers. I stress the word transient here because not all problems with those three things are either permanent or immediately noticeable, and sometimes a problem with a driver or piece of hardware will appear only under the kinds of long-running load times that an Exchange server would experience.
If there are repeated -1018 errors on the same Exchange machine, the machine itself should have its hardware scrutinized or even swapped out. A single, isolated -1018 error is not always a big problem, although it should be taken as a warning sign that other problems can develop.
- -1019 (JET_errPageNotInitialized): You get a -1019 error when a page that Exchange expects to be in use is empty. -1019 errors are usually caused by hard problems in the file system, such as crosslinked clusters, and are extremely rare for this reason.
- -1022 (JET_errDiskIO): A -1022 is logged when Exchange cannot get to a specific page in the database because of a generic I/O problem. This often happens if the database file itself was truncated, so the first culprit is usually the file system itself. It can also happen if another application has exclusive locks on the Exchange database; if you are running a third-party program that deals with the Exchange database, it may be the reason for this error.
MSExchangeIS (248) Synchronous read page checksum error -1018 ((1:3106 1:3106)(0-310013)(0-312215)) occured. Please restore the databases from a previous backup.
The (1:3106 1:3106) represents the database number and page number that the error took place in; the first pair of numbers being what was asked for and the second being what was returned. If you want to see the data on the page, you can output it to a text file using the following command:
Esefile /d database.edb
To recover from -1018 and -1019 errors, there are three basic choices:
- Restore the database from a backup copy. This should be the first line of defense against errors. If you have transaction logs available since the last backup, you can often restore the database to the point where you left off, with little or no data loss.
- Run ESEUTIL.EXE /D
The ESEUTIL.EXE /D command performs an offline defragmentation of the database. This will be helpful if the -1018 error was reported on an empty page. If it isn't, the defrag will cancel before finishing and report an error.
- Run ESEUTIL.EXE /P
The /P switch should be the last resort in a case like this, since it rebuilds the database and discards the damaged page entirely. There is a chance that the damaged data will not be recoverable in any form (although ESEUTIL will make a good effort to recover it all), so this should be used only if there is no backup and no transaction log to work with.