----------------------------------------------------------
 Logmode Considerations and General Archiving Information
----------------------------------------------------------
 
 
 
                     By Ron Castelletto
 
 
 
                 IBM Canada Laboratory Ltd.
                       Toronto, Canada
 
 
 
                        February 1995
 
 
 
                 (c) Copyright IBM Corp 1995
 
 
 
SQL/DS is a trademark of International Business Machines Corp.
 
 
 
--------
CONTENTS
--------
 
 
 
1. LOGMODE CONSIDERATIONS AND GENERAL ARCHIVING INFORMATION
   1.1 CHECKPOINT OVERVIEW
   1.2 LOGMODES
       1.2.1 LOGMODE N
       1.2.2 LOGMODE Y
       1.2.3 LOGMODE A
       1.2.4 LOGMODE L
 
2. HISTORY AREA
 
3. DUAL LOGGING
 
4. WHAT NOT TO DO
 
5. GOOD ARCHIVING TIPS AND PROCEDURES
   5.1 ADVANTAGES OF LOGMODE L OVER LOGMODE A
   5.2 ADVANTAGES OF LOGMODE A OVER LOGMODE L
 
6. BIBLIOGRAPHY
 
 
 
-----------------------------------------------------------
1. LOGMODE CONSIDERATIONS AND GENERAL ARCHIVING INFORMATION
-----------------------------------------------------------
 
 
 
This paper presents an overview of the logmodes available
in SQL/DS, followed by a discussion of the reasons to use
each logmode and some points to consider when running in
these logmodes.  For each logmode, there is an explana-
tion of what to do in the event of a soft crash or in the
event of a hard crash.  As an introduction to the concept
of logmodes there is an overview of the checkpoint
process. A good understanding of the checkpoint process
is required to understand how recovery is performed.
Additionally, there are follow-on sections describing the
History Area and Dual Logging.  Finally there is a
section on what not to do for archive/recovery procedures
and a section on good ideas for archive/recovery.
 
 
 
-----------------------
1.1 CHECKPOINT OVERVIEW
-----------------------
 
 
 
This section describes what checkpoints are and why they
are required.  Among other information, the SQL/DS direc-
tory contains 2 bitmaps.  The shadow bitmap, which is
also known as the checkpoint version of the database,
indicates which data and index pages were written to DASD
during the most recent checkpoint.  The shadow bitmap is
not updated until the next checkpoint.  The other bitmap,
called the current bitmap, points to the up-to-date data
and index pages that have been updated since the most
recent checkpoint. When a checkpoint is taken, the
current pages (those which have changed since the most
recent checkpoint) are written to DASD.  As part of the
checkpoint process, the shadow bitmap is updated to
reflect this.  Immediately after the checkpoint the
shadow and current bitmaps are equal.  If the database
abends abnormally, some recent updates to data pages have
been logged, but the updated pages which were not yet
written to DASD have been lost.  The directory on DASD,
which contains the shadow bitmap, is used for recovery.
To recover, SQL/DS starts with the checkpoint version of
the database and applies the changes logged since the
checkpoint.
 
 
This section describes how the log is interconnected with
the checkpoint process.  As users make updates to the
database, all data required to redo or undo these updates
is written to the log.  Note: All data written to the log
is immediately written to the log disk, otherwise, log
data in memory would be lost if an abend or other failure
occurred.  Data and index pages that have been updated
are kept in the page buffers.  Page buffers are kept in
memory for as long as possible, however, if many pages
are updated and not all can be kept in the page buffers,
the oldest pages in the page buffers are written to DASD.
These changes are reflected in the current bitmap (since
the current bitmap is always up to date), but are not
reflected in the shadow bitmap (this is only updated
after a checkpoint).
 
 
When a checkpoint is taken, a checkpoint record is
written to the log.  Checkpoints are useful because they
are a starting point for recovery if there is a soft
crash (a soft crash can be a power failure, an internal
processing error or system abend).
 
 
Also note that a checkpoint does not wait for LUWs to be
committed. A checkpoint may occur in the middle of an
LUW.  This means that the database is usually not in a
consistent state when a checkpoint is taken.
 
 
 
------------
1.2 LOGMODES
------------
 
 
 
SQL/DS provides 4 logmodes. Logmodes determine how the
log is used for recovery, if archiving will be done, and
if so, what type of archiving will be done.
 
 
 
---------------
1.2.1 LOGMODE N
---------------
 
 
 
In logmode N, there is no logging of data changes.  This
logmode is available only in single user mode.  During an
LUW, the shadow bitmap contains the state of the database
as of the beginning of the LUW.  Meanwhile the current
bitmap contains the state of the database including all
updates made by current LUW.  If a rollback is done then
the current bitmap is reset to the shadow bitmap, which
brings the database back to the state it was in at the
beginning of the LUW.  If Commit Work is issued, then a
checkpoint is taken to write the changes to DASD, which
synchronizes the current and shadow bitmaps.
 
 
Logmode N is mainly used for large dataload that would
otherwise fill the log disk.  Changing to logmode N ends
the current restore set.  After switching to logmode N,
you can only restart the database in logmode Y and need
to take a database archive to establish a new restore
set. After taking this database archive you can start in
logmode A or logmode L.
 
 
If there is a soft crash, the database is restored from
DASD which contains the shadow bitmap. Since there is no
log data, the LUW that was running at the time of the
crash cannot be recovered.
 
 
If there is a hard crash, you must recover from a pre-
vious archive.  The procedure used to recover from a pre-
vious archive depends on the logmode in use prior to the
switch to logmode N; refer to one of the following
sections for appropriate recovery procedures.  However,
in all cases, any changes made in logmode N have not been
logged and must be redone entirely after restoring from
an archive.
 
 
 
---------------
1.2.2 LOGMODE Y
---------------
 
 
 
In logmode Y, all changes to the database are logged and
checkpoints are taken periodically (based on the
CHKINTVL startup parameter).  Logmode Y allows recovery
from soft crashes (power failure, software abend) but not
from hard crash (DASD failure).
 
 
If there is a soft crash, the database on DASD is the
checkpoint version of the database.  Since a checkpoint
may be taken in middle of an LUW, the database may be in
an inconsistent state. Log data is used to bring the
database to a consistent state. Refer to SQL/DS System
Administration, Release 3.4.0, pg. 186 for VM, and pg.122
for VSE.  SQL/DS uses log data to complete LUWs that were
committed between the checkpoint and the soft crash (REDO
processing).  Also, log data is used to rollback LUWs
that had been written to DASD by checkpoint, but were not
completed or were rolled back (UNDO processing).  In
logmode Y, only log data required to recover from a soft
crash is saved in the log.
 
 
If there is a hard crash (DASD failure) the only avail-
able option is to restore from a previous database
archive (if there is one).  Note: no log data can be
applied to the database archive, since log data from the
time of the database archive was not saved.  In logmode
Y, after restoring from a database archive, you do not
have all the data since the archive saved in the log, you
only have log data since the last checkpoint.
 
 
In all logmodes, after a database archive is restored
SQL/DS will try to apply any data contained in the log.
In logmode Y, this log data does not logically apply to
the restored database.  If SQL/DS tries to apply the log
data, most likely SQL/DS will encounter an internal
system error and the database will abend.  To prevent
this error and to allow the database to start up after a
database restore, you must perform a COLDLOG prior to the
database restore.  A COLDLOG will wipe out all log data
from the log, so SQL/DS will not have any log data to try
to apply to the database.
 
 
 
---------------
1.2.3 LOGMODE A
---------------
 
 
 
In logmode A, all changes to the database are logged.
Checkpoints are taken periodically (based on the CHKINTVL
startup parameter).  You must take an initial database
archive to be able to start SQL/DS in logmode A. Once in
logmode A, log data since the previous database archive
will be saved until a subsequent database archive is
taken.
 
 
When the user requests a database archive or when the log
is ARCHPCT full, a database archive is taken.  This data-
base archive establishes a new starting point for saving
log data.  Once the database archive has been taken, the
log data accumulated from the previous archive is no
longer necessary and will be eventually overwritten.  Log
data since this new archive will be saved until the next
database archive is taken.
 
 
Online database archives do not wait for LUWs to com-
plete. For an online database archive, SQL/DS takes a
pre-archive checkpoint (which may be taken in the middle
of an LUW). Once the checkpoint is complete, SQL/DS
starts taking the database archive. SQL/DS uses the
shadow bitmap to take the database archive. Recall that
the shadow bitmap is not changed until a subsequent
checkpoint is taken.  Since we do not want the shadow
bitmap to be changed while the archive is proceeding,
further checkpoints are inhibited until the archive is
completed. At the completion of the archive, a post-
archive checkpoint is taken.  While the archive is being
taken, LUWs are allowed to proceed.  Updates made while
the archive is being written are reflected in the current
bitmap.  In this manner, users may continue their proc-
essing while an online archive is taken.
 
 
If there is a soft crash while running in logmode A,
recovery is the same as for logmode Y. On DASD, we have
the checkpoint version of the database. Log data is used
to do the appropriate UNDO and REDO operations. After
UNDO/REDO, the database is in a consistent state; all
committed LUWs have been restored and all partially com-
pleted LUWs have been rolled back.
 
 
If there is a hard crash, you must recover from a data-
base archive.  If you recover from the most recent data-
base archive, you may use the current log to recover the
database to the point of failure  .  because all changes
made since that database archive have been saved in the
log disk.  After the database is restored, SQL/DS uses
the log data to redo all changes that are recorded in the
log. The database will be completely up to date with all
completed LUWs redone, and any LUWs that were partially
completed at the time of failure undone.
 
 
If there is a problem with the most recent database
archive (the tape is corrupted/unreadable, or there is a
tape missing), you may recover from a backlevel database
archive.  If you recover from a backlevel database
archive, you cannot apply any log data to the database
archive since log data is only saved since the most
recent database archive.  As mentioned earlier, SQL/DS
will always try to apply log data after a database
restore.  In this case, the log data does not logically
apply to the backlevel database archive you must perform
a COLDLOG before the restore.
 
 
 
---------------
1.2.4 LOGMODE L
---------------
 
 
 
In logmode L, all changes to the database are logged.
Checkpoints are taken periodically (based on the CHKINTVL
startup parameter) You must take an initial database
archive to start in logmode L.  This logmode is similar
to logmode A in that all log data since the most recent
archive is saved in the log.  When the user requests a
log archive or when the log reaches ARCHPCT full, the log
is archived (not the entire database). Once the log is
archived, the data on the log disk can be overwritten.
SQL/DS will not overwrite any log data since the most
recent log archive.  If a database archive is requested,
all log data currently on the log disk must first be
archived. This log data is needed in case a backlevel
database archive is restored. Once the current log is
archived SQL/DS will begin taking the database archive,
By taking the log archive before the database archive,
SQL/DS ensures that log archives are LUW consistent (ie:
LUWs cannot span across log archives).  In LOGMODE L, all
LUWs must complete before a log archive can be taken.
Therefore, at the time of a log archive, the database is
in a consistent state.  Since database archives are
always preceded by a log archive, database archives in
logmode L are also in a consistent state.
 
 
If there is a soft crash, recovery is the same as in
logmode Y and logmode A.
 
 
If there is a hard crash, you can restore from the most
recent database archive.  At the time of the hard crash
there will be data on the current log that has not been
archived.  To recover completely, SQL/DS must restore the
most recent database archive, then apply all log archives
taken since the database archive and then apply the
current log.  After the database archive is restored,
SQL/DS will archive the current log.  This is because any
log archives must be copied to the log disk to be applied
and if the current log isn't archived before this is
done, it will be lost.  The archive of the current log
becomes the last log archive in the restore set.  After
the current log is archived, the operator is prompted to
restore the first log archive taken after the database
archive.  When this log archive has been applied, the
next log archive is applied.  This process continues
until all log archives, including the archive of the
current log, have been applied. At this point the data-
base is in exactly the state it was in when the crash
occurred.
 
 
If there is a problem with the most recent database
archive, you may restore from a backlevel archive.  Since
all log data is archived, it is possible to restore all
the log archives taken from the backlevel database
archive up to the just completed current log archive.  So
in logmode L, you can bring your database completely up
to date from a backlevel archive.
 
 
 
---------------
2. HISTORY AREA
---------------
 
 
 
How does SQL/DS know which database archive has been
restored ?  How does SQL/DS know which log archives need
to be applied?  In the log disk, SQL/DS sets aside one
page to store history information.  When an archive is
taken, when a coldlog is done and when the logmode is
switched, an entry is added to the history area.  When a
restore is required, SQL/DS first restores the database
archive, and then based on the timestamp contained in the
database archive, SQL/DS looks for this database archive
entry in the history area. Since the history area is
chronological, SQL/DS can determine if any log archives
have been taken after the database archive, and so can
determine if these log archives should be part of the
restore set. Refer to SQL/DS System Administration,
Special Topics in Recovery Design, (Chapter 10 for VM or
Chapter 9 for VSE) for a discussion of the history area.
 
 
 
---------------
3. DUAL LOGGING
---------------
 
 
 
What if there is a DASD error on the log disk itself ?
As mentioned earlier, log data is required to restore
from both soft and hard crashes.  If the log cannot be
read after a soft crash, you must do a COLDLOG to erase
the log data.  Your database will be in the state it was
in at the time of the last checkpoint, which is not nec-
essarily consistent. The other alternative is to restore
from the most recent database archive (if you have one).
However, since the current log has been damaged, you will
not be able to apply the current log. To protect your
database from a DASD error on the log disk, you should
use dual logging.  Dual logging means SQL/DS will manage
two copies of the log.  If there is a DASD error on one
log disk, you will be notified and SQL/DS will automat-
ically switch to the other log. Dual logging is HIGHLY
recommended. The log disks should be physically separated
to avoid the possibility of both logs being damaged
simultaneously.
 
 
 
-----------------
4. WHAT NOT TO DO
-----------------
 
 
 
The following points are to be avoided.
 
 
o   Running in logmode Y and not having a database
    archive is a bad idea.  If there is a DASD failure,
    you will have to regenerate the database before you
    can start to reload data.
 
 
o   Do not take a user archive or system backup while the
    database is running. The following example illus-
    trates the type of problem that results. Let's assume
    I use DDR to backup my directory and all dbextents
    every night starting at 9:00PM.  I first backup the
    directory disk. The shadow bitmap is in the direc-
    tory. This process takes 15 minutes. At 9:15, I start
    to backup my dbextents, this process takes 2 hours.
    Meanwhile, the database is running and users are
    making updates to the database.  If I use DDR to
    restore the directory and dbextents, the shadow
    bitmap expects all pages to exists as they were at
    9:00PM (the time the shadow bitmap was archived).
    However, if pages were updated and written to DASD
    between 9:00PM and the time the page was archived
    using DDR, there is an inconsistency between the
    shadow bitmap and the actual data pages.  This will
    likely cause internal processing errors during
    startup or subsequent processing. To ensure a DDR
    backup will restore successfully you must ensure
    there is no activity on the database while the backup
    is running.  The best way to do this is to shutdown
    the database.
 
 
o   Continuing with the above example, if you shutdown
    the database to take a system backup, you should
    shutdown with SQLEND UARCHIVE.  This lets SQL/DS know
    that a user archive is taking place and records this
    information in the history area.  On the next
    startup, the operator is prompted if the user archive
    was successful. In this way, SQL/DS will manage the
    log disk and log history area appropriately for the
    logmode you are running.
 
 
o   Be careful when doing a COLDLOG; understand the dif-
    ference between Reformat and Reconfigure.  After
    Reconfigure, you lose your history area and, there-
    fore, the ability to apply any log archives.
 
 
o   If you start a restore and choose to temporarily
    interrupt the restore, by entering STOP SYSTEM, you
    must restart with STARTUP=W. With STARUP=W, SQL/DS
    will recognize that you interrupted a restore, and
    will continue where you left off.  If you restart
    with STARTUP=R or STARTUP=U, SQL/DS will be expecting
    you to restore from a database archive.
 
 
o   In logmode L, SQL/DS has the ability to restore all
    log archives taken, including the current log disk,
    after a backlevel database archive is restored.  In
    logmode L, if you do a coldlog before restoring, you
    lose the ability to bring your database completely up
    to date since the COLDLOG erases the current log.
 
 
o   Follow-on from the previous point: In logmode L, you
    should take database archives regularly to reduce the
    size of the restore set in the event that you need to
    restore. If you have a long series of log archives
    following a database archive, the time to restore all
    the log archives may be considerable. A general rule
    of thumb is to take a database archive weekly
    (usually offshift) and to take log archives nightly.
 
 
o   Take care of your archive tapes! You may consider
    making backup copies of your archive tapes and
    keeping these offsite. This makes it extremely impor-
    tant that you label your archive tapes clearly.  Fol-
    lowing is an example of a real situation caused by an
    operator not familiar with SQL/DS archiving proce-
    dures.  Running in logmode L, soon after a database
    archive was taken the operator asked for a log
    archive to be taken.  Since there was very little log
    data, the archive completed in seconds.  The operator
    was expecting the archive to take longer, and assumed
    that something SQL/DS did not in fact take the
    archive, so asked for another log archive.  SQL/DS
    then took another log archive but the operator never
    switched tapes !!  The second log archive overwrote
    the first log archive.  Later in the week, the data-
    base had to be restored due to a DASD failure.  The
    database archives was restored successfully, and
    SQL/DS prompted for the first log archive.  The tape,
    which actually contained the second log archive was
    mounted.  SQL/DS recognized that the wrong tape was
    loaded, and asked for the first log archive tape
    again.  In this scenario the database can not be
    brought up to date. However, this particular story
    does have a happy ending.  Since the first log
    archive was taken offshift, just a few hours after
    the database archive, and because the operator could
    determine that no users were logged on to the system
    during this time, SQL/DS Service and the Customer
    were convinced that the first log archive hadn't con-
    tained any data which meant that the second and fol-
    lowing log archives could logically be applied to the
    database.  SQL/DS Service was able to patch the
    history to skip over the first log archive and all
    following log archives were applied successfully.
    Moral:  be careful with your tapes, label them
    clearly, understand the archive and restore proce-
    dures.
 
 
 
-------------------------------------
5. GOOD ARCHIVING TIPS AND PROCEDURES
-------------------------------------
 
 
 
The most important decision is which logmode to run in.
Usually corporate databases run in either logmode A or L.
 
 
 
------------------------------------------
5.1 ADVANTAGES OF LOGMODE L OVER LOGMODE A
------------------------------------------
 
 
 
o   if the log fills during the business day. only
    require a log archive not a database archive.
 
o   can take log archive at end of business day to free
    up all log space for next day. In logmode A, you need
    enough log space to store all log data between data-
    base archives.  Usually database archives are taken
    less frequently than log archives, so in Logmode A
    you would need a larger log.
 
o   if there is a problem with the most recent database
    archive, you can recover completely from a backlevel
    database archive.
 
 
 
------------------------------------------
5.2 ADVANTAGES OF LOGMODE A OVER LOGMODE L
------------------------------------------
 
 
 
o   It is usually faster to restore since you only need
    to apply the current log after a database restore,
    not a series of log archives.
 
At least once a year: go over your archive and restore
procedures.  Take your most recent (and perhaps a back-
level) database archive and restore it on a test system.
If in LOGMODE L, restore all log archives as well.  Run
some tests to ensure all the data was restored success-
fully.  In the event you have an emergency, it is invalu-
able to have the experience of going through a restore
scenario so you are familiar with the messages and how to
respond to them.
 
 
If you run in Logmode Y, you should take database
archives periodically.  Some customers have read-only
databases. On daily or weekly basis, data is loaded into
this database from an external source, so they think that
if there is ever a DASD failure, they can simply reload
their data from the external source.  However, for SQL/DS
to recover, it needs a copy of the directory definitions
from database generation to know the structure of the
database.  Without an archive, you would have to regen-
erate the database, re-create all tables, views and
indexes as well as grant authorities,  With an archive,
you can restore the archive, and then run local proce-
dures to reload any data that may have been lost.
 
 
 
---------------
6. BIBLIOGRAPHY
---------------
 
 
 
SQL/DS System Administration for IBM VM Systems, Version
3 Release 4, Chapter 9 and 10
 
SQL/DS System Administration for VSE, Version 3 Release
4, Chapter 8 and 9
 
SQL/DS Managing the SQL/DS Environment for IBM VM
Systems, Version 3 Release 3, Chapter 7