$Id: offlinegc.txt,v 1.2 2007-10-22 18:18:44 otisa Exp $

Offline Garbage Collection User Action Libraries
Norman R. Green
Director of Smalltalk Engineering
GemStone Systems Inc.
norm.green@gemstone.com
September 11, 2007

What's New As Of September 11, 2007

Changes in the 6.2.0 version:

The binary file format has changed to be compatiable with the GemStone/64
bitmap file format.  Everything still works as before as long as you run
both the FDC and GC using the new code.  Mixing files generated with
the 6.1 version of offlinegc.c is known not to work and is begging for trouble.

Also, a C++ compiler is now required to build the user actions because
6.2 was built using C++ and there are some C++-only things in the
header files.

Makefiles have been cleaned up and adjusted.




Introduction
The primary repository-wide garbage collection mechanism in GemStone is 
markForCollection; however this mechanism can prove too tedious for large 
highly available production systems.  The offline GC strategies and code 
presented in this paper are intended as an alternative to markForCollection.

Installation
The user action code is distributed zipped-tar format.
1. unzip the file using gzip:
gzip -d offlinegc.tar.gz

2. unpack the tar file
tar -xvf offlinegc.tar

3. Build the user action library
You can then run makeofflinegc script to build the user action library in a subdirectory that will be created.  The name of the subdirectory is dependant on which operating system is being used.

4. Copy the user action library selected in step 3 to $GEMSTONE/ualib.

Definitions and Acronyms
GC = garbage collection.

MFC means markForCollection, the standard global GC method in GemStone.

FDC means findDisconnectedObjects.  This is a method in GemStone that invokes the MFC algorithm but does not submit the resulting set of dead objects to the system for reclamation.  Instead, FDC returns the resulting garbage objects in an Array.

MGC means markGcCandidates.  MGC is an alternative GC method where a collection of candidate objects submitted to the system as garbage.  The system then validates which objects are in fact eligible for garbage collection and processes them through the normal reclaim mechanism.  MGC usually runs approximately 30 times faster than MFC on the same system. In addition, the finalization of garbage object identified by MGC is faster than the same mechanism for MFC.
Support
This code is offered as-is, without warranty and free of charge.  It has been tested on several systems and is known to function as described in this document.  However bugs may exist and may be reported directly to the author.

The user action code requires GemStone version 6.2 or greater.  The oop load user action will not function correctly on versions prior to 6.2.

A working C++ compiler is required to build the user action library.  Several precompiled versions of the library may also be available from GemStone. 

Note: As of GemStone 6.2.0, a C++ compiler is required.  A C compiler will not work.

 The user action code may be compiled and run on any GemStone supported UNIX platform, including:
* IBM AIX
* Sun Solaris
* Linux (any flavor of Linux that supports GemStone)
Sample make files to build the user action library for each platform are also provided.  

Microsoft Windows is not tested or supported with these user actions.
Offline GC Algorithm
Most of the complexities of the offline GC are handled by the user action code.  The remaining offline GC steps are fairly simple.

There are 2 user actions in the library: one for running the FDC and writing the garbage object identifiers to a file and a second for loading the garbage object identifiers from the file and running the MGC.
FDC User Action
The first user action is run against a copy of production to run the FDC and write the identifiers of the resulting garbage objects to a binary file.  The size of this file will be approximately equal to the number of dead objects found multiplied by 4, which can be quite a large file on some systems.  Care must be taken to ensure enough disk space is available to write the writing oop file or the user action will fail.

The FDC user action should be invoked from Smalltalk by either SystemUser or DataCurator.  It has the following format and arguments:

System userAction: #uaRunFdcAndWriteOopsToFile with: <filename> with: <FDC buffer size>

The first argument is a String object which is the full path of the oop file name.  This file will be created by the user action code and must not already exist.  The second argument is a SmallInteger that represented the size of the data page buffer to be used by the FDC.  The system default is 320 pages but larger values may improve FDC performance.  A value of 2048 is recommended.

If successful, the FDC user action returns the array object that would be returned by the Repository>>findDisconnectedObjects method.  This array can be used to analyze what types of garbage objects are being generated by the system if so desired.  It has not been committed to the repository so it will be discarded when the session logs off unless it is explicitly committed to the repository.

If the user action fails for any reason, a String object is returned describing the error.
MGC User Action
The MGC user action is run against the production repository after the FDC on the copy of production has completed.  It first loads the object identifiers of the garbage objects found by the FDC into the GcCandidates hidden of the gem session.  Each object is also verified to exist in the object table and verified to not be already in the deadNotReclaimed set.  Objects that do not meet these criteria are not loaded and a warning message is printed to the log file.  

The FDC user action should be invoked from Smalltalk by either SystemUser or DataCurator.  It has the following format and arguments:

System userAction: #uaLoadOopsAndRunMgc with: <filename> with: aBoolean

The first argument is a string containing the full path and file name of the oop file generated by the FDC user action.  The second argument is a Boolean that determines if the MGC should be run if one or more oops in the file failed validation and could not be loaded.  If the Boolean is false, the MGC will not be run if one or more oops did not load successfully.  If it is true, the MGC is run if any oops were successfully loaded.

The MGC session requires the GC lock during the load and MGC process.  The user action will fail if the GC lock is not granted by the stone as requested at the beginning of the load. If this happens, determine which session holds the lock and stop it, or else wait for the other process to finish and return the lock to the stone.  Also note the GC lock will also not be granted if there any outstanding possible dead objects on the system.  MGC cannot run until the GcGem has finished processing any existing possible dead objects.  No possible dead objects should be present on the system if the steps below are followed and the epoch GC is disabled on the production system.

Like MFC, at the end of the MGC the candidates verified to be dead objects are recorded as possible dead objects by the stone.  The possible dead objects require further finalization by the system before they can be reclaimed.  Finalization occurs automatically and no further action is required by the database administrator.

Summary of Offline GC Steps
1. Disable epoch GC on production.   
The epoch GC must be disabled before the copy of production is taken in step 2.  This is done to ensure that an attempt is not made to garbage collect the same object twice.  If an epoch runs and garbage collects some objects after the copy is taken, these same objects will found by the FDC and will be written to the oop file.  When the file is loaded for the MGC phase, the objects may be in stones free oop list and will not be loaded for the MGC.  If the object identifiers have been reused and assigned to new objects, the MGC will not allow them to be collected and the MGC will run longer.
Epoch GC is disabled by committing the following entry into the UserGlobals dictionary of GcUser:
	UserGlobals at: #epochGcEnabled put: false.

2. Make a copy of the production repository.  
This can be done by shutting down production and copying the extents, or by restoring a full backup.

3. Restart production.  
After the copy is completed, the production system may continue operating normally during the next steps.

4. Reclaim shadow pages on the copy of production (optional)
The FDC will run faster if there are few or no shadow pages on the production copy.  Shadow pages are reclaimed by running either GcGem or PSR (parallel shadow reclaim) gems.  Monitor the GcPagesNeedReclaiming statistic and the stones log to determine how many shadow pages remain in the system.

5. Run the FDC user action on the production copy
The FDC will determine which objects are dead and write their identifiers to the oop file.  Useful messages may be written to stdout of the gem process.  For a gem process, these go the gems log file, typically named gemnetobjectxxxx.log where xxxx is the process ID of the gem. For topaz sessions, the output goes to the console which can be redirected to a file.  It is recommended that topaz be used to run the FDC and the MGC.  Sample UNIX scripts for running the FDC and MGC are also supplied.

6. Copy the oop file to the production server
When the FDC is finished, copy the oop file to the production server where the MGC will be run.

7. Reclaim shadow pages on the production repository (optional)
The MGC will run faster if there are few or no shadow pages on the production system.  Shadow pages are reclaimed by running either GcGem or PSR (parallel shadow reclaim) gems.  Monitor the GcPagesNeedReclaiming statistic and the stones log to determine how many shadow pages remain in the system.

8. Run the MGC user action on the production repository
This will load the garbage object identifiers from the file and run the MGC.  If you have followed all steps, all OOPs in file should successfully load and the MGC should identify all of them as garbage objects.  Useful log messages will again be written to stdout of the gem process as described in step 5 above.



Figure 1: The Offline GC Process

Performance Tuning Suggestions for FDC and MGC
The MGC is usually fast enough such that performance tuning is not required.  MGC rarely takes more than 6 hours on any system unless there are a high number of shadow pages or the insufficient system resources.

Both of these operations will run faster if the gem has a large private page cache.  The size to be used should be as large as permissible on the system, up to the supported maximum of 512 MB.  In general it is a good idea to use at least 200 MB of private page cache on a large system.

Also ensure the free frame limit of the FDC/MGC gem is set low enough.  For large shared page caches, a value of 1% of the total frames in the cache should be used.

For the FDC, do not specify the FDC buffer size to be too large as using a large value can decrease performance.  |In general, values larger than 3000 are not usually recommended.

Performance gains may also be realized by stripping the disks that hold the extents.


Performance Example
GemStone Version 6.0.1NC
Operating System: Redhat Linux Advanced Server
Repository size: 24 GB, 280 million objects

FDC
FDC time: 10 hrs, 45 mins
Number of dead objects: 88 million
Time to write oop file: < 2 mins

MGC
Oop file load time: 5 minutes
MGC time: 22 minutes
