Some cleanup possibilities:
   1) I believe the following directories can be removed:
          Runtime, Old, lib 
   2) We can probably remove classicMakefile ... I'm not sure that
      anyone can run these anymore.

   3) Change ML_transpose() to something like ML_implicit_tranpose().
      Change ML_transpose_byrow() to ML_transpose().
   
   4) Get rid of ml_xxt.c ml_xyt.c. No one is going to use these.
      However, there are probably some subroutines that are used elsewhere.
  
   5) I think we can get rid of driver.c. This is something that I
      think Charles put in that nobody uses.  ml_pde.c/h and mli_solver.c/h
      can go as well.

   6) Get rid of ml_seg_precond.c ml_seg_precond.h. These
      were used before MEROS existed.

************************************************************************
************************************************************************
Below are a list of things turned on in ml_common. We should probably
go through this list. Some things are always turned on and they should
just be inlined. Other things should probably switch to the HAVE_... 
stuff. Finally, others should just be regular user options.
   AZTEC,ML_WITH_EPETRA,SUPERLU,DSUPERLU,ML_MPI,ML_USING_MPI_FUNCTIONS,
   USE_VENDOR_BLAS,USE_VENDOR_LAPACK,ML_ENRICH,ML_MEMORY_CHK,
   ML_NEW_T_PE,GREG,ML_TIMING,ML_TIMING_DETAILED,WKC,
   ML_MULTIPLE_RHS_BLOCK_FACTOR,ML_CPP,ML_FLOPS,ML_BENCHMARK

************************************************************************
************************************************************************

Here is the list of ifdef's 

RST_MODIF/MB_MODIF (rstumin)
   summary: RST_MODIF and MB_MODIF are always turned on.  MB_MODIF 
            surrounds Marian's code to implement MLS and to avoid dividing
            by the diagonal. RST_MODIF divides by diagonal.
   action:  We can probably get rid of this (keeping all code that is 
            normally turned on).
   details: These ifdefs started with Marian when he put in MLS. Marian
            often scales his matrices so that the diagonal is 1 and so 
            some computation can be avoided. I don't think this is worth it.

ML_MPI  (rstumin)
   summary: Obvious
   action:  We should change these to HAVE_MPI
   details: 

NOTSTRICT_PROTO (charles)
   summary: Looks like this is always off (but we should check if MPSalsa
            turns on). Defines one function ML_GridFunc_Set_Function()
            where the function ptr does not have a real proto-type. My
            guess is that Charles did this to get rid of warnings from some
            compilers. It looks like we could get rid of warnings with the
            proper casts ... though it is not clear who is using it?
   action:  Check if Salsa uses and check what kind of warnings it generates
            when casts are used. Decide whether to get rid of it.
   details: ml_gridfunc.c/.h

ML_CPP   (jhu?)
   summary: My guess is that this determines whether code is compiled with
            c++ or not. It looks like some of this affects casting.
   action:  Would probably be best to get rid of as much of these as possible.
   details: lots of files

DEBUG   (?)
   summary: Obvious (sort of). There are lots of debugs in ML. Most are probably
            not used on any regular basis though they are not harmful.
   action:  It might be good to first mark with a comment who if anyone
            cares about a particular ifdef DEBUG and then rip out those
            that no one wants.
   details:

WKC  (jhu)
   summary: Allows for processing multiple right hand sides with MLS and
            compiling with C++.  Some of these appear in funny places for
            functions that are not used (and could probably be removed).
   action:  Some day it would be nice to clean this up and really support
            it with a special V cycle.
   details: 

ML_MATCHED (rstumin)
   summary: Not used
   action:  Remove
   details: Include/ml_defs.h

ML_CLASSIC_BUILD (jhu)
   summary: Not used
   action:  Remove
   details: Include/ml_common.h

ML_FUNCTION_NAME (jhu)
   summary: This seems to be used to print out the function name
            in error messages. I'm not sure why this is done as
            an ifdef and not another variable?
   action:  Change to a real variable.
   details:

ML_TIMING_DETAILED/ML_TIMING (jhu)
   summary: This inconjunction with ML_TIMING is used to produce low
            level timing information (when objects are destroyed).
            I believe that strange things may happen if one of these
            is turned on and not the other.
   action:  It would probably be best to make this a regular user option.
   details: The easiest thing would be to make this a global variable
            like the output_level. I would also prefer if this
            was combined with ML_TIMING & ML_FLOP and any other
            diagnostic so that you either turn on all performance
            monitoring or you turn them all off.

ML_FLOPS = HAVE_ML_FLOPS (jhu?)
   summary: This seems to return the number of flops executed associated
            with matrix-vector products.
   action:  This one should be treated the same as the timing flags. The
            best would be to make these a real user option.
   details: The main problem with making low level things user options is
            that there are not really any structures that ML passes to 
            everyone. The closest thing would probably be 'Comm'. However,
            the timing/flop things could be passed in the primary
            structures like: ML_Smoother, ML_Operator, ML_CSolve?
            
ML_CATCH_MPI_ERRORS_IN_DEBUGGER (jhu?)
   summary: This controls whether MPI errors return control to a debugger.
   action:  Perhaps this guy should be rolled up with other useful options
            (like breaking for a debugger, others?).  This way, developers
            could turn on one configure switch to get everything.
   details: This is useful on occasion for developers.  Users would never
            want this, obviously.  Developer's 

ML_LOWMEMORY  (rstumin)
   summary: This is off by default. This turns on memory savings like
            single precision grid transfers.
   action:  Should probably always be turned on. The only thing funny is
            that ML does alter some operators (which potentially the user
            may have set).
   details:

takeout/out    (rstumin)
   summary: These are something Tuminaro uses to ifdef code out.
   action:  Double check code and remove most.
   details: ml_matmat_mult.c, ml_struct.c, ml_smoother.c, ...

NOT_DEF  (msala)
   summary: Looks like something to avoid running out of memory in 
            ML_Operator_Eigensolver_Dense(). Oh ... now I see that this
            is used elsewhere
   action:  ?
   details: ml_op_utils.c ml_MultiLevelPreconditioner_*

previousmatmat  (rstumin)
   summary: Just ifdef'ing out old matmat code.
   action:  Just remove.
   details:

new2row (rstumin)
   summary: This function takes a funky matrix (one that is stored by columns 
            and has a post_comm, eg. Rmat) and converts it to one that is 
            stored in the normal way by rows. The only problem is that this
            one has a bug and so is ifdef'd out. The current working one
            which is used by Transpose_byrow() is slower.
   action:  Either fix, get rid of, or make new ifdef ML_NEEDS_DEBUGGING
   details: ml_op_utils.c

charles  (rstumin)
   summary: This is essentially debug information in matmat_mult.c. It is
            not needed anymore.
   action:  Get rid of it.
   details: There are even some undef charle things in here.

ML_OBSOLETE	(jhu?)
   summary: Code that can be cut out.
   action:  Get rid of it.
   details: There is only one instance of this in SGS().
            The code that is taken out appears to do
            communication between the foward and backward
            sweep of symmetric GS. I believe that this
            messes up symmetry so it was taken out.

obsolete        (jhu)
   summary: Code that can be cut out.
   action:  Get rid of it.
   details: This code sits inside of ml_amg_MIS.c. These
            were created by Charles and do a Ruge-Stueben
            type algorithm. No one other than charles has used
            them. I think we should keep the amg files but 
            get rid of this ifdef.

ML_SUPERLU2/ML_SUPERLU2_0 (???)
   summary: Mystery code. There are some comments indicating
            compatablity between SuperLU 1.0 and SuperLU 2.0.
   action:  Figure out what this is.
   details: I could find two references to this in the 
            code. The first turns on some defines like
            DN & GE (these things are passed into superlu
            type calls). The second one decides whether
            superLU is called or not inside ML_SuperLU_SolveLocal().

dump/vecdump/printcoeff (jhu?)
   summary: Prints out some info for ATmat_trans.
   action:  Looks like it can be dumped.
   details: It looks like this was used to debug a particular
            problem. There are some hardwired references to col 45
            and things like that.
        

fastrb/rblack  (rstumin?)
   summary: These are used inside ML_Smoother_MSR_GSforwardnodamping
            to switch between lex and red-black smoothing on a 
            2D regular mesh.
   action:  Get rid of it.
   details: Determines the order that points are visited.

IWANTONESTEP	(rstumin?)
   summary: This is a hardwired way to do some kind of 1-step
            Krylov smoother when Greg Newman's application.
   action:  Do nothing for now.
   details: This is experimental code. I hope that we get things
            flushed out when we start working with Greg again.
          
MatrixFreeHiptmair/MatrixProductHiptmair/debugSmoother/SPECIAL (jhu)
   summary: I believe there are 2 implementations of the Hiptmair
            smoother for edge elements. We always use the 
            product version.
   action:  Get rid of MatrixFreeHiptmair and get rid of 
            MatrixProductHiptmair ifdef (but leave code). Get rid
            of debugSmoother as this is part of matrixfree version.
   details: I believe that the MatrixFree version does not
            run in parallel and it would probably be pretty
            touch to get it working. I don't really know what 
            SPECIAL is but it looks like it is always turned on
            and lives entirely within MatrixFree version.

PRINTITNOW (jhu)
   summary: Used in conjunction with ML_DEBUG_SMOOTHER to 
            print out Hiptmair stuff.
   action:  Get rid of it.
   details: The Hiptmair smoother seems to work.

BLOCKMLS   (rstumin)
   summary: Used to switch between cheby and block cheby
            within the Hiptmair smoother.
   action:  Make a real user option to switch between these
            and get rid of ifdef.
   details: 

GREG	      (rstumin)
   summary: Turns on some smoothing things for complex maxwell smoother.
   action:  Clean this up. Determine what is needed and make a user
            input option. Get rid of junk.
   details: As I recall, this is a mixture of things we used when
            we were working with Greg. Some of these are hardwired
            eigenvalues. Others are switching between complex cheby
            and regular cheby. Other are for printing.

MB_FORNOW	(rstumin or jhu)
   summary: This is normally turned on and indicates to allocate
            and free memory within MLS smoother.
   action:  Keep code and get rid of ifdef.
   details: Marian wanted to allocate memory in the setup phase
            so that we don't need to keep allocating and 
            unallocating it. However, he never got that working.
            I don't think it is worth it to try.
   
ML_FAST	      (jhu)
   summary: Looks like something Charles put in to do SGS faster.
            Might be worth keeping?
   action:  Maybe keep for now?
   details: This looks like something which uses the connection
            between GS and ILU to run the GS faster. It has
            the advantage that it does not need to do getrow
            all the time but it needs to store the matrix data.

ML_DEBUG_SMOOTHER (jhu)
   summary: Use for debugging various smoothers.
   action:  Keep but maybe clean a bit.
   details: The main thing I don't like about this one is there are
            some #undef ML_DEBUG_SMOOTHER. It would be nice to have
            something more elegant.
ML_DEBUG_SUPERLU	(jhu)
   summary: This is used to print the condition number of the
            superlu factors.
   action:  Can we just roll this into ML_DEBUG_SMOOTHER?
   details: Only used twice (and one of those is the multiple
            rhs code). 
XYT	(jhu)
   summary: Decides whether to use XXT or XYT (I think).
   action:  Dump the whole XXT and XYT codes.
   details: There are two direct solvers XXT for symmetric
            systems and XYT for nonsymmetric systems. No one
            uses them so just toss. 

COMPILE_DRIVER  (jhu)
   summary: Inside driver.c. I don't believe anyone uses driver.c
            as all the code is ifdef'd out.
   action:  Get rid of.
   details: Charles put this in.

ML_NOTALWAYS_LOWEST  (rstumin)
   summary: Used in ml_agg_reitzinger to decide which processor
            is assigned an edge shared by two.
   action:  Probably get rid of. Perhaps keep but should be run through
            all the parallel maxwell tests first?
   details: When creating the coarse grid gradient, some edges have a node in
            two different processors.  The current version of the
            code always assigns the new edge to the lower numbered processor.
            I believe the NOTALWAYS option checks whether something is even
            or odd and then assigns to the lowest or largest processor. This
            version is probably a bit better and probably works (though it
            has not gone through extensive testing). Of course, the best
            thing to do when parmetis or zoltan is available to to just
            really load balance everything.

ML_PRINT_COARSE_MAT     (jhu)
   summary: Used by ML_Gen_Amatrix_Global() to
            print coarse matrix.
   action:  Get rid of.

ML_SYNCH        (jhu)
   summary: This is used for synchronizing the V cycle so that timing
            information really represents the total time spent in each kernel.
   action:  Leave as is or make a global user defined variable. There is a
            chance this variable can be shoved into the ML structure. 
            We could also always turn this on when timing is requested?
   details: Timing information can sometimes be misleading because some
            operators are just spending time in a receive waiting for other
            processors to catch up. This synchronization is meant to take 
            care of this by sync'ing processors after each major kernel.

ML_USE_INTERNAL_COMM_FUNCTIONS  (jhu)
   summary: Use to switch between ML's communication wrappers or go directly
            to MPI calls.
   action:  This can probably be rolled into a global user defined variable.
            There is some chance that this can be rolled into ML_Comm.
   details:

ML_SINGLE_LEVEL_PROFILING       (jhu)
   summary: Use to turn on profiling of individual levels with Janus profiling.
   action:  Keep but clean and make into an option (either global or even 
            better in the ML struct). I don't like how this makes the V cycle 
            code look even uglier.  Also, the standard V cycle happens inside 
            a 'default:' statement of a switch. I would prefer if this ugly 
            stuff is moved into a subroutine so as not to clutter the V cycle 
            (with the standard recursive call not appearing in this subroutine).
   details: Used to differentiate between levels in the Janus profiler so
            that we can figure out where the time is spent. I'm not completely
            sure if it is needed as the best way to profile is to just run
            a 1 level method first, and then a 2 level method, etc.
 
newrap  (msala)
   summary: Always on and used by GGB to define things in a more efficient way.
   action:  Keep code but get rid of ifdef and get rid of else code.
   details: GGB can be made more efficient by doing some matrix-matrix
            mults. I'm not sure if this code is always more efficient.

store_QtransA   (msala)
   summary: Another efficiency attempt for GGB.
   action:  Check with Haim to see if this is worth keeping.
   details: I'm not sure if this one helps or not?

ML_SEG  (rstumin)
   summary: This code is always turned off and it corresponds
            to some stuff we did before MEROS.
   action:  Get rid of
   details: ml_seg_precond.c ml_seg_precond.h

RAP_CHECK       (rstumin)
   summary: This code is used to check that Rr = 0 after
            the coarse grid correction on a two level method
            with an exact solve.
   action:  Make it a user input option. Put in ML struct.
   details: Sometimes this is useful for debugging.

RAP2_CHECK      (rstumin)
   summary: This looks like AMGV's version of RAP_CHECK.
   action:  Get rid of and use same logic as RAP_CHECK.
   details: I'm not entirely sure what AMGV. Charles put
            it together and I think it should do the same
            thing as the regular cycle only more efficiently.
            We should probably check it out to see if it is
            worth keeping.

ML_DEBUG        
   summary: This print out some stuff.
   action:  This one seems funny by itself and should
            probably be combined with plain old DEBUG.
            Perhaps all DEBUGs should be changed to ML_DEBUG?
   details: In ml_aggregate.c this prints ML_Aggregate_Coarsen ends.
            In ml_struct.c this does a ML_memory_inquire()
            at the end of ML_Destroy().

ML_ANALYSIS  (jhu)
   summary: This was put by Charles into the V cycle. It
            is supposed to help us figure out why things
            don't converge right. No one uses it.
   action:  Get rid of.
   details: This has propogated to various V cycles. I
            don't think anyone knows how to use it and
            we are developing other tools with ANASAZI
            to analyze things.
newstuff        (jhu)
   summary: This is always on and causes the uncoupled
            aggregation to use tuminaro's phase2_3 cleanup.
   action:  Get rid of ifdef and remove code in the else.
   details: At some point I wasn't happy with Charles'
            phase2 & 3 algorithm.

MAYBE  (marzio)
   summary: Always off in ml_agg_ParMETIS.c
   action:  ?
   details: ?

MARZIO  (marzio)
   summary: Always looks off.
   action:  ?
   details: Used in Coarsen/ml_agg_genP.c,ml_aztec_utils.c

LATER   (marzio)
   summary: Always off.
   action:  ?
   details: In ml_MultiLevelPreconditioner_Filtering, ml_agg_METIS.c,
            ml_MultiLevelPreconditioner_Smoothers, ml_agg_VBMETIS.c,

GEOMETRIC_2D    (rstumin)
   summary: Always off. Used to build a 2D geometric grid transfer.
   action:  Get rid of.
   details: It might be useful to have some geometric grid transfers
            for structured grid problems. However, this thing is 
            shoved in ml_agg_genP.c in a pretty dumb spot. If we want
            this it would probably be best to build from scratch.
            
ML_NONSYMMETRIC (rstumin)
  summary:  Used to switch to max norm instead of using CG to compute
            largest lambda value. This is out-of-date and not used.
  action:   Get rid of.
  details:  In the old days we didn't have a lot of stuff for nonsymmetric
            systems and so this is what we used to do. Things have changed.

SYMMETRIZE      (marzio or tuminaro)
  summary:  Make prolongator smoother (I-alpha B) where B = (A + A^T)/2;
  action:   Get rid of.
  details:  For convective problems it is sometimes useful to ignore the
            nonsymmetric part. This option, however, is available as a
            user input. We should probably also clean out the code in 
            ml_agg_genP.c so that each time we want an eigenvalue it
            uses the wrapper Gimmie_Eigenvalue or something like that.

ML_NEW_T_PE     (jhu)
  summary:  Turns on some new code for fixing up bcs when generating a
            RS interpolation for edge elements. Always on.
  action:   Get rid of ifdef and leave code.
  details:  This is the only option that properly preserves commuting.

RANDMISROOTS    (rstumin)
  summary:  Randomize the order of the points considered for Root nodes.
  action:   Perhaps make this a user option (ml_aggregate.h) instead?
  details:  Most of the time I don't think we want this. The main idea is
            that this can make the serial and parallel codes act 
            qualitatively more the same. This is because the natural ordering
            (which is the default) can produce better results in serial
            (and can then degrade as more processors are used).

USE_GRAPH      (marzio) 
  summary:  I would guess that this is used to do graph partitioning as
            opposed to coordinate partitioning with Zoltan.
  action:   ? Can we make this a user option instead? It would be nice if
            it started with an ML at least.
  details:  I have a feeling Marzio told me that this option does not work.

USE_AT         (marzio)
  summary:  This was used for nonsymmetric problems to build R based on A^T.
  action:   I think we can get rid?
  details:  I believe that Marzio has new code to do this. In particular,
            I think he has a newer gen_hierarchy().

ML_VAMPIR    (jhu)
  summary:  Do stuff so that we can use Vampir to profile.
  action:   Probably nothing?
  details:

Here are some ifdefs that still need comments ....


INPUT_AGGREGATES      2
ML_AGGR_INAGGR  2
ML_AGGR_MARKINAGGR      1
ML_AGGR_OUTAGGR 2
ALEGRA  7
CLEAN_DEBUG     16
DDEBUG  5
DEBUG   4
DUMP_MATLAB_FILE        1
DUMP_WEST       1
EXTREME_DEBUG   2
EXTREME_DEBUGGING       10
METIS_DEBUG     2
ML_DEBUG_AMG    2
ML_AGGR_DEBUG   26

ML_AGGR_OUTPUT  6
ML_AGGR_PARTEST 3
ML_AGGR_TEST    1
ML_AGGR_out     1
ML_AMG_DEBUG    1
ML_NEWDROPSCHEME        5
ML_AGGR_NSOUTPUT        4
