%%%%%%%%%%%%%%%%%%%%%%%%CUT HERE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% This is ltexproc.tex, an example file for use with the SIAM LaTeX
% Proceedings Series macros. 
% Please take the time to read the following comments, as they describe
% how to use these macros. This file can be composed and printed out for
% use as sample output.

% Any comments or questions regarding these macros should be directed to:
%
%                 Edward A. Cilurso
%                 SIAM
%                 3600 University City Science Center
%                 Philadelphia, PA 19104-2688
%                 USA
%                 Telephone: (215) 382-9800, ext. 384
%                 Fax: (215) 386-7999
%                 e-mail: cilurso@siam.org


% This file is to be used as an example for style only. It should not be read
% for content.

%%%%%%%%%%%%%%% PLEASE NOTE THE FOLLOWING STYLE RESTRICTIONS %%%%%%%%%%%%%%%

%%  1. There are no new tags.  Existing LaTeX tags have been formatted to
%%     match the Proceedings series style.  Using LaTeX's automatic numbering
%%     is highly recommended. Use of the manual \leqno or \tag commands
%%     may cause numbering problems.    
%%
%%  2. You must use \cite in the text to mark your reference citations and 
%%     \bibitem in the listing of references at the end of your chapter. See
%%     the examples in the following file. The file siamproc.bst has been 
%%     included for use with BiBTeX. Please be sure to include the appropriate
%%     .bib file with BiBTeX submissions. You may also submit the .bbl file
%%     (run with siamproc.bst) instead of the .bib file.
%% 
%%  3. Unless otherwise stated by your editor, do your chapter as if it
%%     is Chapter 1. The appropriate chapter number will be included during
%%     the production of the proceedings.
%%
%%  4. This macro is set up for three levels of headings (\section, 
%%     \subsection, and \subsubsection). The macro will automatically number 
%%     the headings for you.  If you wish to use an itemized list format,
%%     we suggest {itemise}, including whatever markers you want as follows:
%%     \item[$1.$]		
%%
%%  5. The running heads are indicated by the \markboth command. Please 
%%     define the running heads by placing the authors last names in the 
%%     first field and the title of the paper in the second field.
%%     They should be typed initial cap and lower case. Please see the example.
%%     Neither field can contain more than 50 characters including spaces, 
%%     so please use a  shortened version of the title if necessary. For 
%%     papers with multiple authors please follow these rules; for 
%%     two authors type {Author 1 and Author 2}; for more that two authors type 
%%     {Author 1 et al.}.
%% 
%%  6. Theorems, Lemmas, Definitions, etc. are to be double-numbered, 
%%     indicating the section and the occurrence of that element
%%     within that section. (For example, the first theorem in the second
%%     section would be numbered 2.1. The macro will automatically do 
%%     the numbering for you.  We ask that author not hardcode such items,
%%     since this can cause numbering and citation problems.	
%%
%%  7. Proofs are handled with the commands \begin{proof} \end{proof}.
%%     If you wish to use an end of proof box, use \qed preceding \end{proof}.
%%     The example uses one. It is not required.
%%
%%  8. Figures, equations, and tables must be single-numbered. All equation
%%     numbers are to be on the left. Figure captions should be placed under
%%     the figures they pertain to. Table captions should be placed above 
%%     the tables. Use existing LaTeX tags for these elements. Numbering of 
%%     these elements will be done automatically. SIAM supports the use of 
%%     epsfig for including Postscript figures in LaTeX2e, and psfig for the
%%     LaTeX 2.09. SIAM does not support psfrag for figure labelling. If you
%%     have used psfrag, high-quality hardcopies will be absolutely necessary.
%%     All Postscript figures should be sent as separate files. 
%%     A hardcopy version of all Postscript figures is also required. See 
%%     note regarding this under How to Submit Your Paper.
%%   
%%	Here is an example of the best way to include a figure:
%%
%%	\begin{figure}
%%	\centerline{\epsfig{file=myfig.eps}}
%%	\caption{This is my caption.}
%%	\end{figure}
%%
%%	Note that the figure is centered using \centerline, is included as
%%	a single .eps file, and has already been sized properly.  If you have
%%	a figure that is too large, please adjust it using either "width=" or
%%	"height=" after the "file=," such as in this example:
%%
%%	\begin{figure}
%%	\centerline{\epsfig{file=myfig.eps,width=25pc}}
%%	\caption{This is my caption.}
%%	\end{figure}
%%
%%	Any LaTeX-generated figures must still begin and end with \begin{figure}
%%	and \end{figure}, just as tables must be included using \begin{table} 
%%	and \end{table}.  
%%
%%  9. Grant information and author affiliations.
%%     This information is included in the file with the two commands,
%%     \thanks and \footnotemark []. (See example). The \thanks command 
%%     produces a footnote for the title or author, and places the 
%%     appropriate footnote symbol with the title or author and at the
%%     bottom of the page. The \footnotemark [] command allows the use of
%%     duplicate footnote symbols. This macro follows the normal LaTeX order 
%%     of footnote symbols. Below is a list of these symbols, and their 
%%     corresponding footnotemark:
%%
%%      asterisk                \footnotemark[1]
%%      single-dagger           \footnotemark[2]
%%      double-dagger           \footnotemark[3]
%%      section sign            \footnotemark[4]
%%      paragraph               \footnotemark[5]
%%      parallel                \footnotemark[6]
%%      double asterisk         \footnotemark[7]
%%      double single-dagger    \footnotemark[8]
%%      double double-dagger    \footnotemark[9]   
%%
%%    The following general rules for grants and affiliations apply:
%%      a) If there is a single grant for the paper, then the grant 
%%         information should be footnoted to the title.
%%      b) If there is more than one grant, include the grant information
%%         with each author's affiliation.
%%      c) If there are different grants for the paper but the authors share
%%         the same affiliation, footnote the grant information to the title.
%%         For example, The work of the first author was supported by xyz.
%%         The work of the second author was supported by abc. And so on.
%%      d) For authors sharing the same affiliation, use \thanks for the
%%         first author with that affiliation and the appropriate
%%         \footnotemark[] (from the list above) for all subsequent authors
%%         with that affiliation.
%%
%% 10. Special fonts.
%%     SIAM supports the use of AMS-TeX fonts version 2.0 and later. As 
%%     described in the manual for these fonts, they can be included by
%%     \usepackage{amsmath} for LaTeX2e and \input{amssym.def} and 
%%     \input{amssym.tex} for LaTeX 2.09. These macros are not yet
%%     updated to make use of the New Font Selection Scheme (NFSS) of
%%     Mittelbach and Schopf. To make these macros compatible with NFSS, use 
%%     the oldlfont style option. 
%%
%% 11. How to Submit Your Paper.
%%     The electronic version of your paper should be sent to proceed@siam.org.
%%     A hardcopy version should also be submitted. Instructions are included
%%     in your acceptance letter. Please be sure to send hardcopy
%%     versions of any Postscript figures you have submitted electronically.
%%     Be sure to return your signed Copyright Transfer Agreement. We cannot
%%     publish your paper without it.
%%     
%%
%%
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%-
%%%




\documentclass[leqno,final,11pt]{article}   %You must set up
\usepackage{proc2e}			    %your \documentclass line like this.
\usepackage{latexsym}


%\input{psfig}
\begin{document}
\bibliographystyle{plain}
\cleardoublepage
\pagestyle{myheadings}

\title{Parallelization of Non-Equilibrium Radiation Transport Code}
\author{Igor E. Golovkin\footnotemark[1] \and
Roberto C. Mancini\thanks{Department of Physics, University of
%\author{Igor E. Golovkin; Roberto C. Mancini\thanks{Department of Physics, University of
Nevada, Reno, golovkin@physics.unr.edu; rcman@physics.unr.edu}
\and 
Frederick C. Harris, Jr.\thanks{Department of Computer Science,
University of Nevada, Reno, fredh@cs.unr.edu}
}
\date{}
\maketitle
\pagenumbering{arabic}   %Required

%\begin{abstract}
%short abstract goes here...
%\end{abstract}


	Radiation
	transport effects on X-ray line emission are
	important in modeling spectroscopic-quality, synthetic
	spectra of mid- and high-Z tracer elements in hot dense
	plasmas. Plasmas produced in the laboratory are usually in
	non-equilibrium, i.e. the ionization balance and distribution
	of atomic-level populations are determined by a set of
	collisional and radiative atomic processes. In these conditions
	radiation in the plasma and the
	level-population distribution are interdependent and have to be
	self-consistently determined. This involves the simultaneous
	solution of a set of atomic kinetics rate equations and the
	radiation transport equation, a problem which is
	non-linear and non-local. This results in an
	integro-differential problem that in general can not
	be solved analytically~\cite{mihalas78}.

	The problem of non-equilibrium radiation transport is
	quite computationally intensive.  Within the last 15
	years Lambda operator techniques have been introduced and have
	produced robust iterative schemes~\cite{sc85}.  We have utilized a
	combination of linearization and the Lambda operator approach.
	Our model focuses on a spectral range that covers
	line transitions relevant for spectroscopy
	diagnostics. We assume a plane parallel slab geometry with no
	incident radiation on either side of the slab.

        The basic outline of this iterative method can be described as
        follows: first we introduce spatial discretization.  Then for
        each spatial zone we:
                (1) linearize the atomic kinetic rate equations;
                (2) replace one of the equations by a particle
                conservation condition, since the set of atomic kinetic
                rate equations is redundant;
                (3) express radiation dependent terms in these
                equations through population numbers
                using Lambda operator; this is an integral
                operator based on a formal solution of the radiation
                transport equation;
                (4) solve the resulting system
                for corrections to
                population numbers;
                (5) update the values of the population numbers and
                the radiation field and then repeat all these steps
                until a convergence criterion is satisfied.

        This procedure allows us to obtain the solution, i.e. the population
        numbers and radiation field for each spatial zone.  This
        method is robust and stable, does not require a lot of
        memory, and it converges rapidly.  The only disadvantage is
        that it is very computationally intensive, mainly
        because of the complexity of the Lambda operator.

        Several approximations and acceleration 
        techniques have been
        developed to reduce the computational time required
        for calculating the Lambda operator.
        Many of these methods have been shown
        to reduce computational time significantly; unfortunately, these
        methods also have several drawbacks.  The first disadvantage is
        that the choice of an approximate Lambda operator may depend
        on the physical problem under consideration. This
        hurts the universality of the code.  The second
        disadvantage is that, an approximate operator may contain
        optimization parameters that can not be known {\it a priori}
        but have to be found by trial and error.  Finally, a
        situation may arise in which by reducing computational time
        per iteration one may have to pay the price of
        increasing the number of iteration steps.  This may result in a
        small improvement, no improvement, or worse yet, a longer
        running time.

        Because we wanted to keep our algorithm general,
        straightforward, and therefore easy to modify, instead of
        trying the accelerations we attempted to exploit the power of
        parallelization.  This algorithm provides a very good
        opportunity for distributing tasks among processing elements.
        Each iteration step consist of
        two steps: building the system of equations {\bf Ax=b} (i.e.
        setting up matrix {\bf A} and vector {\bf b}), and solving it. The matrix's
        dimension is equal to the number of atomic
        levels times the number of discretization points in space.  The
        number of levels is determined by the physics of the problem
        and in many cases is not very high.  The number of spatial
        zones can also be kept small.  The matrix is not
        singular and therefore the system can be solved quite easily.
        What takes most of the computational time is setting up the
        matrix and the right-hand-side vector.  Each non-zero element
        in the matrix contains a combination of triple integrals
        arising from the Lambda operator. These are integrals in space,
        angle and frequency approximated by Gauss-Legendre quadrature
        formulas.  The first two sums can have a moderate number of
        terms, but for the frequency integral we must use a fine grid
        to get high quality spectra.

        From the physics point of view each row of the matrix shows how
        the population of each particular atomic level in each spatial
        zone can be effected by all the other levels. The rows of this
        matrix are completely independent and thus can be 
	calculated by a separate processor.  Hence our main
        efforts have been focused on parallelization of the matrix
        set-up because an almost perfect speed-up can be expected as
        long as the matrix dimension is divisible by the number of
        processors, and each processor can be assigned the same amount
        of work.

        We started our work by writing a sequential version of the
        program.  This was done for two reasons. First we needed a
        working sequential code to serve as a diagnostic tool for our
        parallel code, and second, we used the sequential code as a
        springboard for our parallel development.  The parallel code
        was developed and tested on our SGI Power Challenge machine
        with 8 processors and shared memory. Close to linear speed-up
        has been achieved as is shown in Figure 1.


        \begin{figure}[h]
        %\centerline{\psfig{figure=fig.ps,height=2.0in}}
        \begin{center}
	{\footnotesize
        \input{fig.tex}
	}
        \end{center}
        \caption{Problem Speed-up vs. the Number of Processors Used}
        \end{figure}
        %Our initial
        %results are very promising, and we expect that we will achieve
        %similar performance when the code is ported to a larger
        %machine.

\quad
\baselineskip=12pt
\parindent=0pt
\bibliography{/staff/fredh/papers/bibdir/physics}

\end{document}

% end of example file.
\end{document}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

