Changeset 134

Show
Ignore:
Timestamp:
12/27/06 22:12:42 (2 years ago)
Author:
ant_39
Message:
  • Write an introductory chapter Why GCC, Why Now.
  • Bend fitprj to give me proper stuffing for a thesis written in English.
Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • trunk/doc/dp.tex

    r125 r134  
    1 \documentclass[czech]{fitprj} 
     1\documentclass[english]{fitprj} 
    22%\documentclass[a4paper,11pt]{report} 
    33%\usepackage{a4wide} 
     
    1717\topmargin 0in 
    1818 
    19 \date{29. �jna
    20 \setyear{2006
     19\date{January 3rd
     20\setyear{2007
    2121\author{Bc. Petr Machata} 
    2222\title{Construction of GNU Compiler Collection Frontend} 
    23 \FITproject{DP} 
    24  
    25 \edeclaration{ 
    26     Prohla�uji, �e jsem tuto diplomovou pr� vypracoval samostatn�od 
    27     veden�Ing. Milo�e Eysselta, CSc. 
    28     Dal��nformace mi poskytl zam�nanec firmy ANF Data, Ing. Luk�Szemla 
    29     % ujistit se ze je to tak spravne, jestli neni nahodou taky vedouci 
    30     Uvedl jsem v�echny liter��rameny a publikace, ze kter�em �pal. 
    31     } 
    32  
    33 \acknowledgements{ 
    34 Thanks to... (FIT BUT that they gave up their rights) 
    35 
     23\FITproject{SP} 
     24 
     25\def\Algol{{\sc Algol}\space} 
     26 
     27\edeclaration{ I hereby declare that this work has been created by me, 
     28  under the supervision of Milo� Eysselt, and under technical lead of 
     29  Luk�Szemla.  All information resources that I have used are 
     30  properly cited.  } 
     31 
     32\acknowledgements{ I owe thanks for patience and advices to the 
     33  thesis' technical advisor, Luk�Szemla, who was bombarded by my 
     34  status reports monthly; and Professor Jan van Katwijk of Delft 
     35  University of Technology, for advices regarding odds and ends of 
     36  \Algol 60. } 
    3637 
    3738 
     
    4950% tady bude zadani 
    5051\grantrights 
     52\abstractkeywordsCZ 
    5153\abstractkeywords 
    52 \abstractkeywordsCZ 
    5354\FITstart 
    54  
    55 \def\Algol{{\sc Algol}\space} 
    56 \def\GCC{{\sc GCC}\space} 
    57 \def\C{{\sc C}\space} 
    5855 
    5956% for various computer-related terms 
     
    7875\def\output#1{{\ttfamily\begin{tabbing}#1\end{tabbing}}} 
    7976 
    80 % ==================================================================== 
    81  
    82 \tableofcontents 
     77\def\term#1{{\it #1}} 
    8378 
    8479% ==================================================================== 
     
    8984insight to how to read it. 
    9085 
     86In this work, I will refer to myself as ``me'', and a reader as 
     87``you''. 
     88 
    9189\chapter{Why GCC, Why Now} 
    9290 
    93 (Reference: Tom Tromey's Java paper) Summarize options available when 
    94 one writes new language.  Interpreters and compilers, hand-rolling 
    95 compiler, compiling via C (this sucks mostly because C compiler has to 
    96 recover the higher level model from C, which of course isn't possible; 
    97 and debugging--GCC will preserve symbol names where necessary, while 
    98 in C, you'd have to mangle them without chance to give a clue to 
    99 target C compiler (you want to emit \#line directives to point the user 
    100 to the place where the error originated, and in debugger you have no 
    101 way to hide the fact that you are going through C)), compiling via GCC 
    102 (what if it's dynamic language, what if static). 
    103  
    104 Personal opinions aside, writing GCC frontend is probably best choice. 
    105 GCC is ported to dozens of platforms, has tons of optimizations, has 
    106 necessary community, corporation and academia drive, and finally 
    107 writing such frontend isn't nearly as difficult as rolling your own 
    108 backend (fingers crossed it's true). 
    109 Even rolling C backend would be a pain for some language features, 
    110 such as exception handling and object orientation, that GCC readily 
    111 supports. 
    112  
    113 Describe briefly that the author has written \Algol 60 parser 
    114 \cite{TR:ALGOL60} 
    115 to get 
    116 himself familiar with the stuff he describes.  Maybe drop a few notes 
    117 about how the parser (independent from GCC) and GCC frontend got 
    118 joined. 
     91When facing a~task of engineering a processor 
     92\footnote{In a general sense of a tool that allows direct or indirect 
     93  execution of a program in a given language.} of a given language, 
     94you have several options. 
     95 
     96\section{Language Processors Breakdown} 
     97 
     98You could write an interpreter, a tool that, given a~program, emulates 
     99its actions token by token without restricting to any form of 
     100compilation into intermediate code.  Interpreter has the advantage of 
     101being rather simple to write.  And if the language still isn't sorted 
     102out completely, it will be simple to adjust the processor. 
     103 
     104You could write a compiler that emits other high-level language, such 
     105as C.  Compilation via C is quite popular, but it has drawbacks. 
     106E.g. programs in C will contain artifacts introduced during the 
     107compilation, and those will be visible in debugger.  C may not support 
     108all the necessary constructs that your language needs, and you may 
     109have to use an even higher-level language.  You will typically have to 
     110give artificial names to various thunks of code, and mangle program 
     111identifiers.  On the other hand, C is well understood, with 
     112ubiquitious compilers, and tons of documentation. 
     113 
     114Another option is producing virtual machine instructions, such as JVM 
     115or CLR.  This can be advantageous, if you can count on having the 
     116target machine on binary host site.  Virtual machines typically do 
     117just in time compilation, so you can experience near-native speed of 
     118programs.  And they can be ported to several platforms, which means 
     119your compiler will be portable for free. 
     120 
     121\section{Processing With GCC} 
     122 
     123This thesis describes yet another option: writing the processor as a 
     124part of a well known compiler suite, GCC. 
     125 
     126Crafting GCC frontends used to be hard.  One had to understand quirks 
     127of RTL, GCC's intermediate language, which was neither easy, nor 
     128high-level.  But things changed: GCC team took tree language used as 
     129an AST representation in C and C++ frontends, and generalized it into 
     130an official intermediate language called GENERIC.  Work with GENERIC 
     131is rather easy\cite{LJ:2005:Tromey}, on par, I'd say, with emitting C. 
     132Unlike C, there are all GCC bells and whistles: attributes, inline 
     133assembly, OpenMP, whole C++ (albeit e.g. templates make up a {\em 
     134  very} obscure corner).  So it's very powerful platform, compared 
     135with C. 
     136 
     137Unlike C, however, the documentation is rather thin.  Quite often I 
     138found myself scanning through other frontends in hunt for particular 
     139usage of some feature.  Quite often GCC died on me with assertion 
     140error, and I had to look up what went wrong, and figure out why.  This 
     141is actually easier than it sounds, but the fact is, GCC would benefit 
     142from better documentation. 
     143 
     144GCC is work in progres.  For some ten years as of now.  Things change. 
     145If you won't get your frontend into GCC trunk, you will have to deal 
     146with those changes yourself.  And as far as I know, new languages are 
     147not exactly the priority of GCC team, supposedly for exactly the 
     148reason that they would have to maintain them.  It would have to be 
     149very high-impact language for that to happen.  If you won't carry your 
     150compiler forward, it will become irrelevant as time passes.  Just as 
     151are GCC-2.9x frontends irrelevant today, and just as GCC-3.x frontends 
     152will become one day. 
     153 
     154Given GCC's strong C heritage, it's still best fit for compilation of 
     155C-like languages.  Currently the GCC family contains C, C++, their 
     156Objective variants, Java, Fortran, and Ada.  Those are all rather 
     157C-like languages.  I can imagine compiling something like Python 
     158through GCC, and Mercury, a declarative logic/functional language was 
     159implemented as GCC frontend.  But still the best match will be for 
     160imperative C-like languages. 
     161 
     162Last but not least, GCC is a mature, even if underdocumented compiler. 
     163Big companies depend on GCC, on both the vendor and the consumer 
     164sides.  It is ported to a huge number of platforms, and has the 
     165necessary drive and impact.  Lots of people know how to use it, build 
     166it, package it, distribute it, and that means lots of people will know 
     167how to work with your frontend from the day one.  This is very 
     168important.  Clever usage of GCC features will make your processor 
     169another option for wide range of tasks, from parallel to systems 
     170programming. 
     171 
     172\section{What's Ahead} 
     173 
     174I have written an \Algol 60 \cite{TR:ALGOL60} frontend for 
     175GCC\footnote{\url{http://projects.almad.net/gcc-algol}} to get myself 
     176familiar with the platform.  I was expecting days\footnote{Or rather 
     177  nights. With lots of tea.} spent in GCC's internals, digging in code 
     178knee deep.  Much to my surprise, no such thing happened.  I had my 
     179share of cursing, but overall I was pleasantly surprised.  I doubt 
     180going via C would make the work significantly easier. 
     181 
     182What's ahead is description of my experience with development of 
     183\Algol 60 frontend.  It's a mix of things from GCC Internals 
     184documentation \cite{TR:GCCInt}, things cut'n'pasted from other 
     185people's code and later analyzed, and things either found in GCC 
     186comments or tried by chance and found to work, all wrapped up and 
     187delivered as a continuous, and hopefully coherent text. 
     188 
     189 
    119190 
    120191\chapter{GCC Architecture} 
     
    224295\section{Numbers} 
    225296 
    226 Elaborate on numeric support in \GCC.  Including big integers, long 
     297Elaborate on numeric support in GCC.  Including big integers, long 
    227298floats, what's necessary big/little endian-wise, if there's anything 
    228299to know about bits-per-byte, complex numbers (and what are possible 
     
    270341\section{Commandline options} 
    271342 
    272 Each \GCC frontend has the capability of processing commandline 
    273 options.  Moreover it inherits all the options from \GCC proper, so 
     343Each GCC frontend has the capability of processing commandline 
     344options.  Moreover it inherits all the options from GCC proper, so 
    274345e.g. \option{-O3}, \option{-o file} and others are available for all 
    275346frontends with no work.  The only work is necessary for definition of 
     
    277348commandline parsing left off programmer's shoulders. 
    278349 
    279 \GCC understands both positive and negative variants of \option{-f}, 
     350GCC understands both positive and negative variants of \option{-f}, 
    280351\option{-W} and \option{-m} options.  E.g. when your frontend supports 
    281 \option{-fdump-ast}, \GCC will understand also \option{-fno-dump-ast}. 
     352\option{-fdump-ast}, GCC will understand also \option{-fno-dump-ast}. 
    282353Furthermore, each option can be parametrized.  Thus you can have 
    283354e.g. \option{-{}-output-pch=} option for output of precompiled headers, 
     
    287358Of course, programmer has to write the handler for frontend-specific 
    288359options herself.  All work takes place in 
    289 \function{LANG\_HOOKS\_HANDLE\_OPTION} hook, \GCC calls this function 
     360\function{LANG\_HOOKS\_HANDLE\_OPTION} hook, GCC calls this function 
    290361each time it hits an option that the frontend understands.  The 
    291 communication isn't done through option strings, though.  Instead, \GCC 
     362communication isn't done through option strings, though.  Instead, GCC 
    292363associates each option a symbolic identifier with unique integer 
    293364value.  When option is handled, simple \literal{switch} statement can 
     
    308379All frontend-specific options are defined in \file{lang.opt}.  This 
    309380file gives, through build magic, rise to \file{options.h}.  Format and 
    310 features of \file{lang.opt} are to be found in \GCC internals 
     381features of \file{lang.opt} are to be found in GCC internals 
    311382documentation 
    312383\cite[chapter Options]{TR:GCCInt}. 
     
    336407 
    337408Runtime support also includes system's \literal{libc} and 
    338 \literal{libm}.  To honor \GCC interfaces, you will have to pay some 
     409\literal{libm}.  To honor GCC interfaces, you will have to pay some 
    339410attention to these, too. 
    340411 
    341 \GCC doesn't support integration of the runtime library with the same 
     412GCC doesn't support integration of the runtime library with the same 
    342413ease it supports new language frontends.  There are files to be 
    343414patched, an operation inherently unsafe in a volatile environment of 
    344 \GCC trunk.  Apart from that, however, the automation works nicely, 
     415GCC trunk.  Apart from that, however, the automation works nicely, 
    345416and with some rules in mind, you can build cross-compilation safe 
    346417runtime library that is linked to your binaries by default.  Let's get 
     
    367438simply copy preexisting library files there. 
    368439 
    369 \GCC expects that the library provides a \file{configure} script, 
     440GCC expects that the library provides a \file{configure} script, 
    370441which, when launched, creates \file{Makefile}.  (Note that the created 
    371442\file{Makefile} has to reside in a {\em build} directory, but the 
     
    377448bend it to suit your needs.  The files are mostly classical autotools 
    378449source files, but there are certain hacks here and there necessary for 
    379 integration into \GCC build system.  In particular: 
     450integration into GCC build system.  In particular: 
    380451 
    381452\begin{itemize} 
    382453\item \file{configure.ac} has to call \variable{GCC\_TOPLEV\_SUBDIRS} 
    383454  macro after \variable{AC\_INIT}.  This relates to the dreaded 
    384   \GCC{}'s build/host/target trichotomy.  Each of builds is separated 
     455  GCC{}'s build/host/target trichotomy.  Each of builds is separated 
    385456  in directory of its own: \variable{build\_subdir}, 
    386457  \variable{host\_subdir}, and \variable{target\_subdir}.  This macro 
     
    409480 
    410481One last step is necessary before your library gets built as part of 
    411 build process.  \GCC has to {\em know} about it.  As it is, you have 
    412 to touch \GCC{}'s privates to do it: you need to patch toplevel 
     482build process.  GCC has to {\em know} about it.  As it is, you have 
     483to touch GCC{}'s privates to do it: you need to patch toplevel 
    413484\file{configure} and \file{Makefile.def}.  Fortunately, both patches 
    414485are trivial. 
     
    431502\subsection{Linking the Binaries With Runtime Library} 
    432503 
    433 The method that \GCC uses to decide which runtime libraries to link in 
     504The method that GCC uses to decide which runtime libraries to link in 
    434505is pretty much a hack: you will inject necessary libraries into the 
    435506commandline, before it's processed. 
     
    444515be), in which situation you will add or remove it from the vector and 
    445516adjust \variable{argc} accordingly.  It's necessary to support native 
    446 \GCC switches, such as \option{-nostdlib} and \option{-nodefaultlibs}, 
     517GCC switches, such as \option{-nostdlib} and \option{-nodefaultlibs}, 
    447518as well as your own switches that imply that no linking is to be done 
    448519(e.g. \option{-fsyntax-only} or similar). 
     
    461532 
    462533@TODO:maybe move to numerical support chapter: 
    463 Since version 4.3.0, \GCC uses \literal{libgmp} to do its constant 
     534Since version 4.3.0, GCC uses \literal{libgmp} to do its constant 
    464535folding, which means that the library will be available at compiler 
    465536runtime.  Very often that means the library will also be available at 
    466537binary run site.  That doesn't hold universally, but at least 
    467 \literal{libgmp} was ported to all platforms where \GCC has supported 
     538\literal{libgmp} was ported to all platforms where GCC has supported 
    468539backends.  What this means for you is that you can dispatch to this 
    469540library many numerical algorithms that would otherwise have to be 
  • trunk/doc/fitprj.cls

    r102 r134  
    2525 
    2626\ifx\pdfoutput\undefined  % nejedeme pod pdftexem 
     27  \usepackage[czech,english]{babel} 
     28  \usepackage[latin2]{inputenc} 
    2729  \ifczech 
    28     \usepackage[czech,english]{babel} 
    2930    \main@language{czech} 
    30     \usepackage[latin2]{inputenc} 
    31   \else 
    32     \usepackage[english]{babel} 
     31  \else 
     32    \main@language{english} 
    3333  \fi 
    3434  \usepackage{graphics} 
    3535  \usepackage{epsfig} 
    3636\else % je to pdftex ! 
     37  \usepackage[czech,english]{babel} 
     38  \usepackage[latin2]{inputenc} 
    3739  \ifczech 
    38     \usepackage[czech,english]{babel} 
    3940    \main@language{czech} 
    40     \usepackage[latin2]{inputenc} 
    41   \else 
    42     \usepackage[english]{babel} 
     41  \else 
     42    \main@language{english} 
    4343  \fi 
    4444  \usepackage[pdftex]{graphicx} 
     
    9393  \def\@upgmname{Department of Computer Graphics and Multimedia} 
    9494 
     95  \def\@keywordsnameCZ{Kl�v�slova} 
    9596  \def\@keywordsname{Keywords} 
     97  \def\@abstractnameCZ{Abstrakt} 
    9698  \def\@abstractname{Abstract} 
    9799  \def\@projectBP{Bachelor project} 
     
    102104  \def\@projectPGSD{Doctoral Thesis} 
    103105 
    104   \def\@grantstuff{{\em The author thereby grants to \@vutname \@fitnam
     106  \def\@grantstuff{{\em The author thereby grants to \@vutname \space \@fitname \spac
    105107                   persmission to reproduce and distribute copies 
    106108   of this thesis document in whole or in part.}} 
    107109  \def\@authsign{Signiture of the Author} 
    108110  \def\@suprsign{Certified by} 
    109   \def\@submittext{Submitted %to the {\em \@departmentname}  
    110                    in partial %%%% as??? PP 
    111                    fulfillment of the requirements for 
    112    \@requirementstext at} 
     111  \def\@submittext{Submitted to Faculty of Information Technologies % 
     112  of the Brno University of Technology} 
    113113  \def\@acknowname{Acknowledgements} 
     114  \def\@prohlaseni{Declaration} 
    114115  \def\@tocheader{Contents} 
    115116\fi 
     
    160161   %  {\Large \sc \MakeUppercase{\@fitnameL} }\\[5mm] 
    161162   %  \@date 
    162    \\  dne \@date 
     163\ifczech 
     164   \\  dne 
     165\else 
     166   \\  at 
     167\fi 
     168   \@date 
    163169   %\end{center} 
    164170 
     
    211217} 
    212218 
    213 \ifczech 
    214219\newcommand{\abstractkeywordsCZ}{% 
    215220  \thispagestyle{empty} 
     
    234239  \clearpage 
    235240} 
    236 \fi 
    237241 
    238242\newcommand{\FITstart}{% 
     
    264268} 
    265269 
    266 \ifczech 
    267270\newcommand{\fitabstractCZ}[1]{% 
    268271  \def\@abstracttextCZ{#1} 
     
    271274  \def\@keywordstextCZ{#1} 
    272275} 
    273 \fi 
    274276 
    275277\newcommand{\requirements}[1]{%