Changeset 134
- Timestamp:
- 12/27/06 22:12:42 (2 years ago)
- Files:
-
- trunk/doc/dp.tex (modified) (17 diffs)
- trunk/doc/fitprj.cls (modified) (8 diffs)
Legend:
- Unmodified
- Added
- Removed
- Modified
- Copied
- Moved
trunk/doc/dp.tex
r125 r134 1 \documentclass[ czech]{fitprj}1 \documentclass[english]{fitprj} 2 2 %\documentclass[a4paper,11pt]{report} 3 3 %\usepackage{a4wide} … … 17 17 \topmargin 0in 18 18 19 \date{ 29. �jna}20 \setyear{200 6}19 \date{January 3rd} 20 \setyear{2007} 21 21 \author{Bc. Petr Machata} 22 22 \title{Construction of GNU Compiler Collection Frontend} 23 \FITproject{DP} 24 25 \edeclaration{ 26 Prohla�uji, �e jsem tuto diplomovou pr� vypracoval samostatn�od 27 veden�Ing. Milo�e Eysselta, CSc. 28 Dal��nformace mi poskytl zam�nanec firmy ANF Data, Ing. Luk�Szemla 29 % ujistit se ze je to tak spravne, jestli neni nahodou taky vedouci 30 Uvedl jsem v�echny liter��rameny a publikace, ze kter�em �pal. 31 } 32 33 \acknowledgements{ 34 Thanks to... (FIT BUT that they gave up their rights) 35 } 23 \FITproject{SP} 24 25 \def\Algol{{\sc Algol}\space} 26 27 \edeclaration{ I hereby declare that this work has been created by me, 28 under the supervision of Milo� Eysselt, and under technical lead of 29 Luk�Szemla. All information resources that I have used are 30 properly cited. } 31 32 \acknowledgements{ I owe thanks for patience and advices to the 33 thesis' technical advisor, Luk�Szemla, who was bombarded by my 34 status reports monthly; and Professor Jan van Katwijk of Delft 35 University of Technology, for advices regarding odds and ends of 36 \Algol 60. } 36 37 37 38 … … 49 50 % tady bude zadani 50 51 \grantrights 52 \abstractkeywordsCZ 51 53 \abstractkeywords 52 \abstractkeywordsCZ53 54 \FITstart 54 55 \def\Algol{{\sc Algol}\space}56 \def\GCC{{\sc GCC}\space}57 \def\C{{\sc C}\space}58 55 59 56 % for various computer-related terms … … 78 75 \def\output#1{{\ttfamily\begin{tabbing}#1\end{tabbing}}} 79 76 80 % ==================================================================== 81 82 \tableofcontents 77 \def\term#1{{\it #1}} 83 78 84 79 % ==================================================================== … … 89 84 insight to how to read it. 90 85 86 In this work, I will refer to myself as ``me'', and a reader as 87 ``you''. 88 91 89 \chapter{Why GCC, Why Now} 92 90 93 (Reference: Tom Tromey's Java paper) Summarize options available when 94 one writes new language. Interpreters and compilers, hand-rolling 95 compiler, compiling via C (this sucks mostly because C compiler has to 96 recover the higher level model from C, which of course isn't possible; 97 and debugging--GCC will preserve symbol names where necessary, while 98 in C, you'd have to mangle them without chance to give a clue to 99 target C compiler (you want to emit \#line directives to point the user 100 to the place where the error originated, and in debugger you have no 101 way to hide the fact that you are going through C)), compiling via GCC 102 (what if it's dynamic language, what if static). 103 104 Personal opinions aside, writing GCC frontend is probably best choice. 105 GCC is ported to dozens of platforms, has tons of optimizations, has 106 necessary community, corporation and academia drive, and finally 107 writing such frontend isn't nearly as difficult as rolling your own 108 backend (fingers crossed it's true). 109 Even rolling C backend would be a pain for some language features, 110 such as exception handling and object orientation, that GCC readily 111 supports. 112 113 Describe briefly that the author has written \Algol 60 parser 114 \cite{TR:ALGOL60} 115 to get 116 himself familiar with the stuff he describes. Maybe drop a few notes 117 about how the parser (independent from GCC) and GCC frontend got 118 joined. 91 When facing a~task of engineering a processor 92 \footnote{In a general sense of a tool that allows direct or indirect 93 execution of a program in a given language.} of a given language, 94 you have several options. 95 96 \section{Language Processors Breakdown} 97 98 You could write an interpreter, a tool that, given a~program, emulates 99 its actions token by token without restricting to any form of 100 compilation into intermediate code. Interpreter has the advantage of 101 being rather simple to write. And if the language still isn't sorted 102 out completely, it will be simple to adjust the processor. 103 104 You could write a compiler that emits other high-level language, such 105 as C. Compilation via C is quite popular, but it has drawbacks. 106 E.g. programs in C will contain artifacts introduced during the 107 compilation, and those will be visible in debugger. C may not support 108 all the necessary constructs that your language needs, and you may 109 have to use an even higher-level language. You will typically have to 110 give artificial names to various thunks of code, and mangle program 111 identifiers. On the other hand, C is well understood, with 112 ubiquitious compilers, and tons of documentation. 113 114 Another option is producing virtual machine instructions, such as JVM 115 or CLR. This can be advantageous, if you can count on having the 116 target machine on binary host site. Virtual machines typically do 117 just in time compilation, so you can experience near-native speed of 118 programs. And they can be ported to several platforms, which means 119 your compiler will be portable for free. 120 121 \section{Processing With GCC} 122 123 This thesis describes yet another option: writing the processor as a 124 part of a well known compiler suite, GCC. 125 126 Crafting GCC frontends used to be hard. One had to understand quirks 127 of RTL, GCC's intermediate language, which was neither easy, nor 128 high-level. But things changed: GCC team took tree language used as 129 an AST representation in C and C++ frontends, and generalized it into 130 an official intermediate language called GENERIC. Work with GENERIC 131 is rather easy\cite{LJ:2005:Tromey}, on par, I'd say, with emitting C. 132 Unlike C, there are all GCC bells and whistles: attributes, inline 133 assembly, OpenMP, whole C++ (albeit e.g. templates make up a {\em 134 very} obscure corner). So it's very powerful platform, compared 135 with C. 136 137 Unlike C, however, the documentation is rather thin. Quite often I 138 found myself scanning through other frontends in hunt for particular 139 usage of some feature. Quite often GCC died on me with assertion 140 error, and I had to look up what went wrong, and figure out why. This 141 is actually easier than it sounds, but the fact is, GCC would benefit 142 from better documentation. 143 144 GCC is work in progres. For some ten years as of now. Things change. 145 If you won't get your frontend into GCC trunk, you will have to deal 146 with those changes yourself. And as far as I know, new languages are 147 not exactly the priority of GCC team, supposedly for exactly the 148 reason that they would have to maintain them. It would have to be 149 very high-impact language for that to happen. If you won't carry your 150 compiler forward, it will become irrelevant as time passes. Just as 151 are GCC-2.9x frontends irrelevant today, and just as GCC-3.x frontends 152 will become one day. 153 154 Given GCC's strong C heritage, it's still best fit for compilation of 155 C-like languages. Currently the GCC family contains C, C++, their 156 Objective variants, Java, Fortran, and Ada. Those are all rather 157 C-like languages. I can imagine compiling something like Python 158 through GCC, and Mercury, a declarative logic/functional language was 159 implemented as GCC frontend. But still the best match will be for 160 imperative C-like languages. 161 162 Last but not least, GCC is a mature, even if underdocumented compiler. 163 Big companies depend on GCC, on both the vendor and the consumer 164 sides. It is ported to a huge number of platforms, and has the 165 necessary drive and impact. Lots of people know how to use it, build 166 it, package it, distribute it, and that means lots of people will know 167 how to work with your frontend from the day one. This is very 168 important. Clever usage of GCC features will make your processor 169 another option for wide range of tasks, from parallel to systems 170 programming. 171 172 \section{What's Ahead} 173 174 I have written an \Algol 60 \cite{TR:ALGOL60} frontend for 175 GCC\footnote{\url{http://projects.almad.net/gcc-algol}} to get myself 176 familiar with the platform. I was expecting days\footnote{Or rather 177 nights. With lots of tea.} spent in GCC's internals, digging in code 178 knee deep. Much to my surprise, no such thing happened. I had my 179 share of cursing, but overall I was pleasantly surprised. I doubt 180 going via C would make the work significantly easier. 181 182 What's ahead is description of my experience with development of 183 \Algol 60 frontend. It's a mix of things from GCC Internals 184 documentation \cite{TR:GCCInt}, things cut'n'pasted from other 185 people's code and later analyzed, and things either found in GCC 186 comments or tried by chance and found to work, all wrapped up and 187 delivered as a continuous, and hopefully coherent text. 188 189 119 190 120 191 \chapter{GCC Architecture} … … 224 295 \section{Numbers} 225 296 226 Elaborate on numeric support in \GCC. Including big integers, long297 Elaborate on numeric support in GCC. Including big integers, long 227 298 floats, what's necessary big/little endian-wise, if there's anything 228 299 to know about bits-per-byte, complex numbers (and what are possible … … 270 341 \section{Commandline options} 271 342 272 Each \GCC frontend has the capability of processing commandline273 options. Moreover it inherits all the options from \GCC proper, so343 Each GCC frontend has the capability of processing commandline 344 options. Moreover it inherits all the options from GCC proper, so 274 345 e.g. \option{-O3}, \option{-o file} and others are available for all 275 346 frontends with no work. The only work is necessary for definition of … … 277 348 commandline parsing left off programmer's shoulders. 278 349 279 \GCC understands both positive and negative variants of \option{-f},350 GCC understands both positive and negative variants of \option{-f}, 280 351 \option{-W} and \option{-m} options. E.g. when your frontend supports 281 \option{-fdump-ast}, \GCC will understand also \option{-fno-dump-ast}.352 \option{-fdump-ast}, GCC will understand also \option{-fno-dump-ast}. 282 353 Furthermore, each option can be parametrized. Thus you can have 283 354 e.g. \option{-{}-output-pch=} option for output of precompiled headers, … … 287 358 Of course, programmer has to write the handler for frontend-specific 288 359 options herself. All work takes place in 289 \function{LANG\_HOOKS\_HANDLE\_OPTION} hook, \GCC calls this function360 \function{LANG\_HOOKS\_HANDLE\_OPTION} hook, GCC calls this function 290 361 each time it hits an option that the frontend understands. The 291 communication isn't done through option strings, though. Instead, \GCC362 communication isn't done through option strings, though. Instead, GCC 292 363 associates each option a symbolic identifier with unique integer 293 364 value. When option is handled, simple \literal{switch} statement can … … 308 379 All frontend-specific options are defined in \file{lang.opt}. This 309 380 file gives, through build magic, rise to \file{options.h}. Format and 310 features of \file{lang.opt} are to be found in \GCC internals381 features of \file{lang.opt} are to be found in GCC internals 311 382 documentation 312 383 \cite[chapter Options]{TR:GCCInt}. … … 336 407 337 408 Runtime support also includes system's \literal{libc} and 338 \literal{libm}. To honor \GCC interfaces, you will have to pay some409 \literal{libm}. To honor GCC interfaces, you will have to pay some 339 410 attention to these, too. 340 411 341 \GCC doesn't support integration of the runtime library with the same412 GCC doesn't support integration of the runtime library with the same 342 413 ease it supports new language frontends. There are files to be 343 414 patched, an operation inherently unsafe in a volatile environment of 344 \GCC trunk. Apart from that, however, the automation works nicely,415 GCC trunk. Apart from that, however, the automation works nicely, 345 416 and with some rules in mind, you can build cross-compilation safe 346 417 runtime library that is linked to your binaries by default. Let's get … … 367 438 simply copy preexisting library files there. 368 439 369 \GCC expects that the library provides a \file{configure} script,440 GCC expects that the library provides a \file{configure} script, 370 441 which, when launched, creates \file{Makefile}. (Note that the created 371 442 \file{Makefile} has to reside in a {\em build} directory, but the … … 377 448 bend it to suit your needs. The files are mostly classical autotools 378 449 source files, but there are certain hacks here and there necessary for 379 integration into \GCC build system. In particular:450 integration into GCC build system. In particular: 380 451 381 452 \begin{itemize} 382 453 \item \file{configure.ac} has to call \variable{GCC\_TOPLEV\_SUBDIRS} 383 454 macro after \variable{AC\_INIT}. This relates to the dreaded 384 \GCC{}'s build/host/target trichotomy. Each of builds is separated455 GCC{}'s build/host/target trichotomy. Each of builds is separated 385 456 in directory of its own: \variable{build\_subdir}, 386 457 \variable{host\_subdir}, and \variable{target\_subdir}. This macro … … 409 480 410 481 One last step is necessary before your library gets built as part of 411 build process. \GCC has to {\em know} about it. As it is, you have412 to touch \GCC{}'s privates to do it: you need to patch toplevel482 build process. GCC has to {\em know} about it. As it is, you have 483 to touch GCC{}'s privates to do it: you need to patch toplevel 413 484 \file{configure} and \file{Makefile.def}. Fortunately, both patches 414 485 are trivial. … … 431 502 \subsection{Linking the Binaries With Runtime Library} 432 503 433 The method that \GCC uses to decide which runtime libraries to link in504 The method that GCC uses to decide which runtime libraries to link in 434 505 is pretty much a hack: you will inject necessary libraries into the 435 506 commandline, before it's processed. … … 444 515 be), in which situation you will add or remove it from the vector and 445 516 adjust \variable{argc} accordingly. It's necessary to support native 446 \GCC switches, such as \option{-nostdlib} and \option{-nodefaultlibs},517 GCC switches, such as \option{-nostdlib} and \option{-nodefaultlibs}, 447 518 as well as your own switches that imply that no linking is to be done 448 519 (e.g. \option{-fsyntax-only} or similar). … … 461 532 462 533 @TODO:maybe move to numerical support chapter: 463 Since version 4.3.0, \GCC uses \literal{libgmp} to do its constant534 Since version 4.3.0, GCC uses \literal{libgmp} to do its constant 464 535 folding, which means that the library will be available at compiler 465 536 runtime. Very often that means the library will also be available at 466 537 binary run site. That doesn't hold universally, but at least 467 \literal{libgmp} was ported to all platforms where \GCC has supported538 \literal{libgmp} was ported to all platforms where GCC has supported 468 539 backends. What this means for you is that you can dispatch to this 469 540 library many numerical algorithms that would otherwise have to be trunk/doc/fitprj.cls
r102 r134 25 25 26 26 \ifx\pdfoutput\undefined % nejedeme pod pdftexem 27 \usepackage[czech,english]{babel} 28 \usepackage[latin2]{inputenc} 27 29 \ifczech 28 \usepackage[czech,english]{babel}29 30 \main@language{czech} 30 \usepackage[latin2]{inputenc} 31 \else 32 \usepackage[english]{babel} 31 \else 32 \main@language{english} 33 33 \fi 34 34 \usepackage{graphics} 35 35 \usepackage{epsfig} 36 36 \else % je to pdftex ! 37 \usepackage[czech,english]{babel} 38 \usepackage[latin2]{inputenc} 37 39 \ifczech 38 \usepackage[czech,english]{babel}39 40 \main@language{czech} 40 \usepackage[latin2]{inputenc} 41 \else 42 \usepackage[english]{babel} 41 \else 42 \main@language{english} 43 43 \fi 44 44 \usepackage[pdftex]{graphicx} … … 93 93 \def\@upgmname{Department of Computer Graphics and Multimedia} 94 94 95 \def\@keywordsnameCZ{Kl�v�slova} 95 96 \def\@keywordsname{Keywords} 97 \def\@abstractnameCZ{Abstrakt} 96 98 \def\@abstractname{Abstract} 97 99 \def\@projectBP{Bachelor project} … … 102 104 \def\@projectPGSD{Doctoral Thesis} 103 105 104 \def\@grantstuff{{\em The author thereby grants to \@vutname \ @fitname106 \def\@grantstuff{{\em The author thereby grants to \@vutname \space \@fitname \space 105 107 persmission to reproduce and distribute copies 106 108 of this thesis document in whole or in part.}} 107 109 \def\@authsign{Signiture of the Author} 108 110 \def\@suprsign{Certified by} 109 \def\@submittext{Submitted %to the {\em \@departmentname} 110 in partial %%%% as??? PP 111 fulfillment of the requirements for 112 \@requirementstext at} 111 \def\@submittext{Submitted to Faculty of Information Technologies % 112 of the Brno University of Technology} 113 113 \def\@acknowname{Acknowledgements} 114 \def\@prohlaseni{Declaration} 114 115 \def\@tocheader{Contents} 115 116 \fi … … 160 161 % {\Large \sc \MakeUppercase{\@fitnameL} }\\[5mm] 161 162 % \@date 162 \\ dne \@date 163 \ifczech 164 \\ dne 165 \else 166 \\ at 167 \fi 168 \@date 163 169 %\end{center} 164 170 … … 211 217 } 212 218 213 \ifczech214 219 \newcommand{\abstractkeywordsCZ}{% 215 220 \thispagestyle{empty} … … 234 239 \clearpage 235 240 } 236 \fi237 241 238 242 \newcommand{\FITstart}{% … … 264 268 } 265 269 266 \ifczech267 270 \newcommand{\fitabstractCZ}[1]{% 268 271 \def\@abstracttextCZ{#1} … … 271 274 \def\@keywordstextCZ{#1} 272 275 } 273 \fi274 276 275 277 \newcommand{\requirements}[1]{%
