Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

GCC 4.0.0 Released

Posted by CowboyNeal on Thu Apr 21, 2005 09:08 PM
from the funrolled-loops dept.
busfahrer writes "Version 4.0.0 of the GNU Compiler Collection has been released. You can read the changelog or you can download the source tarball. The new version finally features SSA for trees, allowing for a completely new optimization framework." The changelog is pretty lengthy, and there's updates for every language supported from Ada to Java in addition to the usual flavors of C.
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Moving fast (Score:4, Interesting)

    by slapout (93640) on Thursday April 21 2005, @09:10PM (#12309554)
    Is it just me or did the jump from version 3 to 4 happen a lot faster than the one from 2 to 3?
    • by ari_j (90255) on Thursday April 21 2005, @09:12PM (#12309564)
      There was a version 3?
    • Re:Moving fast (Score:5, Interesting)

      by JohnsonWax (195390) on Thursday April 21 2005, @09:18PM (#12309606)
      Apple wasn't working on GCC until version 3. I suspect a lot of other companies weren't either.
    • by Anonymous Coward on Thursday April 21 2005, @10:12PM (#12309940)
      due to the fact that all its c++ shared libraries will now be 40% smaller due to the symbol visibility improvements (i.e., no runtime adjustment needed by the linker for internal-only functions). This translates into a significant speed improvement for all KDE code.
        • by Anonymous Coward on Friday April 22 2005, @01:25AM (#12310833)
          In case you hadn't noticed, the "slow" part of running KDE are the start up times. Once you actually get KDE loaded, the runtime speed is fine.

          -a GNOME/KDE agnostic fluxbox user
      • Re:Moving fast (Score:5, Interesting)

        by burns210 (572621) <maburns@gmail.com> on Thursday April 21 2005, @09:32PM (#12309692) Homepage Journal
        Apple is using it in their Tiger (OS X 10.4) release come the 29th of this month. So there is a few millions new GCC 4.0 users right there.
            • by 1lus10n (586635) on Thursday April 21 2005, @11:50PM (#12310471) Journal
              You have no concept of numbers. Both Linux and mac are minor on the desktop but close to 50% of the backend of the internet is handled by unix or unix like systems (not including apple). The vast majority of which use gcc or some derivative.

              Unix and its children and cousins on the back and front end probably double the total number of apple boxes out there. If not more so. Hell some numbers suggest that there actually are more linux desktops than mac desktops. Even if its close between apple and linux on the desktop (which is likely) the number of nix systems in use in general at least matches the number on either side (though they are not desktops).
  • Lisp? (Score:4, Funny)

    by ari_j (90255) on Thursday April 21 2005, @09:10PM (#12309555)
    Yeah, but does it have a Common Lisp compiler yet?
    • Re:Lisp? (Score:5, Informative)

      by sketerpot (454020) * <sketerpot&gmail,com> on Thursday April 21 2005, @09:37PM (#12309718)
      Try SBCL, CMUCL, GCL, or CLISP. They're all good Lisp implementations. SBCL and CMUCL compile to native code directly and are probably the fastest free CL implemetations, GCL compiles via C (and therefore GCC), and CLISP has a bytecode interpreter.
    • Re:Lisp? (Score:5, Funny)

      by jm92956n (758515) on Thursday April 21 2005, @09:49PM (#12309814) Journal
      every language supported from Ada to Java

      From A to J. Lisp starts with an L.

      Therefore, according to the summary... no.

    • Re:Lisp? (Score:5, Informative)

      by Theatetus (521747) on Friday April 22 2005, @01:58AM (#12310924) Journal

      Yes, gcl (formerly known as kyoto common lisp). But it doesn't need the assembler/linker part of the toolchain so it's packaged separately. But I think it is "Part of the GNU Compiler Collection", for what that's worth, and it does depend on GCC.

  • by ribo-bailey (724061) on Thursday April 21 2005, @09:11PM (#12309558) Homepage
    of the 2.95 -> 3.0 transition.
    • Why? (Score:5, Funny)

      by Mr. Underbridge (666784) on Thursday April 21 2005, @09:28PM (#12309658)
      of the 2.95 -> 3.0 transition.

      Did you not get pleasure out of things being errors in 3.0 that weren't even warnings in 2.95?

      I'm sure all the contractors loved it! ;)

      GCC motto: "What code can we break today?

      • Misplaced blame (Score:5, Insightful)

        by tepples (727027) <<moc.thgienip> <ta> <6002hsals>> on Thursday April 21 2005, @09:35PM (#12309708) Homepage Journal

        Did you not get pleasure out of things being errors in 3.0 that weren't even warnings in 2.95?

        At least the maintainers of the ISO C++ standard did.

        GCC motto: "What code can we break today?

        Blame the standards committee, not the GCC maintainers.

        • by Screaming Lunatic (526975) on Thursday April 21 2005, @10:38PM (#12310108) Homepage
          Blame the standards committee, not the GCC maintainers.

          Insightful? Jesus eff-ing Christ. Now the slashbots don't like standards. I bet you wouldn't be presenting the same argument if this discussion was about the transition from MSVC 6.0 to 7.0/7.1.

        • by Mancat (831487) on Thursday April 21 2005, @10:40PM (#12310121) Homepage
          Mechanic: Sir, your car is ready.

          Customer: Thanks for fixing it so quickly!

          Mechanic: We didn't fix it. We just brought it up to standards. Oh, by the way, your air conditioning no longer works, and your rear brakes are now disabled.

          Customer: Uhh.. What?

          Mechanic: That's right. The standard refrigerant is now R-134A, so we removed your old R-14 air conditioning system. Also, disc brakes are now standard in the autmotive world, so we removed your drum brakes. Don't drive too fast.

          Customer: What the fuck?

          Mechanic: Oh, I almost forgot. Your car doesn't have airbags. We're going to have to remove your car's body and replace it with a giant tube frame lined with air mattresses.
      • Re:Why? (Score:5, Insightful)

        by Sivar (316343) <{moc.liamg]} {ta} {[snrubnselrahc}> on Friday April 22 2005, @12:04AM (#12310527)
        I know you were just poking fun but--

        Standards are the reason that computers are tolerable to use for any purpose.
        If a programmer can't be bothered to follow an international standard of his own language, there is no guarantee that the code is future-proof. One can hardly blame the compiler vendor, as we can't expect a compiler to mindlessly maintain backwards compatibility with every weird use of a bug and every bizarre code construct that has ever been supported in the past.

        The ability to compile code written for GCC in another compiler is a *good* thing. If it requires informing the programmer that their code has always been broken, then so be it. A little inconvenience is a small price to pay for standards compliance, or should we expect that the GCC authors "embrace and extend" C and other languages until so much code relies on weird GCC nuggets that programmers (and users) are "locked in" to using just that compiler? (But Douglas Adams forbid if Microsoft does the same thing!)

        Maybe I am missing something. If so, please enlighten me (This is not a sarcastic remark--I haven't done much research on what 4.0 has broken so I may be way out of line).

        Sheesh, for as hard as the GCC authors work, and for as much GCC has improved in the last 10 years, the contributers sure get a lot of flak. Anyone who doesn't contribute code themselves should be greatful (or at least appreciative) of their efforts, even when they do make mistakes.
  • by Da w00t (1789) on Thursday April 21 2005, @09:11PM (#12309559) Homepage
    Not a C coder myself, (sticking mainly to perl).. I've just got to ask, what are SSA trees, and what benefit do they serve?
    • by Entrope (68843) on Thursday April 21 2005, @09:17PM (#12309595) Homepage
      Single static assignment is a way the compiler can rewrite the code (usually for optimization purposes) so each "variable" being analyzed is only written once. This makes a lot of optimizations easier to do, since it eliminates aliasing due to the programmer assigning different values to the same variable. You'd probably learn these things if you would RTFA.
    • by GillBates0 (664202) on Thursday April 21 2005, @09:22PM (#12309626) Homepage Journal
      Wikipedia (as usual) has a nice article [wikipedia.org] about the Static Single Assignment (SSA) form.

      To put it simply, SSA is an intermediate representation where each variable in a block is defined only *once*. If a variable is defined multiple times, the target of any subsequent definitions of the same variable is replaced by a new variable name.

      SSA helps to simplify later optimizations passes of a compiler (for example: eliminating unused definitions, etc) as described in greater detail (with examples and flowcharts) in the article linked to.

      That's the SSA form in short. Now I need to ask somebody the difference between the standard SSA form and "SSA for trees".

    • by Dink Paisy (823325) on Thursday April 21 2005, @10:00PM (#12309885) Homepage
      As other people have said, SSA is static single assignment. It means that each variable in the program is assigned in only one place. SSA is for optimization, and is usually done in intermediate forms generated by the compiler, rather than in programs written by a human in common computer languages such as C, C++, Perl or assembly languages.

      Trying to recall my knowledge of optimizing compilers:

      SSA makes optimization easier, since it is obvious where a variable was assigned (since it was assigned in only one location) and what value it contains (since there is only one value being assigned to it). The complexity moves to register allocation, where there can be many more variables to allocate because of SSA. Register allocation is Hard, but doing an ok job is quite possible. Most optimizations are impossible unless you can prove various properties about the variables involved, which is often much easier with variables in SSA form.

    • by mindriot (96208) on Thursday April 21 2005, @10:02PM (#12309895)

      Hmm. Funny. Seems like perfect timing, in retrospect. I just held a presentation on SSA (and efficiently transforming code into SSA) today.

      Get the slides here [udel.edu].

      HTH

    • by IvyMike (178408) on Thursday April 21 2005, @10:20PM (#12309986)
      There have been several good answers to your question, but if you're really new to compilers, you might want a little more context. Want a quick lesson in how compilers work? If you're willing to accept some gross oversimplifications, here's how most modern compilers work:

      1) Tokenize the input. For example, if you were compiling perl, you might choose to turn "print $foo" into three tokens; KEYWORD_PRINT, TYPE_SCALAR, and IDENTIFIER('foo'). The output is typically a stream of tokens. This step might be done by lex or flex.

      2) Parse the sequence of tokens using a set of rules called a grammar. For example, "TYPE_SCALAR" followed by "IDENTIFIER()" is might match a rule to generate a variable called "$foo", and "KEYWORD_PRINT" followed by a variable means call the function print on the contents of the variable. The output is typically an abstract syntax tree (AST); a high-level data structure representing the program. This step might be done by yacc or bison.

      3) Match the AST against a series of rules to output the final code. This might actually be two steps; you might generate something into a low-level register transfer language (RTL) that looks very much like assembly, and then turn THAT into actual machine instructions.

      At each stage, you might choose to optimize the output. You might also insert optimizations passes between steps. (For example, you might insert a pass between 2 and 3 to optimize the AST into a simpler AST.)

      Before SSA, GCC sort of skipped making any high-level AST; it used to go from parsing almost immediately into a RTL. You can still optimize RTL, but since it's pretty low-level, it misses out on higher-level context and made some optimizations really difficult.

      SSA is simply a form used for the high-level AST. Why SSA? It is a very nice form to optimize. Read the wikipedia article for more details on why SSA is particularly useful for some optimizations.

      Page 181 of this PDF file [linux.org.uk] from the 2003 GCC Summit explains the flow of the GCC compiler.
  • Sweetness (Score:5, Informative)

    by kronak (723456) on Thursday April 21 2005, @09:12PM (#12309561)
    Glad to see they are targeting the AMD64 architecture for improvements.
  • debian (Score:5, Funny)

    by Anonymous Coward on Thursday April 21 2005, @09:12PM (#12309562)
    i wonder when debian sid will integrate GCC 4.0...
  • Autovectorization (Score:5, Interesting)

    by QuantumG (50515) <qg@biodome.org> on Thursday April 21 2005, @09:29PM (#12309662) Homepage Journal
    Correct me if I'm wrong here, but most Linux distributions are still i386 right? It's only the people who use Gentoo who actually compile everything with i686 options right? So, if autovectorization and all the other improvements in GCC 4.0 make binaries massively faster on modern platforms, how long will it be before the major binary based distributions (like Ubuntu) start making i686 the default and i386 an available alternative (like AMD64 is now).
    • by vlad_petric (94134) on Thursday April 21 2005, @10:04PM (#12309910) Homepage
      The main problem is the C language. While vectorizing a loop is generally not that difficult, figuring out if it's the right thing to do is extremely tough. To do that, you have to "prove" that iterations of a loop are independent of each other. This, in turn, requires good pointer alias analysis, and gcc isn't doing it well enough yet. BTW ... a language like Fortran, that doesn't have pointers at all, is much easier to vectorize; that's one of the reasons a lot of scientific codes are still in Fortran.

      Without automatic vectorization, the performance benefit of compiling for 686 as opposed to 386 is simply minimal. A lot of people have done benchmarks on this, and found out that tuning for 686 with gcc only provides 1-2% improvements in the best case. Keep in mind that current X86 processors execute instructions out-of-order, so instruction scheduling for a specific pipeline is not going to do much (it's very important for in-order machines, though)

      • Re:Autovectorization (Score:5, Interesting)

        by QuantumG (50515) <qg@biodome.org> on Thursday April 21 2005, @09:41PM (#12309753) Homepage Journal
        I used to work for Codeplay, a company that made compilers for games development, and we were pretty surprised at the kinds of speedups you would get on non-gaming applications. Obviously compiling open source software was a great way to test our compiler. Basically any loop which performs the same operation on multiple data can be unrolled 4 times and vectorized. That's a massive speedup. So yes, I would expect OpenOffice to be faster.
              • Re:Autovectorization (Score:5, Informative)

                by thalakan (14668) <jspence&lightconsulting,com> on Friday April 22 2005, @12:53AM (#12310732) Homepage
                Wrong. The SSE instruction set includes several instructions for doing vector integer ops, such as average and multiplication. These things are a huge speed win even in "average" applications, as the game compiler developer noted above. If you don't believe me, fire up a profiler and look at how much time an office app or web browser spends doing rectangle intersection calculations and TrueType font math.

                Also, there aren't nearly enough people using MOVNTDQ to avoid polluting the instruction pipeline and dumping useless garbage into the system cache. If you're copying stuff into main memory and you aren't going to use it for a while, use MOVNTDQ to get a big speed win. If you do need it cached, use MOVDQA to get both caching and 128 bit transfers in one instruction! We all paid for these fancy schmancy new instructions in our processors, and it's extremely annoying to see programmers not use them.

  • by jtshaw (398319) * on Thursday April 21 2005, @09:29PM (#12309663) Homepage
    When they announced the release of Apple 10.4 "Tiger" I noticed this page: At that point I kinda figured gcc 4.0.0 had to be out by April 29th since Apple claimed they were using it for OS X.
    • by k98sven (324383) on Thursday April 21 2005, @09:47PM (#12309794) Journal
      When they announced the release of Apple 10.4 "Tiger" I noticed this page: At that point I kinda figured gcc 4.0.0 had to be out by April 29th since Apple claimed they were using it for OS X.

      Well, you're wrong because GCC doesn't follow Apple's schedule, or anyone else's for that matter. Even a cursory glance at the GCC mailing list will tell you that.

      The reason Apple can promise this is that they're not actually shipping GCC 4. They're shipping their own fork of the GCC 4 code. It's probably about 99% the same code, but don't make the mistake of thinking they're shipping exactly what the FSF is distributing.
        • by dvdeug (5033) <dvdeug@e[ ]l.ro ['mai' in gap]> on Friday April 22 2005, @01:24AM (#12310824)
          We're not shipping "a fork" of GCC 4. We're shipping GCC 4.0.0, which we compiled from source for Darwin 8.

          According to http://gcc.gnu.org/install/specific.html#powerpc-x -darwin [gnu.org],
          The version of GCC shipped by Apple typically includes a number of extensions not available in a standard GCC release. These extensions are generally for backwards compatibility and best avoided.

          i.e. you're using a forked version of GCC, and definitely not 4.0.0 out of the box.

          the whole notion of "a fork" runs 100% counter to all that open-source stuff

          No, actually, the importance of the ability to fork and wisdom to know when to fork is very important to "that open-source stuff".
  • *chuckle* (Score:5, Funny)

    by fr2asbury (462941) on Thursday April 21 2005, @09:29PM (#12309664)
    I can see my Gentoo box sweating now all nervous for the night I get a little drunk and decide to see how this gcc 4 thing works out. heh heh heh.
  • Readme.SCO (Score:5, Interesting)

    by karvind (833059) <karvind@gm a i l . com> on Thursday April 21 2005, @09:48PM (#12309802) Journal
    The gcc tar ball has a README.SCO file (reproduced below)

    The GCC team has been urged to drop support for SCO Unix from GCC, as a protest against SCO's irresponsible aggression against free software and GNU/Linux. We have decided to take no action at this time, as we no longer believe that SCO is a serious threat.

    For more on the FSF's position regarding SCO's attacks on free software, please read:

    http://www.gnu.org/philosophy/sco/sco.html

  • by phoenix.bam! (642635) on Thursday April 21 2005, @09:54PM (#12309846)
    and no gentoo users commenting on how they've already recompiled their entire system with the new optimizations. Or maybe they're just waiting for some free resources to open a browser.
  • by mrcrowbar (821370) on Thursday April 21 2005, @10:02PM (#12309896)
    Screenshots anyone? ;)
  • Patent issues (Score:5, Informative)

    by plgs (447731) on Thursday April 21 2005, @10:25PM (#12310006) Homepage
    "Unfortunately we cannot implement Steensgaard [pointer] analysis due to patent issues."

    They mean this patent [uspto.gov] owned by this company [microsoft.com]. What a surprise.

  • by Old Wolf (56093) on Thursday April 21 2005, @10:51PM (#12310183)
    One of the changes in 4.0.0 is autovectorization [gnu.org] optimizing.
    One _ancient_ compiler (10+ years) I have to use, already has this feature -- and on a large scale: it'll do it over several screensful of code. What took GCC so long?

    Unfortunately, this compiler I mention also has a bug: once it's factored out 'i' in a piece of code like that below, it then complains that 'i' is an unused variable. So you have to do something with 'i' to suppress that warning, which kinda defeats the purpose of the autovectorization.

    Sample code:

    int a[256], b[256], c[256];
    foo () {
    int i;

    for (i=0; i256; i++){
    a[i] = b[i] + c[i];
    }
    }
  • TR1 included! (Score:5, Informative)

    by Anthony Liguori (820979) on Friday April 22 2005, @12:40AM (#12310684) Homepage
    I'm surprised noone's mention the inclusion of the C++ TR1. There's a ton of very cool new library features. Here are my two favorite:
    #include <tr1/functional>

    int foo(int x, int y) { return x * y; }

    using namespace std::tr1::placeholders;

    int main() {
    std::tr1::function<int (int, int)> f;
    std::tr1::function<int (int)> g;

    // f can be stored in a container
    f = foo;

    f(2, 3);

    g = std::tr1::bind(f, _1, 3);

    // this is equivalent to f(2, 3)
    f(2)
    }
    Not to mention the inclusion of shared_ptr which provides a reference counted pointer wrapper. This will eliminate 99% of the need to do manual memory management in C++. It's all very exciting, kudus to the G++ team on this!
          • by Anonymous Coward on Thursday April 21 2005, @09:57PM (#12309862)
            The parent poster is refering to the deprecation of Managed Extensions for C++ syntax in favor of C++/CLI (which is undergoing ISO standardization).

            While it is true the syntax has changed (much for the better: templates are now supported in managed C++ code and so are generics, keywords replace ugly __gc, and more), support for the old syntax is still in the compiler (/clr:oldSyntax), and IntelliSense.

            However, you will be unable to mix new syntax and old syntax code in the same project without taking some penalties (IntelliSense will break, at the least). The designer will even spit out old syntax code when designing an old form or control.

            While the old syntax is definitely on its last legs, the VC++ team was very concerned about not screwing over those (early) adopters of C++ code for the CLR thus far.

            A good resource to read up more on the subject would be Herb Sutter's Blog [msdn.com], Stan Lippman's Blog [msdn.com], or any of the other VC++ team member's blogs.

            Take this from a former VC++ teammate who left during the Whidbey product cycle (posting AC since I've never bothered to get a slashdot account).
      • by Anonymous Coward on Friday April 22 2005, @01:47AM (#12310892)
        > I'm amazed that the NO CARRIER joke can be mode so often, and always get modded funny.

        It doesn't always get modded fun&@^4-- NO CARRIER