Language:

The Free and Open Productivity Suite
Apache OpenOffice 4.1.7 released

C++ coding guidelines

The Mozilla C++ portability guidelines provide an excellent source of information on portable C++ programming. Many of these guidelines are direct applicable to coding OpenOffice.org projects and are taken over verbatim into the guidelines below. I've taken out some of the Mozilla rules (don't use exceptions, RTTI and namespaces) which restrict the usage of newer language features. OpenOffice.org can't be compiled with older compilers anyway, and current compilers implement these features properly. I've also added a few rules from our OpenOffice.org programming guidelines.

The guidelines are roughly ordered by importance. Guideline 1,2,4,5 cover problems which are almost always present in any mayor build of OpenOffice.org and require sometimes excessive amount of code tweaking by the release engineers.

Don't use static constructors.

Non-portable example:

  FooBarClass static_object(87, 92);

  void
  bar()
  {
    if (static_object.count > 15) {
       ...
    }
  }
Static constructors don't work reliably either. A static initialized object is an object which is instanciated at startup time (just before main() is called). Usually there are two components to these objects. First there is the data segment which is static data loaded into the global data segment of the program. The second part is a initializer function that is called by the loader before main() is called. We've found that many compilers do not reliably implement the initializer function. So you get the object data, but it is never initialized. One workaround for this limitation is to write a wrapper function that creates a single instance of an object, and replace all references to the static initialized object with a call to the wrapper function:

Portable example:

  static FooBarClass* static_object;

  FooBarClass*
  getStaticObject()
  {
    if (!static_object)
      static_object =
        new FooBarClass(87, 92);
    return static_object;
  }

  void
  bar()
  {
    if (getStaticObject()->count > 15) {
      ...
    }
  }

Use the common denominator between members of a C/C++ compiler family.

For many of the compiler families we use, the implementation of the C and C++ compilers are completely different, sometimes this means that there are things you can do in the C language, that you cannot do in the C++ language on the same machine. One example is the 'long long' type. On some systems (IBM's compiler used to be one, but I think it's better now), the C compiler supports long long, while the C++ compiler does not. This can make porting a pain, as often times these types are in header files shared between C and C++ files. The only thing you can do is to go with the common denominator that both compilers support. In the special case of long long, we developed a set of macros for supporting 64 bit integers when the long long type is not available. We have to use these macros if either the C or the C++ compiler does not support the special 64 bit type.

Don't put C++ comments in C code.

The quickest way to raise the blood pressure of a Release engineer is to put C++ comments (// comments) into C files. Yes, this might work on your Microsoft Visual C compiler, but it's wrong, and is not supported by the vast majority of C compilers in the world. Just do not go there.

Many header files will be included by C files and included by C++ files. We think it's a good idea to apply this same rule to those headers. Don't put C++ comments in header files included in C files. You might argue that you could use C++ style comments inside #ifdef __cplusplus blocks, but we are not convinced that is always going to work (some compilers have weird interactions between comment stripping and pre-processing), and it hardly seems worth the effort. Just stick to C style /**/ comments for any header file that is ever likely to be included by a C file.

Put a new line at end-of-file.

Not having a new-line char at end-of-file breaks some compilers (Solaris).

Don't put extra top-level semi-colons in code.

Non-portable example:

  int
  A::foo()
  {
  };
This is another problem that seems to show up more on C++ than C code. This is problem really a bit of a drag. That extra little semi-colon at the end of the function is ignored by most compilers, but it makes some compilers very unhappy (IBM's AIX compiler doesn't like extra top-level semi-colons). Don't do it.

Portable example:

  int
  A::foo()
  {
  }

C++ filename extension is .cxx.

This one is another plain annoying problem. What's the name of a C++ file? file.cpp, file.cc, file.C, file.cxx, file.c++, file.C++? Most compilers could care less, but some are very particular. We have not been able to find one file extension which we can use on all the platforms we have ported OpenOffice.org code to. For no great reason, we've settled on file.cxx, probably because the first C++ code in OpenOffice.org code was checked in with that extension. Well, it's done. The extension we use is .cxx. This extension seems to make most compilers happy, but there are some which do not like it.

Don't use initializer lists with objects.

Non-portable example:

  FooClass myFoo = {10, 20};
Some compilers won't allow this syntax for objects (HP-UX won't), actually only some will allow it. So don't do it. Again, use a wrapper function, see Don't use static constructors.

Always have a default constructor.

Always have a default constructor, even if it doesn't make sense in terms of the object structure/hierarchy. HP-UX will barf on statically initialized objects that don't have default constructors.

Be careful with inner-classes.

Some compilers (HP-UX) generally require that types (classes, enums, etc.) declared inside of another class should be referred to with their fully scoped form (e.g., Foo::kListMaxLen versus kListMaxLen).

Be careful of variable declarations that require construction or initialization.

Non-portable example:

  void
  A::foo(int c)
  {
    switch(c) {
    case FOOBAR_1:
      XyzClass buf(100);
      // stuff
      break;
    }
  }
Be careful with variable placement around if blocks and switch statements. Some compilers (HP-UX) require that any variable requiring a constructor/initializer to be run, needs to be at the start of the method -- it won't compile code when a variable is declared inside a switch statement and needs a default constructor to run.

Portable example:

  void
  A::foo(int c)
  {
    XyzClass buf(100);

    switch(c) {
    case FOOBAR_1:
      // stuff
      break;
    }
  }

Make header files compatible with C and C++.

Non-portable example:

  /*oldCheader.h*/
  int existingCfunction(char*);
  int anotherExistingCfunction(char*);

  /* oldCfile.c */
  #include "oldCheader.h"
  ...

  // new file.cxx
  extern "C" {
  #include "oldCheader.h"
  };
  ...
If you make new header files with exposed C interfaces, make the header files work correctly when they are included by both C and C++ files. If you start including an existing C header in new C++ files, fix the C header file to support C++ (as well as C), don't just extern "C" {} the old header file. Do this:

Portable example:

  /*oldCheader.h*/
  #ifdef __cplusplus
  extern "C" {
  #endif
  int existingCfunction(char*);
  int anotherExistingCfunction(char*);
  #ifdef __cplusplus
  }
  #endif

  /* oldCfile.c */
  #include "oldCheader.h"
  ...

  // new file.cxx
  #include "oldCheader.h"
  ...
There are number of reasons for doing this, other than just good style. For one thing, you are making life easier for everyone else, doing the work in one common place (the header file) instead of all the C++ files that include it. Also, by making the C header safe for C++, you document that "hey, this file is now being included in C++". That's a good thing. You also avoid a big portability nightmare that is nasty to fix...

Some systems include C++ in system header files that are designed to be included by C or C++. Not just extern "C" {} guarding, but actual C++ code, usually in the form of inline functions that serve as "optimizations". While we question the wisdom of vendors doing this, there is nothing we can do about it. Changing system header files, is not a path we wish to take. Anyway, so why is this a problem? Take for example the following code fragment:

Non-portable example:

  /*system.h*/
  #ifdef __cplusplus
    /* optimization */
  inline int sqr(int x) {return(x*x);}
  #endif

  /*header.h*/
  #include <system.h>
  int existingCfunction(char*);

  // file.cxx 
  extern "C" {
  #include "header.h"
  }
What's going to happen? When the C++ compiler finds the extern "C" declaration in file.cxx, it will switch dialects to C, because it's assumed all the code inside is C code, and C's type free name rules need to be applied. But the __cplusplus pre-processor macro is still defined (that's seen by the pre-processor, not the compiler). In the system header file the C++ code inside the #ifdef __cplusplus block will be seen by the compiler (now running in C mode). Syntax Errors galore! If instead the extern "C" was done in the header file, the C functions can be correctly guarded, leaving the systems header file out of the equation. This works:

Portable example:

  /*system.h*/
  #ifdef __cplusplus
    /* optimization */
  inline int sqr(int x) {return(x*x);}
  #endif

  /*header.h*/
  #include <system.h>
  extern "C" {
  int existingCfunction(char*);
  }

  // file.cxx
  #include "header.h"
One more thing before we leave the extern "C" segment of the program. Sometimes you're going to have to extern "C" system files. This is because you need to include C system header files that do not have extern "C" guarding themselves. Most vendors have updated all their headers to support C++, but there are still a few out there that won't grok C++. You might have to do this only for some platforms, not for others (using #ifdef SYSTEM_X). The safest place to do extern "C" a system header file (in fact the safest place to include a system header file) is at the lowest place possible in the header file inclusion hierarchy. That is, push all this stuff down to the header files closer to the system code, don't do this stuff in the mail header files. Ideally the best place to do this is in the NSPR or XP header files - which sit directly on the system code.

Be careful of the scoping of variables declared inside for() statements.

Non-portable example:

  void
  A::foo()
  {
      for (int i = 0; i < 10; i++) {
        // do something
      }
      // i might get referenced
      //  after the loop.
      ...
  }
This is actually an issue that comes about because the C++ standard has changed over time. The original C++ specification would scope the i as part of the outer block (in this case function A::foo()). The standard changed so that now the i in is scoped within the for() {} block. Most compilers use the new standard. Some compilers (for example, HP-UX) still use the old standard. Some other compilers (for example, gcc) use the new rules, but will tolerate the old. If i was referenced later in the for() {} block, gcc will allow the construct, but give a warning about use of an "obsolete binding". So, while the code above is valid, it would become ambiguous if i was used later in the function. It's probably better to be on the safe side and declare the iterator variable outside of the for() loop. Then you'll know what you are getting on all platforms:

Portable example:

  void
  A::foo()
  {
    int i;
    for (i = 0; i < 10; i++) {
      // do something
    }
    // i might get referenced
    //  after the loop.
	...
  }

Declare local initialized aggregates as static.

Non-portable example:

  void
  A:: func_foo()
  {
    char* foo_int[] = {"1", "2", "C"};
    ...
  }
This seemingly innocent piece of code will generate a "loader error" using the HP-UX compiler/linker. If you really meant for the array to be static data, say so:

Portable example:

  void
  A:: func_foo()
  {
    static char *foo_int[] = {"1", "2", "C"};
    ...
  }
Otherwise you can keep the array as an automatic, and initialize by hand:

Portable example:

  void
  A:: func_foo()
  {
    char *foo_int[3];

    foo_int[0] = XP_STRDUP("1");
    foo_int[1] = XP_STRDUP("2");
    foo_int[2] = XP_STRDUP("C");
    // or something equally Byzantine...
    ...
  }

Expect complex inlines to be non-portable.

Non-portable example:

  class FooClass {
    ...
    int fooMethod(char* p) {
      if (p[0] == '\0')
        return -1;

      doSomething();
      return 0;
    }
    ...
  };
It's surprising, but many C++ compilers do a very bad job of handling inline member functions. Cfront based compilers (like those on SCO and HP-UX) are prone to give up on all but the most simple inline functions, with the error message "sorry, unimplemented". Often times the source of this problem is an inline with multiple return statements. The fix for this is to resolve the returns into a single point at the end of the function. But there are other constructs which will result in "not implemented". For this reason, you'll see that most of the C++ code in Mozilla does not use inline functions. We don't want to legislate inline functions away, but you should be aware that there is some danger in using them, so do so only when there is some measurable gain (not just a random hope of performance win). Maybe you should just not go there.

Portable example:

  class FooClass {
    ...
    int fooMethod(char* p) {
      int return_value;

        if (p[0] == '\0') {
           return_value = -1;
        } else {
           doSomething();
           return_value = 0;
        }
        return return_value;
    }
    ...
  };
Or

Portable example:

  class FooClass {
    ...
    int fooMethod(char* p);
    ...
  };

  int FooClass::fooMethod(char* p)
  {
    if (p[0] == '\0')
      return -1;

    doSomething();
    return 0;
  }

Don't use return statements that have an inline function in the return expression.

For the same reason as the previous tip, don't use return statements that have an inline function in the return expression. You'll get that same "sorry, unimplemented" error. Store the return value in a temporary, then pass that back.

Use virtual declaration on all subclass virtual member functions.

Non-portable example:

  class A {
    virtual void foobar(char*);
  };

  class B : public A {
    void foobar(char*);
  };
Another drag. In the class declarations above, A::foobar() is declared as virtual. C++ says that all implementations of void foobar(char*) in subclasses will also be virtual (once virtual, always virtual). This code is really fine, but some compilers want the virtual declaration also used on overloaded functions of the virtual in subclasses. If you don't do it, you get warnings. While this is not a hard error, because this stuff tends to be in headers files, you'll get so many warnings that's you'll go nuts. Better to silence the compiler warnings, by including the virtual declaration in the subclasses. It's also better documentation:

Portable example:

  class A {
    virtual void foobar(char*);
  };

  class B : public A {
    virtual void foobar(char*);
  };

Always declare a copy constructor and assignment operator.

One feature of C++ that can be problematic is the use of copy constructors. Because a class's copy constructor defines what it means to pass and return objects by value (or if you prefer, pass by value means call the copy constructor), it's important to get this right. There are times when the compiler will silently generate a call to a copy constructor, that maybe you do not want. For example, when a you pass an object by value as a function parameter, a temporary copy is made, which gets passed, then destroyed on return from the function. Maybe you don't want this to happen, maybe you'd always like instances of your class to be passed by reference. If you do not define a copy constructor the C++ compiler will generate one for you (the default copy constructor), and this automatically generated copy constructor might, well, suck. So you have a situation where the compiler is going to silently generate calls to a piece of code that might not be the greatest code for the job (it may be wrong).

Ok, you say, "no problem, I know when I'm calling the copy constructor, and I know I'm not doing it". But what about other people using your class? The safe bet is to do one of two things: if you want your class to support pass by value, then write a good copy constructor for your class. If you see no reason to support pass by value on your class, then you should explicitly prohibit this, don't let the compiler's default copy constructor do it for you. The way to enforce your policy is to declare the copy constructor as private, and not supply a definition. While your at it, do the same for the assignment operator used for assignment of objects of the same class. Example:

  class foo {
    ...
    private:
    // These are not supported
    // and are not implemented!
    foo(const foo& x);
    foo& operator=(const foo& x);
  };
When you do this, you ensure that code that implicitly calls the copy constructor will not compile and link. That way nothing happens in the dark. When a user's code won't compile, they'll see that they were passing by value, when they meant to pass by reference (oops).

Be careful of overloaded methods with like signatures.

It's best to avoid overloading methods when the type signature of the methods differs only by 1 "abstract" type (e.g. PR_Int32 or int32). What you will find as you move that code to different platforms, is suddenly on the Foo2000 compiler your overloaded methods will have the same type-signature.

Type scalar constants to avoid unexpected ambiguities.

Non-portable code:

  class FooClass {
    // having such similar signatures
    // is a bad idea in the first place.
    void doit(long);
    void doit(short);
  };

  void
  B::foo(FooClass* xyz)
  {
    xyz->doit(45);
  }
Be sure to type your scalar constants, e.g., PR_INT32(10) or 10L. Otherwise, you can produce ambiguous function calls which potentially could resolve to multiple methods, particularly if you haven't followed (2) above. Not all of the compilers will flag ambiguous method calls.

Portable code:

  class FooClass {
    // having such similar signatures
    // is a bad idea in the first place.
    void doit(long);
    void doit(short);
  };

  void
  B::foo(FooClass* xyz)
  {
    xyz->doit(45L);
  }

Type scalar constants to avoid unexpected ambiguities.

Some platforms (e.g. Linux) have native definitions of types like bool which sometimes conflict with definitions OpenOffice.org code. Always use sal_Bool..

Stuff that is good to do for C or C++.

Do not wrap include statements with an #ifdef.

Do not wrap include statements with an #ifdef. The reason is that when the symbol is not defined, other compiler symbols will not be defined and it will be hard to test the code on all platforms. An example of what not to do:

Bad code example:

  // don't do this
  #ifdef X
  #include "foo.h"
  #endif
The exception to this rule is when you are including different system files for different machines. In that case, you may need to have a #ifdef SYSTEM_X include.

Macs complain about assignments in boolean expressions.

Another example of code that will generate warnings on a Mac:

Generates warnings code:

  if ((a = b) == c) ...
Macs don't like assignments in if statements, even if you properly wrap them in parentheses.

More portable example:

  a=b;
  if (a == c) ...

Every source file must have a unique name.

Non-portable file tree:

  feature_x
      private.h
      x.cxx
  feature_y
      private.h
      y.cxx
For Mac compilers, every has to have a unique name. Don't assume that just because your file is only used locally that it's OK to use the same name as a header file elsewhere. It's not ok. Every filename must be different.

Portable file tree:

  feature_x
      xprivate.h
      x.cxx
  feature_y
      yprivate.h
      y.cxx
Use #if 0 rather than comments to temporarily kill blocks of code.

Non-portable example:

  int
  foo()
  {
    ...
    a = b + c;
    /*
     * Not doing this right now.
    a += 87;
    if (a > b) (* have to check for the
                  candy factor *)
      c++;
     */
    ...
  }
This is a bad idea, because you always end up wanting to kill code blocks that include comments already. No, you can't rely on comments nesting properly. That's far from portable. You have to do something crazy like changing /**/ pairs to (**) pairs. You'll forget. And don't try using #ifdef NOTUSED, the day you do that, the next day someone will quietly start defining NOTUSED somewhere. It's much better to block the code out with a #if 0, #endif pair, and a good comment at the top. Of course, this kind of thing should always be a temporary thing, unless the blocked out code fulfills some amazing documentation purpose.

Portable example:

  int
  foo()
  {
    ...
    a = b + c;
  #if 0
    /* Not doing this right now. */
    a += 87;
    if (a > b) /* have to check for the
                  candy factor */
      c++;
  #endif
    ...
  }

Source code formatting.

After some discussion we agreed that basic indentation should be four spaces, no tabs. Please use an editor setting which expands a typed tab into spaces. Some examples:


vim: set expandtab
emacs: (setq-default indent-tabs-mode nil)
msdev: check Options->Tabs->Insert Spaces

Apache Software Foundation

Copyright & License | Privacy | Contact Us | Donate | Thanks

Apache and the Apache feather logo are trademarks of The Apache Software Foundation. OpenOffice, OpenOffice.org and the seagull logo are registered trademarks of The Apache Software Foundation. Other names appearing on the site may be trademarks of their respective owners.