The Free and Open Productivity Suite
Released: Apache OpenOffice 4.1.15

OpenOffice.orgBe Careful with file URLs

Different Ways to Name Files

There are (at least) five ways to name files:

  1. The platform-specific notation, called pathnames here (e.g., /abc/def/ghi.txt on Unix, a:\bcd\efg\hij.txt on DOS and Windows, and abc:def:ghi.txt on Macintosh).

  2. A UNC-like notation, called UNC names here (e.g., //./abc/def/ghi.txt or //./a:/bcd/efg/hij.txt). The osl layer used to make heavy use of these as a platform-independent notation, but since osl has shifted to file URLs as the platform-independent notation (see below), UNC names have been deprecated and became pretty much useless (and are only mentioned here for completeness).

  3. The file URLs used by the osl layer as a platform-independent notation, called osl URLs here (e.g., file:///abc/def/ghi.txt or file:///a:/bcd/efg/hij.txt). Read on to learn why it is important to explicitly label these file URLs as osl URLs.

  4. The file URLs used by the File Content Provider (FCP) within the Universal Content Broker (UCB), called FCP URLs (e.g., file:///home/usr123/work/abc.txt or file:///user/work/abc.txt). Normally, osl URLs and FCP URLs are the same (after all, the FCP uses osl to access the files). But the FCP has a feature called mount points that allows it to restrict access to only certain files (those that lie below a given set of mount points in the file system hierarchy), and to give names to these files that hide their real locations.

    For example, if you have a mount point named user at the osl URL file:///home/usr123, the osl URL file:///home/usr123/work/abc.txt corresponds to the FCP URL file:///user/work/abc.txt. If you only have that single mount point, the osl URL file:///home/usr567/work/def.txt has no corresponding FCP URL (and cannot be accessed via the FCP).

  5. The URLs used by the UCB, called UCB URLs (e.g., file:///a:/bcd/efg/hij.txt or Normally, FCP URLs and UCB URLs are the same, because the UCB hands file URLs directly to the FCP. But there is a special content provider, the Remote Access Content Provider (RAP), that allows to rewrite URLs before passing them on to other content providers. This is used, for example, in the Sun ONE Webtop (S1W), where there are typically two file systems: a client file system accessed via normal (FCP) file URLs (i.e., there is no rewriting RAP between the UCB and the client FCP), and a server file system accessed via (FCP) URLs where the file scheme has been replaced with (i.e., there is a rewriting RAP between the UCB and the server FCP).

The last two notations (FCP URLs and UCB URLs) are relatively unknown, because in a plain OpenOffice installation neither mount points nor the RAP are used, so that osl URLs, FCP URLs and UCB URLs are all identical. But when you want to write correct code that also works in unusal deployments (or in the S1W, which should be regarded not too unusal), you have to be well aware of these different notations all labeled as "URLs."

Where Different Notations are Used

As mentioned before, use of UNC names is deprecated. Also, since most code accesses the FCP not directly, but via the UCB, FCP URLs are only of interest to hard core UCB users (who should know what they are doing, anyway). So, in the following we can concentrate on three different notations: pathnames, osl URLs, and UCB URLs.

Where Pathnames are Used

Pathnames are used in only a few places, because the default notation used by osl (the lowest level of concern to us) already are osl URLs (which are a level above pathnames). It can be argued that interfaces that use pathnames should use osl URLs instead, and that pathnames are only of interest when communicating with the external world (other processes, or the human user).

One place where pathnames are used is class utl::TempFile.

Where osl URLs are Used

The osl file system functions (in osl/file.h and osl/file.hxx) now generally use osl URLs in their interfaces.

There should be few places above osl where osl URLs instead of UCB URLs are used (because generally all file access should be done through the UCB, and not directly via osl). One notable exception is the handling of temporary files (see above).

Where UCB URLs are Used

Generally, all interfaces that are designed to communicate resource names within the OpenOffice framework should use UCB URLs, and all implemenations that access resources by these names should do so via the UCB. Another advantage of this is that without any extra effort not only file resources can be accessed, but also other resources like HTTP and FTP (by using appropriate URLs, but these URLs can be opaque to the code, only interpreted by the UCB).

Converting between Different Notations

Sometimes it may be necessary to convert between different notations, and the routines to do so are well available:

There is no direct way to convert between osl URLs and UCB URLs. To convert from an osl URL to a UCB URL, use osl::FileBase::getSystemPathFromFileURL() followed by utl::LocalFileHelper::ConvertPhysicalNameToURL(). To convert from a UCB URL to an osl URL, use utl::LocalFileHelper::ConvertURLToPhysicalName() followed by osl::FileBase::getFileURLFromSystemPath. But be aware that this only works if the osl URL and the UCB URL shall denote files within the same file system.

Author: Stephan Bergmann (Last modification $Date: 2003/12/06 22:37:31 $). Copyright 2001 Foundation. All Rights Reserved.

Apache Software Foundation

Copyright & License | Privacy | Contact Us | Donate | Thanks

Apache, OpenOffice, and the seagull logo are registered trademarks of The Apache Software Foundation. The Apache feather logo is a trademark of The Apache Software Foundation. Other names appearing on the site may be trademarks of their respective owners.