Return to main table of contents
![]() |
Cascade is a suite of tools that aims to speed up software development. Here are the primary components that make up Cascade:
Normally, when you access your source control repository, the first step is to "check out" a tree. When you check out a tree, the client downloads the latest revision of all of the files and stores a copy of each file on your local disk. In some systems, like Subversion, the system stores not one but two copies of each file. You are only allowed to modify one of these two copies. The system can determine which files you've edited by comparing each file against the second copy, which contains the original, unmodified file contents.
For a large tree, checking out a tree can take a long time and consume a lot of disk space. Once you've checked out a tree, further updates are generally incremental, meaning that you only need to download new files or files that have changed since your last update/checkout. Even so, for large projects, an "update" can take a long time.
Cascade File System (sometimes abbreviated CFS) provides a more efficient way to access your source control repository. Instead of spending minutes or hours downloading a tree, you can be up and running in just seconds with Cascade. You don't need to download all the files up front. Instead, CFS provides you a view of the tree where it looks like all of the files are present, even though in reality—behind the scenes—a file is only downloaded the first time you access it.
Because CFS interfaces with the operating system kernel as a file system driver rather than as a shell/GUI (e.g. Windows Explorer) plugin, all of your applications can access files inside CFS just like any other file on your hard drive, even if those applications are not aware of CFS. CFS simply looks like another hard drive with a different drive letter (on Windows) or another file system with a different mount point (on Linux or Macintosh).
Once CFS downloads a file for the first time, the file is cached locally on your hard drive, so subsequent accesses to that file are just as fast as your local disk.
When you work from a site with a slow network link to your source control repository, downloading files can be even slower. Cascade provides a proxy server called Cascade Proxy that caches files that have already been downloaded by other users. With Cascade Proxy, you don't have to waste time downloading files that other people already downloaded earlier.
You don't need to explicitly install a Cascade Proxy server. The Cascade Proxy server is built into the Cascade File System service, so any computer running CFS can also potentially act as a proxy server. To use a proxy, you just need to specify the proxy server's hostname during the Cascade install process, and Cascade will automatically route all requests through the proxy.
Using Cascade Proxy is entirely optional. All Cascade functionality works with or without the proxy server.
Cascade provides an automated build and regression test environment, sometimes known as a "continuous integration" system. You can set up builds and tests through its Cascade Manager web interface and run these builds and tests on a farm of PCs running the Cascade Worker software. When someone breaks a build or regression test, Cascade Manager can send out an email letting you know. You can see the status of these builds and tests, look at their log files, and download their results from the web interface.
Once you've set up your automated builds and tests, you can clone a pre-built, pre-tested CFS tree from Cascade Manager. Cloning is nearly instantaneous, because you don't need to download any files from the repository to clone. The cloned tree contains not just all the files in your source control repository but also all of the output files produced by your builds and tests. You can access these output files just like any other file in the file system. Cloning allows you to get started on development and testing right away, without having to wait for a build to complete.
When you clone a tree, you can tell Cascade Manager to give you the last known good revision rather than the latest revision. The last known good revision is the most recent revision where all builds and tests have completed and passed. By cloning last known good, you know that you are starting with a tree that works, rather than grabbing the latest revision and hoping that someone hasn't broken things recently.
After cloning a tree and making changes to it, you can checkpoint your changes, or upload them onto Cascade Manager for safekeeping. Unlike a commit to your repository, a checkpoint will never break anything. A checkpoint is a lightweight way to save your changes without permanently recording them for posterity in the repository. Once you've created a checkpoint, you or another person can clone a new tree from the checkpoint, so checkpointing is a fast and easy way to hand off changes between engineers.
Cascade Manager can kick off the builds and tests affected by a checkpointed change. This way, you can know what your change might break before you commit it, rather than hoping for the best after you commit it. This is especially helpful for cross-platform development, where it can take a lot of time and effort to manually verify that your software builds and runs on each of your target platforms.
Finally, Cascade Manager can impose a commit policy to prevent broken changes from being committed to the repository. By default, Cascade Manager will only allow you to commit a change if all of the builds and tests affected by the change have completed and passed.
Cascade File System and Cascade Proxy provide faster, more efficient access to your source control repository. Cascade Manager tracks changes to your repository and kicks off builds and tests affected by those changes on various Cascade Worker clients. Once the entire system is set up, users can benefit from Cascade's powerful tools for cloning trees and for checkpointing and committing changes.
![]() |
Cascade File System is a file system driver. It exposes a tree of files and directories that any program running on your computer can access, just like you would access files on your local disk. The difference is that the files you see in Cascade File System are generally not backed by storage allocated on your local hard disk—instead, they're only cached on your local hard drive as needed.
The first time you access a file, it will be downloaded from the repository. Subsequent accesses will obtain the file from the local cache. Eventually, if you don't use a file for a long time, it may be evicted from your cache to make room for other files. All of the files will still appear to be present, even the ones that aren't in your cache; the only way to tell that a file is cached is that accessing it is faster.
On Windows, you can assign Cascade File System its own drive letter. By
default, it uses the X: drive, but you can choose any unused drive
letter at install time. On Linux and Macintosh, Cascade File System is
typically available at the /mnt/cfs directory.
Cascade File System is organized into trees and mount points.
You can create any number of trees at the top level of Cascade
File System. For example, you might have trees X:\tree1,
X:\tree2, and X:\tree3. Trees can only live at the
top level of Cascade File System; for example, you cannot have a tree two levels
deep at X:\foo\bar.
You can create one or more mount points underneath each tree. A
mount point is a mapping of a source code repository (e.g. a Subversion or
Perforce repository) into a Cascade File System tree. Mount points can only be
created inside trees; for example, you cannot create a mount point
X:\svn at the top level of Cascade File System or at
C:\svn outside of Cascade File System entirely, only inside a CFS
tree at a path such as X:\tree\svn. A tree can, however, have more
than one mount point, and these mount points can be placed at any directory you
want inside the tree (except inside another mount point). For instance, you
could map your Subversion repository at X:\tree\svn, your Perforce
repository at X:\tree\p4, and a third-party vendor's Subversion
repository at X:\tree\vendor_name\svn. However, you could not
map a second Subversion repository at X:\tree\svn\svn2, since a
mount point cannot live inside another mount point.
A mount point maps a repository at a particular revision. While
source control terminology often differs from vendor to vendor, in Cascade,
revisions count the total number of changes made since the beginning of time,
not the number of changes made to a particular file. (This is similar to how
Subversion uses the term "revision." Perforce uses the term "changelist" to
describe the same concept. In general, Cascade tries to model its terminology
as closely as possible after Subversion.) A mount point might map to repository
X at revision 1000; this means that you will see the contents of the repository
as they existed after 1000 commits, just as though you had done a svn
checkout -r 1000 or p4 sync @1000.
In addition to reading them, you can modify, add, or delete files under CFS trees. Each CFS tree you create is independent of all your other CFS trees. When you make changes to files under the mount points, those changes will be private to just that tree. Also, CFS will never commit your changes to the repository unless you specifically ask it to.
There is one important "gotcha" to remember about destroying a CFS tree.
Normally, you would delete a directory tree on your hard drive by typing
rm -rf tree (Unix) or rmdir /s tree (Windows), or by
clicking on it and selecting "Delete" from the context menu or typing
Shift-Delete in Windows Explorer. Destroying a CFS tree this way will work, but
this is suboptimal: these commands will recursively walk through the entire
directory structure, deleting one file at a time. Since it's unlikely that you
have the entire directory structure in your cache, this will take a while.
A much faster way to destroy a CFS tree is to type rmdir tree.
CFS tweaks the normal file system semantics slightly and allows an
rmdir operation on an entire tree to succeed, even though the
directory underneath is not empty. Under Windows Explorer, the Cascade shell
extension provides a "Delete Tree" option in its right-click context menu that
will do the same thing.
CFS enforces normal rmdir semantics inside a tree,
failing this operation if the directory is not empty. You can only use this
shortcut to delete an entire CFS tree.
The mount point structure of each tree is entirely independent from that of the others—there is no restriction that all trees must have the same mount point structure. In practice, however, you will almost always want to set up the same mount point structure, and it would be tedious to type these directory names and their associated repositories over and over. Instead, you can clone a tree from Cascade Manager. Here, you fill in the paths of the mount points (relative to the root of their associated CFS trees) and their associated repositories only once. Then, when you clone a tree, the Cascade client software will set up this same mount point structure.
Cloning is particularly useful when working with more than one repository. If you tell Cascade Manager to watch more than one repository, it will look at the timestamps of the changes committed to each repository and interleave them appropriately into a single unified timeline. Instead of saying "revision 300 from repository A and revision 400 from repository B", you would simply clone revision 700 from the unified timeline of revisions that Cascade Manager builds up.
Note that even with just one repository, Cascade Manager's revision numbers can and probably will diverge from the underlying repository revision numbers. This can happen because:
There is one special mount point that exists in all cloned trees. This mount
point is called /results. Its directory structure parallels that
of all your other mount points. The contents of this mount point are all of the
output files of all of the tasks you have asked Cascade Manager to run.
For example, if you have a source file mapped at /svn/trunk/foo.c,
you have a task that compiles foo.c into foo.o, and
you've told Cascade Manager to archive the file /svn/trunk/foo.o as
an output file, you can find the file foo.o underneath any cloned
CFS tree at /results/svn/trunk/foo.o.
The /results mount point has a few special properties that make
it unlike other mount points.
It is read-only. You cannot directly edit the files under
/results; if you want to influence their contents, you must do so
by editing the files they are generated from, then checkpointing your changes,
or by reconfiguring the tasks in Cascade Manager.
Its contents can change asynchronously as tasks complete. If a task
hasn't completed yet, its corresponding output files will be missing from
/results. The files will appear in real time as the tasks finish.
Cascade does not attempt to update /results immediately as you
edit tasks' input files. You must checkpoint your changes first before
/results will reflect those changes. Cascade will not warn you
if /results is out of sync with the edits you've made since your
last checkpoint.
You can access different versions of a file using @ suffixes on
filenames.
You can access the unedited version of any file by appending
@original to its filename. For example, if you've made changes to
foo.txt, you can access the unchanged version as
foo.txt@original.
If you've edited a file and then updated your tree, and someone else has
edited the same file, you can access the version you will need to merge against
by adding a @merge suffix. In other words, foo.txt
is "yours" and foo.txt@merge is "theirs."
If you want to access a file at a specific revision number, you can add a
@<revision> suffix. For example, foo.txt@50
would be the contents of file foo.txt at revision 50.
Note that even though they are legal filename characters on both Windows and
Unix, some programs have been observed to get confused by @
characters in paths.
Cascade refers to repositories and files in repositories by their "URL." This URL has a similar (but not identical) format to that of the URLs you would use in your web browser.
This release of Cascade supports Perforce and Subversion repositories. If you are interested in using Cascade with another type of repository, please contact us.
Subversion already uses URLs to identify repositories and files. Cascade's Subversion URLs are slightly different than normal Subversion URLs.
Subversion supports several different repository access methods. Cascade
supports the http://, https://, and svn://
repository access methods. For the http:// and https://
repository access methods, to turn your Subversion URL into a Cascade repository
URL, add svn- in front of it. For example:
svn-http://svn.collab.net/repos/svn
svn-https://svn.collab.net/repos/svn
For the svn:// repository access method, you can use the
Subversion URL unmodified. For example:
svn://my-server/var/svn/repos
Cascade does not natively support the svn+ssh:// repository
access method, but you can emulate it and achieve the same level of security
with your own SSH tunnel. Use ssh (on Windows, you can use the free
PuTTY SSH
client) to log in to the target machine, with a tunnel set up to forward local
port 3690 to remote port 3690, and run svnserve -d at the prompt.
This will start an svnserve server process on port 3690. Once
started, the server will continue to run until someone explicitly shuts it down
or the system is rebooted, so you don't need to restart it each time you log in.
Also, several users wanting to access the same repository can all share a single
svnserve process. If port 3690 is already in use, the server is
probably already running and you probably don't need to start a new server. If
someone else happens to be using port 3690 for something unrelated, you can use
svnserve's --listen-port argument to to specify a
custom port number; in this case, you will need to specify this same port number
as the remote port when setting up your SSH tunnel (you should still use port
3690 as the local port).
Once you have set up your SSH tunnel, your repository URL is simply
svn://localhost/path rather than svn+ssh://server/path.
You will need to leave this ssh tunnel running all the time in the background.
You can further simplify the use of an svn+ssh:// repository
with Cascade Proxy. When Cascade is configured with a proxy, it will send
queries to the proxy rather than directly to the repository. This means that
you only need to set up the ssh tunnel on the machine acting as the proxy
server, rather than on every single machine running Cascade File System. (If
several proxies are chained together, the only one that needs the ssh tunnel is
the final one in the chain.) If you set up a proxy on a computer that is
"always on", you can start the tunnel once and simply leave it running.
Cascade Proxy's network protocol is not secure, but you can secure it using a
VPN or forward Cascade Proxy over an SSH tunnel on port 4187.
Cascade does not support the file:// Subversion repository
access method.
To construct a Cascade URL for a Perforce server, take the value of the
P4PORT environment variable and add p4:// in front of
it.
For example, if your P4PORT is perforce:1666, your repository
URL would be p4://perforce:1666.
The special-purpose /results mount point in each Cascade tree is
like a repository in many ways, so it, too, has a URL. This mount point is
hosted by your Cascade Manager server, so its URL is the same as that of your
Cascade Manager server but with csc- in front. For example, it
might be:
csc-http://hostname:8080
csc-https://hostname/cascade
One of the key concepts Cascade introduces is checkpointing changes. It's important to understand how checkpointing a change differs from committing it to the repository. In both cases, you are uploading a change to a server, but there are also some key differences between the two operations.
Commits are permanent. When you commit a change to the repository, it becomes part of the permanent archive of all the changes made to the files stored in your repository. This is a good thing if your changes are done, but usually not if your changes are not finalized yet. For example, before committing, because this is going to become part of the permanent record, you might want to clean up any clutter in your code: delete temporary debugging code, write better comments and clean up spelling and grammatical errors, and so on. Also, commits permanently grow the repository and require the server hosting the repository to do more work. This is usually not a concern for small projects, but on very large projects, if too many people commit too much code too frequently, this can slow down the server.
Checkpoints are immutable (once created, they cannot be modified) but temporary. Cascade Manager stores them in its database for safekeeping, and you can keep them around as long as you want, but typically administrators will purge old checkpoints every so often to free up disk space and remove clutter. For example, you might purge checkpoints more than a month old once a week during the weekend. Checkpointing also does not store any data in the repository, nor does it require the repository server to do any work. Checkpoints are therefore an ideal way to save off changes that aren't done yet and that don't yet meet the level of professionalism you would expect in a commit. You also don't need to worry about how often you checkpoint. Whereas you might commit a few times per day, you can checkpoint every few minutes if you so desire.
Committing a change pushes it on all other users working in the same codeline (project and branch) when they update their trees. This can be good or bad. If the change fixes an important bug, for instance, the sooner people pick it up, the better. On the other hand, if the change introduces a bug, this is dangerous. Either way, this change now becomes part of the baseline that future commits to the codeline must be relative to. This shifts the responsibility for who has to merge. As long as a change stays in your local tree, it is your responsibility to merge your changes with changes that other people commit. Once you commit a change to the repository, other people are now responsible for merging their changes with yours.
Checkpointing a change does not push it on anyone, nor does it affect who is responsible for merging. If another user wants to grab your checkpointed change and use it as a baseline, they can; this is simply a matter of cloning a new tree from the checkpoint. You are still responsible for merging your checkpointed changes with other users' commits. It is OK to checkpoint code that is buggy or that doesn't even compile, since no one else is forced to pick it up.
One way of looking at checkpointing is that it is a lightweight way to create a temporary branch. Traditionally, if you wanted to checkpoint your incomplete work in progress several times while working on a large change, you might create your own "private branch" and commit to it each time you wanted to save off your work. Then, when your change was finished, you would merge your private branch back into the main codeline and (optionally) delete your private branch. Checkpointing is a similar way of accomplishing the same result with less overhead.
Checkpoints are a natural way to build pre-commit workflows. For example, you might have a convention that, before someone is allowed to commit their changes, they must first obtain a code review from another engineer, and the code must compile and pass a certain list of tests. Of course, this relies on trust and requires discipline: it's tempting to make "one last change" to your code before committing it and not rerun all the test cases.
By separating a commit into separate "checkpoint" and "commit the checkpointed change" phases, we can prevent these sorts of problems. Because checkpoints are immutable, if you make another change after getting your code review or running your tests, you must create a new checkpoint. That might be OK, and you might want to allow the commit anyway—that's a decision your project has to make for itself—but it makes it possible for the tools to enforce a higher level of discipline.
In particular, Cascade's default policy is that a checkpointed change cannot be committed until Cascade has run through all of the tasks (builds and tests) affected by that change to demonstrate that the change doesn't break any of them. This allows you to keep your project at a guaranteed minimum level of quality. Cascade cannot write high-quality software or test cases for you, nor can it stop people from bypassing it altogether and committing directly to the underlying repository, but Cascade can help ensure that you don't get stuck waiting for someone to fix a careless, preventable build or regression test break.
A Cascade installation needs to know various configuration information, such
as where to find Cascade Manager on the network (in order to be able to clone a
tree), or what diff program to launch when the user types csc diff.
Cascade refers to these settings as configuration variables.
On Windows, when looking up a configuration variable, Cascade first looks for
an environment variable with that name. If it can't find one, it looks in the
Windows registry under HKEY_CURRENT_USER\Software\Conifer Systems\Cascade
for a REG_SZ (i.e. a string, not a DWORD) registry key. If it
can't find it there, it looks in the registry under
HKEY_LOCAL_MACHINE\Software\Conifer Systems\Cascade. Finally, if
it doesn't exist in any of these three locations, it may fall back to a default
value.
On Linux or Macintosh, Cascade first looks for an environment variable. If
it can't find one, it looks in the file /etc/cascade.conf. Each
line in this file has the format var=value. (Note that this file
format is very restrictive. The file format does not support comments, and it
is sensitive to whitespace. For example, if you put a space before and after
the =, the variable name would end with a space, and the variable's
value would begin with a space.)
Some of the configuration variables are set up by the Cascade installer, and you may never need to touch their values after installation.
Here are some of the important configuration variables you should know about:
CFS_IGNORE: A space-separated list of filename patterns that
CFS should not pay attention to.
CFS_PROXY: The hostname of the Cascade Proxy that Cascade
will attempt to connect to when it encounters a cache miss. If not set, Cascade
will go straight to the underlying repository to service the cache miss.
CFS_ROOT: The path at which Cascade File System can be
found, e.g., X: or /mnt/cfs.
CSC_DIFF: The diff program that should be run by
csc diff.
CSC_MANAGER: The URL at which Cascade Manager can be found,
e.g., http://hostname/cascade.
CSC_MERGE: The merge program that should be run by
csc merge.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit. (http://www.openssl.org/)
This product includes TortoiseMerge and TortoiseOverlays, developed by the TortoiseSVN project.
Comments or questions about the manual? Please email info@conifersystems.com with your feedback.
Copyright © 2008 Conifer Systems LLC. All rights reserved.
Cascade contains valuable trade secrets and other confidential information belonging to Conifer Systems LLC. This software and its associated documentation may not be copied, duplicated or disclosed to third parties without the express written permission of Conifer Systems LLC.