Copying a File

Problem

You need to copy one file to another in a portable manner, i.e., without using OS-specific APIs.

Solution

Use C++ file streams in to copy data from one stream to another. Example 10-9 gives an example of a buffered stream copy.

Example 10-9. Copying a file

#include #include const static int BUF_SIZE = 4096; using std::ios_base; int main(int argc, char** argv) { std::ifstream in(argv[1], ios_base::in | ios_base::binary); // Use binary mode so we can std::ofstream out(argv[2], // handle all kinds of file ios_base::out | ios_base::binary); // content. // Make sure the streams opened okay... char buf[BUF_SIZE]; do { in.read(&buf[0], BUF_SIZE); // Read at most n bytes into out.write(&buf[0], in.gcount( )); // buf, then write the buf to } while (in.gcount( ) > 0); // the output. // Check streams for problems... in.close( ); out.close( ); }

 

Discussion

Copying a file may appear to be a simple matter of reading from one stream and writing to another. But the C++ streams library is large, and there are a number of different ways to do the reading and the writing, so you should know a little about the library to avoid costly performance mistakes.

Example 10-9 runs fast because it buffers input and output. The read and write functions operate on entire buffers at a timeinstead of a character-at-a-time copy loopby reading from the input stream to the buffer and writing from the buffer to the output stream in chunks. They also do not do any kind of formatting on the data like the left- and right-shift operators, which keeps things fast. Additionally, since the streams are in binary mode, EOF characters can be read and written without incident. Depending on your hardware, OS, and so on, you will get different results for different buffer sizes. Experiment to find the best parameters for your system.

But there's more to it than this. All C++ streams already buffer data when reading or writing, so Example 10-9 is actually doing double buffering. The input stream has its own internal stream buffer that holds characters that have been read from the source but not extracted with read, operator<<, getc, or any other member functions, and the output stream has a buffer that holds output that has been written to the stream but not the destination (in the case of an ofstream, it's a file but it could be a string, a network connection, or who-knows-what). Therefore, the best thing to do is to let the buffers exchange data directly. You can do this with operator<<, which behaves differently than usual when used with stream buffers. For example, instead of the do/while loop in Example 10-9, use this:

out << in.rdbuf( );

Don't place this statement in the body of the loop, replace the loop with this single line. It looks a little odd, since, typically, operator<< says, "take the righthand side and send it to the lefthand stream," but bear with me and it will make sense. rdbuf returns the buffer from the input stream, and the implementation of operator<< that takes a stream buffer as a righthand argument reads a character at a time from the input buffer and writes it to the output buffer. When the input buffer is emptied, it knows it has to refill itself with data from the real source, and operator<< is none the wiser.

Example 10-9 shows how to copy the contents of a file yourself, but your OS is responsible for managing the filesystem, which encompasses copying them, so why not let the OS do the work? In most cases, the answer to this question is that a direct call to the OS API is, of course, not portable. Boost's Filesystem library masks a lot of the OS-specific APIs for you by providing the function copy_file, which makes different OS calls based on the platform it was compiled for. Example 10-10 contains a short program that copies a file from one location to another.

Example 10-10. Copying a file with Boost

#include #include #include #include using namespace std; using namespace boost::filesystem; int main(int argc, char** argv) { // Parameter checking... try { // Turn the args into absolute paths using native formatting path src=complete(path(argv[1], native)); path dst = complete(path(argv[2], native)); copy_file(src, dst); } catch (exception& e) { cerr << e.what( ) << endl; } return(EXIT_SUCCESS); }

This a small program, but there are a few key parts that need explaining because other recipes in this chapter use the Boost Filesystem library. To begin with, the central component of the Boost Filesystem library is the path class, which represents, in an OS-independent way, a path to a file or directory. You can create a path using either a portable or OS-native string. In Example 10-10, I create a path out of the program arguments (that I then pass to complete, which I discuss in a moment):

path src=complete(path(argv[1], native));

The first argument is the text of the path, e.g., "tmp\foo.txt" and the second argument is the name of a function that accepts a string argument and returns a boolean that validates that a path is valid according to certain rules. The native function means to use the OS's native format for validation. I used it in Example 10-10 because the arguments are passed in from the command line where they are presumably typed in by a human user, who will probably use the native OS format when specifying files. There are a number of functions that you can use to validate file and directory names, all of which are self-explanatory: portable_posix_name, windows_name, portable_name, portable_directory_name, portable_file_name, and no_check. See the documentation for specifics.

complete composes an absolute path using the current working directory and the relative path you pass it. Thus, I can do this to create an absolute path to the source file:

path src=complete(path("tmp\foo.txt", native));

complete handles the case where the first argument is already an absolute filename by using the value given rather than trying to merge it with the current working directory. In other words, the following code invoked from a current directory of "c:myprograms" ignores the current working directory since the path given is already complete:

path src=complete(path("c:\windows\garbage.txt", native));

Many of the Boost Filesystem functions will throw an exception if a precondition is not met. The documentation has all the details, but a good example is with the copy_file function itself. A file must exist before it can be copied, so if the source file does not exist, the operation cannot succeed; therefore, copy_file will throw an exception. Catch the exception as I did in Example 10-10 and you will get an error message that explains the problem.

Категории