Copying Large Files

By , January 10, 2011

Windows Server 2008 provides at least three different command line utilities for copying files. The familiar COPY and XCOPY commands have been around since the early DOS days. And ROBOCOPY, a more robust and feature-rich tool, began shipping with Windows Vista and Windows Server 2008. These tools work just fine in most cases. But all three suffer from a fatal flaw when you’re trying to copy a file that’s larger than available memory.

Manifestation of the Problem

Our main servers are quad-core machines with 32 gigabytes of RAM, running the 64 bit version of Windows Server 2008. Our custom format repository file is about 90 gigabytes in size. That’s after compression. In addition to daily backups, I periodically need to copy that file to another computer. And that’s where the problems start.

For simplicity, let’s say that the repository file, Repository.dat is stored in the \\SERVER\Data\ directory, and a copy needs to go to \\CLIENT\Data\. That’s simple enough, right? To copy from SERVER to CLIENT, you’d give this command from the SERVER computer:

copy c:\data\repository.dat \\CLIENT\Data\repository.dat

That seems to work just fine on my systems. However, it’s usually more convenient to have the client pull information from the server, rather than have the server push it. So on the CLIENT machine, you want to enter this command:

copy \\SERVER\data\repository.dat c:\data\repository.dat

That causes big problems.

At first, the copy operation goes along as expected. If I open Task Manager and watch network activity I see that the network is very busy transporting data, and Resource Manager shows my computer writing to the disk like mad. After a while, though, network activity drops off, disk activity goes down and, much to my horror, the server machine becomes totally unresponsive. This is bad.

Repeating the test and keeping an eye on the server machine will show that memory usage steadily increases while the copy operation is taking place until memory is full, the machine starts swapping pages to disk, and then eventually begins thrashing.

Although this problem is not unknown, detailed information about the cause is hard to come by. It’s pretty obvious that Windows is reading ahead and buffering the data in preparation for sending it across the network. Why Windows seems to think that it has to buffer 32 gigabytes of data is beyond me.

Possible Solutions

If you dig a little bit, you’ll find descriptions of this problem going back to Windows 2000 Server. Similar problems are reported for Windows Server 2003, especially the x64 version, and for Windows Vista. Microsoft has addressed the problem in a few places.

Possible Solutions

If you dig a little bit, you’ll find descriptions of this problem going back to Windows 2000 Server. Similar problems are reported for Windows Server 2003, especially the x64 version, and for Windows Vista. Microsoft has addressed the problem in a few places.

In Slow Large File Copy Issues, the Windows Server Performance team explains that the problem is buffering (we knew that), and the solution is to use a utility that doesn’t use the CopyFile or CopyFileEx API functions. Their recommendation is the ESEUTIL program that ships with Exchange Server. That’s fine if you have Exchange Server, I guess, but I don’t have that program. And from what I’ve read online, ESEUTIL isn’t especially fast. Not like RoboCopy, which appeared to be much faster than COPY or XCOPY.

Another blog posting recommended RichCopy, a formerly “internal use only” Microsoft utility. Although it appears to be a useful tool and the documentation says that it will work from the command line, I was unable to get it to do anything from the command line. If I ever need a GUI copy utility, I’ll definitely look at RichCopy.

Par Hansson over at TechSpot ran into this problem on his Windows XP x64 system and solved it by changing a registry entry. Unfortunately, his registry edit doesn’t seem to have any effect under Windows Vista or Windows 7 or, I assume, Windows Server 2008.

Other blog postings have suggested various other third party utilities such as TeraCopy, with differing levels of success. And, although TeraCopy appears to work well and solves my immediate problem, I’ve not yet seen anything approaching consensus on a solution that does not involve third party utilities.

What does this have to do with .NET programming?

I’m so glad you asked. First, the days of a programmer being “just a programmer” are long gone. At least, they appear to be from where I sit. When I first started in this industry, there were people who did nothing but write COBOL programs all day, with the occasional JCL script thrown in. Today, a programmer might be primarily a C# developer, but he’s also building Web pages with HTML and JavaScript, dabbling in VBScript or JScript for scripting, crafting stored procedures with T-SQL, and endlessly fiddling with hardware and operating system configuration issues. Add to that budget cuts and an economic downturn, and it’s highly likely that programmers will run into these issues that in the past were left to system administrators.

More to the point, you’ll likely run into this issue if you attempt to copy a very large file using the .NET Framework’s File.Copy method. File.Copy is little more than a wrapper around the CopyFile Windows API function, and it will exhibit the same behavior as COPY, XCOPY, and any other tool that uses CopyFile or CopyFileEx. So it’s good to understand what’s going on and how to solve the problem. In the next section, we’ll start looking at different ways to copy a file in .NET.

7 Responses to “Copying Large Files”

  1. stuart says:

    .

    спасибо за инфу!

  2. joseph says:

    .

    сэнкс за инфу!

  3. Whispering Misty

    So sorry you may miss the workshop!