Commit d4c085bc authored by cvs2svn's avatar cvs2svn

This commit was manufactured by cvs2svn to create tag 'r0-7-5'.

git-svn-id: http://svn.savannah.nongnu.org/svn/rdiff-backup@100 2b77aa54-bcbc-44c9-a7ec-4f6cf2b41109
parent 8a22c618
This diff is collapsed.
This diff is collapsed.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>rdiff-backup FAQ</title>
</head>
<body>
<h1>rdiff-backup FAQ</h1>
<h2>Table of contents</h2>
<ol><li><a href="#__future__">When I try to run rdiff-backup it says
"ImportError: No module named __future__" or "SyntaxError: invalid
syntax". What's happening?</a></li>
<li><a href="#verbosity">What do the different verbosity levels mean?</a></li>
<li><a href="#windows">Does rdiff-backup run under Windows?</a></li>
<li><a href="#remove_dir">My backup set contains some files that I just realized I don't want/need backed up. How do I remove them from the backup volume to save space?</li>
<li><a href="#redhat">How do I install the RPMs on Redhat linux system?</a></li>
<li><a href="#solaris">Under Solaris, rdiff-backup keeps failing with the error message "open(/dev/zero): Too many open files".</a></li>
</ol>
<h2>FAQ</h2>
<ol>
<a name="__future__">
<li><strong>When I try to run rdiff-backup it says "ImportError: No
module named __future__" or "SyntaxError: invalid syntax". What's
happening?</strong>
<P>rdiff-backup versions 0.2.x require Python version 2.1 or later,
and versions 0.3.x and later require Python version 2.2 or later. If
you don't know what version of python you are running, type in "python
-V" from the shell. I'm sorry if this is inconvenient, but
rdiff-backup uses generators, iterators, nested scoping, and
static/class methods extensively, and these were only added in version
2.2.
<P>If you have two versions of python installed, and running "python"
defaults to an early version, you'll probably have to change the first
line of the rdiff-backup script. For instance, you could set it to:
<pre>
#/usr/bin/env python2.2
</pre>
</li>
<a name="verbosity">
<li><strong>What do the different verbosity levels mean?</strong>
<P>There is no formal specification, but here is a rough description
(settings are always cumulative, so 5 displays everything 4 does):
<P>
<table cellspacing="10">
<tr><td>0</td><td>No information given</td></tr>
<tr><td>1</td><td>Fatal Errors displayed</td></tr>
<tr><td>2</td><td>Warnings</td></tr>
<tr><td>3</td><td>Important messages, and maybe later some global statistics (default)</td></tr>
<tr><td>4</td><td>Some global settings, miscellaneous messages</td></tr>
<tr><td>5</td><td>Mentions which files were changed</td></tr>
<tr><td>6</td><td>More information on each file processed</td></tr>
<tr><td>7</td><td>More information on various things</td></tr>
<tr><td>8</td><td>All logging is dated</td></tr>
<tr><td>9</td><td>Details on which objects are moving across the connection</td></tr>
</table>
<a name="windows">
<li><strong>Does rdiff-backup run under Windows?</strong>
<P>Yes, apparently it is possible. First, follow Jason Piterak's
instructions:
<pre>
Subject: Cygwin rdiff-backup
From: Jason Piterak &lt;Jason_Piterak@c-i-s.com&gt;
Date: Mon, 4 Feb 2002 16:54:24 -0500 (13:54 PST)
To: rdiff-backup@keywest.Stanford.EDU
Hello all,
On a lark, I thought I would attempt to get rdiff-backup to work under
Windows98 under Cygwin. We have a number of NT/Win2K servers in the field
that I'd love to be backing up via rdiff-backup, and this was the start of
getting that working.
SUMMARY:
o You can get all the pieces for rdiff-backup working under Cygwin.
o The backup process works up to the point of writing any files with
timestamps.
... This is because the ':' character is reserved for Alternate Data
Stream (ADS) file designations under NTFS.
HOW TO GET IT WORKING (to a point, anyway):
o Install Cygwin
o Download the Python 2.2 update through the Cygwin installer and install.
o Download the librsync libraries from the usual place, but before
compiling...
o Cygwin does not use/provide glibc. Because of this, you have to repoint
some header files in the Makefile:
-- Make sure that you have /usr/include/inttypes.h
redirected to /usr/include/sys/types.h. Do this by:
create a file /usr/include/inttypes.h with the contents:
#include &lt;sys/types.h&gt;
o Put rdiff-backup in your PATH, as you normally would.
</pre>
Then, whenever you use rdiff-backup (or at least if you are backing up
to or restoring from a Windows system), use the
<strong>--windows-time-format</strong> switch, which will tell
rdiff-backup not to put a colon (":") in a filename (this option was
added after Jason posted his message). Finally, as Michael Muegel
points out, you have to exclude all files from the source directory
which have colons in them, so add something like the --exclude ".*:.*"
option. In the near future some quoting facility may be added to deal
with these issues.
</li>
<P>
<a name="remove_dir">
<li><strong>My backup set contains some files that I just realized I
don't want/need backed up. How do I remove them from the backup
volume to save space?</strong>
<P>Let's take an example. Suppose you ran
<pre>rdiff-backup /usr /backup</pre>
and now realize that you don't want /usr/local backed up on /backup.
Next time you back up, you run
<pre>rdiff-backup --exclude /usr/local /usr /backup</pre>
so that /usr/local is no longer copied to /backup/usr/local.
However, old information about /usr/local is still present in
/backup/rdiff-backup-data/increments/usr/local. You could wait for
this information to expire and then run rdiff-backup with the
--remove-older-than option, or you could remove the increments
manually by typing:
<pre>rm -rf /backup/rdiff-backup-data/increments/usr/local
rm /backup/rdiff-backup-data/increments/usr/local.*.dir</pre>
</li>
<P>
<a name="redhat">
<li><strong>How do I install the RPMs on a Redhat linux system?</strong>
<P>The problem is that the default version of python for Redhat 7.x is
1.5.x, and rdiff-backup requires python >= 2.2. Redhat/rawhide
provides python 2.2 RPMs, but they are packaged under the "python2"
name.
<P>So, if you are running Redhat 7.x:
<ol>
<li>Make sure the python2 >= 2.2 package is installed,
leaving python 1.5 the way it is
<li>Install the rdiff-backup RPM, using --nodeps if it only complains
about python 2.2 missing.
<li>Edit the first line of /usr/bin/rdiff-backup so it says<pre>
#!/usr/bin/env python2
</pre>
so "python2" gets run instead of "python".
</ol>
<P>You can also upgrade using a non-Redhat python 2.2 RPM and avoid
the above steps (this is what I did). Because of all the dependencies
it is usually easier to use source RPMs for this.
</li>
<P>
<a name="solaris">
<li><strong>Under Solaris, rdiff-backup keeps failing with
the error message "open(/dev/zero): Too many open files".</strong>
<P>Kevin Spicer reported this problem and then posted the following
update:
<pre>
Subject: RE: Crash report....still not^H^H^H working
From: "Spicer, Kevin" <Kevin.Spicer@bmrb.co.uk>
Date: Sat, 11 May 2002 23:36:42 +0100
To: rdiff-backup@keywest.Stanford.EDU
Quick mail to follow up on this..
My rdiff backup (on Solaris 2.6 if you remember) has now worked
reliably for nearly two weeks after I added...
ulimit -n unlimited
to the start of my cron job and created a wrapper script on the remote
machine which looked like this...
#!/bin/sh
ulimit -n unlimited
rdiff-backup --server
exit
And changed the remote schema on the command line of rdiff-backup to
call the wrapper script rather than rdiff-backup itself on the remote
machine. As for the /dev/zero thing I've done a bit of Googleing and
it seems that /dev/zero is used internally by libthread on Solaris
(which doesn't really explain why its opening more than 64 files - but
at least I think I've now got round it).
</pre>
</li>
</ol>
<hr>
<a href="http://www.stanford.edu/~bescoto">Ben Escoto</a> <address><a href="mailto:bescoto@stanford.edu">&lt;bescoto@stanford.edu&gt;</a></address>
<!-- Created: Fri Sep 7 15:34:45 PDT 2001 -->
<!-- hhmts start -->
Last modified: Sat May 11 19:26:17 PDT 2002
<!-- hhmts end -->
</body>
</html>
Thank you for trying rdiff-backup.
Remember that you must have Python 2.2 or later and librsync installed
(this means that "python" and "rdiff" should be in your path). To
download, see http://www.python.org and
http://sourceforge.net/projects/rproxy/ respectively.
For remote operation, rdiff-backup should be in installed and in the
PATH on remote system(s) (see man page for more information).
If you have the above installed, and it still doesn't work, contact
Ben Escoto <bescoto@stanford.edu>, or post to the mailing list (see
web page at http://www.stanford.edu/~bescoto/rdiff-backup for more
information).
Security audit
--read-only and --write-only /usr/foo switches to tighten security up some.
Don't produce stack trace which looks like crash/include file name in
logging stats
Add to above Dean Gaudet's suggestion: make errors look prettier (like tar).
Examine and fix recovery problems.
Think about adding Gaudet's idea for keeping track of renamed files.
If don't recover hardlink support and hardlink support on, don't resume.
#!/usr/bin/env python
import os, re, shutil, time
SourceDir = "src"
filelist = [SourceDir + "/rdiff-backup", "CHANGELOG",
"COPYING", "README", "FAQ.html"]
# Various details about the files must also be specified by the rpm
# spec template.
spec_template = "dist/rdiff-backup.spec"
redhat_spec_template = "dist/rdiff-backup.rh7x.spec"
def GetVersion():
"""Return version string by reading in ./rdiff-backup"""
fp = open(SourceDir + "/rdiff-backup", "r")
match = re.search("Version (.*?) ", fp.read())
fp.close()
return match.group(1)
def CopyMan(destination, version):
"""Create updated man page at the specified location"""
fp = open(destination, "w")
date = time.strftime("%B %Y", time.localtime(time.time()))
version = "Version "+version
firstline = ('.TH RDIFF-BACKUP 1 "%s" "%s" "User Manuals"\n' %
(date, version))
fp.write(firstline)
infp = open("rdiff-backup.1", "r")
infp.readline()
fp.write(infp.read())
fp.close()
infp.close()
def MakeTar(version):
"""Create rdiff-backup tar file"""
tardir = "rdiff-backup-%s" % version
tarfile = "rdiff-backup-%s.tar.gz" % version
try:
os.lstat(tardir)
os.system("rm -rf " + tardir)
except OSError: pass
os.mkdir(tardir)
for file in filelist: os.system("cp -a %s %s" % (file, tardir))
os.chmod(os.path.join(tardir, "rdiff-backup"), 0755)
CopyMan(os.path.join(tardir, "rdiff-backup.1"), version)
os.system("tar -cvzf %s %s" % (tarfile, tardir))
shutil.rmtree(tardir)
return tarfile
def MakeSpecFile(version):
"""Create spec file using spec template"""
def helper(spec_template, specfile):
"""Added now that there are special redhat rpms"""
outfp = open(specfile, "w")
outfp.write("Version: %s\n" % version)
infp = open(spec_template, "r")
outfp.write(infp.read())
infp.close()
outfp.close()
specfile = "rdiff-backup-%s-1.spec" % version
redhat_specfile = "rdiff-backup-%s-1.rh7x.spec" % version
helper(spec_template, specfile)
helper(redhat_spec_template, redhat_specfile)
return (specfile, redhat_specfile)
def Main():
cwd = os.getcwd()
os.chdir(SourceDir)
assert not os.system("./Make")
os.chdir(cwd)
version = GetVersion()
print "Processing version " + version
tarfile = MakeTar(version)
print "Made tar file " + tarfile
specfiles = MakeSpecFile(version)
print "Made specfiles %s and %s" % specfiles
if __name__ == "__main__": Main()
#!/usr/bin/env python
import os, sys, re
SourceDir = "src"
def GetVersion():
"""Return version string by reading in ./rdiff-backup"""
fp = open(SourceDir + "/rdiff-backup", "r")
match = re.search("Version (.*?) ", fp.read())
fp.close()
return match.group(1)
if len(sys.argv) == 1:
version = GetVersion()
specfile = "rdiff-backup-%s-1.spec" % version
print "Using specfile %s" % specfile
elif len(sys.argv) == 2:
specfile = sys.argv[1]
print "Using specfile %s" % specfile
else:
print ("%s takes zero or one argument, the name of the rpm spec "
"file" % sys.argv[0])
sys.exit(1)
base = ".".join(specfile.split(".")[:-1])
srcrpm = base+".src.rpm"
noarchrpm = base+".noarch.rpm"
tarfile = "-".join(base.split("-")[:-1]) + ".tar.gz"
redhat_srcrpm = base+"rh7x.src.rpm"
redhat_noarchrpm = base+"rh7x.noarch.rpm"
redhat_patch = "rdiff-backup-rh7x.patch"
redhat_specfile = "rdiff-backup-%s-1.rh7x.spec" % version
#os.system("install -o root -g root -m 644 %s %s /usr/src/redhat/SOURCES" %
# (tarfile, redhat_patch))
os.system("install -o root -g root -m 644 %s /usr/src/redhat/SOURCES" %
(tarfile,))
os.system("rpm -ba --sign -vv --target noarch " + specfile)
#os.system("rpm -ba --sign -vv --target noarch.rh7x " + redhat
#os.system("install -o ben -g ben -m 644 /usr/src/redhat/SRPMS/%s ." % srcrpm)
os.system("install -o ben -g ben -m 644 /usr/src/redhat/RPMS/noarch/%s ." %
noarchrpm)
#!/usr/bin/env python
import sys, os
def RunCommand(cmd):
print cmd
os.system(cmd)
if not sys.argv[1:]:
print 'Call with version number, as in "./makeweb 0.3.1"'
sys.exit(1)
version = sys.argv[1]
webprefix = "/home/ben/misc/html/mirror/rdiff-backup/"
RunCommand("cp *%s* %s" % (version, webprefix))
RunCommand("rman -f html -r '' rdiff-backup.1 > %srdiff-backup.1.html"
% webprefix)
RunCommand("cp FAQ.html CHANGELOG src/rdiff-backup %s" % webprefix)
os.chdir(webprefix)
print "cd ", webprefix
RunCommand("rm latest latest.rpm latest.tar.gz")
RunCommand("ln -s *%s*rpm latest.rpm" % (version,))
RunCommand("ln -s *%s*tar.gz latest.tar.gz" % (version,))
--- rdiff-backup.old Sat Apr 6 10:05:18 2002
+++ rdiff-backup Sat Apr 6 10:05:25 2002
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python2
#
# rdiff-backup -- Mirror files while keeping incremental changes
# Version 0.7.1 released March 25, 2002
Summary: Convenient and transparent local/remote incremental mirror/backup
Name: rdiff-backup
Release: 1
URL: http://www.stanford.edu/~bescoto/rdiff-backup/
Source: %{name}-%{version}.tar.gz
Copyright: GPL
Group: Applications/Archiving
BuildRoot: %{_tmppath}/%{name}-root
requires: librsync, python2 >= 2.2
Patch: rdiff-backup-rh7x.patch
%description
rdiff-backup is a script, written in Python, that backs up one
directory to another and is intended to be run periodically (nightly
from cron for instance). The target directory ends up a copy of the
source directory, but extra reverse diffs are stored in the target
directory, so you can still recover files lost some time ago. The idea
is to combine the best features of a mirror and an incremental
backup. rdiff-backup can also operate in a bandwidth efficient manner
over a pipe, like rsync. Thus you can use rdiff-backup and ssh to
securely back a hard drive up to a remote location, and only the
differences from the previous backup will be transmitted.
%prep
%setup
%patch
%build
%install
rm -rf $RPM_BUILD_ROOT
mkdir -p $RPM_BUILD_ROOT/usr/bin
mkdir -p $RPM_BUILD_ROOT/usr/share/man/man1
install -m 755 rdiff-backup $RPM_BUILD_ROOT/usr/bin/rdiff-backup
install -m 644 rdiff-backup.1 $RPM_BUILD_ROOT/usr/share/man/man1/rdiff-backup.1
%clean
%files
%defattr(-,root,root)
/usr/bin/rdiff-backup
/usr/share/man/man1/rdiff-backup.1.gz
%doc CHANGELOG COPYING README FAQ.html
%changelog
* Sat Apr 6 2002 Ben Escoto <bescoto@stanford.edu>
- Made new version for Redhat 7.x series
* Sun Nov 4 2001 Ben Escoto <bescoto@stanford.edu>
- Initial RPM
Summary: Convenient and transparent local/remote incremental mirror/backup
Name: rdiff-backup
Release: 1
URL: http://www.stanford.edu/~bescoto/rdiff-backup/
Source: %{name}-%{version}.tar.gz
Copyright: GPL
Group: Applications/Archiving
BuildRoot: %{_tmppath}/%{name}-root
requires: librsync, python >= 2.2
%description
rdiff-backup is a script, written in Python, that backs up one
directory to another and is intended to be run periodically (nightly
from cron for instance). The target directory ends up a copy of the
source directory, but extra reverse diffs are stored in the target
directory, so you can still recover files lost some time ago. The idea
is to combine the best features of a mirror and an incremental
backup. rdiff-backup can also operate in a bandwidth efficient manner
over a pipe, like rsync. Thus you can use rdiff-backup and ssh to
securely back a hard drive up to a remote location, and only the
differences from the previous backup will be transmitted.
%prep
%setup
%build
%install
rm -rf $RPM_BUILD_ROOT
mkdir -p $RPM_BUILD_ROOT/usr/bin
mkdir -p $RPM_BUILD_ROOT/usr/share/man/man1
install -m 755 rdiff-backup $RPM_BUILD_ROOT/usr/bin/rdiff-backup
install -m 644 rdiff-backup.1 $RPM_BUILD_ROOT/usr/share/man/man1/rdiff-backup.1
%clean
%files
%defattr(-,root,root)
/usr/bin/rdiff-backup
/usr/share/man/man1/rdiff-backup.1.gz
%doc CHANGELOG COPYING README FAQ.html
%changelog
* Sun Nov 4 2001 Ben Escoto <bescoto@stanford.edu>
- Initial RPM
#!/usr/bin/env python
#
# Compresses old rdiff-backup increments. See
# http://www.stanford.edu/~bescoto/rdiff-backup for information on
# rdiff-backup.
from __future__ import nested_scopes, generators
import os, sys, getopt
rdiff_backup_location = "/usr/bin/rdiff-backup"
no_compression_regexp_string = None
__no_execute__ = 1
def print_help():
"""Print usage, exit"""
print """
Usage: compress-rdiff-backup-increments [options] mirror_directory
This script will compress the old rdiff-backup increments under
mirror_directory, in a format compatible with rdiff-backup version
0.7.1 and later. So for instance if you were using an old version
of rdiff-backup like this:
rdiff-backup /foo /backup
and now you want to take advantage of v0.7.1's space saving
compression, you can run:
compress-rdiff-backup-increments /backup
Options:
--rdiff-backup-location location
This script reads your rdiff-backup executable. The default
is "/usr/bin/rdiff-backup", so if your rdiff-backup is in a
different location, you must use this switch.
--no-compression-regexp regexp
Any increments whose base name match this regular expression
won't be compressed. This is generally used to avoid
compressing already compressed files. See the rdiff-backup
man page for the default.
"""
sys.exit(1)
def parse_args(arglist):
"""Check and evaluate command line arguments, return dirname"""
global rdiff_backup_location
global no_compression_regexp_string
try: optlist, args = getopt.getopt(arglist, "v:",
["rdiff-backup-location=",
"no-compression-regexp="])
except getopt.error: print_help()
for opt, arg in optlist:
if opt == "--no-compression-regexp":
no_compression_regexp_string = arg
elif opt == "--rdiff-backup-location": rdiff_backup_location = arg
else:
print "Bad option: ", opt
print_help()
if len(args) != 1:
print "Wrong number of arguments"
print_help()
return args[0]
def exec_rdiff_backup():
"""Execs rdiff-backup"""
try: execfile(rdiff_backup_location, globals())
except IOError:
print "Unable to read", rdiff_backup_location
print "You may need to use the --rdiff-backup-location argument"
sys.exit(1)
if not map(int, Globals.version.split(".")) >= [0, 7, 1]:
print "This script requires rdiff-backup version 0.7.1 or later,",
print "found version", Globals.version
sys.exit(1)
def gzip_file(rp):
"""gzip rp, adding .gz to path and deleting original"""
newrp = RPath(rp.conn, rp.base, rp.index[:-1] + (rp.index[-1]+".gz",))
if newrp.lstat():
print "Warning: %s already exists, skipping" % newrp.path
return
print "gzipping ", rp.path
newrp.write_from_fileobj(rp.open("rb"), compress = 1)
RPath.copy_attribs(rp, newrp)
rp.delete()
def Main():
dirname = parse_args(sys.argv[1:])
exec_rdiff_backup()
if no_compression_regexp_string is not None:
no_compression_regexp = re.compile(no_compression_regexp_string, re.I)
else: no_compression_regexp = \
re.compile(Globals.no_compression_regexp_string, re.I)
Globals.change_source_perms = 1
Globals.change_ownership = (os.getuid() == 0)
# Check to make sure rbdir exists
root_rp = RPath(Globals.local_connection, dirname)
rbdir = root_rp.append("rdiff-backup-data")
if not rbdir.lstat():
print "Cannot find %s, exiting" % rbdir.path
sys.exit(1)
for dsrp in DestructiveStepping.Iterate_with_Finalizer(rbdir, 1):
if (dsrp.isincfile() and dsrp.isreg() and
not dsrp.isinccompressed() and
(dsrp.getinctype() == "diff" or dsrp.getinctype() == "snapshot")
and dsrp.getsize() != 0 and
not no_compression_regexp.match(dsrp.getincbase_str())):
gzip_file(dsrp)
Main()
#!/usr/bin/env python
from __future__ import generators
import sys, os, stat
def usage():
print "Usage: find2dirs dir1 dir2"
print
print "Given the name of two directories, list all the files in both, one"
print "per line, but don't repeat a file even if it is in both directories"
sys.exit(1)
def getlist(base, ext = ""):
"""Return iterator yielding filenames from directory"""
if ext: yield ext
else: yield "."
fullname = os.path.join(base, ext)
if stat.S_ISDIR(stat.S_IFMT(os.lstat(fullname)[stat.ST_MODE])):
for subfile in os.listdir(fullname):
for fn in getlist(base, os.path.join(ext, subfile)): yield fn
def main(dir1, dir2):
d = {}
for fn in getlist(dir1): d[fn] = 1
for fn in getlist(dir2): d[fn] = 1
for fn in d.keys(): print fn
if not len(sys.argv) == 3: usage()
else: main(sys.argv[1], sys.argv[2])
#!/usr/bin/env python
"""init_smallfiles.py
This program makes a number of files of the given size in the
specified directory.
"""
import os, stat, sys, math
if len(sys.argv) > 5 or len(sys.argv) < 4:
print "Usage: init_files [directory name] [file size] [file count] [base]"
print
print "Creates file_count files in directory_name of size file_size."
print "The created directory has a tree type structure where each level"
print "has at most base files or directories in it. Default is 50."
sys.exit(1)
dirname = sys.argv[1]
filesize = int(sys.argv[2])
filecount = int(sys.argv[3])
block_size = 16384
block = "." * block_size
block_change = "." * (filesize % block_size)
if len(sys.argv) == 4: base = 50
else: base = int(sys.argv[4])
def make_file(path):
"""Make the file at path"""
fp = open(path, "w")
for i in xrange(int(math.floor(filesize/block_size))): fp.write(block)
fp.write(block_change)
fp.close()
def find_sublevels(count):
"""Return number of sublevels required for count files"""
return int(math.ceil(math.log(count)/math.log(base)))
def make_dir(dir, count):
"""Make count files in the directory, making subdirectories if necessary"""
print "Making directory %s with %d files" % (dir, count)
os.mkdir(dir)
level = find_sublevels(count)
assert count <= pow(base, level)
if level == 1:
for i in range(count): make_file(os.path.join(dir, "file%d" %i))
else:
files_per_subdir = pow(base, level-1)
full_dirs = int(count/files_per_subdir)
assert full_dirs <= base
for i in range(full_dirs):
make_dir(os.path.join(dir, "subdir%d" % i), files_per_subdir)
change = count - full_dirs*files_per_subdir
assert change >= 0
if change > 0:
make_dir(os.path.join(dir, "subdir%d" % full_dirs), change)
def start(dir):
try: os.stat(dir)
except os.error: pass
else:
print "Directory %s already exists, exiting." % dir
sys.exit(1)
make_dir(dirname, filecount)
start(dirname)
#!/usr/bin/python
import sys, os
curdir = os.getcwd()
os.chdir("../src")
execfile("selection.py")
os.chdir(curdir)
lc = Globals.local_connection
for filename in sys.argv[1:]:
#print "Deleting %s" % filename
rp = RPath(lc, filename)
if rp.lstat(): rp.delete()
#os.system("rm -rf " + rp.path)
#!/usr/bin/env python
"""remove-comments.py
Given a python program on standard input, spit one out on stdout that
should work the same, but has blank and comment lines removed.
"""
import sys, re
triple_regex = re.compile('"""')
def eattriple(initial_line_stripped):
"""Keep reading until end of doc string"""
assert initial_line_stripped.startswith('"""')
if triple_regex.search(initial_line_stripped[3:]): return
while 1:
line = sys.stdin.readline()
if not line or triple_regex.search(line): break
while 1:
line = sys.stdin.readline()
if not line: break
stripped = line.strip()
if not stripped: continue
if stripped[0] == "#": continue
if stripped.startswith('"""'):
eattriple(stripped)
continue
sys.stdout.write(line)
This diff is collapsed.
This diff is collapsed.
from __future__ import generators
import types
execfile("rorpiter.py")
#######################################################################
#
# destructive-stepping - Deal with side effects from traversing trees
#
class DSRPPermError(Exception):
"""Exception used when a DSRPath can't get sufficient permissions"""
pass
class DSRPath(RPath):
"""Destructive Stepping RPath
Sometimes when we traverse the directory tree, even when we just
want to read files, we have to change things, like the permissions
of a file or directory in order to read it, or the file's access
times. This class is like an RPath, but the permission and time
modifications are delayed, so that they can be done at the very
end when they won't be disturbed later.
Here are the new class variables:
delay_perms - true iff future perm changes should be delayed
newperms - holds the perm values while they are delayed
delay_atime - true iff some atime change are being delayed
newatime - holds the new atime
delay_mtime - true if some mtime change is being delayed
newmtime - holds the new mtime
"""
def __init__(self, source, *args):
"""Initialize DSRP
Source should be true iff the DSRPath is taken from the
"source" partition and thus settings like
Globals.change_source_perms should be paid attention to.
If args is [rpath], return the dsrpath equivalent of rpath,
otherwise use the same arguments as the RPath initializer.
"""
if len(args) == 1 and isinstance(args[0], RPath):
rp = args[0]
RPath.__init__(self, rp.conn, rp.base, rp.index)
else: RPath.__init__(self, *args)
if source != "bypass":
# "bypass" val is used when unpackaging over connection
assert source is None or source is 1
self.source = source
self.set_delays(source)
self.set_init_perms(source)
def set_delays(self, source):
"""Delay writing permissions and times where appropriate"""
if not source or Globals.change_source_perms:
self.delay_perms, self.newperms = 1, None
else: self.delay_perms = None
if Globals.preserve_atime:
self.delay_atime = 1
# Now get atime right away if possible
if self.data.has_key('atime'): self.newatime = self.data['atime']
else: self.newatime = None
else: self.delay_atime = None
if source:
self.delay_mtime = None # we'll never change mtime of source file
else:
self.delay_mtime = 1
# Save mtime now for a dir, because it might inadvertantly change
if self.isdir(): self.newmtime = self.data['mtime']
else: self.newmtime = None
def set_init_perms(self, source):
"""If necessary, change permissions to ensure access"""
if self.isreg() and not self.readable():
if (source and Globals.change_source_perms or
not source and Globals.change_mirror_perms):
self.chmod_bypass(0400)
elif self.isdir():
if source and Globals.change_source_perms:
if not self.readable() or not self.executable():
self.chmod_bypass(0500)
elif not source and Globals.change_mirror_perms:
if not self.hasfullperms(): self.chmod_bypass(0700)
def warn(self, err):
Log("Received error '%s' when dealing with file %s, skipping..."
% (err, self.path), 1)
raise DSRPPermError(self.path)
def __getstate__(self):
"""Return picklable state. See RPath __getstate__."""
assert self.conn is Globals.local_connection # Can't pickle a conn
return self.getstatedict()
def getstatedict(self):
"""Return dictionary containing the attributes we can save"""
pickle_dict = {}
for attrib in ['index', 'data', 'delay_perms', 'newperms',
'delay_atime', 'newatime',
'delay_mtime', 'newmtime',
'path', 'base', 'source']:
if self.__dict__.has_key(attrib):
pickle_dict[attrib] = self.__dict__[attrib]
return pickle_dict
def __setstate__(self, pickle_dict):
"""Set state from object produced by getstate"""
self.conn = Globals.local_connection
for attrib in pickle_dict.keys():
self.__dict__[attrib] = pickle_dict[attrib]
def chmod(self, permissions):
"""Change permissions, delaying if self.perms_delayed is set"""
if self.delay_perms: self.newperms = self.data['perms'] = permissions
else: RPath.chmod(self, permissions)
def getperms(self):
"""Return dsrp's intended permissions"""
if self.delay_perms and self.newperms is not None:
return self.newperms
else: return self.data['perms']
def chmod_bypass(self, permissions):
"""Change permissions without updating the data dictionary"""
self.delay_perms = 1
if self.newperms is None: self.newperms = self.getperms()
Log("DSRP: Perm bypass %s to %o" % (self.path, permissions), 8)
self.conn.os.chmod(self.path, permissions)
def settime(self, accesstime, modtime):
"""Change times, delaying if self.times_delayed is set"""
if self.delay_atime: self.newatime = self.data['atime'] = accesstime
if self.delay_mtime: self.newmtime = self.data['mtime'] = modtime
if not self.delay_atime or not self.delay_mtime:
RPath.settime(self, accesstime, modtime)
def setmtime(self, modtime):
"""Change mtime, delaying if self.times_delayed is set"""
if self.delay_mtime: self.newmtime = self.data['mtime'] = modtime
else: RPath.setmtime(self, modtime)
def getmtime(self):
"""Return dsrp's intended modification time"""
if self.delay_mtime and self.newmtime is not None:
return self.newmtime
else: return self.data['mtime']
def getatime(self):
"""Return dsrp's intended access time"""
if self.delay_atime and self.newatime is not None:
return self.newatime
else: return self.data['atime']
def write_changes(self):
"""Write saved up permission/time changes"""
if not self.lstat(): return # File has been deleted in meantime
if self.delay_perms and self.newperms is not None:
Log("Finalizing permissions of dsrp %s to %s" %
(self.path, self.newperms), 8)
RPath.chmod(self, self.newperms)
do_atime = self.delay_atime and self.newatime is not None
do_mtime = self.delay_mtime and self.newmtime is not None
if do_atime and do_mtime:
RPath.settime(self, self.newatime, self.newmtime)
elif do_atime and not do_mtime:
RPath.settime(self, self.newatime, self.getmtime())
elif not do_atime and do_mtime:
RPath.setmtime(self, self.newmtime)
def newpath(self, newpath, index = ()):
"""Return similar DSRPath but with new path"""
return self.__class__(self.source, self.conn, newpath, index)
def append(self, ext):
"""Return similar DSRPath with new extension"""
return self.__class__(self.source, self.conn, self.base,
self.index + (ext,))
def new_index(self, index):
"""Return similar DSRPath with new index"""
return self.__class__(self.source, self.conn, self.base, index)
class DestructiveSteppingFinalizer(IterTreeReducer):
"""Finalizer that can work on an iterator of dsrpaths
The reason we have to use an IterTreeReducer is that some files
should be updated immediately, but for directories we sometimes
need to update all the files in the directory before finally
coming back to it.
"""
dsrpath = None
def start_process(self, index, dsrpath):
self.dsrpath = dsrpath
def end_process(self):
if self.dsrpath:
Robust.check_common_error(self.dsrpath.write_changes,
lambda exc: Log("Error %s finalizing file %s" %
(str(exc), dsrp.path)))
from __future__ import generators
execfile("manage.py")
#######################################################################
#
# filelist - Some routines that help with operations over files listed
# in standard input instead of over whole directories.
#
class FilelistError(Exception): pass
class Filelist:
"""Many of these methods have analogs in highlevel.py"""
def File2Iter(fp, baserp):
"""Convert file obj with one pathname per line into rpiter
Closes fp when done. Given files are added to baserp.
"""
while 1:
line = fp.readline()
if not line: break
if line[-1] == "\n": line = line[:-1] # strip trailing newline
if not line: continue # skip blank lines
elif line[0] == "/": raise FilelistError(
"Read in absolute file name %s." % line)
yield baserp.append(line)
assert not fp.close(), "Error closing filelist fp"
def Mirror(src_rpath, dest_rpath, rpiter):
"""Copy files in fileiter from src_rpath to dest_rpath"""
sigiter = dest_rpath.conn.Filelist.get_sigs(dest_rpath, rpiter)
diffiter = Filelist.get_diffs(src_rpath, sigiter)
dest_rpath.conn.Filelist.patch(dest_rpath, diffiter)
dest_rpath.setdata()
def Mirror_and_increment(src_rpath, dest_rpath, inc_rpath):
"""Mirror + put increment in tree based at inc_rpath"""
sigiter = dest_rpath.conn.Filelist.get_sigs(dest_rpath, rpiter)
diffiter = Filelist.get_diffs(src_rpath, sigiter)
dest_rpath.conn.Filelist.patch_and_increment(dest_rpath, diffiter,
inc_rpath)
dest_rpath.setdata()
def get_sigs(dest_rpbase, rpiter):
"""Get signatures of file analogs in rpiter
This is meant to be run on the destination side. Only the
extention part of the rps in rpiter will be used; the base is
ignored.
"""
def dest_iter(src_iter):
for src_rp in src_iter: yield dest_rpbase.new_index(src_rp.index)
return RORPIter.Signatures(dest_iter())
def get_diffs(src_rpbase, sigiter):
"""Get diffs based on sigiter and files in src_rpbase
This should be run on the local side.
"""
for sig_rorp in sigiter:
new_rp = src_rpbase.new_index(sig_rorp.index)
yield RORPIter.diffonce(sig_rorp, new_rp)
def patch(dest_rpbase, diffiter):
"""Process diffs in diffiter and update files in dest_rbpase.
Run remotely.
"""
for diff_rorp in diffiter:
basisrp = dest_rpbase.new_index(diff_rorp.index)
if basisrp.lstat(): Filelist.make_subdirs(basisrp)
Log("Processing %s" % basisrp.path, 7)
RORPIter.patchonce(dest_rpbase, basisrp, diff_rorp)
def patch_and_increment(dest_rpbase, diffiter, inc_rpbase):
"""Apply diffs in diffiter to dest_rpbase, and increment to inc_rpbase
Also to be run remotely.
"""
for diff_rorp in diffiter:
basisrp = dest_rpbase.new_index(diff_rorp.index)
if diff_rorp.lstat(): Filelist.make_subdirs(basisrp)
Log("Processing %s" % basisrp.path, 7)
# XXX This isn't done yet...
def make_subdirs(rpath):
"""Make sure that all the directories under the rpath exist
This function doesn't try to get the permissions right on the
underlying directories, just do the minimum to make sure the
file can be created.
"""
dirname = rpath.dirsplit()[0]
if dirname == '.' or dirname == '': return
dir_rp = RPath(rpath.conn, dirname)
Filelist.make_subdirs(dir_rp)
if not dir_rp.lstat(): dir_rp.mkdir()
MakeStatic(Filelist)
#!/usr/bin/env python
#
# rdiff-backup -- Mirror files while keeping incremental changes
# Version 0.7.5 released May 21, 2002
# Copyright (C) 2001, 2002 Ben Escoto <bescoto@stanford.edu>
#
# This program is licensed under the GNU General Public License (GPL).
# Distributions of rdiff-backup usually include a copy of the GPL in a
# file called COPYING. The GPL is also available online at
# http://www.gnu.org/copyleft/gpl.html.
#
# See http://www.stanford.edu/~bescoto/rdiff-backup for more
# information. Please send mail to me or the mailing list if you find
# bugs or have any suggestions.
from __future__ import nested_scopes, generators
import os, stat, time, sys, getopt, re, cPickle, types, shutil, sha, marshal, traceback, popen2, tempfile, gzip, UserList, errno
This diff is collapsed.
execfile("filename_mapping.py")
#######################################################################
#
# increment - Provides Inc class, which writes increment files
#
# This code is what writes files ending in .diff, .snapshot, etc.
#
class Inc:
"""Class containing increment functions"""
# This is a hack. _inc_file holds the dsrp of the latest
# increment file created, to be used in IncrementITR for
# statistics purposes. It should be given directly to the ITR
# object but there didn't seem to be a good way to pass it out.
_inc_file = None
def Increment_action(new, mirror, incpref):
"""Main file incrementing function, returns RobustAction
new is the file on the active partition,
mirror is the mirrored file from the last backup,
incpref is the prefix of the increment file.
This function basically moves mirror -> incpref.
"""
if not (new and new.lstat() or mirror.lstat()):
return Robust.null_action # Files deleted in meantime, do nothing
Log("Incrementing mirror file " + mirror.path, 5)
if ((new and new.isdir()) or mirror.isdir()) and not incpref.isdir():
incpref.mkdir()
if not mirror.lstat(): return Inc.makemissing_action(incpref)
elif mirror.isdir(): return Inc.makedir_action(mirror, incpref)
elif new.isreg() and mirror.isreg():
return Inc.makediff_action(new, mirror, incpref)
else: return Inc.makesnapshot_action(mirror, incpref)
def Increment(new, mirror, incpref):
Inc.Increment_action(new, mirror, incpref).execute()
def makemissing_action(incpref):
"""Signify that mirror file was missing"""
return RobustAction(lambda: None,
Inc.get_inc_ext(incpref, "missing").touch,
lambda exp: None)
def makesnapshot_action(mirror, incpref):
"""Copy mirror to incfile, since new is quite different"""
if (mirror.isreg() and Globals.compression and
not Globals.no_compression_regexp.match(mirror.path)):
snapshotrp = Inc.get_inc_ext(incpref, "snapshot.gz")
return Robust.copy_with_attribs_action(mirror, snapshotrp, 1)
else:
snapshotrp = Inc.get_inc_ext(incpref, "snapshot")
return Robust.copy_with_attribs_action(mirror, snapshotrp, None)
def makediff_action(new, mirror, incpref):
"""Make incfile which is a diff new -> mirror"""
if (Globals.compression and
not Globals.no_compression_regexp.match(mirror.path)):
diff = Inc.get_inc_ext(incpref, "diff.gz")
return Robust.chain([Rdiff.write_delta_action(new, mirror,
diff, 1),
Robust.copy_attribs_action(mirror, diff)])
else:
diff = Inc.get_inc_ext(incpref, "diff")
return Robust.chain([Rdiff.write_delta_action(new, mirror,
diff, None),
Robust.copy_attribs_action(mirror, diff)])
def makedir_action(mirrordir, incpref):
"""Make file indicating directory mirrordir has changed"""
dirsign = Inc.get_inc_ext(incpref, "dir")
def final():
dirsign.touch()
RPath.copy_attribs(mirrordir, dirsign)
return RobustAction(lambda: None, final, dirsign.delete)
def get_inc_ext(rp, typestr):
"""Return RPath/DSRPath like rp but with inc/time extension
If the file exists, then probably a previous backup has been
aborted. We then keep asking FindTime to get a time later
than the one that already has an inc file.
"""
def get_newinc(timestr):
"""Get new increment rp with given time suffix"""
addtostr = lambda s: "%s.%s.%s" % (s, timestr, typestr)
if rp.index:
incrp = rp.__class__(rp.conn, rp.base, rp.index[:-1] +
(addtostr(rp.index[-1]),))
else: incrp = rp.__class__(rp.conn, addtostr(rp.base), rp.index)
if Globals.quoting_enabled: incrp.quote_path()
return incrp
inctime = 0
while 1:
inctime = Resume.FindTime(rp.index, inctime)
incrp = get_newinc(Time.timetostring(inctime))
if not incrp.lstat(): break
Inc._inc_file = incrp
return incrp
MakeStatic(Inc)
class IncrementITR(IterTreeReducer):
"""Patch and increment iterator of increment triples
This has to be an ITR because directories that have files in them
changed are flagged with an increment marker. There are four
possibilities as to the order:
1. Normal file -> Normal file: right away
2. Directory -> Directory: wait until files in the directory
are processed, as we won't know whether to add a marker
until the end.
3. Normal file -> Directory: right away, so later files will
have a directory to go into.
4. Directory -> Normal file: Wait until the end, so we can
process all the files in the directory.
Remember this object needs to be pickable.
"""
directory, directory_replacement = None, None
changed = None
def __init__(self, inc_rpath):
"""Set inc_rpath, an rpath of the base of the tree"""
self.inc_rpath = inc_rpath
IterTreeReducer.__init__(self, inc_rpath)
def start_process(self, index, diff_rorp, dsrp):
"""Initial processing of file
diff_rorp is the RORPath of the diff from the remote side, and
dsrp is the local file to be incremented
"""
self.init_statistics(diff_rorp, dsrp)
incpref = self.inc_rpath.new_index(index)
if Globals.quoting_enabled: incpref.quote_path()
if dsrp.isdir():
self.init_dir(dsrp, diff_rorp, incpref)
self.setvals(diff_rorp, dsrp, incpref)
else: self.init_non_dir(dsrp, diff_rorp, incpref)
def init_statistics(self, diff_rorp, dsrp):
"""Set initial values for various statistics
These refer to the old mirror or to new increment files. Note
that changed_file_size could be bigger than total_file_size.
The other statistic, increment_file_size, is set later when we
have that information.
"""
if dsrp.lstat():
self.total_files = 1
self.total_file_size = dsrp.getsize()
else: self.total_files = self.total_file_size = 0
if diff_rorp:
self.changed_files = 1
if dsrp.lstat(): self.changed_file_size = dsrp.getsize()
else: self.changed_file_size = 0
else: self.changed_files = self.changed_file_size = 0
self.increment_file_size = 0
def override_changed(self):
"""Set changed flag to true
This is used only at the top level of a backup, to make sure
that a marker is created recording every backup session.
"""
self.changed = 1
def setvals(self, diff_rorp, dsrp, incpref):
"""Record given values in state dict since in directory
We don't do these earlier in case of a problem inside the
init_* functions. Index isn't given because it is done by the
superclass.
"""
self.directory = 1
self.diff_rorp = diff_rorp
self.dsrp = dsrp
self.incpref = incpref
def init_dir(self, dsrp, diff_rorp, incpref):
"""Process a directory (initial pass)
If the directory is changing into a normal file, we need to
save the normal file data in a temp file, and then create the
real file once we are done with everything inside the
directory.
"""
if not (incpref.lstat() and incpref.isdir()): incpref.mkdir()
if diff_rorp and diff_rorp.isreg() and diff_rorp.file:
tf = TempFileManager.new(dsrp)
RPathStatic.copy_with_attribs(diff_rorp, tf)
tf.set_attached_filetype(diff_rorp.get_attached_filetype())
self.directory_replacement = tf
def init_non_dir(self, dsrp, diff_rorp, incpref):
"""Process a non directory file (initial pass)"""
if not diff_rorp: return # no diff, so no change necessary
if diff_rorp.isreg() and (dsrp.isreg() or diff_rorp.isflaglinked()):
tf = TempFileManager.new(dsrp)
def init_thunk():
if diff_rorp.isflaglinked():
Hardlink.link_rp(diff_rorp, tf, dsrp)
else: Rdiff.patch_with_attribs_action(dsrp, diff_rorp,
tf).execute()
Inc.Increment_action(tf, dsrp, incpref).execute()
Robust.make_tf_robustaction(init_thunk, (tf,), (dsrp,)).execute()
else:
Robust.chain([Inc.Increment_action(diff_rorp, dsrp, incpref),
RORPIter.patchonce_action(None, dsrp, diff_rorp)]
).execute()
self.increment_file_size += ((Inc._inc_file and Inc._inc_file.lstat()
and Inc._inc_file.getsize()) or 0)
self.changed = 1
def end_process(self):
"""Do final work when leaving a tree (directory)"""
if not self.directory: return
diff_rorp, dsrp, incpref = self.diff_rorp, self.dsrp, self.incpref
if not diff_rorp and not self.changed: return
if self.directory_replacement:
tf = self.directory_replacement
Inc.Increment(tf, dsrp, incpref)
RORPIter.patchonce_action(None, dsrp, tf).execute()
tf.delete()
else:
Inc.Increment(diff_rorp, dsrp, incpref)
if diff_rorp:
RORPIter.patchonce_action(None, dsrp, diff_rorp).execute()
self.increment_file_size += ((Inc._inc_file and Inc._inc_file.lstat()
and Inc._inc_file.getsize()) or 0)
self.write_statistics()
def write_statistics(self):
"""Write the accumulated totals into file in inc directory"""
if not self.incpref.isdir(): return # only write for directories
statrp = Inc.get_inc_ext(self.incpref.append("directory_statistics"),
"data")
tf = TempFileManager.new(statrp)
def init_thunk():
fp = tf.open("w")
fp.write("TotalFiles %d\n" % self.total_files)
fp.write("TotalFileSize %d\n" % self.total_file_size)
fp.write("ChangedFiles %d\n" % self.changed_files)
fp.write("ChangedFileSize %d\n" % self.changed_file_size)
fp.write("IncrementFileSize %d\n" % self.increment_file_size)
fp.close()
Robust.make_tf_robustaction(init_thunk, (tf,), (statrp,)).execute()
def branch_process(self, subinstance):
"""Update statistics, and the has_changed flag if change in branch"""
if subinstance.changed: self.changed = 1
self.total_files += subinstance.total_files
self.total_file_size += subinstance.total_file_size
self.changed_files += subinstance.changed_files
self.changed_file_size += subinstance.changed_file_size
self.increment_file_size += subinstance.increment_file_size
execfile("ttime.py")
import cPickle
#######################################################################
#
# iterfile - Convert an iterator to a file object and vice-versa
#
class IterFileException(Exception): pass
class UnwrapFile:
"""Contains some basic methods for parsing a file containing an iter"""
def __init__(self, file):
self.file = file
def _s2l(self, s):
"""Convert string to long int"""
assert len(s) == 7
l = 0L
for i in range(7): l = l*256 + ord(s[i])
return l
def _get(self):
"""Return pair (type, data) next in line on the file
type is a single character which is either "o" for object, "f"
for file, "c" for a continution of a file, or None if no more
data can be read. Data is either the file's data, if type is
"c" or "f", or the actual object if the type is "o".
"""
header = self.file.read(8)
if not header: return None, None
assert len(header) == 8, "Header is only %d bytes" % len(header)
type, length = header[0], self._s2l(header[1:])
buf = self.file.read(length)
if type == "o": return type, cPickle.loads(buf)
else: return type, buf
class IterWrappingFile(UnwrapFile):
"""An iterator generated from a file.
Initialize with a file type object, and then it will return the
elements of the file in order.
"""
def __init__(self, file):
UnwrapFile.__init__(self, file)
self.currently_in_file = None
def __iter__(self): return self
def next(self):
if self.currently_in_file:
self.currently_in_file.close() # no error checking by this point
type, data = self._get()
if not type: raise StopIteration
if type == "o": return data
elif type == "f":
file = IterVirtualFile(self, data)
if data: self.currently_in_file = file
else: self.currently_in_file = None
return file
else: raise IterFileException("Bad file type %s" % type)
class IterVirtualFile(UnwrapFile):
"""Another version of a pretend file
This is returned by IterWrappingFile when a file is embedded in
the main file that the IterWrappingFile is based around.
"""
def __init__(self, iwf, initial_data):
"""Initializer
initial_data is the data from the first block of the file.
iwf is the iter wrapping file that spawned this
IterVirtualFile.
"""
UnwrapFile.__init__(self, iwf.file)
self.iwf = iwf
self.bufferlist = [initial_data]
self.bufferlen = len(initial_data)
self.closed = None
def check_consistency(self):
l = len("".join(self.bufferlist))
assert l == self.bufferlen, \
"Length of IVF bufferlist doesn't match (%s, %s)" % \
(l, self.bufferlen)
def read(self, length):
assert not self.closed
if self.iwf.currently_in_file:
while length >= self.bufferlen:
if not self.addtobuffer(): break
real_len = min(length, self.bufferlen)
combined_buffer = "".join(self.bufferlist)
assert len(combined_buffer) == self.bufferlen, \
(len(combined_buffer), self.bufferlen)
self.bufferlist = [combined_buffer[real_len:]]
self.bufferlen = self.bufferlen - real_len
return combined_buffer[:real_len]
def addtobuffer(self):
"""Read a chunk from the file and add it to the buffer"""
assert self.iwf.currently_in_file
type, data = self._get()
assert type == "c", "Type is %s instead of c" % type
if data:
self.bufferlen = self.bufferlen + len(data)
self.bufferlist.append(data)
return 1
else:
self.iwf.currently_in_file = None
return None
def close(self):
"""Currently just reads whats left and discards it"""
while self.iwf.currently_in_file:
self.addtobuffer()
self.bufferlist = []
self.bufferlen = 0
self.closed = 1
class FileWrappingIter:
"""A file interface wrapping around an iterator
This is initialized with an iterator, and then converts it into a
stream of characters. The object will evaluate as little of the
iterator as is necessary to provide the requested bytes.
The actual file is a sequence of marshaled objects, each preceded
by 8 bytes which identifies the following the type of object, and
specifies its length. File objects are not marshalled, but the
data is written in chunks of Globals.blocksize, and the following
blocks can identify themselves as continuations.
"""
def __init__(self, iter):
"""Initialize with iter"""
self.iter = iter
self.bufferlist = []
self.bufferlen = 0L
self.currently_in_file = None
self.closed = None
def read(self, length):
"""Return next length bytes in file"""
assert not self.closed
while self.bufferlen < length:
if not self.addtobuffer(): break
combined_buffer = "".join(self.bufferlist)
assert len(combined_buffer) == self.bufferlen
real_len = min(self.bufferlen, length)
self.bufferlen = self.bufferlen - real_len
self.bufferlist = [combined_buffer[real_len:]]
return combined_buffer[:real_len]
def addtobuffer(self):
"""Updates self.bufferlist and self.bufferlen, adding on a chunk
Returns None if we have reached the end of the iterator,
otherwise return true.
"""
if self.currently_in_file:
buf = "c" + self.addfromfile()
else:
try: currentobj = self.iter.next()
except StopIteration: return None
if hasattr(currentobj, "read") and hasattr(currentobj, "close"):
self.currently_in_file = currentobj
buf = "f" + self.addfromfile()
else:
pickle = cPickle.dumps(currentobj, 1)
buf = "o" + self._l2s(len(pickle)) + pickle
self.bufferlist.append(buf)
self.bufferlen = self.bufferlen + len(buf)
return 1
def addfromfile(self):
"""Read a chunk from the current file and return it"""
buf = self.currently_in_file.read(Globals.blocksize)
if not buf:
assert not self.currently_in_file.close()
self.currently_in_file = None
return self._l2s(len(buf)) + buf
def _l2s(self, l):
"""Convert long int to string of 7 characters"""
s = ""
for i in range(7):
l, remainder = divmod(l, 256)
s = chr(remainder) + s
assert remainder == 0
return s
def close(self): self.closed = 1
class BufferedRead:
"""Buffer the .read() calls to the given file
This is used to lessen overhead and latency when a file is sent
over a connection.
"""
def __init__(self, file):
self.file = file
self.buffer = ""
self.bufsize = Globals.conn_bufsize
def read(self, l = -1):
if l < 0: # Read as much as possible
result = self.buffer + self.file.read()
self.buffer = ""
return result
if len(self.buffer) < l: # Try to make buffer as long as l
self.buffer += self.file.read(max(self.bufsize,
l - len(self.buffer)))
actual_size = min(l, len(self.buffer))
result = self.buffer[:actual_size]
self.buffer = self.buffer[actual_size:]
return result
def close(self): return self.file.close()
from __future__ import generators
execfile("static.py")
import os, stat, types
#######################################################################
#
# lazy - Define some lazy data structures and functions acting on them
#
class Iter:
"""Hold static methods for the manipulation of lazy iterators"""
def filter(predicate, iterator):
"""Like filter in a lazy functional programming language"""
for i in iterator:
if predicate(i): yield i
def map(function, iterator):
"""Like map in a lazy functional programming language"""
for i in iterator: yield function(i)
def foreach(function, iterator):
"""Run function on each element in iterator"""
for i in iterator: function(i)
def cat(*iters):
"""Lazily concatenate iterators"""
for iter in iters:
for i in iter: yield i
def cat2(iter_of_iters):
"""Lazily concatenate iterators, iterated by big iterator"""
for iter in iter_of_iters:
for i in iter: yield i
def empty(iter):
"""True if iterator has length 0"""
for i in iter: return None
return 1
def equal(iter1, iter2, verbose = None, operator = lambda x, y: x == y):
"""True if iterator 1 has same elements as iterator 2
Use equality operator, or == if it is unspecified.
"""
for i1 in iter1:
try: i2 = iter2.next()
except StopIteration:
if verbose: print "End when i1 = %s" % i1
return None
if not operator(i1, i2):
if verbose: print "%s not equal to %s" % (i1, i2)
return None
try: i2 = iter2.next()
except StopIteration: return 1
if verbose: print "End when i2 = %s" % i2
return None
def Or(iter):
"""True if any element in iterator is true. Short circuiting"""
i = None
for i in iter:
if i: return i
return i
def And(iter):
"""True if all elements in iterator are true. Short circuiting"""
i = 1
for i in iter:
if not i: return i
return i
def len(iter):
"""Return length of iterator"""
i = 0
while 1:
try: iter.next()
except StopIteration: return i
i = i+1
def foldr(f, default, iter):
"""foldr the "fundamental list recursion operator"?"""
try: next = iter.next()
except StopIteration: return default
return f(next, Iter.foldr(f, default, iter))
def foldl(f, default, iter):
"""the fundamental list iteration operator.."""
while 1:
try: next = iter.next()
except StopIteration: return default
default = f(default, next)
def multiplex(iter, num_of_forks, final_func = None, closing_func = None):
"""Split a single iterater into a number of streams
The return val will be a list with length num_of_forks, each
of which will be an iterator like iter. final_func is the
function that will be called on each element in iter just as
it is being removed from the buffer. closing_func is called
when all the streams are finished.
"""
if num_of_forks == 2 and not final_func and not closing_func:
im2 = IterMultiplex2(iter)
return (im2.yielda(), im2.yieldb())
if not final_func: final_func = lambda i: None
if not closing_func: closing_func = lambda: None
# buffer is a list of elements that some iterators need and others
# don't
buffer = []
# buffer[forkposition[i]] is the next element yieled by iterator
# i. If it is -1, yield from the original iter
starting_forkposition = [-1] * num_of_forks
forkposition = starting_forkposition[:]
called_closing_func = [None]
def get_next(fork_num):
"""Return the next element requested by fork_num"""
if forkposition[fork_num] == -1:
try: buffer.insert(0, iter.next())
except StopIteration:
# call closing_func if necessary
if (forkposition == starting_forkposition and
not called_closing_func[0]):
closing_func()
called_closing_func[0] = None
raise StopIteration
for i in range(num_of_forks): forkposition[i] += 1
return_val = buffer[forkposition[fork_num]]
forkposition[fork_num] -= 1
blen = len(buffer)
if not (blen-1) in forkposition:
# Last position in buffer no longer needed
assert forkposition[fork_num] == blen-2
final_func(buffer[blen-1])
del buffer[blen-1]
return return_val
def make_iterator(fork_num):
while(1): yield get_next(fork_num)
return tuple(map(make_iterator, range(num_of_forks)))
MakeStatic(Iter)
class IterMultiplex2:
"""Multiplex an iterator into 2 parts
This is a special optimized case of the Iter.multiplex function,
used when there is no closing_func or final_func, and we only want
to split it into 2. By profiling, this is a time sensitive class.
"""
def __init__(self, iter):
self.a_leading_by = 0 # How many places a is ahead of b
self.buffer = []
self.iter = iter
def yielda(self):
"""Return first iterator"""
buf, iter = self.buffer, self.iter
while(1):
if self.a_leading_by >= 0: # a is in front, add new element
elem = iter.next() # exception will be passed
buf.append(elem)
else: elem = buf.pop(0) # b is in front, subtract an element
self.a_leading_by += 1
yield elem
def yieldb(self):
"""Return second iterator"""
buf, iter = self.buffer, self.iter
while(1):
if self.a_leading_by <= 0: # b is in front, add new element
elem = iter.next() # exception will be passed
buf.append(elem)
else: elem = buf.pop(0) # a is in front, subtract an element
self.a_leading_by -= 1
yield elem
class IterTreeReducer:
"""Tree style reducer object for iterator
The indicies of a RORPIter form a tree type structure. This class
can be used on each element of an iter in sequence and the result
will be as if the corresponding tree was reduced. This tries to
bridge the gap between the tree nature of directories, and the
iterator nature of the connection between hosts and the temporal
order in which the files are processed.
There are three stub functions below: start_process, end_process,
and branch_process. A class that subclasses this one should fill
in these functions with real values.
It is important that this class be pickable, so keep that in mind
when subclassing (this is used to resume failed sessions).
"""
def __init__(self, *args):
"""ITR initializer"""
self.init_args = args
self.index = None
self.subinstance = None
self.finished = None
def intree(self, index):
"""Return true if index is still in current tree"""
return self.base_index == index[:len(self.base_index)]
def set_subinstance(self):
"""Return subinstance of same type as self"""
self.subinstance = self.__class__(*self.init_args)
def process_w_subinstance(self, args):
"""Give object to subinstance, if necessary update branch_val"""
if not self.subinstance: self.set_subinstance()
if not self.subinstance(*args):
self.branch_process(self.subinstance)
self.set_subinstance()
assert self.subinstance(*args)
def start_process(self, *args):
"""Do some initial processing (stub)"""
pass
def end_process(self):
"""Do any final processing before leaving branch (stub)"""
pass
def branch_process(self, subinstance):
"""Process a branch right after it is finished (stub)"""
pass
def Finish(self):
"""Call at end of sequence to tie everything up"""
assert not self.finished, (self.base_index, self.index)
if self.subinstance:
self.subinstance.Finish()
self.branch_process(self.subinstance)
self.end_process()
self.finished = 1
def __call__(self, *args):
"""Process args, where args[0] is current position in iterator
Returns true if args successfully processed, false if index is
not in the current tree and thus the final result is
available.
Also note below we set self.index after doing the necessary
start processing, in case there is a crash in the middle.
"""
index = args[0]
assert type(index) is types.TupleType, type(index)
if self.index is None:
self.start_process(*args)
self.index = self.base_index = index
return 1
if index <= self.index:
Log("Warning: oldindex %s >= newindex %s" % (self.index, index), 2)
if not self.intree(index):
self.Finish()
return None
else:
self.process_w_subinstance(args)
self.index = index
return 1
import time, sys
execfile("lazy.py")
#######################################################################
#
# log - Manage logging
#
class LoggerError(Exception): pass
class Logger:
"""All functions which deal with logging"""
def __init__(self):
self.log_file_open = None
self.log_file_local = None
self.verbosity = self.term_verbosity = 3
# termverbset is true if the term_verbosity has been explicity set
self.termverbset = None
def setverbosity(self, verbosity_string):
"""Set verbosity levels. Takes a number string"""
try: self.verbosity = int(verbosity_string)
except ValueError:
Log.FatalError("Verbosity must be a number, received '%s' "
"instead." % verbosity_string)
if not self.termverbset: self.term_verbosity = self.verbosity
def setterm_verbosity(self, termverb_string):
"""Set verbosity to terminal. Takes a number string"""
try: self.term_verbosity = int(termverb_string)
except ValueError:
Log.FatalError("Terminal verbosity must be a number, received "
"'%s' insteaxd." % termverb_string)
self.termverbset = 1
def open_logfile(self, rpath):
"""Inform all connections of an open logfile.
rpath.conn will write to the file, and the others will pass
write commands off to it.
"""
assert not self.log_file_open
for conn in Globals.connections:
conn.Log.open_logfile_allconn(rpath.conn)
rpath.conn.Log.open_logfile_local(rpath)
def open_logfile_allconn(self, log_file_conn):
"""Run on all connections to signal log file is open"""
self.log_file_open = 1
self.log_file_conn = log_file_conn
def open_logfile_local(self, rpath):
"""Open logfile locally - should only be run on one connection"""
assert self.log_file_conn is Globals.local_connection
self.log_file_local = 1
self.logrp = rpath
self.logfp = rpath.open("a")
def close_logfile(self):
"""Close logfile and inform all connections"""
if self.log_file_open:
for conn in Globals.connections:
conn.Log.close_logfile_allconn()
self.log_file_conn.Log.close_logfile_local()
def close_logfile_allconn(self):
"""Run on every connection"""
self.log_file_open = None
def close_logfile_local(self):
"""Run by logging connection - close logfile"""
assert self.log_file_conn is Globals.local_connection
assert not self.logfp.close()
self.log_file_local = None
def format(self, message, verbosity):
"""Format the message, possibly adding date information"""
if verbosity < 9: return message + "\n"
else: return "%s %s\n" % (time.asctime(time.localtime(time.time())),
message)
def __call__(self, message, verbosity):
"""Log message that has verbosity importance"""
if verbosity <= self.verbosity: self.log_to_file(message)
if verbosity <= self.term_verbosity:
self.log_to_term(message, verbosity)
def log_to_file(self, message):
"""Write the message to the log file, if possible"""
if self.log_file_open:
if self.log_file_local:
self.logfp.write(self.format(message, self.verbosity))
else: self.log_file_conn.Log.log_to_file(message)
def log_to_term(self, message, verbosity):
"""Write message to stdout/stderr"""
if verbosity <= 2 or Globals.server: termfp = sys.stderr
else: termfp = sys.stdout
termfp.write(self.format(message, self.term_verbosity))
def conn(self, direction, result, req_num):
"""Log some data on the connection
The main worry with this function is that something in here
will create more network traffic, which will spiral to
infinite regress. So, for instance, logging must only be done
to the terminal, because otherwise the log file may be remote.
"""
if self.term_verbosity < 9: return
if type(result) is types.StringType: result_repr = repr(result)
else: result_repr = str(result)
if Globals.server: conn_str = "Server"
else: conn_str = "Client"
self.log_to_term("%s %s (%d): %s" %
(conn_str, direction, req_num, result_repr), 9)
def FatalError(self, message):
self("Fatal Error: " + message, 1)
Globals.Main.cleanup()
sys.exit(1)
def exception(self, only_terminal = 0, verbosity = 4):
"""Log an exception and traceback
If only_terminal is None, log normally. If it is 1, then only
log to disk if log file is local (self.log_file_open = 1). If
it is 2, don't log to disk at all.
"""
assert only_terminal in (0, 1, 2)
if (only_terminal == 0 or
(only_terminal == 1 and self.log_file_open)):
logging_func = self.__call__
else: logging_func = self.log_to_term
exc_info = sys.exc_info()
logging_func("Exception %s raised of class %s" %
(exc_info[1], exc_info[0]), verbosity)
logging_func("".join(traceback.format_tb(exc_info[2])), verbosity+1)
Log = Logger()
execfile("restore.py")
#######################################################################
#
# manage - list, delete, and otherwise manage increments
#
class ManageException(Exception): pass
class Manage:
def get_incobjs(datadir):
"""Return Increments objects given the rdiff-backup data directory"""
return map(IncObj, Manage.find_incrps_with_base(datadir, "increments"))
def get_file_type(rp):
"""Returns one of "regular", "directory", "missing", or "special"."""
if not rp.lstat(): return "missing"
elif rp.isdir(): return "directory"
elif rp.isreg(): return "regular"
else: return "special"
def get_inc_type(inc):
"""Return file type increment represents"""
assert inc.isincfile()
type = inc.getinctype()
if type == "dir": return "directory"
elif type == "diff": return "regular"
elif type == "missing": return "missing"
elif type == "snapshot": return Manage.get_file_type(inc)
else: assert None, "Unknown type %s" % (type,)
def describe_incs_parsable(incs, mirror_time, mirrorrp):
"""Return a string parsable by computer describing the increments
Each line is a time in seconds of the increment, and then the
type of the file. It will be sorted oldest to newest. For example:
10000 regular
20000 directory
30000 special
40000 missing
50000 regular <- last will be the current mirror
"""
incpairs = [(Time.stringtotime(inc.getinctime()), inc) for inc in incs]
incpairs.sort()
result = ["%s %s" % (time, Manage.get_inc_type(inc))
for time, inc in incpairs]
result.append("%s %s" % (mirror_time, Manage.get_file_type(mirrorrp)))
return "\n".join(result)
def describe_incs_human(incs, mirror_time, mirrorrp):
"""Return a string describing all the the root increments"""
incpairs = [(Time.stringtotime(inc.getinctime()), inc) for inc in incs]
incpairs.sort()
result = ["Found %d increments:" % len(incpairs)]
for time, inc in incpairs:
result.append(" %s %s" %
(inc.dirsplit()[1], Time.timetopretty(time)))
result.append("Current mirror: %s" % Time.timetopretty(mirror_time))
return "\n".join(result)
def delete_earlier_than(baserp, time):
"""Deleting increments older than time in directory baserp
time is in seconds. It will then delete any empty directories
in the tree. To process the entire backup area, the
rdiff-backup-data directory should be the root of the tree.
"""
baserp.conn.Manage.delete_earlier_than_local(baserp, time)
def delete_earlier_than_local(baserp, time):
"""Like delete_earlier_than, but run on local connection for speed"""
assert baserp.conn is Globals.local_connection
def yield_files(rp):
yield rp
if rp.isdir():
for filename in rp.listdir():
for sub_rp in yield_files(rp.append(filename)):
yield sub_rp
for rp in yield_files(baserp):
if ((rp.isincfile() and
Time.stringtotime(rp.getinctime()) < time) or
(rp.isdir() and not rp.listdir())):
Log("Deleting increment file %s" % rp.path, 5)
rp.delete()
MakeStatic(Manage)
class IncObj:
"""Increment object - represent a completed increment"""
def __init__(self, incrp):
"""IncObj initializer
incrp is an RPath of a path like increments.TIMESTR.dir
standing for the root of the increment.
"""
if not incrp.isincfile():
raise ManageException("%s is not an inc file" % incrp.path)
self.incrp = incrp
self.time = Time.stringtotime(incrp.getinctime())
def getbaserp(self):
"""Return rp of the incrp without extensions"""
return self.incrp.getincbase()
def pretty_time(self):
"""Return a formatted version of inc's time"""
return Time.timetopretty(self.time)
def full_description(self):
"""Return string describing increment"""
s = ["Increment file %s" % self.incrp.path,
"Date: %s" % self.pretty_time()]
return "\n".join(s)
#!/usr/bin/env python
"""Run rdiff-backup with profiling on
Same as rdiff-backup but runs profiler, and prints profiling
statistics afterwards.
"""
__no_execute__ = 1
execfile("main.py")
import profile, pstats
profile.run("Globals.Main.Main(%s)" % repr(sys.argv[1:]), "profile-output")
p = pstats.Stats("profile-output")
p.sort_stats('time')
p.print_stats(20)
p.sort_stats('cumulative')
p.print_stats(20)
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#!/bin/sh
# This script will create the testing/restoretest3 directory as it
# needs to be for one of the tests in restoretest.py to work.
rm -rf testfiles/restoretest3
rdiff-backup --current-time 10000 testfiles/increment1 testfiles/restoretest3
rdiff-backup --current-time 20000 testfiles/increment2 testfiles/restoretest3
rdiff-backup --current-time 30000 testfiles/increment3 testfiles/restoretest3
rdiff-backup --current-time 40000 testfiles/increment4 testfiles/restoretest3
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment