Commit cbf321e5 authored by cvs2svn's avatar cvs2svn

This commit was manufactured by cvs2svn to create tag 'r0-12-2'.

git-svn-id: http://svn.savannah.nongnu.org/svn/rdiff-backup@369 2b77aa54-bcbc-44c9-a7ec-4f6cf2b41109
parent 5ffdb7c7
New in v0.12.2 (2003/07/24)
---------------------------
Patch by Arkadiusz Patyk fixes building with Python 2.3c1.
Restore of archives made by 0.10.x and earlier fixed, although hard
link information is not restored unless it is current in the mirror.
(Bug reported by Jeff Lessem.)
Fixed problem with door files locally when repository is remote.
(Reported by Robert Weber.)
New in v0.12.1 (2003/07/22)
---------------------------
Added --no-change-dir-inc-perms switch, to avoid some weird errors on
FreeBSD, and possibly on Solaris. Thanks to Troels Arvin for
report.
Fixed bug when regressing destination directory made with
--windows-mode. Reported by Tucker Sylvestro.
The librsync blocksize is now chosen based on filesize. This should
make operations on large files faster (in some cases, orders of
magnitude faster). Thanks to Ty! Boyack for bringing this issue to my
attention.
New in v0.12.0 (2003/06/26)
---------------------------
Fixed (?) bug that caused crash when file changes type from regular
file in middle of download (reported by Ty! Boyack).
Failure to construct regular file in regression/restoration only
causes warning, not fatal error.
Removed --exclude-mirror option. (Probably no one uses this, and it
adds clutter.)
--include and --exclude options should work now with restores, with
some speed penalty.
New in v0.11.5 (2003/06/20)
---------------------------
Added EDEADLOCK to the list of skippable errors. (Thanks to Dave
Kempe for report.)
Added --list-at-time option at request of Farkas Levente.
Various fixes for backing up onto windows directories. Thanks to
Keith Edmunds for bug reports and testing.
Fixed possible crash when a file would be deleted while being
processed (reported by Robert Weber).
Handle better cases when there are two files with the same name in the
same directory.
Added --windows-restore switch, for use when when restoring from a
windows-style file system to a normal one. Use --windows-mode when
backing up.
Scott Bender's patch fixes backing up hard links when first linked
file is quoted.
New in v0.11.4 (2003/03/15)
---------------------------
Fixed bug incrementing sockets whose filenames were pretty long, but
not super long. Reported by Olivier Mueller.
Added Albert Chin-A-Young's patch to add a few options to the setup.py
install script.
Apparently fixed rare utime type bug. Thanks to Christian Skarby for
report and testing.
Added detailed file_statistics (in addition to session_statistics) as
requested by Dean Gaudet. Disable with --no-file-statistics option.
Minor speed enhancements.
New in v0.11.3 (2003/03/04)
---------------------------
Fixed a number of bugs reported by Olivier Mueller:
Brought some old parts of the man page up-to-date.
Fixed bug if unrecoverable error on second backup to a directory.
Fixed spurious error message that could appear after a successful
backup.
--print-statistics option works again (before it would silently
ignored).
Fixed cache pipeline overflow bug. This error could appear on
large remote backups when many files have not changed.
New in v0.11.2 (2003/03/01)
---------------------------
Fixed seg fault bug reported by a couple sparc/openbsd users. Thanks
to Dave Steinberg for giving me an account on his system for testing.
Re-enabled --windows-mode and filename quoting.
Fixed selection bug: In 0.11.1, files which were included in one
backup would be automatically included in the next. Now you can
include/exclude files session-by-session.
Fixed ownership compare bug: In 0.11.1, backups where the destination
side was not root would preserve ownership information by recording it
in the metadata file. However, mere ownership changes would not
trigger creation of new increments. This has been fixed.
Added the --no-inode-compare switch. You probably don't need to use
it though.
If a special file cannot be created on the destination side, a 0
length regular file will be written instead as a placeholder.
(Restores should work fine because of the metadata file.)
Yet another error handling strategy (hopefully this is the last one
for a while, because this stuff isn't very exciting, and takes a long
time to write):
All recoverable errors are classified into one of three groups:
ListErrors, UpdateErrors, and SpecialFileErrors. rdiff-backup's
reaction to each error is more formally defined (see the error
policy page, currently at
http://rdiff-backup.stanford.edu/error_policy.html).
rdiff-backup makes no attempt to recover or clean up after
unrecoverable errors.
However, it now uses fsync() to increment the destination
directory in a reversable way. If there is an error, the next
backup will regress the destination directory into its state
before the aborted backup.
The above process can be done without a backup with the
--check-destination-dir option.
Improved error logging. Instead of the old haphazard reporting
method, which sometimes didn't indicate the file an error occurred on,
now all recoverable errors are reported in a standard format and also
written to the error_log.<time>.data file in the rdiff-backup-data
directory. Thanks to Dean Gaudet and others for repeatedly bugging me
about this.
New in v0.11.1 (2002/12/31)
---------------------------
**Warning** Various features have been removed from this version, so
this is not a safe upgrade. Also this version has less error
checking, and, if it crashes, this version may be more prone to leave
the destination directory in an inconsistent state. I plan to look at
these issues in the next version. Also, this version is quite
different from previous ones, so you cannot run version 0.11.1 on one
end of a connection and any previous version on the other side.
The following features have been removed:
--mirror-only option: If you just want to mirror something, use
rsync. (Or you could use rdiff-backup and then just delete the
rdiff-backup-data directory, and then update the root mtime.)
--change-source-perms option: This feature was pretty complicated
to implement, and if something happened to rdiff-backup during a
transfer, the old permissions could not be restored.
All "resume" related functionality, like --checkpoint-interval:
This was complicated to implement, and didn't seem to work all
that well.
Directory statistics file: Although the session statistics file is
still generated, the directory statistics file no longer is,
because the new code structure makes it less inconvenient.
The various --exclude and --include options no longer work when
restoring. This may be added later if there is demand.
--windows-mode and filename quoting doesn't work. There have been
several requests for this in the past, so it will probably be
re-added in the next version.
Extensive refactoring. A lot of rdiff-backup's code was structured as
if it were still in one file, so it didn't make enough use of Python's
module system.
Now rdiff-backup writes metadata (uid, gid, mtime, etc.) to a
compressed text file in the rdiff-backup-data directory. Here are
some ramifications:
A user does not need root access on the destination side to record
file ownership information.
Some files may be recognized as not having changed based on this
metadata, so it may not be necessary to traverse the whole mirror
directory. This can reduce file access on the destination side.
Even when the --no-hard-links option is given when backing up,
link relationships can be restored properly. However, if this
option is given, mirror files will not be linked together.
Special file types like device and sockets which cannot be created
on the remote side for some reason can still be backed up and
restored properly.
Fixed bug with the --{include|exclude}-globbing-filelist options
(reported by Claus Herwig).
Added --list-changed-since option to list the files changed since the
given date, and added Bud Bruegger's patch to that. The format and
information this option provides will probably change in the near
future.
Restoring is now pipelined for better high latency performance, and
unchanged files in the target directory will not be recopied.
New in v0.11.0 (2002/10/05)
---------------------------
If get a socket error from trying to create a socket whose name is too
long, just skip file instead of exiting with error (bug report by Ivo
De Decker).
Added --exclude-special-files switch, which excludes fifos, symlinks,
sockets, and device files.
--windows-mode is now short for --windows-time-format --chars-to-quote
A-Z: --no-hard-links --exclude-special-files. Thanks to Paul-Erik
Törrönen for some helpful windows info.
Multiple --include and --exclude statements can now be given in a
single file. See the documentation on
--{include|exclude}-globbing-filelist. Thanks to Henrik Lewander for
pointing out that command line length could otherwise be a problem.
Fixed bug in filelist processing that ignored leading or trailing
whitespace in filelists. Now filenames with, for instance, trailing
spaces can be used in filelists. Filelists which took advantage of
this bug for formatting may have to be editted.
Applied major/minor patch contributed by David S. rdiff-backup should
now correctly copy device files on platforms such as NetBSD.
It is now possible to restore from a read-only filesystem (before
rdiff-backup would fail when trying to open log file). Thanks to
Gregor Zattler for bug report.
Fixed bug that prevented certain restores when the source directory
was specified with a trailing backslash.
Added a bit more logging so it should be apparent which file was being
processed when an error occurs (thanks to Gerd Knops for suggestion).
Fixed bug when using --chars-to-quote and directory deleted that has
quoted characters in it.
New in v0.10.1 (2002/09/16)
---------------------------
rdiff-backup should now correctly handle files larger than 2GB.
Thanks to Russ Allbery for telling me how to do this.
New in v0.10.0 (2002/09/10)
---------------------------
Fixed bug, probably introduced in 0.9.3, which prevented restores from
a local source to a remote destination. Reported by Phillip Eby.
Fixed another bug reported by Phillip Eby, where restores would fail
if rdiff-backup had only been run once and no increments were
available.
A few man page additions regarding restoring, statistics, and
--test-server (thanks to Gregor Zattler, Christopher Schanzle, and
Tobias Polzin for suggestions).
Fixed comparison bug where rdiff-backup would unnecessarily report a
directory as changed when its source size differed from its mirror
size. Thanks to Tim Allen for report.
New in v0.9.5 (2002/08/09)
--------------------------
Fixed --verbosity option (now both -v and --verbosity work). Thanks
to Chris Dumont for report.
****** IMPORTANT ****** Fixed serious permissions bug found by Robert
Weber. Previous versions in the 0.9.x branch would throw away high
bit permissions (like the setuid and setuid bits). This would be
especially bad when running with the --change-source-perms operation.
Anyone running 0.9.0 - 0.9.4 should upgrade immediately.
Complain about --change-source-perms when running as root, as this
option should not be necessary then.
Fixed bug with --windows-mode. Thanks to Chris Grindstaff for report.
New in v0.9.4 (2002/07/24)
--------------------------
Man page now correctly included in rpm.
To prevent confusion, rdiff-backup script does not have exec
permissions until it is installed (thanks Jason Piterak).
Sockets are now replicated. Why not? (Suggestion by Mickey Everts)
Bad resuming information (because, say, it is left over from a
previous version) should no longer cause exit, except when --resume is
specified.
Better error handling in certain cases when errors occur in file reads
(thanks to John Goerzen for report).
New in v0.9.3 (2002/07/15)
--------------------------
Added --sleep-ratio option after hearing that rdiff-backup was too
hard on hard disks (thanks to Steve Alexander for the suggestion).
Quick example: --sleep-ratio 0.25 makes rdiff-backup sleep about 25%
of the time. Maybe this will help on bandwidth usage also.
Fixed -m/--mirror-only option.
Added --exclude-other-filesystems option. Thanks to Paul Wouters for
the suggestion.
Added convenience field TotalDestinationSizeChange (total change in
destination directory - mirror change + increments change) to
session_statistics file.
Handle a particular situation better where a file changes in a certain
way while rdiff-backup is processing it. Before rdiff-backup would
just crash; now it skips the file. Thanks to Scott Bender for the bug
report.
A couple interface fixes to --remove-older-than.
Added some security features to the protocol, so rdiff-backup will now
only allow commands from remote connections. The extra security will
be enabled automatically on the client (it knows what to expect), but
the extra switches --restrict, --restrict-update-only, and
--restrict-read-only have been added for use with --server.
New in v0.9.2 (2002/06/27)
--------------------------
Interface directly with librsync(.a|.so) instead of running "rdiff"
command line utility. This can significant save fork()ing time when
processing lots of smallish files that have changed. Also, rdiff is
no longer required to be in the PATH.
Further speed optimizations, mostly reducing CPU consumption when
scanning through unchanged files.
Fixed Path bug which could caused globbing and regexp include/exclude
statements to malfunction when the base of the source directory was
"/" (root of filesystem). Thanks to Vlastimil Adamovsky for noting
this bug.
Added quoting for spaces in directory_statistics file, hopefully
making it easier to parse.
New in v0.9.1 (2002/06/19)
--------------------------
Fixed some bad C. Besides being unportable and leaking memory, it may
have lead to someone's backup directory getting deleted (?).
Tweaked some error recovery code to make it more like 0.8.0.
Improved the installation a bit.
New in v0.9.0 (2002/06/17)
--------------------------
Changed lots of the code to distribute as standard python package
instead of single script. Installation procedure is also different.
Speed optimizations - average user might see speed increase of 2 or
more.
New in v0.8.0 (2002/06/14)
--------------------------
Added --null-separator argument so filenames can safely include
newlines in an include/exclude filelist.
Fixed bug that affected restoring from current mirror with the '-r
now' option.
New in v0.7.6 (2002/05/31)
--------------------------
Improved statistics support, and added --print-statistics and
--calculate-average switches. See the directory_statistics and
session_statistics files in the rdiff-backup-data directory.
Major improvements to error correction and resuming.
Now signals SIGQUIT, SIGHUP, and SIGTERM are caught to exit more
gracefully.
Fixed crankyness when --exclude-filelist is the last exclude option
and it is given an empty file (thanks to Bryce C for report).
New in v0.7.5 (2002/05/21)
--------------------------
Fixed resuming bug.
After a bit of empirical testing, increased Globals.conn_bufsize and
enabled ssh compression by default (and also added
--ssh-no-compression option). This should speed up the "typical"
remote session.
Fixed bug noticed by Dean Gaudet in processing of
--(include|exclude)-filelist[-stdin] options when source directory was
remote.
Fixed --include error reporting bug reported by Ben Edwards.
Small change so 'door' files and other unknown file types will be
ignored. (Thanks for Steve Simitzis for sending in a patch for this.)
Fixed bug noticed by Dean Gaudet where, unless the
--change-source-perms option is specified, rdiff-backup wouldn't even
attempt to open files lacking ownership permissions.
New in v0.7.4 (2002/05/11)
--------------------------
Added new restore syntax and corresponding -r and --restore-as-of
options. For instance, "rdiff-backup -r 1/3/2002 /backup/foo out"
will try to restore /backup/foo (a file on the mirror directory) to
out, as it was January 3rd, 2002. See man page for more information.
directory_statistics.<time>.data files will now be created in the
directories underneath rdiff-backup-data/increments. Just look at one
to see what's inside.
Added extra options --chars-to-quote, --quoting-char, and
--windows-mode, mostly to allow files whose names have colons (:) in
them to be backed up to windows machines.
Now the -l and --list-increments switches can list the increments
corresponding to any mirror file, not just the root directory. Also
the option --parsable-output was added to control whether the
--list-increments output looks better for a human, or computer.
Improved remove-earlier-than handling so it should run approximately
as fast locally and remotely.
Probably fixed bug noticed by Erminio Baranzini which caused
rdiff-backup to try to preserve access times unnecessarily (the
default is not preserve access times).
Rewrote a few large chunks of code for clarity and simplicity.
Allow extended time strings for the --remove-older-than option.
Added RESTORING section to the manual page because there seemed to be
some general confusion about this.
hardlink_data, current_mirror, and a few other files now carry the
.data extension (instead of .snapshot), to make it clearer they are
not copies of source files.
New in v0.7.3 (2002/04/29)
--------------------------
Fixed broken remote operation in v0.7.2 by applying (a variant of)
Daniel Robbins' patch. Also fixed associated bug in test set.
Fixed bug recognizing --[include|exclude]-filelist-stdin options, and
IndexError bug reading some filelists.
--force is no longer necessary if the target directory is empty.
--include/--exclude/etc now work for restoring as they do for backing up.
Raised verbosity level for traceback output - if long log error
messages are annoying you, set verbosity to 2. Will come up with a
better logging system later.
May have fixed a problem encountered by Matthew Farrellee and Kevin
Spicer wherein the _session_info_list information was stored on the
wrong computer. This could cause rdiff-backup to fail when running
after another backup that failed for a different reason. May backport
this fix to 0.6.0 later.
May have fixed a problem also noticed by Matthew Farrellee which can
cause rdiff-backup to exit when a directory changes into a
non-directory file while rdiff-backup is processing the directory.
(May also apply to 0.6.0).
Fixed a bug noticed by Jamie Heilman where restoring could fail if a
recent rdiff-backup process which produced the backup set was aborted
while processing a new directory. (May also apply to 0.6.0)
New in v0.7.2 (2002/04/11)
--------------------------
Added new selection options --exclude-filelist,
--exclude-filelist-stdin, --exclude-regexp, --include-filelist,
--include-filelist-stdin, --include-regexp.
*** WARNING *** the --include and --exclude options have changed. The
new --include-regexp and --exclude-regexp are close to, but still
different from the old --include and --exclude options. See the man
page for details.
Friendlier error reporting when remote connection doesn't start.
New in v0.7.1 (2002/03/25)
--------------------------
Now by default .snapshot and .diff increments are compressed with
python's internal gzip. The new increments format is backwards
compatible, but only rdiff-backup >0.7.1 will be able to restore if
any gzipped increments are present.
Added --no-compression and --no-compression-regexp to control which
files are compressed.
New in v0.7.0 (2002/03/21)
--------------------------
Added hardlink support. This is now the default, but can be turned
off with --no-hardlinks.
Clarified a bit of the manual.
May have fixed a bug with remote handling of device files.
New in v0.6.0 (2002/03/14)
--------------------------
Fixed some assorted manual "bugs".
Fixed endless loop bug in certain error recovery situation reported by
Nick Duffek, and slightly changed around some other error correction
code.
Switching to new version numbering system: versions x.2n+1.x are
unstable, versions x.2n.x are supposed to be more stable.
New in v0.5.4 (2002/03/06)
--------------------------
Fixed bug present since 0.5.0 wherein rdiff-backup would make
snapshots instead of diffs when regular files change.
May have fixed race condition involving rdiff execution.
New in v0.5.3 (2002/03/03)
--------------------------
It turns out the previous version broke device handling. Sorry about
that..
New in v0.5.2 (2002/03/02)
--------------------------
Fixed bugs which made rdiff-backup try to preserve mod times when it
wasn't necessary, and exit instead of warning when it wasn't being run
as root and found a file it didn't own. (Reported by Alberto
Accomazi.)
Added some more error checking; maybe this will fix a bug reported by
John Goerzen wherein rdiff-backup can crash if file is deleted while
rdiff-backup is processing it.
Changed locations of some of the temp files; filenames will be
determined by the tempfile module.
New in v0.5.1 (2002/02/22)
--------------------------
When establishing a connection, print a warning if the server version
is different from the client version.
When find rdiff error value 256, tell user that it is probably because
rdiff couldn't be found in the path.
Fixed a serious bug that can apparently cause a remote backups to fail
(reported by John Goerzen).
May have fixed a bug that causes recovery from certain errors to fail.
New in v0.5.0 (2002/02/17)
--------------------------
Now every so often (default is 20 seconds, the --checkpoint-interval
option controls it) rdiff-backup checkpoints by dumping its state to
temporary files in the rdiff-backup-data directory. If rdiff-backup
is rerun with the same destination directory, it can either try to
resume the previous backup or at least clean things up so the archive
is consistent and accurate.
Added new options --resume, --no-resume, and --resume-interval, which
control when rdiff-backup tries to resume a previous failed backup.
Fixed a bug with the --exclude-device-files option which caused the
option to be ignored when the source directory was remote.
By default, if rdiff-backup encounters a certain kind of IOError
(currently types 26 and 5) while trying to access a file, it logs the
error, skips the file, and tries to continue.
If settings requiring an integer argument (like -v or
--checkpoint-interval) are given a bad (non-integer) argument, fail
with better explanation.
Fixed annoying logging bug. Now no matter which computer a logging
message originates on, it should be routed to the process which is
writing to the logging file, and written correctly. However, logging
messages about network traffic will not be routed, as this will
generate more traffic and lead to an infinite regress.
When calling rdiff, uses popen2.Popen3 and os.spawnvp instead of
os.popen and os.system. This should make rdiff-backup more secure.
Thanks to Jamie Heilman for the suggestion.
Instead of calling the external shell command 'stat', rdiff-backup
uses os.lstat().st_rdev to determine a device file's major and minor
numbers. The new method should be more portable. Thanks to Jamie
Heilman for the suggestion.
All the file operations were examined and tweaked to try to
minimize/eliminate the chance of leaving the backup directory in an
inconsistent state.
Upon catchable kinds of errors, try to checkpoint before exiting so
later rdiff-backup processes have more information to work with.
At the suggestion of Jason Piterak, added a --windows-time-format
option so rdiff-backup will (perhaps) work under MS windows NT.
New in v0.4.4 (2002/01/09)
--------------------------
Applied Berkan Eskikaya's "xmas patch" (I was travelling and didn't
have a chance on Christmas). He fixed important bugs in the
--terminal-verbosity and --remove-older-than options.
Added an --exclude-device-files option, which makes rdiff-backup skip
any device files in the same way it skips files selected with the
--exclude option.
New in v0.4.3 (2001/12/17)
--------------------------
Plugged another memory hole. At first I thought it might have been
python's fault, but it was all me. If rdiff-backup uses more than a
few megabytes of memory, tell me because it is probably another memory
hole..
rdiff-backup is now a bit more careful about deleting temporary files
it creates when it is done with them.
Changed the rpm spec a little. The enclosed man page is gzipped and
the package file is GPG signed (it can be checked with, for example,
"rpm --checksig -v rdiff-backup-0.4.3-1.noarch.rpm").
rdiff-backup no longer checks the mtimes or atimes of device files.
Use of these times was inconsistent (sometimes writing to device files
updates their times, sometimes not) and leads to unnecessary backing
up of files.
New in v0.4.2 (2001/11/19)
--------------------------
Significant speed increases (maybe 20% for local sessions) when
dealing with directories that do not need to be updated much.
Fixed memory leak. rdiff-backup should now run in almost constant
memory (about 6MB on my system).
Enabled buffering of object transfers, so remote sessions can be
50-100%+ faster.
rdiff-backup now thinks it is running as root if the destination
connection is root. Thus rdiff-backup will preserve ownership even if
it is not running as root on the source end.
If you abort rdiff-backup or it fails for some reason, it is now more
robust about recovering the next time it is run (before it could fail
in ways which made subsequent sessions fail also). However, it is
still not a good idea to abort, as individual files could be in the
process of being written and could get corrupted.
If rdiff-backup encounters an unreadable file (or, if
--change-source-perms is given, a file whose permissions it cannot
change), it will log a warning, ignore the file, and continue, instead
of exiting with an error.
New in v0.4.1 (2001/11/9)
-------------------------
Now either the source, or the target, or both can be remote. To make
this less confusing, now rdiff-backup supports host::file notation.
So it is legal to run:
rdiff-backup bill@host1.net::source_file jones@host2.net::target
Also, the test suites have been improved and found a number of bugs
(which were then fixed).
New in v0.4.0 (2001/11/4)
-------------------------
Much of the rdiff-backup internals were rewritten. The result should
be better performance when operating remotely over a pipe with
significant latency. Also the code dealing with changing permissions
is much cleaner, and should generalize later to similar jobs (for
instance preserving atimes.)
Listing and deleting increments and restoring should work remotely
now. In earlier versions a file or directory had to be restored
locally and then copied over to its final destination.
At the request of the FSF, a copy of the GPL has been included in the
packaged distributions. It is in the file "COPYING".
New in v0.3.4 (2001/10/31)
--------------------------
A change in python from the 2.2a series to 2.2b series made remote
backup on version 0.3.3 stop work, a small change fixes it. (Thanks
to Berkan Eskikaya for telling me about this.)
Listed some missing features/bugs on the manual page.
New in v0.3.3 (2001/10/16)
--------------------------
Changed quoting system yet again after learning that the old system
was not very portable between shells (thanks Hans
<hguevremont@eternitee.com>)
New in v0.3.2 (2001/10/9)
-------------------------
Added --list-increments and --remove-older-than commands.
--list-increments will just tell you what increments you have and
their dates. This isn't anything you couldn't get from "ls", but it
may be formatted more nicely. The --remove-older-than command is used
to delete older increments that you don't want, or don't have space
for.
Also, on some systems ssh was adding a spurious "Broken pipe" message,
even though everything went fine. Maybe this version will prevent
this confusing message.
New in v0.3.1 (2001/9/11)
-------------------------
Fix for stupid bug - when running remotely as users with different
uids, rdiff-backup now doesn't check the uid/gid. Before it kept
thinking that the files needed to be updated because they didn't have
the right ownership. This shouldn't have resulted in any data loss -
just some unnecessary .rdiff files. (Thanks to Michael Friedlander
for finding this.)
Added check to make sure that rdiff exits successfully.
New in v0.3.0 (2001/9/9 - Billennium edition)
---------------------------------------------
rdiff-backup has been almost completely rewritten for v0.3.0, as it
was for v0.1.0. The main problem with versions 0.2.x was that the
networking code was added to the not-remote-capable v0.1, and the
result was unyieldy and prone to bugs when operating over a pipe.
There are some new features:
- Hopefully very few bugs, at least in basic file handling.
rdiff-backup has an extensive testing suite now, so it should be
much more reliable.
- Complete support for reading and writing from and to files and
directories that lack permissions, by temporarily changing them, and
then changing them back later. (See for instance the
--change-source-perms switch.) As I found out there is a lot to
this, so much that I'm not sure in retrospect I should have
bothered. :-)
- New more standard format for increment files. See
http://www.w3.org/TR/NOTE-datetime for the time standard. The old
format, besides being less standard, didn't take timezones into
account.
- In the initial mirroring, rdiff-backup only copies the files that it
needs to, so it is much quicker when you almost have an initial
mirror already. You can even the --mirror-only switch and make
rdiff-backup into a slow version of rsync.
- Terminal and file verbosity levels can be selected separately. So
if you like a lot in your backup.log/restore.log but not much on
your terminal, or vice-versa, you can set them at different numbers.
- New --test-server option so if something goes wrong you can see if
it is because the server on the other side isn't being initialized
properly.
- New --no-rdiff-copy option, which disables using rdiff to move files
across a connection (it will still be used to make increment files
however). If the bottleneck is not bandwidth but local disks/CPUs,
this options should speed things up.
There are, however, a few negatives:
- rdiff-backup now requires Python version 2.2 or later. Sorry for
the inconvenience but I use the new features a lot.
- It may be slightly slower overall than versions 0.2.x - the remote
code is cleaner, but probably has higher overhead. At least on my
computer, rdiff-backup is still quicker than rsync for local
mirroring of large files, but for remote mirroring, rsync will
usually be much quicker, because it uses a fairly low-overhead
pipelining protocol.
- Any old increments are incompatible because they use a different
date/time standard. If this is a big deal, try mailing me. A
converter shouldn't be very difficult to write, but I didn't want to
take the time unless someone really wanted it.
New in v0.2.8 (2001/9/4)
-------------------------
Fixed two stupid bugs that would cause rdiff-backup to exit with an
exception. (I can't believe they were in there.)
New in v0.2.7 (2001/8/29)
-------------------------
Added new long options --backup-mode and --verbosity which are
equivalent to -b and -v.
rdiff-backup should be a little more resistant to the filesystem it is
backup up changing underneath it (although it is not setup to handle
this in general). Thanks Alberto Accomazzi
<aaccomazzi@cfa.harvard.edu> for these suggestions.
New in v0.2.6 (2001/8/27)
-------------------------
Fixed bug where, for non-root users, rdiff-backup could, in the
process of mirroring an unwritable directory, make the copy
unwriteable and then fail. Now rdiff-backup goes through and makes
what it needs to be readable and writeable, and then changes things
back at the end. (Another one found by Jeb Campbell!)
New in v0.2.5 (2001/8/26)
-------------------------
Added better error reporting when server throws an exception.
Fixed bug so that backed-up setuid files will also be setuid.
Now rdiff-backup thinks it's running as root only if both client and
server are running as root (Thanks to Jeb Campbell for finding these
previous two bugs).
Fixed miscellaneous Path bug that could occur in remote operation.
New in v0.2.4 (2001/8/25)
-------------------------
Added more logging options that may help other track down a mysterious
bug.
New in v0.2.3 (2001/8/24)
-------------------------
Fixed typing bug that caused an Assertion Error in remote operation,
thanks again to Jeb Campbell for finding it.
New in v0.2.2 (2001/8/24)
-------------------------
Fixed bug in remote creation of special files and symlinks (thanks to
Jeb Campbell <jebc@c4solutions.net> for finding it).
Fixed another error report.
New in v0.2.1 (2001/8/7)
------------------------
Now if rdiff-backup isn't running as root, it doesn't try to change
file ownership.
Fixed an error report.
Stopped flushing an open pipe to fix a race condition on IRIX.
New in v0.2 (2001/8/3)
----------------------
rdiff-backup can now operate in a bandwidth efficient manner (a la
rsync) using a pipe setup with, for instance, ssh.
I was too hasty with the last bug fix and didn't deal with all
filenames properly. Maybe this one will work.
New in v0.1.1 (2001/8/2)
-------------------------
Bug fix: Filenames that may contain spaces, backslashes, and other
special characters are quoted now and should be handled correctly.
New in v0.1 (2001/7/15)
----------------------
Large portion (majority?) of rdiff-backup was rewritten for v0.1. New
version highlights:
- No new features!
- No speed improvements! It may even be slower...
- No bug fixes! (ok maybe a few)
However, the new version is much cleaner and better documented. This
version should have fewer bugs, and it should be easier to fix any
future bugs.
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.
CVS README - Notes for people checking out of CVS
-------------------------------------------------
Getting rdiff-backup to run:
----------------------------
If you want to run a version of rdiff-backup checked out of CVS into
your $RDB_CVS directory, change to $RDB_CVS/rdiff_backup and run the
./compilec.py file:
cd $RDB_CVS/rdiff_backup; python compilec.py
With any luck, _librsync.so and C.so libraries will appear in that
directory. Then run rdiff-backup, making sure that all the files are
in your PYTHONPATH:
PYTHONPATH=$RDB_CVS $RDB_CVS/rdiff-backup <arguments>
Running the unit tests:
-----------------------
If you want to try some of tests, you first have to get the
testfiles.tar.gz tarball. It is available at
http://rdiff-backup.stanford.edu/testfiles.tar.gz
To untar it, root is required because the tarball contains device
files, files with various uid/gid, etc. If you don't have root, it's
ok, all the tests except for roottest.py may still work.
So, three steps:
1) Make sure the the C modules are compiled as explained above:
cd $RDB_CVS/rdiff_backup; python compilec.py
2) Untar the testfiles tarball, as root if you have it:
cd $RDB_CVS/testing; tar -xvzf testfiles.tar.gz
3) In the testing directory, run each of the *test.py files as
desired. For instance,
cd $RDB_CVS/testing; python rpathtest.py
If python restoretest.py doesn't work, try running
./makerestoretest3
<h3>Table of contents</h3>
<ol><li><a href="#__future__">When I try to run rdiff-backup it says
"ImportError: No module named __future__" or "SyntaxError: invalid
syntax". What's happening?</a></li>
<li><a href="#verbosity">What do the different verbosity levels mean?</a></li>
<li><a href="#windows">Does rdiff-backup run under Windows?</a></li>
<li><a href="#OSX">Does rdiff-backup run under Mac OS X?</a></li>
<li><a href="#remove_dir">My backup set contains some files that I just realized I don't want/need backed up. How do I remove them from the backup volume to save space?</li>
<li><a href="#solaris">Does rdiff-backup work under Solaris?</a></li>
<li><a href="#speed">How fast is rdiff-backup? Can it be run on large
data sets?</a></li>
<li><a href="#statistics">What do the various fields mean in the
session statistics and directory statistics files?</a></li>
<li><a href="#bwlimit">Is there some way to limit rdiff-backup's
bandwidth usage, as in rsync's --bwlimit option?</a></li>
<li><a href="#leak">How much memory should rdiff-backup use? Is there a
memory leak?</a></li>
</ol>
<h3>Questions and Answers</h3>
<ol>
<a name="__future__">
<li><strong>When I try to run rdiff-backup it says "ImportError: No
module named __future__" or "SyntaxError: invalid syntax". What's
happening?</strong>
<P>rdiff-backup versions 0.2.x require Python version 2.1 or later,
and versions 0.3.x and later require Python version 2.2 or later. If
you don't know what version of python you are running, type in "python
-V" from the shell. I'm sorry if this is inconvenient, but
rdiff-backup uses generators, iterators, nested scoping, and
static/class methods extensively, and these were only added in version
2.2.
<P>If you have two versions of python installed, and running "python"
defaults to an early version, you'll probably have to change the first
line of the rdiff-backup script. For instance, you could set it to:
<pre>#!/usr/bin/env python2.2</pre>
</li>
<a name="verbosity">
<li><strong>What do the different verbosity levels mean?</strong>
<P>There is no formal specification, but here is a rough description
(settings are always cumulative, so 5 displays everything 4 does):
<P>
<table cellspacing="10">
<tr><td>0</td><td>No information given</td></tr>
<tr><td>1</td><td>Fatal Errors displayed</td></tr>
<tr><td>2</td><td>Warnings</td></tr>
<tr><td>3</td><td>Important messages, and maybe later some global statistics (default)</td></tr>
<tr><td>4</td><td>Some global settings, miscellaneous messages</td></tr>
<tr><td>5</td><td>Mentions which files were changed</td></tr>
<tr><td>6</td><td>More information on each file processed</td></tr>
<tr><td>7</td><td>More information on various things</td></tr>
<tr><td>8</td><td>All logging is dated</td></tr>
<tr><td>9</td><td>Details on which objects are moving across the connection</td></tr>
</table>
<a name="windows">
<li><strong>Does rdiff-backup run under Windows?</strong>
<P>Yes, apparently it is possible. First, follow Jason Piterak's
instructions:
<pre>
Subject: Cygwin rdiff-backup
From: Jason Piterak &lt;Jason_Piterak@c-i-s.com&gt;
Date: Mon, 4 Feb 2002 16:54:24 -0500 (13:54 PST)
To: rdiff-backup@keywest.Stanford.EDU
Hello all,
On a lark, I thought I would attempt to get rdiff-backup to work under
Windows98 under Cygwin. We have a number of NT/Win2K servers in the field
that I'd love to be backing up via rdiff-backup, and this was the start of
getting that working.
SUMMARY:
o You can get all the pieces for rdiff-backup working under Cygwin.
o The backup process works up to the point of writing any files with
timestamps.
... This is because the ':' character is reserved for Alternate Data
Stream (ADS) file designations under NTFS.
HOW TO GET IT WORKING (to a point, anyway):
o Install Cygwin
o Download the Python 2.2 update through the Cygwin installer and install.
o Download the librsync libraries from the usual place, but before
compiling...
o Cygwin does not use/provide glibc. Because of this, you have to repoint
some header files in the Makefile:
-- Make sure that you have /usr/include/inttypes.h
redirected to /usr/include/sys/types.h. Do this by:
create a file /usr/include/inttypes.h with the contents:
<protect>#include &lt;sys/types.h&gt;</protect>
o Put rdiff-backup in your PATH, as you normally would.
</pre>
Then, whenever you use rdiff-backup to back up from a unix system to
Windows, use the <strong>--windows-mode</strong> switch. This
compensates for some windows file systems' inability to store hard
links, symlinks, device files, sockets, fifos, case sensitive
filenames, and filenames with colons (":") in them. (Note: device
files, symlinks, fifos, and sockets will simply be skipped, and hard
link information will not be recorded.)
<p>If you are backing up one windows system to another, full
--windows-mode is not necessary, but you'll still need
<strong>--windows-time-format</strong>, which stops rdiff-backup from
trying to make increment files with colons in them. Whichever
--windows* option you use, remember to use the same one when restoring
or listing that backup directory.
</li>
<P>
<a name="OSX">
<li><strong>Does rdiff-backup run under Mac OS X?</strong>
<p>
Yes, but there may be some issues installing librsync. See this
message from Gerd Knops:
<pre>
From: Gerd Knops <gerti@bitart.com>
Date: Thu, 3 Oct 2002 03:56:47 -0500 (01:56 PDT)
[parts of original message deleted]
these instructions build it fine with all tests running OK
(librsync-0.9.5.1 on OS X 10.2.1):
aclocal
autoconf
automake --foreign --add-missing
CFLAGS=-no-cpp-precomp ./configure
make
make install
</pre>
Also, if you are backing up to a file system that is not case
sensitive you may need to use "--chars-to-quote A-Z". If you do use
--chars-to-quote, remember to use it with the same arguments when
restoring or listing incrementes.
</li>
<P>
<a name="remove_dir">
<li><strong>My backup set contains some files that I just realized I
don't want/need backed up. How do I remove them from the backup
volume to save space?</strong>
<P>Let's take an example. Suppose you ran
<pre>rdiff-backup /usr /backup</pre>
and now realize that you don't want /usr/local backed up on /backup.
Next time you back up, you run
<pre>rdiff-backup --exclude /usr/local /usr /backup</pre>
so that /usr/local is no longer copied to /backup/usr/local.
However, old information about /usr/local is still present in
/backup/rdiff-backup-data/increments/usr/local. You could wait for
this information to expire and then run rdiff-backup with the
--remove-older-than option, or you could remove the increments
manually by typing:
<pre>rm -rf /backup/rdiff-backup-data/increments/usr/local
rm /backup/rdiff-backup-data/increments/usr/local.*.dir</pre>
</li>
<P>
<a name="solaris">
<li><strong>Does rdiff-backup work under Solaris?</strong>
<P>There may be a problem with rdiff-backup and Solaris' libthread.
Adding "ulimit -n unlimited" may fix the problem though. Here is a
post by Kevin Spicer on the subject:
<pre>
Subject: RE: Crash report....still not^H^H^H working
From: "Spicer, Kevin" <Kevin.Spicer@bmrb.co.uk>
Date: Sat, 11 May 2002 23:36:42 +0100
To: rdiff-backup@keywest.Stanford.EDU
Quick mail to follow up on this..
My rdiff backup (on Solaris 2.6 if you remember) has now worked
reliably for nearly two weeks after I added...
ulimit -n unlimited
to the start of my cron job and created a wrapper script on the remote
machine which looked like this...
#!/bin/sh
ulimit -n unlimited
rdiff-backup --server
exit
And changed the remote schema on the command line of rdiff-backup to
call the wrapper script rather than rdiff-backup itself on the remote
machine. As for the /dev/zero thing I've done a bit of Googleing and
it seems that /dev/zero is used internally by libthread on Solaris
(which doesn't really explain why its opening more than 64 files - but
at least I think I've now got round it).
</pre>
</li>
<P>
<a name="speed">
<li><strong>How fast is rdiff-backup? Can it be run on large
data sets?</strong>
<P>rdiff-backup can be limited by the CPU, disk IO, or available
bandwidth, and the length of a session can be affected by the amount
of data, how much the data changed, and how many files are present.
That said, in the typical case the number/size of changed files is
relatively small compared to that of unchanged files, and rdiff-backup
is often either CPU or bandwidth bound, and takes time proportional to
the total number of files. Initial mirrorings will usually be
bandwidth or disk bound, and will take much longer than subsequent
updates.
<P>To give two arbitrary data points, when I back up my personal HD
locally (about 9GB, 600000 files, maybe 50 MB turnover, 1.1Ghz athlon)
rdiff-backup takes about 35 minutes and is usually CPU bound. Another
user reports an rdiff-backup session takes about 3 hours (80GB, ~1mil
files, 2GB turnover) to back up remotely Tru64 -> linux.
</li>
<p>
<a name="statistics">
<li><strong>What do the various fields mean in the
session statistics and directory statistics files?</strong>
<P>Let's examine an example session statistics file:
<pre>
StartTime 1028200920.44 (Thu Aug 1 04:22:00 2002)
EndTime 1028203082.77 (Thu Aug 1 04:58:02 2002)
ElapsedTime 2162.33 (36 minutes 2.33 seconds)
SourceFiles 494619
SourceFileSize 8535991560 (7.95 GB)
MirrorFiles 493797
MirrorFileSize 8521756994 (7.94 GB)
NewFiles 1053
NewFileSize 23601632 (22.5 MB)
DeletedFiles 231
DeletedFileSize 10346238 (9.87 MB)
ChangedFiles 572
ChangedSourceSize 86207321 (82.2 MB)
ChangedMirrorSize 85228149 (81.3 MB)
IncrementFiles 1857
IncrementFileSize 13799799 (13.2 MB)
TotalDestinationSizeChange 28034365 (26.7 MB)
Errors 0
</pre>
<P>StartTime and EndTime are measured in seconds since the epoch.
ElapsedTime is just EndTime - StartTime, the length of the
rdiff-backup session.
<P>SourceFiles are the number of files found in the source directory,
and SourceFileSize is the total size of those files. MirrorFiles are
the number of files found in the mirror directory (not including the
rdiff-backup-data directory) and MirrorFileSize is the total size of
those files. All sizes are in bytes. If the source directory hasn't
changed since the last backup, MirrorFiles == SourceFiles and
SourceFileSize == MirrorFileSize.
<P>NewFiles and NewFileSize are the total number and size of the files
found in the source directory but not in the mirror directory. They
are new as of the last backup.
<P>DeletedFiles and DeletedFileSize are the total number and size of
the files found in the mirror directory but not the source directory.
They have been deleted since the last backup.
<P>ChangedFiles are the number of files that exist both on the mirror
and on the source directories and have changed since the previous
backup. ChangedSourceSize is their total size on the source
directory, and ChangedMirrorSize is their total size on the mirror
directory.
<P>IncrementFiles is the number of increment files written to the
rdiff-backup-data directory, and IncrementFileSize is their total
size. Generally one increment file will be written for every new,
deleted, and changed file.
<P>TotalDestinationSizeChange is the number of bytes the destination
directory as a whole (mirror portion and rdiff-backup-data directory)
has grown during the given rdiff-backup session. This is usually
close to IncrementFileSize + NewFileSize - DeletedFileSize +
ChangedSourceSize - ChangedMirrorSize, but it also includes the space
taken up by the hardlink_data file to record hard links.
</li>
<a name="bwlimit">
<li><strong>Is there some way to limit rdiff-backup's
bandwidth usage, as in rsync's --bwlimit option?</strong>
<P>There is no internal rdiff-backup option to do this. However, the
--sleep-ratio option can limit overall resource usage, including
bandwidth. Also, external utilities such as <a href="http://www.cons.org/cracauer/cstream.html">cstream</a> can be
used to monitor bandwidth explicitly. trevor@tecnopolis.ca writes:
<pre>
rdiff-backup --remote-schema
'cstream -v 1 -t 10000 | ssh %s '\''rdiff-backup --server'\'' | cstream -t 20000'
'netbak@foo.bar.com::/mnt/backup' localbakdir
(must run from a bsh-type shell, not a csh type)
That would apply a limit in both directions [10000 bytes/sec outgoing,
20000 bytes/sec incoming]. I don't think you'd ever really want to do
this though as really you just want to limit it in one direction.
Also, note how I only -v 1 in one direction. You probably don't want
to output stats for both directions as it will confuse whatever script
you have parsing the output. I guess it wouldn't hurt for manual runs
however.
</pre>
To only limit bandwidth in one directory, simply remove one of the
cstream commands. Two cstream caveats may be worth mentioning:
<ol> <li>Because cstream is limiting the uncompressed data heading
into or out of ssh, if ssh compression is turned on, cstream may be
overly restrictive.</li>
<li>cstream may be "bursty", limiting average bandwidth but allowing
rdiff-backup to exceed it for significant periods.</li>
</ol>
<p>
Another option is to limit bandwidth at a lower (and perhaps more
appropriate) level. Adam Lazur mentions <a
href="http://lartc.org/wondershaper/">The Wonder Shaper</a>.
</li>
<a name="leak">
<li><strong>How much memory should rdiff-backup use? Is there a
memory leak?</strong>
<p>The amount of memory rdiff-backup uses should not depend much on
the size of directories being processed. Keeping track of hard links
may use up memory, so if you have, say, hundreds of thousands of files
hard linked together, rdiff-backup may need tens of MB.
<p>If rdiff-backup seems to be leaking memory, it is probably because
it is using an early version of librsync. <strong>librsync 0.9.5
leaks lots of memory.</strong> Version 0.9.5.1 should not leak and is
available from the rdiff-backup homepage.
</li>
</ol>
INSTALLATION:
Thank you for trying rdiff-backup. To install, run:
python setup.py install
The build process can be also be run separately:
python setup.py build
The default prefix is /usr, so files are put in /usr/bin,
/usr/share/man/, etc. An alternate prefix can be specified using the
--prefix=<prefix> option. For example:
python setup.py install --prefix=/usr/local
A few special build arguments can be specified such as --librsync-dir,
--lflags, and --libs, and the LIBRSYNC_DIR, LFLAGS, and LIBS
environment variables will also be used. Running setup.py with no
arguments will also display some help.
REQUIREMENTS:
Remember that you must have Python 2.2 or later and librsync 0.9.5.1
or later installed. For Python, see http://www.python.org. The
rdiff-backup homepage at http://rdiff-backup.stanford.edu/ should have
a recent version of librsync; otherwise see the librsync homepage at
http://sourceforge.net/projects/librsync/.
For remote operation, rdiff-backup should be in installed and in the
PATH on remote system(s) (see man page for more information).
TROUBLESHOOTING:
If you have everything installed properly, and it still doesn't work,
see the enclosed FAQ.html, the web page at
http://rdiff-backup.stanford.edu, and/or the mailing list.
---------[ Medium term ]---------------------------------------
Look into sparse file support (requested by Stelios K. Kyriacou)
Look into security.py code, do some sort of security audit.
Don't require increments.<date>.dir files to be setuid/setgid, or
don't even have the backup files. (Andrew Bressen)
Look at Kent Borg's suggestion for restore options and digests.
Add --list-files-changed-between or similar option, to list files that
have changed between two times
Add ACL support
Add --dry-run option (target for v1.1.x)
Add # of increments option to --remove-older-than
Make argument shortcut for cstream, or some other bandwidth limiter.
Write configuration file, to make sure settings like --quoting-char,
--windows-time-format, etc., don't change between sessions,
backup/restoring, etc.
---------[ Long term ]---------------------------------------
Think about adding Gaudet's idea for keeping track of renamed files.
Look into different inode generation techniques (see treescan, Dean
Gaudet's other post).
#!/usr/bin/env python
import os, re, shutil, time, sys, getopt
SourceDir = "rdiff_backup"
DistDir = "dist"
# Various details about the files must also be specified by the rpm
# spec template.
spec_template = "dist/rdiff-backup.spec.template"
#redhat_spec_template = "dist/rdiff-backup.rh7x.spec"
def CopyMan(destination, version):
"""Create updated man page at the specified location"""
fp = open(destination, "w")
date = time.strftime("%B %Y", time.localtime(time.time()))
version = "Version "+version
firstline = ('.TH RDIFF-BACKUP 1 "%s" "%s" "User Manuals"\n' %
(date, version))
fp.write(firstline)
infp = open("rdiff-backup.1", "r")
infp.readline()
fp.write(infp.read())
fp.close()
infp.close()
def MakeFAQ():
"""Create FAQ.html and FAQ.wml files from FAQ-body.html"""
faqbody_fp = open("FAQ-body.html", "r")
faqbody_string = faqbody_fp.read()
faqbody_fp.close()
wml_fp = open("FAQ.wml", "w")
wml_fp.write(
"""#include 'template.wml' home=. curpage=faq title="rdiff-backup: FAQ"
<divert body>
<p><h2>FAQ:</h2>
""")
wml_fp.write(faqbody_string)
wml_fp.write("\n</divert>\n")
wml_fp.close()
html_fp = open("FAQ.html", "w")
html_fp.write(
"""<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>rdiff-backup FAQ</title>
</head>
<body>
<h1>rdiff-backup FAQ</h1>
""")
html_fp.write(faqbody_string)
html_fp.write("\n</body></html>")
html_fp.close()
def VersionedCopy(source, dest):
"""Copy source to dest, substituting $version with version"""
fin = open(source, "rb")
inbuf = fin.read()
assert not fin.close()
outbuf = re.sub("\$version", Version, inbuf, 1)
if outbuf == inbuf: assert 0, "No $version string replaced"
assert not re.search("\$version", outbuf), \
"Two $version strings found in the same file %s" % (source,)
fout = open(dest, "wb")
fout.write(outbuf)
assert not fout.close()
def MakeTar():
"""Create rdiff-backup tar file"""
tardir = "rdiff-backup-%s" % Version
tarfile = "rdiff-backup-%s.tar.gz" % Version
try:
os.lstat(tardir)
os.system("rm -rf " + tardir)
except OSError: pass
os.mkdir(tardir)
for filename in ["CHANGELOG", "COPYING", "README", "FAQ.html",
SourceDir + "/cmodule.c",
SourceDir + "/_librsyncmodule.c",
DistDir + "/setup.py"]:
assert not os.system("cp %s %s" % (filename, tardir)), filename
os.mkdir(tardir+"/rdiff_backup")
for filename in ["backup.py", "connection.py",
"FilenameMapping.py", "Hardlink.py",
"increment.py", "__init__.py", "iterfile.py",
"lazy.py", "librsync.py", "log.py", "Main.py",
"manage.py", "metadata.py", "Rdiff.py",
"regress.py", "restore.py", "robust.py",
"rorpiter.py", "rpath.py", "Security.py",
"selection.py", "SetConnections.py", "static.py",
"statistics.py", "TempFile.py", "Time.py"]:
assert not os.system("cp %s/%s %s/rdiff_backup" %
(SourceDir, filename, tardir)), filename
VersionedCopy("%s/Globals.py" % (SourceDir,),
"%s/rdiff_backup/Globals.py" % (tardir,))
VersionedCopy("rdiff-backup", "%s/rdiff-backup" % (tardir,))
VersionedCopy(DistDir + "/setup.py", "%s/setup.py" % (tardir,))
os.chmod(os.path.join(tardir, "setup.py"), 0755)
os.chmod(os.path.join(tardir, "rdiff-backup"), 0644)
CopyMan(os.path.join(tardir, "rdiff-backup.1"), Version)
os.system("tar -cvzf %s %s" % (tarfile, tardir))
shutil.rmtree(tardir)
return tarfile
def MakeSpecFile():
"""Create spec file using spec template"""
#specfile = "rdiff-backup-%s-2.spec" % Version
specfile = "rdiff-backup.spec" # Fedora standard name
VersionedCopy(spec_template, specfile)
return specfile
def parse_cmdline(arglist):
"""Returns action"""
global Version
def error():
print "Syntax: makedist [--faq-only] [version_number]"
sys.exit(1)
optlist, args = getopt.getopt(arglist, "", ["faq-only"])
if len(args) != 1: error()
else: Version = args[0]
for opt, arg in optlist:
if opt == "--faq-only": return "FAQ"
else: assert 0, "Bad argument"
return "All"
def Main():
action = parse_cmdline(sys.argv[1:])
print "Making FAQ"
MakeFAQ()
if action != "FAQ":
assert action == "All"
print "Processing version " + Version
tarfile = MakeTar()
print "Made tar file " + tarfile
specfile = MakeSpecFile()
print "Made specfile ", specfile
if __name__ == "__main__" and not globals().has_key('__no_execute__'):
Main()
#!/usr/bin/env python
import os, sys, re
rpmroot = "/home/ben/rpm"
if len(sys.argv) == 2:
version = sys.argv[1]
specfile = "rdiff-backup.spec"
print "Using specfile %s" % specfile
else:
print "Syntax: %s version_number" % sys.argv[0]
sys.exit(1)
base = "rdiff-backup-%s" % (version,)
tarfile = base + ".tar.gz"
rpmbase = base + "-0.fdr.2" # Fedora suffix, with release number 2
i386_rpm = rpmbase + ".i386.rpm"
source_rpm = rpmbase + ".src.rpm"
# These assume the rpm root directory $HOME/rpm. The
# nonstandard location allows for building by non-root user.
assert not os.system("cp %s %s/SOURCES" % (tarfile, rpmroot))
#assert not os.system("rpm -ba --sign -vv --target i386 " + specfile)
assert not os.system("rpmbuild -ba -v --sign " + specfile)
assert not os.system("mv %s/RPMS/i386/%s ." % (rpmroot, i386_rpm))
assert not os.system("mv %s/SRPMS/%s ." % (rpmroot, source_rpm))
#!/usr/bin/env python
import sys, os
def RunCommand(cmd):
print cmd
os.system(cmd)
webprefix = "/home/ben/misc/html/mirror/rdiff-backup/"
if not sys.argv[1:]:
print 'Call with version number, as in "./makeweb 0.3.1"'
print "to move new rpms and tarballs. Now just remaking FAQ and man page."
print
else:
version = sys.argv[1]
RunCommand("cp *%s* %s" % (version, webprefix))
RunCommand("rman -f html -r '' rdiff-backup.1 > %srdiff-backup.1.html"
% webprefix)
RunCommand("cp FAQ.wml CHANGELOG %s" % webprefix)
os.chdir(webprefix)
print "cd ", webprefix
if sys.argv[1:]:
RunCommand("rm latest latest.rpm latest.tar.gz")
RunCommand("ln -s *%s*rpm latest.rpm" % (version,))
RunCommand("ln -s *%s*tar.gz latest.tar.gz" % (version,))
RunCommand("./Make")
--- rdiff-backup.old Sat Apr 6 10:05:18 2002
+++ rdiff-backup Sat Apr 6 10:05:25 2002
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python2
#
# rdiff-backup -- Mirror files while keeping incremental changes
# Version 0.7.1 released March 25, 2002
Summary: Convenient and transparent local/remote incremental mirror/backup
Name: rdiff-backup
Release: 1
URL: http://www.stanford.edu/~bescoto/rdiff-backup/
Source: %{name}-%{version}.tar.gz
Copyright: GPL
Group: Applications/Archiving
BuildRoot: %{_tmppath}/%{name}-root
requires: librsync, python2 >= 2.2
Patch: rdiff-backup-rh7x.patch
%description
rdiff-backup is a script, written in Python, that backs up one
directory to another and is intended to be run periodically (nightly
from cron for instance). The target directory ends up a copy of the
source directory, but extra reverse diffs are stored in the target
directory, so you can still recover files lost some time ago. The idea
is to combine the best features of a mirror and an incremental
backup. rdiff-backup can also operate in a bandwidth efficient manner
over a pipe, like rsync. Thus you can use rdiff-backup and ssh to
securely back a hard drive up to a remote location, and only the
differences from the previous backup will be transmitted.
%prep
%setup
%patch
%build
%install
rm -rf $RPM_BUILD_ROOT
mkdir -p $RPM_BUILD_ROOT/usr/bin
mkdir -p $RPM_BUILD_ROOT/usr/share/man/man1
install -m 755 rdiff-backup $RPM_BUILD_ROOT/usr/bin/rdiff-backup
install -m 644 rdiff-backup.1 $RPM_BUILD_ROOT/usr/share/man/man1/rdiff-backup.1
%clean
%files
%defattr(-,root,root)
/usr/bin/rdiff-backup
/usr/share/man/man1/rdiff-backup.1.gz
%doc CHANGELOG COPYING README FAQ.html
%changelog
* Sat Apr 6 2002 Ben Escoto <bescoto@stanford.edu>
- Made new version for Redhat 7.x series
* Sun Nov 4 2001 Ben Escoto <bescoto@stanford.edu>
- Initial RPM
%define PYTHON_NAME %((rpm -q --quiet python2 && echo python2) || echo python)
Version: $version
Summary: Convenient and transparent local/remote incremental mirror/backup
Name: rdiff-backup
Release: 0.fdr.2
Epoch: 0
URL: http://rdiff-backup.stanford.edu/
Source: http://rdiff-backup.stanford.edu/%{name}-%{version}.tar.gz
License: GPL
Group: Applications/Archiving
BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
Requires: librsync >= 0.9.5.1, %{PYTHON_NAME} >= 2.2
BuildPrereq: %{PYTHON_NAME}-devel >= 2.2, librsync-devel = 0.9.5.1
%description
rdiff-backup is a script, written in Python, that backs up one
directory to another and is intended to be run periodically (nightly
from cron for instance). The target directory ends up a copy of the
source directory, but extra reverse diffs are stored in the target
directory, so you can still recover files lost some time ago. The idea
is to combine the best features of a mirror and an incremental
backup. rdiff-backup can also operate in a bandwidth efficient manner
over a pipe, like rsync. Thus you can use rdiff-backup and ssh to
securely back a hard drive up to a remote location, and only the
differences from the previous backup will be transmitted.
%prep
%setup -q
%build
%{PYTHON_NAME} setup.py build
%install
%{PYTHON_NAME} setup.py install --prefix=$RPM_BUILD_ROOT/usr
%clean
[ "$RPM_BUILD_ROOT" != "/" ] && rm -rf $RPM_BUILD_ROOT
%files
%defattr(-,root,root)
%{_bindir}/rdiff-backup
%{_mandir}/man1/rdiff-backup*
%{_libdir}/
%doc CHANGELOG COPYING FAQ.html README
%changelog
* Thu Jul 24 2003 Ben Escoto <bescoto@stanford.edu>
- Set librsync = 0.9.5.1, because new version will use librsync.h
instead of rsync.h
* Sun Jul 20 2003 Ben Escoto <bescoto@stanford.edu>
- Minor changes to comply with Fedora standards.
* Sun Jan 19 2002 Troels Arvin <troels@arvin.dk>
- Builds, no matter if Python 2.2 is called python2-2.2 or python-2.2.
* Sun Nov 4 2001 Ben Escoto <bescoto@stanford.edu>
- Initial RPM
#!/usr/bin/env python
import sys, os, getopt
from distutils.core import setup, Extension
version_string = "$version"
if sys.version_info[:2] < (2,2):
print "Sorry, rdiff-backup requires version 2.2 or later of python"
sys.exit(1)
# Defaults
lflags_arg = []
libname = ['rsync']
incdir_list = libdir_list = None
if os.name == 'posix':
LIBRSYNC_DIR = os.environ.get('LIBRSYNC_DIR', '')
LFLAGS = os.environ.get('LFLAGS', [])
LIBS = os.environ.get('LIBS', [])
# Handle --librsync-dir=[PATH] and --lflags=[FLAGS]
args = sys.argv[:]
for arg in args:
if arg.startswith('--librsync-dir='):
LIBRSYNC_DIR = arg.split('=')[1]
sys.argv.remove(arg)
elif arg.startswith('--lflags='):
LFLAGS = arg.split('=')[1].split()
sys.argv.remove(arg)
elif arg.startswith('--libs='):
LIBS = arg.split('=')[1].split()
sys.argv.remove(arg)
if LFLAGS or LIBS:
lflags_arg = LFLAGS + LIBS
if LIBRSYNC_DIR:
incdir_list = [os.path.join(LIBRSYNC_DIR, 'include')]
libdir_list = [os.path.join(LIBRSYNC_DIR, 'lib')]
if '-lrsync' in LIBS:
libname = []
setup(name="rdiff-backup",
version=version_string,
description="Local/remote mirroring+incremental backup",
author="Ben Escoto",
author_email="bescoto@stanford.edu",
url="http://rdiff-backup.stanford.edu",
packages = ['rdiff_backup'],
ext_modules = [Extension("rdiff_backup.C", ["cmodule.c"]),
Extension("rdiff_backup._librsync",
["_librsyncmodule.c"],
include_dirs=incdir_list,
library_dirs=libdir_list,
libraries=libname,
extra_link_args=lflags_arg)],
scripts = ['rdiff-backup'],
data_files = [('share/man/man1', ['rdiff-backup.1']),
('share/doc/rdiff-backup-%s' % (version_string,),
['CHANGELOG', 'COPYING', 'README', 'FAQ.html'])])
#!/usr/bin/env python
#
# Compresses old rdiff-backup increments. See
# http://www.stanford.edu/~bescoto/rdiff-backup for information on
# rdiff-backup.
from __future__ import nested_scopes, generators
import os, sys, getopt
rdiff_backup_location = "/usr/bin/rdiff-backup"
no_compression_regexp_string = None
__no_execute__ = 1
def print_help():
"""Print usage, exit"""
print """
Usage: compress-rdiff-backup-increments [options] mirror_directory
This script will compress the old rdiff-backup increments under
mirror_directory, in a format compatible with rdiff-backup version
0.7.1 and later. So for instance if you were using an old version
of rdiff-backup like this:
rdiff-backup /foo /backup
and now you want to take advantage of v0.7.1's space saving
compression, you can run:
compress-rdiff-backup-increments /backup
Options:
--rdiff-backup-location location
This script reads your rdiff-backup executable. The default
is "/usr/bin/rdiff-backup", so if your rdiff-backup is in a
different location, you must use this switch.
--no-compression-regexp regexp
Any increments whose base name match this regular expression
won't be compressed. This is generally used to avoid
compressing already compressed files. See the rdiff-backup
man page for the default.
"""
sys.exit(1)
def parse_args(arglist):
"""Check and evaluate command line arguments, return dirname"""
global rdiff_backup_location
global no_compression_regexp_string
try: optlist, args = getopt.getopt(arglist, "v:",
["rdiff-backup-location=",
"no-compression-regexp="])
except getopt.error: print_help()
for opt, arg in optlist:
if opt == "--no-compression-regexp":
no_compression_regexp_string = arg
elif opt == "--rdiff-backup-location": rdiff_backup_location = arg
else:
print "Bad option: ", opt
print_help()
if len(args) != 1:
print "Wrong number of arguments"
print_help()
return args[0]
def exec_rdiff_backup():
"""Execs rdiff-backup"""
try: execfile(rdiff_backup_location, globals())
except IOError:
print "Unable to read", rdiff_backup_location
print "You may need to use the --rdiff-backup-location argument"
sys.exit(1)
if not map(int, Globals.version.split(".")) >= [0, 7, 1]:
print "This script requires rdiff-backup version 0.7.1 or later,",
print "found version", Globals.version
sys.exit(1)
def gzip_file(rp):
"""gzip rp, adding .gz to path and deleting original"""
newrp = RPath(rp.conn, rp.base, rp.index[:-1] + (rp.index[-1]+".gz",))
if newrp.lstat():
print "Warning: %s already exists, skipping" % newrp.path
return
print "gzipping ", rp.path
newrp.write_from_fileobj(rp.open("rb"), compress = 1)
RPath.copy_attribs(rp, newrp)
rp.delete()
def Main():
dirname = parse_args(sys.argv[1:])
exec_rdiff_backup()
if no_compression_regexp_string is not None:
no_compression_regexp = re.compile(no_compression_regexp_string, re.I)
else: no_compression_regexp = \
re.compile(Globals.no_compression_regexp_string, re.I)
Globals.change_source_perms = 1
Globals.change_ownership = (os.getuid() == 0)
# Check to make sure rbdir exists
root_rp = RPath(Globals.local_connection, dirname)
rbdir = root_rp.append("rdiff-backup-data")
if not rbdir.lstat():
print "Cannot find %s, exiting" % rbdir.path
sys.exit(1)
for dsrp in DestructiveStepping.Iterate_with_Finalizer(rbdir, 1):
if (dsrp.isincfile() and dsrp.isreg() and
not dsrp.isinccompressed() and
(dsrp.getinctype() == "diff" or dsrp.getinctype() == "snapshot")
and dsrp.getsize() != 0 and
not no_compression_regexp.match(dsrp.getincbase_str())):
gzip_file(dsrp)
Main()
#!/usr/bin/env python
from __future__ import generators
import sys, os, stat
def usage():
print "Usage: find2dirs dir1 dir2"
print
print "Given the name of two directories, list all the files in both, one"
print "per line, but don't repeat a file even if it is in both directories"
sys.exit(1)
def getlist(base, ext = ""):
"""Return iterator yielding filenames from directory"""
if ext: yield ext
else: yield "."
fullname = os.path.join(base, ext)
if stat.S_ISDIR(stat.S_IFMT(os.lstat(fullname)[stat.ST_MODE])):
for subfile in os.listdir(fullname):
for fn in getlist(base, os.path.join(ext, subfile)): yield fn
def main(dir1, dir2):
d = {}
for fn in getlist(dir1): d[fn] = 1
for fn in getlist(dir2): d[fn] = 1
for fn in d.keys(): print fn
if not len(sys.argv) == 3: usage()
else: main(sys.argv[1], sys.argv[2])
#!/usr/bin/env python
"""init_smallfiles.py
This program makes a number of files of the given size in the
specified directory.
"""
import os, stat, sys, math
if len(sys.argv) > 5 or len(sys.argv) < 4:
print "Usage: init_files [directory name] [file size] [file count] [base]"
print
print "Creates file_count files in directory_name of size file_size."
print "The created directory has a tree type structure where each level"
print "has at most base files or directories in it. Default is 50."
sys.exit(1)
dirname = sys.argv[1]
filesize = int(sys.argv[2])
filecount = int(sys.argv[3])
block_size = 16384
block = "." * block_size
block_change = "." * (filesize % block_size)
if len(sys.argv) == 4: base = 50
else: base = int(sys.argv[4])
def make_file(path):
"""Make the file at path"""
fp = open(path, "w")
for i in xrange(int(math.floor(filesize/block_size))): fp.write(block)
fp.write(block_change)
fp.close()
def find_sublevels(count):
"""Return number of sublevels required for count files"""
return int(math.ceil(math.log(count)/math.log(base)))
def make_dir(dir, count):
"""Make count files in the directory, making subdirectories if necessary"""
print "Making directory %s with %d files" % (dir, count)
os.mkdir(dir)
level = find_sublevels(count)
assert count <= pow(base, level)
if level == 1:
for i in range(count): make_file(os.path.join(dir, "file%d" %i))
else:
files_per_subdir = pow(base, level-1)
full_dirs = int(count/files_per_subdir)
assert full_dirs <= base
for i in range(full_dirs):
make_dir(os.path.join(dir, "subdir%d" % i), files_per_subdir)
change = count - full_dirs*files_per_subdir
assert change >= 0
if change > 0:
make_dir(os.path.join(dir, "subdir%d" % full_dirs), change)
def start(dir):
try: os.stat(dir)
except os.error: pass
else:
print "Directory %s already exists, exiting." % dir
sys.exit(1)
make_dir(dirname, filecount)
start(dirname)
#!/usr/bin/env python
"""Use librsync to transform everything in one dir to another"""
import sys, os, librsync
dir1, dir2 = sys.argv[1:3]
for i in xrange(1000):
dir1fn = "%s/%s" % (dir1, i)
dir2fn = "%s/%s" % (dir2, i)
# Write signature file
f1 = open(dir1fn, "rb")
sigfile = open("sig", "wb")
librsync.filesig(f1, sigfile, 2048)
f1.close()
sigfile.close()
# Write delta file
f2 = open(dir2fn, "r")
sigfile = open("sig", "rb")
deltafile = open("delta", "wb")
librsync.filerdelta(sigfile, f2, deltafile)
f2.close()
sigfile.close()
deltafile.close()
# Write patched file
f1 = open(dir1fn, "rb")
newfile = open("%s/%s.out" % (dir1, i), "wb")
deltafile = open("delta", "rb")
librsync.filepatch(f1, deltafile, newfile)
f1.close()
deltafile.close()
newfile.close()
#!/usr/bin/env python
"""Make 10000 files consisting of data
Syntax: test.py directory_name number_of_files character filelength"""
import sys, os
dirname = sys.argv[1]
num_files = int(sys.argv[2])
character = sys.argv[3]
filelength = int(sys.argv[4])
os.mkdir(dirname)
for i in xrange(num_files):
fp = open("%s/%s" % (dirname, i), "w")
fp.write(character * filelength)
fp.close()
fp = open("%s.big" % dirname, "w")
fp.write(character * (filelength*num_files))
fp.close()
#!/usr/bin/python
import sys, os
#sys.path.insert(0, "../src")
from rdiff_backup.rpath import *
from rdiff_backup.connection import *
from rdiff_backup import Globals
lc = Globals.local_connection
for filename in sys.argv[1:]:
#print "Deleting %s" % filename
rp = RPath(lc, filename)
if rp.lstat(): rp.delete()
#os.system("rm -rf " + rp.path)
#!/usr/bin/env python
"""Run rdiff to transform everything in one dir to another"""
import sys, os
dir1, dir2 = sys.argv[1:3]
for i in xrange(1000):
assert not os.system("rdiff signature %s/%s sig" % (dir1, i))
assert not os.system("rdiff delta sig %s/%s diff" % (dir2, i))
assert not os.system("rdiff patch %s/%s diff %s/%s.out" %
(dir1, i, dir1, i))
#!/usr/bin/env python
"""remove-comments.py
Given a python program on standard input, spit one out on stdout that
should work the same, but has blank and comment lines removed.
"""
import sys, re
triple_regex = re.compile('"""')
def eattriple(initial_line_stripped):
"""Keep reading until end of doc string"""
assert initial_line_stripped.startswith('"""')
if triple_regex.search(initial_line_stripped[3:]): return
while 1:
line = sys.stdin.readline()
if not line or triple_regex.search(line): break
while 1:
line = sys.stdin.readline()
if not line: break
stripped = line.strip()
if not stripped: continue
if stripped[0] == "#": continue
if stripped.startswith('"""'):
eattriple(stripped)
continue
sys.stdout.write(line)
#!/usr/bin/env python
# rdiff-backup -- Mirror files while keeping incremental changes
# Version $version released October 5, 2002
# Copyright (C) 2001, 2002 Ben Escoto <bescoto@stanford.edu>
#
# This program is licensed under the GNU General Public License (GPL).
# you can redistribute it and/or modify it under the terms of the GNU
# General Public License as published by the Free Software Foundation,
# Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA; either
# version 2 of the License, or (at your option) any later version.
# Distributions of rdiff-backup should include a copy of the GPL in a
# file called COPYING. The GPL is also available online at
# http://www.gnu.org/copyleft/gpl.html.
#
# See http://rdiff-backup.stanford.edu/ for more information. Please
# send mail to me or the mailing list if you find bugs or have any
# suggestions.
import sys
import rdiff_backup.Main
if __name__ == "__main__" and not globals().has_key('__no_execute__'):
rdiff_backup.Main.Main(sys.argv[1:])
.TH RDIFF-BACKUP 1 "AUGUST 2001" "Version 0.2.1" "User Manuals" \" -*- nroff -*-
.SH NAME
rdiff-backup \- local/remote mirror and incremental backup
.SH SYNOPSIS
.B rdiff-backup
.BI [ options ]
.BI [[[ user@ ] host1.foo ]:: source_directory ]
.BI [[[ user@ ] host2.foo ]:: destination_directory ]
.B rdiff-backup
.B {{ -l | --list-increments }
.BI "| --remove-older-than " time_interval
.BI "| --list-at-time " time
.BI "| --list-changed-since " time }
.BI [[[ user@ ] host2.foo ]:: destination_directory ]
.B rdiff-backup --calculate-average
.I statfile1 statfile2 ...
.B rdiff-backup --test-server
.BI [ user1 ] @host1.net1 :: path
.BI [[ user2 ] @host2.net2 :: path ]
.I ...
.SH DESCRIPTION
.B rdiff-backup
is a script, written in
.BR python (1)
that backs up one directory to another. The target directory ends up
a exacty copy (mirror) of the source directory, but extra reverse
diffs are stored in the target directory, so you can still recover
files lost some time ago. The idea is to combine the best features of
a mirror and an incremental backup. rdiff-backup also preserves
symlinks, special files, hardlinks, permissions, uid/gid ownership (if
it is running as root), and modification times.
.B rdiff-backup
can also operate
in a bandwidth efficient manner over a pipe, like
.BR rsync (1).
Thus you can use ssh and rdiff-backup to securely back a hard drive up
to a remote location, and only the differences will be transmitted.
Using the default settings, rdiff-backup requires that the remote
system accept ssh connections, and that
.B rdiff-backup
is installed in the user's PATH on the remote system. For information
on other options, see the section on
.B REMOTE OPERATION.
Note that you
.B should not write to the mirror directory
except with rdiff-backup. Many of the increments are stored as
reverse diffs, so if you delete or modify a file, you may lose the
ability to restore previous versions of that file.
.SH OPTIONS
.TP
.B -b, --backup-mode
Force backup mode even if first argument appears to be an increment file.
.TP
.B --calculate-average
Enter calculate average mode. The arguments should be a number of
statistics files. rdiff-backup will print the average of the listed
statistics files and exit.
.TP
.BI "--chars-to-quote " chars
If this option is set, any characters in
.I chars
present in filenames on the source side will be quoted on the
destination side, so that they do not appear in filenames on the
remote side. See
.B --quoting-char
and
.BR --windows-mode .
.TP
.B --check-destination-dir
If an rdiff-backup session fails, running rdiff-backup with this
option on the destination dir will undo the failed directory. This
happens automatically if you attempt to back up to a directory and the
last backup failed.
.TP
.BI "--current-time " seconds
This option is useful mainly for testing. If set, rdiff-backup will
it for the current time instead of consulting the clock. The argument
is the number of seconds since the epoch.
.TP
.BI "--exclude " shell_pattern
Exclude the file or files matched by
.IR shell_pattern .
If a directory is matched, then files under that directory will also
be matched. See the
.B FILE SELECTION
section for more information.
.TP
.B "--exclude-device-files"
Exclude all device files. This can be useful for security/permissions
reasons or if rdiff-backup is not handling device files correctly.
.TP
.BI "--exclude-filelist " filename
Excludes the files listed in
.I filename
See the
.B FILE SELECTION
section for more information.
.TP
.B --exclude-filelist-stdin
Like
.B --exclude-filelist,
but the list of files will be read from standard input. See the
.B FILE SELECTION
section for more information.
.TP
.BR "--exclude-globbing-filelist " filename
Like
.B --exclude-filelist
but each line of the filelist will be interpreted according to the
same rules as
.B --include
and
.B --exclude.
.TP
.B --exclude-other-filesystems
Exclude files on file systems (identified by device number) other than
the file system the root of the source directory is on.
.TP
.BI "--exclude-regexp " regexp
Exclude files matching the given regexp. Unlike the
.B --exclude
option, this option does not match files in a directory it matches.
See the
.B FILE SELECTION
section for more information.
.TP
.B --exclude-special-files
Exclude all device files, fifos, sockets, and symlinks. This option
is implied by --windows-mode.
.TP
.B --force
Authorize the updating or overwriting of a destination path.
rdiff-backup will generally tell you if it needs this.
.TP
.BI "--include " shell_pattern
Similar to
.B --exclude
but include matched files instead. Unlike
.BR --exclude ,
this option will also match parent directories of matched files
(although not necessarily their contents). See the
.B FILE SELECTION
section for more information.
.TP
.BI "--include-filelist " filename
Like
.BR --exclude-filelist ,
but include the listed files instead. See the
.B FILE SELECTION
section for more information.
.TP
.B --include-filelist-stdin
Like
.BR --include-filelist ,
but read the list of included files from standard input.
.BI "--include-globbing-filelist " filename
Like
.B --include-filelist
but each line of the filelist will be interpreted according to the
same rules as
.B --include
and
.B --exclude.
.TP
.BI "--include-regexp " regexp
Include files matching the regular expression
.IR regexp .
Only files explicitly matched by
.I regexp
will be included by this option. See the
.B FILE SELECTION
section for more information.
.TP
.BI "--list-at-time " time
List the files in the archive that were present at the given time. If
a directory in the archive is specified, list only the files under
that directory.
.TP
.BI "--list-changed-since " time
List the files that have changed since the given time. See
.B TIME FORMATS
for the format of
.IR time .
If a directory in the archive is specified, list only the files under
that directory.
.TP
.B "-l, --list-increments"
List the number and date of partial incremental backups contained in
the specified destination directory. No backup or restore will take
place if this option is given.
.TP
.B "-m, --mirror-only"
Do not create an rdiff-backup-data directory or make any increments.
In this mode rdiff-backup is similar to rsync (but usually
slower).
.TP
.B --no-change-dir-inc-perms
Do not change the permissions of the directory increments to match the
directories they represent. This option may be required on file
systems where regular files cannot have their sticky bit set.
.TP
.B --no-compare-inode
This relative esoteric option prevents rdiff-backup from flagging a
file as changed when its inode changes. This option may be useful if
you are backing up two different directories to the same rdiff-backup
destination directory. The downside is that hard link information may
get messed up, as the metadata file may no longer have the correct
inode information.
.TP
.B --no-compression
Disable the default gzip compression of most of the .snapshot and .diff
increment files stored in the rdiff-backup-data directory. A backup
volume can contain compressed and uncompressed increments, so using
this option inconsistently is fine.
.TP
.B "--no-compression-regexp " regexp
Do not compress increments based on files whose filenames match regexp.
The default is
"(?i).*\\.(gz|z|bz|bz2|tgz|zip|rpm|deb|jpg|gif|png|jp2|mp3|ogg|avi|wmv|mpeg|mpg|rm|mov)$"
.TP
.B --no-file-statistics
This will disable writing to the file_statistics file in the
rdiff-backup-data directory. rdiff-backup will run slightly quicker
and take up a bit less space.
.TP
.BI --no-hard-links
Don't replicate hard links on destination side. Note that because
metadata is written to a separate file, hard link information will not
be lost even if the --no-hard-links option is given (however, mirror
files will not be linked). If many hard-linked files are present,
this option can drastically decrease memory usage.
.TP
.B --null-separator
Use nulls (\\0) instead of newlines (\\n) as line separators, which
may help when dealing with filenames containing newlines. This
affects the expected format of the files specified by the
--{include|exclude}-filelist[-stdin] switches as well as the format of
the directory statistics file.
.TP
.B --parsable-output
If set, rdiff-backup's output will be tailored for easy parsing by
computers, instead of convenience for humans. Currently this only
applies when listing increments using the
.B -l
or
.B --list-increments
switches, where the time will be given in seconds since the epoch.
.TP
.B --print-statistics
If set, summary statistics will be printed after a successful backup
If not set, this information will still be available from the
session statistics file. See the
.B STATISTICS
section for more information.
.TP
.BI "--quoting-char " char
Use the specified character for quoting characters specified to be
escaped by the
.B --chars-to-quote
option. The default is the semicolon ";". See also
.BR --windows-mode .
.TP
.BI "-r, --restore-as-of " restore_time
Restore the specified directory as it was as of
.IR restore_time .
See the
.B TIME FORMATS
section for more information on the format of
.IR restore_time ,
and see the
.B RESTORING
section for more information on restoring.
.TP
.BI "--remote-cmd " command
This command has been depreciated as of version 0.4.1. Use
--remote-schema instead.
.TP
.BI "--remote-schema " schema
Specify an alternate method of connecting to a remote computer. This
is necessary to get rdiff-backup not to use ssh for remote backups, or
if, for instance, rdiff-backup is not in the PATH on the remote side.
See the
.B REMOTE OPERATION
section for more information.
.TP
.BI "--remove-older-than " time_spec
Remove the incremental backup information in the destination directory
that has been around longer than the given time.
.I time_spec
can be either an absolute time, like "2002-01-04", or a time interval.
The time interval is an integer followed by the character s, m, h, D,
W, M, or Y, indicating seconds, minutes, hours, days, weeks, months,
or years respectively, or a number of these concatenated. For
example, 32m means 32 minutes, and 3W2D10h7s means 3 weeks, 2 days, 10
hours, and 7 seconds. In this context, a month means 30 days, a year
is 365 days, and a day is always 86400 seconds.
rdiff-backup cannot remove-older-than and back up or restore in a
single session. If you want to, for instance, backup a directory and
remove old files in it, you must run rdiff-backup twice.
Note that snapshots of deleted files are covered by this operation.
Thus if you deleted a file two weeks ago, backed up immediately
afterwards, and then ran rdiff-backup with --remove-older-than 10D
today, no trace of that file would remain. Finally, file selection
options such as --include and --exclude don't affect
--remove-older-than.
.TP
.BI "--restrict " path
Require that all file access be inside the given path. This switch,
and the following two, are intended to be used with the --server
switch to provide a bit more protection when doing automated remote
backups. They are
.B not intended as your only line of defense
so please don't do something silly like allow public access to an
rdiff-backup server run with --restrict-read-only.
.TP
.BI "--restrict-read-only " path
Like
.BR --restrict ,
but also reject all write requests.
.TP
.BI "--restrict-update-only " path
Like
.BR --restrict ,
but only allow writes as part of an incremental backup. Requests for other types of writes (for instance, deleting
.IR path )
will be rejected.
.TP
.B --server
Enter server mode (not to be invoked directly, but instead used by
another rdiff-backup process on a remote computer).
.TP
.B --ssh-no-compression
When running ssh, do not use the -C option to enable compression.
.B --ssh-no-compression
is ignored if you specify a new schema using
.B --remote-schema.
.TP
.BI "--terminal-verbosity " [0-9]
Select which messages will be displayed to the terminal. If missing
the level defaults to the verbosity level.
.TP
.B --test-server
Test for the presence of a compatible rdiff-backup server as specified
in the following host::filename argument(s). The filename section
will be ignored.
.TP
.BI -v [0-9] ", --verbosity " [0-9]
Specify verbosity level (0 is totally silent, 3 is the default, and 9
is noisiest). This determines how much is written to the log file.
.TP
.B "-V, --version"
Print the current version and exit
.TP
.B --windows-mode
This option quotes characters not allowable on windows, and does not
try to preserve ownership, hardlinks, or permissions on the
destination side. It is appropriate when backing up a normal unix
file system to a windows one such as VFS, or a file system with
similar limitations. Because metadata is stored in a separate regular
file, this option does not prevent all data from being restored.
.TP
.B --windows-restore
This option turns on windows quoting, but does not disable
permissions, hard linking, or ownership. Use this when restoring from
an rdiff-backup directory on a windows file system to a unix file
system.
.SH EXAMPLES
Simplest case---backup directory foo to directory bar, with increments
in bar/rdiff-backup-data:
.PP
.RS
rdiff-backup foo bar
.PP
.RE
This is exactly the same as previous example because trailing slashes
are ignored:
.PP
.RS
rdiff-backup foo/ bar/
.PP
.RE
Back files up from /home/bob to /mnt/backup, leaving increments in /mnt/backup/rdiff-backup-data. Do not back up directory /home/bob/tmp or any files in it.
.PP
.RS
rdiff-backup --exclude /home/bob/tmp /home/bob /mnt/backup
.PP
.RE
The file selection options can be combined in various ways. The
following command backs up the whole file system to /usr/local/backup.
However, the entire /usr directory is skipped, with the exception of
/usr/local, which is included, except for /usr/local/backup, which is
excluded to prevent a circularity:
.PP
.RS
rdiff-backup --exclude /usr/local/backup --include /usr/local --exclude /usr / /usr/local/backup
.PP
.RE
You can also use regular expressions in the --exclude statements.
This will skip any files whose full pathnames contain the word
"cache", or any files whose name is "tmp", "temp", "TMP", "tEmP", etc.
.PP
.RS
rdiff-backup --exclude-regexp cache --exclude-regexp '(?i)/te?mp$' /home/bob /mnt/backup
.PP
.RE
After the previous command was completed, this command will list the
backups present on the destination disk:
.PP
.RS
rdiff-backup --list-increments /mnt/backup
.PP
.RE
If space is running out on the /mnt/backup directory, older
incremental backups can be erased. The following command erases
backup information older than a week:
.PP
.RS
rdiff-backup --remove-older-than 7D /mnt/backup
.PP
.RE
The following reads the file
important-data.2001-07-15T04:09:38-07:00.dir and restores the
resulting directory important-data as it was on Februrary 14, 2001,
calling the new directory "temp". Note that rdiff-backup goes into
restore mode because it recognizes the suffix of the file. The -v9
means keep lots of logging information.
.PP
.RS
rdiff-backup -v9 important-data.2001-07-15T04:09:38-07:00.dir temp
.PP
.RE
This command causes rdiff-backup to backup the directory
/some/local-dir to the directory /whatever/remote-dir on the machine
hostname.net. It uses ssh to open the necessary pipe to the remote
copy of rdiff-backup. Here the username on the local machine and on
hostname.net are the same.
.PP
.RS
rdiff-backup /some/local-dir hostname.net::/whatever/remote-dir
.PP
.RE
This command logs into hostname.net as smith and restores the remote
increment old-file on a remote computer to the current directory on
the local computer:
.PP
.RS
rdiff-backup smith@hostname.net::/foo/rdiff-backup-data/increments/bar/old-file.2001-11-09T12:43:53-04:00.diff
.PP
.RE
Backup foo on one remote machine to bar on another. This will
probably be slower than running rdiff-backup from either machine.
.PP
.RS
rdiff-backup smith@host1::foo jones@host2::bar
.PP
.RE
Test to see if the specified ssh command really opens up a working
rdiff-backup server on the remote side.
.RS
rdiff-backup --test-server hostname.net::/ignored
.SH RESTORING
There are two ways to tell rdiff-backup to restore a file or
directory. Firstly, you can run rdiff-backup on a mirror file and use
the
.B -r
or
.B --restore-as-of
options. Secondly, you can run it on an increment file.
.PP
For example, suppose in the past you have run:
.PP
.RS
rdiff-backup /usr /usr.backup
.PP
.RE
to back up the /usr directory into the /usr.backup directory, and now
want a copy of the /usr/local directory the way it was 3 days ago
placed at /usr/local.old.
.PP
One way to do this is to run:
.PP
.RS
rdiff-backup -r 3D /usr.backup/local /usr/local.old
.PP
.RE
where above the "3D" means 3 days (for other ways to specify the time,
see the
.B TIME FORMATS
section). The /usr.backup/local directory was selected, because that
is the directory containing the current version of /usr/local.
.PP
Note that the option to
.B --restore-as-of
always specifies an exact time. (So "3D" refers to the instant 72
hours before the present.) If there was no backup made at that time,
rdiff-backup restores the state recorded for the previous backup. For
instance, in the above case, if "3D" is used, and there are only
backups from 2 days and 4 days ago, /usr/local as it was 4 days ago
will be restored.
.PP
The second way to restore files involves finding the corresponding
increment file. It would be in the
/backup/rdiff-backup-data/increments/usr directory, and its name would
be something like "local.2002-11-09T12:43:53-04:00.dir" where the time
indicates it is from 3 days ago. Note that the increment files all
end in ".diff", ".snapshot", ".dir", or ".missing", where ".missing"
just means that the file didn't exist at that time (finally, some of
these may be gzip-compressed, and have an extra ".gz" to indicate
this). Then running:
.PP
.RS
rdiff-backup /backup/rdiff-backup-data/increments/usr/local.<time>.dir /usr/local.old
.PP
.RE
would also restore the file as desired.
.PP
If you are not sure exactly which version of a file you need, it is
probably easiest to either restore from the increments files as
described immediately above, or to see which increments are available
with -l/--list-increments, and then specify exact times into
-r/--restore-as-of.
.SH TIME FORMATS
rdiff-backup uses time strings in two places. Firstly, all of the
increment files rdiff-backup creates will have the time in their
filenames in the w3 datetime format as described in a w3 note at
http://www.w3.org/TR/NOTE-datetime. Basically they look like
"2001-07-15T04:09:38-07:00", which means what it looks like. The
"-07:00" section means the time zone is 7 hours behind UTC.
.PP
Secondly, the
.BI -r , " --restore-as-of" ", and " --remove-older-than
options take a time string, which can be given in any of several
formats:
.IP 1.
the string "now" (refers to the current time)
.IP 2.
a sequences of digits, like "123456890" (indicating the time in
seconds after the epoch)
.IP 3.
A string like "2002-01-25T07:00:00+02:00" in datetime format
.IP 4.
An interval, which is a number followed by one of the characters s, m,
h, D, W, M, or Y (indicating seconds, minutes, hourse, days, weeks,
months, or years respectively), or a series of such pairs. In this
case the string refers to the time that preceded the current time by
the length of the interval. For instance, "1h78m" indicates the time
that was one hour and 78 minutes ago. The calendar here is
unsophisticated: a month is always 30 days, a year is always 365 days,
and a day is always 86400 seconds.
.IP 5.
A date format of the form YYYY/MM/DD, YYYY-MM-DD, MM/DD/YYYY, or
MM/DD/YYYY, which indicates midnight on the day in question, relative
to the current timezone settings. For instance, "2002/3/5",
"03-05-2002", and "2002-3-05" all mean March 5th, 2002.
.SH REMOTE OPERATION
In order to access remote files, rdiff-backup opens up a pipe to a
copy of rdiff-backup running on the remote machine. Thus rdiff-backup
must be installed on both ends. To open this pipe, rdiff-backup first
splits the filename into host_info::pathname. It then substitutes
host_info into the remote schema, and runs the resulting command,
reading its input and output.
.PP
The default remote schema is 'ssh %s rdiff-backup --server' meaning if
the host_info is user@host.net, then rdiff-backup runs 'ssh
user@host.net rdiff-backup --server'. The '%s' keyword is substituted
with the host_info. Using --remote-schema, rdiff-backup can invoke an
arbitrary command in order to open up a remote pipe. For instance,
.RS
rdiff-backup --remote-schema 'cd /usr; %s' foo 'rdiff-backup
--server'::bar
.RE
is basically equivalent to (but slower than)
.RS
rdiff-backup foo /usr/bar
.RE
.PP
Concerning quoting, if for some reason you need to put two consecutive
colons in the host_info section of a host_info::pathname argument, or
in the pathname of a local file, you can quote one of them by
prepending a backslash. So in 'a\\::b::c', host_info is 'a::b' and
the pathname is 'c'. Similarly, if you want to refer to a local file
whose filename contains two consecutive colons, like 'strange::file',
you'll have to quote one of the colons as in 'strange\\::file'.
Because the backslash is a quote character in these circumstances, it
too must be quoted to get a literal backslash, so 'foo\\::\\\\bar'
evaluates to 'foo::\\bar'. To make things more complicated, because
the backslash is also a common shell quoting character, you may need
to type in '\\\\\\\\' at the shell prompt to get a literal backslash
(if it makes you feel better, I had to type in 8 backslashes to get
that in this man page...). And finally, to include a literal % in the
string specified by --remote-schema, quote it with another %, as in
%%.
.SH FILE SELECTION
.B rdiff-backup
supports file selection options similar to (but different from)
.BR rsync (1).
The system may appear complicated, but it is supposed to be flexible
and easy-to-use.
When rdiff-backup is run, it searches through the given source
directory and backs up all the files specified by the file selection
system. The file selection system comprises a number of file
selection conditions, which are set using one of the following command
line options:
.BR --exclude , --exclude-device-files , --exclude-filelist ,
.BR --exclude-globbing-filelist ,
.BR --exclude-filelist-stdin , --exclude-regexp , --exclude-special-files ,
.BR --include ,
.BR --include-filelist , --include-globbing-filelist ,
.BR --include-filelist-stdin ,
and
.BR --include-regexp .
Each file selection condition either matches or doesn't match a given
file. A given file is excluded by the file selection system exactly
when the first matching file selection condition specifies that the
file be excluded; otherwise the file is included.
For instance,
.PP
.RS
rdiff-backup --include /usr --exclude /usr /usr /backup
.PP
.RE
is exactly the same as
.PP
.RS
rdiff-backup /usr /backup
.PP
.RE
because the include and exclude directives match exactly the same
files, and the
.B --include
comes first, giving it precedence. Similarly,
.PP
.RS
rdiff-backup --include /usr/local/bin --exclude /usr/local /usr /backup
.PP
.RE
would backup the /usr/local/bin directory (and its contents), but not
/usr/local/doc.
The
.BR include ,
.BR exclude ,
.BR include-globbing-filelist ,
and
.B exclude-globbing-filelist
options accept
.IR "extended shell globbing patterns" .
These patterns can contain the special patterns
.BR * ,
.BR ** ,
.BR ? ,
and
.BR [...] .
As in a normal shell,
.B *
can be expanded to any string of characters not containing "/",
.B ?
expands to any character except "/", and
.B [...]
expands to a single character of those characters specified (ranges
are acceptable). The new special pattern,
.BR ** ,
expands to any string of characters whether or not it contains "/".
Furthermore, if the pattern starts with "ignorecase:" (case
insensitive), then this prefix will be removed and any character in
the string can be replaced with an upper- or lowercase version of
itself.
Remember that you may need to quote these characters when typing them
into a shell, so the shell does not interpret the globbing patterns
before rdiff-backup sees them.
The
.BI "--exclude " pattern
option matches a file iff:
.TP
.B 1.
.I pattern
can be expanded into the file's filename, or
.TP
.B 2.
the file is inside a directory matched by the option.
.PP
.RE
Conversely,
.BI "--include " pattern
matches a file iff:
.TP
.B 1.
.I pattern
can be expanded into the file's filename,
.TP
.B 2.
the file is inside a directory matched by the option, or
.TP
.B 3.
the file is a directory which contains a file matched by the option.
.PP
.RE
For example,
.PP
.RS
.B --exclude
/usr/local
.PP
.RE
matches /usr/local, /usr/local/lib, and /usr/local/lib/netscape. It
is the same as --exclude /usr/local --exclude '/usr/local/**'.
.PP
.RS
.B --include
/usr/local
.PP
.RE
specifies that /usr, /usr/local, /usr/local/lib, and
/usr/local/lib/netscape (but not /usr/doc) all be backed up. Thus you
don't have to worry about including parent directories to make sure
that included subdirectories have somewhere to go. Finally,
.PP
.RS
.B --include
ignorecase:'/usr/[a-z0-9]foo/*/**.py'
.PP
.RE
would match a file like /usR/5fOO/hello/there/world.py. If it did
match anything, it would also match /usr. If there is no existing
file that the given pattern can be expanded into, the option will not
match /usr.
The
.BR --include-filelist ,
.BR --exclude-filelist ,
.BR --include-filelist-stdin ,
and
.B --exclude-filelist-stdin
options also introduce file selection conditions. They direct
rdiff-backup to read in a file, each line of which is a file
specification, and to include or exclude the matching files. Lines
are separated by newlines or nulls, depending on whether the
--null-separator switch was given. Each line in a filelist is
interpreted similarly to the way
.I extended shell patterns
are, with a few exceptions:
.TP
.B 1.
Globbing patterns like
.BR * ,
.BR ** ,
.BR ? ,
and
.B [...]
are not expanded.
.TP
.B 2.
Include patterns do not match files in a directory that is included.
So /usr/local in an include file will not match /usr/local/doc.
.TP
.B 3.
Lines starting with "+ " are interpreted as include directives, even
if found in a filelist referenced by
.BR --exclude-filelist .
Similarly, lines starting with "- " exclude files even if they are
found within an include filelist.
.RE
For example, if the file "list.txt" contains the lines:
.RS
/usr/local
.RE
.RS
- /usr/local/doc
.RE
.RS
/usr/local/bin
.RE
.RS
+ /var
.RE
.RS
- /var
.RE
then "--include-filelist list.txt" would include /usr, /usr/local, and
/usr/local/bin. It would exclude /usr/local/doc,
/usr/local/doc/python, etc. It neither excludes nor includes
/usr/local/man, leaving the fate of this directory to the next
specification condition. Finally, it is undefined what happens with
/var. A single file list should not contain conflicting file
specifications.
The
.B --include-globbing-filelist
and
.B --exclude-globbing-filelist
options also specify filelists, but each line in the filelist will be
interpreted as a globbing pattern the way
.B --include
and
.B --exclude
options are interpreted (although "+ " and "- " prefixing is still
allowed). For instance, if the file "globbing-list.txt" contains the
lines:
.RE
.RS
dir/foo
.RE
.RS
+ dir/bar
.RE
.RS
- **
.RE
Then "--include-globbing-filelist globbing-list.txt" would be exactly
the same as specifying "--include dir/foo --include dir/bar --exclude **"
on the command line.
Finally, the
.B --include-regexp
and
.B --exclude-regexp
allow files to be included and excluded if their filenames match a
python regular expression. Regular expression syntax is too
complicated to explain here, but is covered in Python's library
reference. Unlike the
.B --include
and
.B --exclude
options, the regular expression options don't match files containing
or contained in matched files. So for instance
.PP
.RS
--include '[0-9]{7}(?!foo)'
.PP
.RE
matches any files whose full pathnames contain 7 consecutive digits
which aren't followed by 'foo'. However, it wouldn't match /home even
if /home/ben/1234567 existed.
.SH STATISTICS
Every session rdiff-backup saves various statistics into two files,
the session statistics file at
rdiff-backup-data/session_statistics.<time>.data and the directory
statistics file at rdiff-backup-data/directory_statistics.<time>.data.
They are both text files and contain similar information: how many
files changed, how many were deleted, the total size of increment
files created, etc. However, the session statistics file is intended
to be very readable and only describes the session as a whole. The
directory statistics file is more compact (and slightly less readable)
but describes every directory backed up. It also may be compressed to
save space.
Statistics related options include
.B --print-statistics
and
.BR --null-separator .
Also, rdiff-backup will save various messages to the log file, which
is rdiff-backup-data/backup.log for backup sessions and
rdiff-backup-data/restore.log for restore sessions. Generally what is
written to this file will coincide with the messages diplayed to
stdout or stderr, although this can be changed with the
.B --terminal-verbosity
option.
The log file is not compressed and can become quite large if
rdiff-backup is run with high verbosity.
.SH BUGS
rdiff-backup uses the shell command
.BR mknod (1)
to backup device files (e.g. /dev/ttyS0), so device files won't be
handled correctly on systems with non-standard mknod syntax.
.PP
Files whose names are close to the maximum length (e.g. 235 chars if
the maximum is 255) may be skipped because the filenames of related
increment files would be too long.
.PP
The gzip library in versions 2.2 and earlier of python (but fixed in
2.3a1) has trouble producing files over 2GB in length. This bug will
prevent rdiff-backup from producing large compressed increments
(snapshots or diffs). A workaround is to disable compression for
large uncompressable files.
.SH AUTHOR
Ben Escoto <bescoto@stanford.edu>
.PP
Feel free to ask me questions or send me bug reports, but you may want to see the web page, mentioned below, first.
.SH SEE ALSO
.BR python (1),
.BR rdiff (1),
.BR rsync (1),
.BR ssh (1).
The main rdiff-backup web page is at
.IR http://rdiff-backup.stanford.edu/ .
It has more information, links to the mailing list and CVS, etc.
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Coordinate corresponding files with different names
For instance, some source filenames may contain characters not allowed
on the mirror end. These files must be called something different on
the mirror end, so we escape the offending characters with semicolons.
One problem/complication is that all this escaping may put files over
the 256 or whatever limit on the length of file names. (We just don't
handle that error.)
"""
import re
import Globals, log, rpath
max_filename_length = 255
# If true, enable character quoting, and set characters making
# regex-style range.
chars_to_quote = None
# These compiled regular expressions are used in quoting and unquoting
chars_to_quote_regexp = None
unquoting_regexp = None
# Use given char to quote. Default is set in Globals.
quoting_char = None
class QuotingException(Exception): pass
def set_init_quote_vals():
"""Set quoting value from Globals on all conns"""
for conn in Globals.connections:
conn.FilenameMapping.set_init_quote_vals_local()
def set_init_quote_vals_local():
"""Set value on local connection, initialize regexps"""
global chars_to_quote, quoting_char
chars_to_quote = Globals.chars_to_quote
if len(Globals.quoting_char) != 1:
log.Log.FatalError("Expected single character for quoting char,"
"got '%s' instead" % (Globals.quoting_char,))
quoting_char = Globals.quoting_char
init_quoting_regexps()
def init_quoting_regexps():
"""Compile quoting regular expressions"""
global chars_to_quote_regexp, unquoting_regexp
try:
chars_to_quote_regexp = \
re.compile("[%s]|%s" % (chars_to_quote, quoting_char), re.S)
unquoting_regexp = re.compile("%s[0-9]{3}" % quoting_char, re.S)
except re.error:
log.Log.FatalError("Error '%s' when processing char quote list %s" %
(re.error, chars_to_quote))
def quote(path):
"""Return quoted version of given path
Any characters quoted will be replaced by the quoting char and
the ascii number of the character. For instance, "10:11:12"
would go to "10;05811;05812" if ":" were quoted and ";" were
the quoting character.
"""
return chars_to_quote_regexp.sub(quote_single, path)
def quote_single(match):
"""Return replacement for a single character"""
return "%s%03d" % (quoting_char, ord(match.group()))
def unquote(path):
"""Return original version of quoted filename"""
return unquoting_regexp.sub(unquote_single, path)
def unquote_single(match):
"""Unquote a single quoted character"""
if not len(match.group()) == 4:
raise QuotingException("Quoted group wrong size: " + match.group())
try: return chr(int(match.group()[1:]))
except ValueError:
raise QuotingException("Quoted out of range: " + match.group())
class QuotedRPath(rpath.RPath):
"""RPath where the filename is quoted version of index
We use QuotedRPaths so we don't need to remember to quote RPaths
derived from this one (via append or new_index). Note that only
the index is quoted, not the base.
"""
def __init__(self, connection, base, index = (), data = None):
"""Make new QuotedRPath"""
quoted_index = tuple(map(quote, index))
rpath.RPath.__init__(self, connection, base, quoted_index, data)
self.index = index
def listdir(self):
"""Return list of unquoted filenames in current directory
We want them unquoted so that the results can be sorted
correctly and append()ed to the currect QuotedRPath.
"""
return map(unquote, self.conn.os.listdir(self.path))
def __str__(self):
return "QuotedPath: %s\nIndex: %s\nData: %s" % \
(self.path, self.index, self.data)
def isincfile(self):
"""Return true if path indicates increment, sets various variables"""
result = rpath.RPath.isincfile(self)
if result: self.inc_basestr = unquote(self.inc_basestr)
return result
def get_quotedrpath(rp, separate_basename = 0):
"""Return quoted version of rpath rp"""
assert not rp.index # Why would we starting quoting "in the middle"?
if separate_basename:
dirname, basename = rp.dirsplit()
return QuotedRPath(rp.conn, dirname, (unquote(basename),), rp.data)
else: return QuotedRPath(rp.conn, rp.base, (), rp.data)
def get_quoted_sep_base(filename):
"""Get QuotedRPath from filename assuming last bit is quoted"""
return get_quotedrpath(rpath.RPath(Globals.local_connection, filename), 1)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Hold a variety of constants usually set at initialization."""
import re, os
# The current version of rdiff-backup
version = "$version"
# If this is set, use this value in seconds as the current time
# instead of reading it from the clock.
current_time = None
# This determines how many bytes to read at a time when copying
blocksize = 32768
# This is used by the BufferedRead class to determine how many
# bytes to request from the underlying file per read(). Larger
# values may save on connection overhead and latency.
conn_bufsize = 98304
# This is used in rorpiter.CacheIndexable. The number represents the
# number of rpaths which may be stuck in buffers when moving over a
# remote connection.
pipeline_max_length = 500
# True if script is running as a server
server = None
# uid and gid of the owner of the rdiff-backup process. This can
# vary depending on the connection.
process_uid = os.getuid()
process_gid = os.getgid()
# If true, when copying attributes, also change target's uid/gid
change_ownership = None
# If true, when copying attributes, also change target's permission.
change_permissions = 1
# If true, change the permissions of unwriteable mirror files
# (such as directories) so that they can be written, and then
# change them back. This defaults to 1 just in case the process
# is not running as root (root doesn't need to change
# permissions).
change_mirror_perms = (process_uid != 0)
# If true, try to reset the atimes of the source partition.
preserve_atime = None
# This will be set as soon as the LocalConnection class loads
local_connection = None
# All connections should be added to the following list, so
# further global changes can be propagated to the remote systems.
# The first element should be Globals.local_connection. For a
# server, the second is the connection to the client.
connections = []
# Each process should have a connection number unique to the
# session. The client has connection number 0.
connection_number = 0
# Dictionary pairing connection numbers with connections. Set in
# SetConnections for all connections.
connection_dict = {}
# True if the script is the end that reads the source directory
# for backups. It is true for purely local sessions.
isbackup_reader = None
# Connection of the real backup reader (for which isbackup_reader
# is true)
backup_reader = None
# True if the script is the end that writes to the increment and
# mirror directories. True for purely local sessions.
isbackup_writer = None
# Connection of the backup writer
backup_writer = None
# Connection of the client
client_conn = None
# This list is used by the set function below. When a new
# connection is created with init_connection, its Globals class
# will match this one for all the variables mentioned in this
# list.
changed_settings = []
# The RPath or QuotedRPath of the rdiff-backup-data directory.
rbdir = None
# quoting_enabled is true if we should quote certain characters in
# filenames on the source side (see FilenameMapping for more
# info). chars_to_quote is a string whose characters should be
# quoted, and quoting_char is the character to quote with.
quoting_enabled = None
chars_to_quote = "A-Z:"
quoting_char = ';'
# If true, emit output intended to be easily readable by a
# computer. False means output is intended for humans.
parsable_output = None
# If true, then hardlinks will be preserved to mirror and recorded
# in the increments directory. There is also a difference here
# between None and 0. When restoring, None or 1 means to preserve
# hardlinks iff can find a hardlink dictionary. 0 means ignore
# hardlink information regardless.
preserve_hardlinks = 1
# If this is false, then rdiff-backup will not compress any
# increments. Default is to compress based on regexp below.
compression = 1
# Increments based on files whose names match this
# case-insensitive regular expression won't be compressed (applies
# to .snapshots and .diffs). The second below will be the
# compiled version of the first.
no_compression_regexp_string = "(?i).*\\.(gz|z|bz|bz2|tgz|zip|rpm|deb|" \
"jpg|gif|png|jp2|mp3|ogg|avi|wmv|mpeg|mpg|rm|mov)$"
no_compression_regexp = None
# If true, filelists and directory statistics will be split on
# nulls instead of newlines.
null_separator = None
# Determines whether or not ssh will be run with the -C switch
ssh_compression = 1
# If true, print statistics after successful backup
print_statistics = None
# Controls whether file_statistics file is written in
# rdiff-backup-data dir. These can sometimes take up a lot of space.
file_statistics = 1
# On the writer connection, the following will be set to the mirror
# Select iterator.
select_mirror = None
# On the backup writer connection, holds the root incrementing branch
# object. Access is provided to increment error counts.
ITRB = None
# security_level has 4 values and controls which requests from remote
# systems will be honored. "all" means anything goes. "read-only"
# means that the requests must not write to disk. "update-only" means
# that requests shouldn't destructively update the disk (but normal
# incremental updates are OK). "minimal" means only listen to a few
# basic requests.
security_level = "all"
# If this is set, it indicates that the remote connection should only
# deal with paths inside of restrict_path.
restrict_path = None
# If set, a file will be marked as changed if its inode changes. See
# the man page under --no-compare-inode for more information.
compare_inode = 1
# If set, directories can be fsync'd just like normal files, to
# guarantee that any changes have been committed to disk.
fsync_directories = 1
# If set, directory increments are given the same permissions as the
# directories they represent. Otherwise they have the default
# permissions.
change_dir_inc_perms = 1
def get(name):
"""Return the value of something in this module"""
return globals()[name]
def is_not_None(name):
"""Returns true if value is not None"""
return globals()[name] is not None
def set(name, val):
"""Set the value of something in this module
Use this instead of writing the values directly if the setting
matters to remote sides. This function updates the
changed_settings list, so other connections know to copy the
changes.
"""
changed_settings.append(name)
globals()[name] = val
def set_integer(name, val):
"""Like set, but make sure val is an integer"""
try: intval = int(val)
except ValueError:
Log.FatalError("Variable %s must be set to an integer -\n"
"received %s instead." % (name, val))
set(name, intval)
def set_float(name, val, min = None, max = None, inclusive = 1):
"""Like set, but make sure val is float within given bounds"""
def error():
s = "Variable %s must be set to a float" % (name,)
if min is not None and max is not None:
s += " between %s and %s " % (min, max)
if inclusive: s += "inclusive"
else: s += "not inclusive"
elif min is not None or max is not None:
if inclusive: inclusive_string = "or equal to "
else: inclusive_string = ""
if min is not None:
s += " greater than %s%s" % (inclusive_string, min)
else: s+= " less than %s%s" % (inclusive_string, max)
Log.FatalError(s)
try: f = float(val)
except ValueError: error()
if min is not None:
if inclusive and f < min: error()
elif not inclusive and f <= min: error()
if max is not None:
if inclusive and f > max: error()
elif not inclusive and f >= max: error()
set(name, f)
def get_dict_val(name, key):
"""Return val from dictionary in this class"""
return globals()[name][key]
def set_dict_val(name, key, val):
"""Set value for dictionary in this class"""
globals()[name][key] = val
def postset_regexp(name, re_string, flags = None):
"""Compile re_string on all existing connections, set to name"""
for conn in connections:
conn.Globals.postset_regexp_local(name, re_string, flags)
def postset_regexp_local(name, re_string, flags):
"""Set name to compiled re_string locally"""
if flags: globals()[name] = re.compile(re_string, flags)
else: globals()[name] = re.compile(re_string)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Preserve and restore hard links
If the preserve_hardlinks option is selected, linked files in the
source directory will be linked in the mirror directory. Linked files
are treated like any other with respect to incrementing, but their
link status can be retrieved because their device location and inode #
is written in the metadata file.
All these functions are meant to be executed on the mirror side. The
source side should only transmit inode information.
"""
from __future__ import generators
import cPickle
import Globals, Time, rpath, log, robust
# In all of these lists of indicies are the values. The keys in
# _inode_ ones are (inode, devloc) pairs.
_src_inode_indicies = None
_dest_inode_indicies = None
# The keys for these two are just indicies. They share values
# with the earlier dictionaries.
_src_index_indicies = None
_dest_index_indicies = None
# When a linked file is restored, its path is added to this dict,
# so it can be found when later paths being restored are linked to
# it.
_restore_index_path = None
def initialize_dictionaries():
"""Set all the hard link dictionaries to empty"""
global _src_inode_indicies, _dest_inode_indicies
global _src_index_indicies, _dest_index_indicies, _restore_index_path
_src_inode_indicies = {}
_dest_inode_indicies = {}
_src_index_indicies = {}
_dest_index_indicies = {}
_restore_index_path = {}
def clear_dictionaries():
"""Delete all dictionaries"""
global _src_inode_indicies, _dest_inode_indicies
global _src_index_indicies, _dest_index_indicies, _restore_index_path
_src_inode_indicies = _dest_inode_indicies = None
_src_index_indicies = _dest_index_indicies = _restore_index_path = None
def get_inode_key(rorp):
"""Return rorp's key for _inode_ dictionaries"""
return (rorp.getinode(), rorp.getdevloc())
def get_indicies(rorp, source):
"""Return a list of similarly linked indicies, using rorp's index"""
if source: dict = _src_index_indicies
else: dict = _dest_index_indicies
try: return dict[rorp.index]
except KeyError: return []
def add_rorp(rorp, source):
"""Process new rorp and update hard link dictionaries
First enter it into src_inode_indicies. If we have already
seen all the hard links, then we can delete the entry.
Everything must stay recorded in src_index_indicies though.
"""
if not rorp.isreg() or rorp.getnumlinks() < 2: return
if source:
inode_dict, index_dict = _src_inode_indicies, _src_index_indicies
else: inode_dict, index_dict = _dest_inode_indicies, _dest_index_indicies
rp_inode_key = get_inode_key(rorp)
if inode_dict.has_key(rp_inode_key):
index_list = inode_dict[rp_inode_key]
index_list.append(rorp.index)
if len(index_list) == rorp.getnumlinks():
del inode_dict[rp_inode_key]
else: # make new entry in both src dicts
index_list = [rorp.index]
inode_dict[rp_inode_key] = index_list
index_dict[rorp.index] = index_list
def add_rorp_iter(iter, source):
"""Return new rorp iterator like iter that add_rorp's first"""
for rorp in iter:
add_rorp(rorp, source)
yield rorp
def rorp_eq(src_rorp, dest_rorp):
"""Compare hardlinked for equality
Two files may otherwise seem equal but be hardlinked in
different ways. This function considers them equal enough if
they have been hardlinked correctly to the previously seen
indicies.
"""
if (not src_rorp.isreg() or not dest_rorp.isreg() or
src_rorp.getnumlinks() == dest_rorp.getnumlinks() == 1):
return 1 # Hard links don't apply
src_index_list = get_indicies(src_rorp, 1)
dest_index_list = get_indicies(dest_rorp, None)
# If a list only has one element, then it is only hardlinked
# to itself so far, so that is not a genuine difference yet.
if not src_index_list or len(src_index_list) == 1:
return not dest_index_list or len(dest_index_list) == 1
if not dest_index_list or len(dest_index_list) == 1: return None
# Both index lists exist and are non-empty
return src_index_list == dest_index_list # they are always sorted
def islinked(rorp):
"""True if rorp's index is already linked to something on src side"""
return len(get_indicies(rorp, 1)) >= 2
def get_link_index(rorp):
"""Return first index on target side rorp is already linked to"""
return get_indicies(rorp, 1)[0]
def restore_link(index, rpath):
"""Restores a linked file by linking it
When restoring, all the hardlink data is already present, and
we can only link to something already written. In either
case, add to the _restore_index_path dict, so we know later
that the file is available for hard
linking.
Returns true if succeeded in creating rpath, false if must
restore rpath normally.
"""
if index not in _src_index_indicies: return None
for linked_index in _src_index_indicies[index]:
if linked_index in _restore_index_path:
srcpath = _restore_index_path[linked_index]
log.Log("Restoring %s by hard linking to %s" %
(rpath.path, srcpath), 6)
rpath.hardlink(srcpath)
return 1
_restore_index_path[index] = rpath.path
return None
def link_rp(diff_rorp, dest_rpath, dest_root = None):
"""Make dest_rpath into a link using link flag in diff_rorp"""
if not dest_root: dest_root = dest_rpath # use base of dest_rpath
dest_link_rpath = dest_root.new_index(diff_rorp.get_link_flag())
dest_rpath.hardlink(dest_link_rpath.path)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Start (and end) here - read arguments, set global settings, etc."""
from __future__ import generators
import getopt, sys, re, os
from log import Log, LoggerError, ErrorLog
import Globals, Time, SetConnections, selection, robust, rpath, \
manage, backup, connection, restore, FilenameMapping, \
Security, Hardlink, regress, C
action = None
remote_cmd, remote_schema = None, None
force = None
select_opts = []
select_files = []
def parse_cmdlineoptions(arglist):
"""Parse argument list and set global preferences"""
global args, action, force, restore_timestr, remote_cmd, remote_schema
global remove_older_than_string
def sel_fl(filename):
"""Helper function for including/excluding filelists below"""
try: return open(filename, "r")
except IOError: Log.FatalError("Error opening file %s" % filename)
try: optlist, args = getopt.getopt(arglist, "blr:sv:V",
["backup-mode", "calculate-average", "chars-to-quote=",
"check-destination-dir", "current-time=", "exclude=",
"exclude-device-files", "exclude-filelist=",
"exclude-filelist-stdin", "exclude-globbing-filelist=",
"exclude-mirror=", "exclude-other-filesystems",
"exclude-regexp=", "exclude-special-files", "force",
"include=", "include-filelist=", "include-filelist-stdin",
"include-globbing-filelist=", "include-regexp=",
"list-at-time=", "list-changed-since=", "list-increments",
"no-change-dir-inc-perms", "no-compare-inode",
"no-compression", "no-compression-regexp=",
"no-file-statistics", "no-hard-links", "null-separator",
"parsable-output", "print-statistics", "quoting-char=",
"remote-cmd=", "remote-schema=", "remove-older-than=",
"restore-as-of=", "restrict=", "restrict-read-only=",
"restrict-update-only=", "server", "ssh-no-compression",
"terminal-verbosity=", "test-server", "verbosity=",
"version", "windows-mode", "windows-restore"])
except getopt.error, e:
commandline_error("Bad commandline options: %s" % str(e))
for opt, arg in optlist:
if opt == "-b" or opt == "--backup-mode": action = "backup"
elif opt == "--calculate-average": action = "calculate-average"
elif opt == "--check-destination-dir": action = "check-destination-dir"
elif opt == "--chars-to-quote":
Globals.set('chars_to_quote', arg)
Globals.set('quoting_enabled', 1)
elif opt == "--current-time":
Globals.set_integer('current_time', arg)
elif opt == "--exclude": select_opts.append((opt, arg))
elif opt == "--exclude-device-files": select_opts.append((opt, arg))
elif opt == "--exclude-filelist":
select_opts.append((opt, arg))
select_files.append(sel_fl(arg))
elif opt == "--exclude-filelist-stdin":
select_opts.append(("--exclude-filelist", "standard input"))
select_files.append(sys.stdin)
elif opt == "--exclude-globbing-filelist":
select_opts.append((opt, arg))
select_files.append(sel_fl(arg))
elif (opt == "--exclude-other-filesystems" or
opt == "--exclude-regexp" or
opt == "--exclude-special-files"): select_opts.append((opt, arg))
elif opt == "--force": force = 1
elif opt == "--include": select_opts.append((opt, arg))
elif opt == "--include-filelist":
select_opts.append((opt, arg))
select_files.append(sel_fl(arg))
elif opt == "--include-filelist-stdin":
select_opts.append(("--include-filelist", "standard input"))
select_files.append(sys.stdin)
elif opt == "--include-globbing-filelist":
select_opts.append((opt, arg))
select_files.append(sel_fl(arg))
elif opt == "--include-regexp": select_opts.append((opt, arg))
elif opt == "--list-at-time":
restore_timestr, action = arg, "list-at-time"
elif opt == "--list-changed-since":
restore_timestr, action = arg, "list-changed-since"
elif opt == "-l" or opt == "--list-increments":
action = "list-increments"
elif opt == "--no-change-dir-inc-perms":
Globals.set("change_dir_inc_perms", 0)
elif opt == "--no-compare-inode": Globals.set("compare_inode", 0)
elif opt == "--no-compression": Globals.set("compression", None)
elif opt == "--no-compression-regexp":
Globals.set("no_compression_regexp_string", arg)
elif opt == "--no-file-statistics": Globals.set('file_statistics', 0)
elif opt == "--no-hard-links": Globals.set('preserve_hardlinks', 0)
elif opt == "--null-separator": Globals.set("null_separator", 1)
elif opt == "--parsable-output": Globals.set('parsable_output', 1)
elif opt == "--print-statistics": Globals.set('print_statistics', 1)
elif opt == "--quoting-char":
Globals.set('quoting_char', arg)
Globals.set('quoting_enabled', 1)
elif opt == "-r" or opt == "--restore-as-of":
restore_timestr, action = arg, "restore-as-of"
elif opt == "--remote-cmd": remote_cmd = arg
elif opt == "--remote-schema": remote_schema = arg
elif opt == "--remove-older-than":
remove_older_than_string = arg
action = "remove-older-than"
elif opt == "--restrict": Globals.restrict_path = arg
elif opt == "--restrict-read-only":
Globals.security_level = "read-only"
Globals.restrict_path = arg
elif opt == "--restrict-update-only":
Globals.security_level = "update-only"
Globals.restrict_path = arg
elif opt == "-s" or opt == "--server":
action = "server"
Globals.server = 1
elif opt == "--ssh-no-compression":
Globals.set('ssh_compression', None)
elif opt == "--terminal-verbosity": Log.setterm_verbosity(arg)
elif opt == "--test-server": action = "test-server"
elif opt == "-V" or opt == "--version":
print "rdiff-backup " + Globals.version
sys.exit(0)
elif opt == "-v" or opt == "--verbosity": Log.setverbosity(arg)
elif opt == "--windows-mode":
Globals.set('chars_to_quote', "^a-z0-9._ -")
Globals.set('quoting_enabled', 1)
Globals.set('preserve_hardlinks', 0)
Globals.set('change_ownership', 0)
Globals.set('change_permissions', 0)
Globals.set('fsync_directories', 0)
elif opt == '--windows-restore':
Globals.set('chars_to_quote', "^a-z0-9._ -")
Globals.set('quoting_enabled', 1)
else: Log.FatalError("Unknown option %s" % opt)
def isincfilename(path):
"""Return true if path is of a (possibly quoted) increment file"""
rp = rpath.RPath(Globals.local_connection, path)
if Globals.quoting_enabled:
if not FilenameMapping.quoting_char:
FilenameMapping.set_init_quote_vals()
rp = FilenameMapping.get_quotedrpath(rp, separate_basename = 1)
result = rp.isincfile()
return result
def set_action():
"""Check arguments and try to set action"""
global action
l = len(args)
if not action:
if l == 0: commandline_error("No arguments given")
elif l == 1: action = "restore"
elif l == 2:
if isincfilename(args[0]): action = "restore"
else: action = "backup"
else: commandline_error("Too many arguments given")
if l == 0 and action != "server":
commandline_error("No arguments given")
if l > 0 and action == "server":
commandline_error("Too many arguments given")
if l < 2 and (action == "backup" or action == "restore-as-of"):
commandline_error("Two arguments are required (source, destination).")
if l == 2 and (action == "list-increments" or
action == "remove-older-than" or
action == "list-at-time" or
action == "list-changed-since" or
action == "check-destination-dir"):
commandline_error("Only use one argument, "
"the root of the backup directory")
if l > 2 and action != "calculate-average":
commandline_error("Too many arguments given")
def commandline_error(message):
sys.stderr.write("Error: %s\n" % message)
sys.stderr.write("See the rdiff-backup manual page for instructions\n")
sys.exit(1)
def misc_setup(rps):
"""Set default change ownership flag, umask, relay regexps"""
os.umask(077)
Time.setcurtime(Globals.current_time)
FilenameMapping.set_init_quote_vals()
SetConnections.UpdateGlobal("client_conn", Globals.local_connection)
Globals.postset_regexp('no_compression_regexp',
Globals.no_compression_regexp_string)
for conn in Globals.connections:
conn.robust.install_signal_handlers()
conn.Hardlink.initialize_dictionaries()
def take_action(rps):
"""Do whatever action says"""
if action == "server":
connection.PipeConnection(sys.stdin, sys.stdout).Server()
sys.exit(0)
elif action == "backup": Backup(rps[0], rps[1])
elif action == "restore": Restore(*rps)
elif action == "restore-as-of": RestoreAsOf(rps[0], rps[1])
elif action == "test-server": SetConnections.TestConnections()
elif action == "list-at-time": ListAtTime(rps[0])
elif action == "list-changed-since": ListChangedSince(rps[0])
elif action == "list-increments": ListIncrements(rps[0])
elif action == "remove-older-than": RemoveOlderThan(rps[0])
elif action == "calculate-average": CalculateAverage(rps)
elif action == "check-destination-dir": CheckDest(rps[0])
else: raise AssertionError("Unknown action " + action)
def cleanup():
"""Do any last minute cleaning before exiting"""
Log("Cleaning up", 6)
if ErrorLog.isopen(): ErrorLog.close()
Log.close_logfile()
if not Globals.server: SetConnections.CloseConnections()
def Main(arglist):
"""Start everything up!"""
parse_cmdlineoptions(arglist)
set_action()
cmdpairs = SetConnections.get_cmd_pairs(args, remote_schema, remote_cmd)
Security.initialize(action, cmdpairs)
rps = map(SetConnections.cmdpair2rp, cmdpairs)
misc_setup(rps)
take_action(rps)
cleanup()
def Backup(rpin, rpout):
"""Backup, possibly incrementally, src_path to dest_path."""
if Globals.quoting_enabled:
rpout = FilenameMapping.get_quotedrpath(rpout)
SetConnections.BackupInitConnections(rpin.conn, rpout.conn)
backup_set_select(rpin)
backup_init_dirs(rpin, rpout)
if prevtime:
rpout.conn.Main.backup_touch_curmirror_local(rpin, rpout)
Time.setprevtime(prevtime)
backup.Mirror_and_increment(rpin, rpout, incdir)
rpout.conn.Main.backup_remove_curmirror_local()
else:
backup.Mirror(rpin, rpout)
rpout.conn.Main.backup_touch_curmirror_local(rpin, rpout)
def backup_set_select(rpin):
"""Create Select objects on source connection"""
rpin.conn.backup.SourceStruct.set_source_select(rpin, select_opts,
*select_files)
def backup_init_dirs(rpin, rpout):
"""Make sure rpin and rpout are valid, init data dir and logging"""
global datadir, incdir, prevtime
if rpout.lstat() and not rpout.isdir():
if not force: Log.FatalError("Destination %s exists and is not a "
"directory" % rpout.path)
else:
Log("Deleting %s" % rpout.path, 3)
rpout.delete()
if not rpin.lstat():
Log.FatalError("Source directory %s does not exist" % rpin.path)
elif not rpin.isdir():
Log.FatalError("Source %s is not a directory" % rpin.path)
datadir = rpout.append_path("rdiff-backup-data")
SetConnections.UpdateGlobal('rbdir', datadir)
checkdest_if_necessary(rpout)
incdir = datadir.append_path("increments")
prevtime = backup_get_mirrortime()
if rpout.lstat():
if rpout.isdir() and not rpout.listdir(): # rpout is empty dir
if Globals.change_permissions:
rpout.chmod(0700) # just make sure permissions aren't too lax
elif not datadir.lstat() and not force: Log.FatalError(
"""Destination directory
%s
exists, but does not look like a rdiff-backup directory. Running
rdiff-backup like this could mess up what is currently in it. If you
want to update or overwrite it, run rdiff-backup with the --force
option.""" % rpout.path)
if not rpout.lstat():
try: rpout.mkdir()
except os.error:
Log.FatalError("Unable to create directory %s" % rpout.path)
if not datadir.lstat(): datadir.mkdir()
inc_base = datadir.append_path("increments")
if not inc_base.lstat(): inc_base.mkdir()
if Log.verbosity > 0:
Log.open_logfile(datadir.append("backup.log"))
ErrorLog.open(Time.curtimestr, compress = Globals.compression)
backup_warn_if_infinite_regress(rpin, rpout)
def backup_warn_if_infinite_regress(rpin, rpout):
"""Warn user if destination area contained in source area"""
if rpout.conn is rpin.conn: # it's meaningful to compare paths
if ((len(rpout.path) > len(rpin.path)+1 and
rpout.path[:len(rpin.path)] == rpin.path and
rpout.path[len(rpin.path)] == '/') or
(rpin.path == "." and rpout.path[0] != '/' and
rpout.path[:2] != '..')):
# Just a few heuristics, we don't have to get every case
if Globals.backup_reader.Globals.select_source.Select(rpout): Log(
"""Warning: The destination directory '%s' may be contained in the
source directory '%s'. This could cause an infinite regress. You
may need to use the --exclude option.""" % (rpout.path, rpin.path), 2)
def backup_get_mirrortime():
"""Return time in seconds of previous mirror, or None if cannot"""
incbase = Globals.rbdir.append_path("current_mirror")
mirror_rps = restore.get_inclist(incbase)
assert len(mirror_rps) <= 1, \
"Found %s current_mirror rps, expected <=1" % (len(mirror_rps),)
if mirror_rps: return mirror_rps[0].getinctime()
else: return None
def backup_touch_curmirror_local(rpin, rpout):
"""Make a file like current_mirror.time.data to record time
When doing an incremental backup, this should happen before any
other writes, and the file should be removed after all writes.
That way we can tell whether the previous session aborted if there
are two current_mirror files.
When doing the initial full backup, the file can be created after
everything else is in place.
"""
mirrorrp = Globals.rbdir.append("current_mirror.%s.%s" % (Time.curtimestr,
"data"))
Log("Touching mirror marker %s" % mirrorrp.path, 6)
mirrorrp.touch()
mirrorrp.fsync_with_dir()
def backup_remove_curmirror_local():
"""Remove the older of the current_mirror files. Use at end of session"""
assert Globals.rbdir.conn is Globals.local_connection
curmir_incs = restore.get_inclist(Globals.rbdir.append("current_mirror"))
assert len(curmir_incs) == 2
if curmir_incs[0].getinctime() < curmir_incs[1].getinctime():
older_inc = curmir_incs[0]
else: older_inc = curmir_incs[1]
C.sync() # Make sure everything is written before curmirror is removed
older_inc.delete()
def Restore(src_rp, dest_rp = None):
"""Main restoring function
Here src_rp should be an increment file, and if dest_rp is
missing it defaults to the base of the increment.
"""
rpin, rpout = restore_check_paths(src_rp, dest_rp)
restore_common(rpin, rpout, rpin.getinctime())
def RestoreAsOf(rpin, target):
"""Secondary syntax for restore operation
rpin - RPath of mirror file to restore (not nec. with correct index)
target - RPath of place to put restored file
"""
rpin, rpout = restore_check_paths(rpin, target, 1)
try: time = Time.genstrtotime(restore_timestr)
except Time.TimeException, exc: Log.FatalError(str(exc))
restore_common(rpin, target, time)
def restore_common(rpin, target, time):
"""Restore operation common to Restore and RestoreAsOf"""
if target.conn.os.getuid() == 0:
SetConnections.UpdateGlobal('change_ownership', 1)
mirror_root, index = restore_get_root(rpin)
restore_check_backup_dir(mirror_root)
mirror = mirror_root.new_index(index)
inc_rpath = datadir.append_path('increments', index)
restore_set_select(mirror_root, target)
restore_start_log(rpin, target, time)
restore.Restore(mirror, inc_rpath, target, time)
Log("Restore ended", 4)
def restore_set_select(mirror_rp, target):
"""Set the selection iterator on mirror side from command line args
Here we set the selector on the mirror side, because that is where
we will be filtering, but the pathnames are relative to the target
directory.
"""
if select_opts:
mirror_rp.conn.restore.MirrorStruct.set_mirror_select(
target, select_opts, *select_files)
def restore_start_log(rpin, target, time):
"""Open restore log file, log initial message"""
try: Log.open_logfile(datadir.append("restore.log"))
except LoggerError, e: Log("Warning, " + str(e), 2)
# Log following message at file verbosity 3, but term verbosity 4
log_message = ("Starting restore of %s to %s as it was as of %s." %
(rpin.path, target.path, Time.timetopretty(time)))
if Log.term_verbosity >= 4: Log.log_to_term(log_message, 4)
if Log.verbosity >= 3: Log.log_to_file(log_message)
def restore_check_paths(rpin, rpout, restoreasof = None):
"""Check paths and return pair of corresponding rps"""
if not restoreasof:
if not rpin.lstat():
Log.FatalError("Source file %s does not exist" % rpin.path)
if Globals.quoting_enabled:
rpin = FilenameMapping.get_quotedrpath(rpin, 1)
if not rpin.isincfile():
Log.FatalError("""File %s does not look like an increment file.
Try restoring from an increment file (the filenames look like
"foobar.2001-09-01T04:49:04-07:00.diff").""" % rpin.path)
if not rpout: rpout = rpath.RPath(Globals.local_connection,
rpin.getincbase_str())
if rpout.lstat() and not force:
Log.FatalError("Restore target %s already exists, "
"specify --force to overwrite." % rpout.path)
return rpin, rpout
def restore_check_backup_dir(rpin):
"""Make sure backup dir root rpin is in consistent state"""
result = checkdest_need_check(rpin)
if result is None:
Log.FatalError("%s does not appear to be an rdiff-backup directory."
% (rpin.path,))
elif result == 1: Log.FatalError(
"Previous backup to %s seems to have failed."
"Rerun rdiff-backup with --check-destination-dir option to revert"
"directory to state before unsuccessful session." % (rpin.path,))
def restore_get_root(rpin):
"""Return (mirror root, index) and set the data dir
The idea here is to keep backing up on the path until we find
a directory that contains "rdiff-backup-data". That is the
mirror root. If the path from there starts
"rdiff-backup-data/increments*", then the index is the
remainder minus that. Otherwise the index is just the path
minus the root.
All this could fail if the increment file is pointed to in a
funny way, using symlinks or somesuch.
"""
global datadir
if rpin.isincfile(): relpath = rpin.getincbase().path
else: relpath = rpin.path
pathcomps = os.path.join(rpin.conn.os.getcwd(), relpath).split("/")
assert len(pathcomps) >= 2 # path should be relative to /
i = len(pathcomps)
while i >= 2:
parent_dir = rpath.RPath(rpin.conn, "/".join(pathcomps[:i]))
if (parent_dir.isdir() and
"rdiff-backup-data" in parent_dir.listdir()): break
i = i-1
else: Log.FatalError("Unable to find rdiff-backup-data directory")
if not Globals.quoting_enabled: rootrp = parent_dir
else: rootrp = FilenameMapping.get_quotedrpath(parent_dir)
Log("Using mirror root directory %s" % rootrp.path, 6)
datadir = rootrp.append_path("rdiff-backup-data")
SetConnections.UpdateGlobal('rbdir', datadir)
if not datadir.isdir():
Log.FatalError("Unable to read rdiff-backup-data directory %s" %
datadir.path)
from_datadir = tuple(pathcomps[i:])
if not from_datadir or from_datadir[0] != "rdiff-backup-data":
return (rootrp, from_datadir) # in mirror, not increments
assert from_datadir[1] == "increments"
return (rootrp, from_datadir[2:])
def ListIncrements(rp):
"""Print out a summary of the increments and their times"""
mirror_root, index = restore_get_root(rp)
restore_check_backup_dir(mirror_root)
mirror_rp = mirror_root.new_index(index)
inc_rpath = Globals.rbdir.append_path('increments', index)
incs = restore.get_inclist(inc_rpath)
mirror_time = restore.MirrorStruct.get_mirror_time()
if Globals.parsable_output:
print manage.describe_incs_parsable(incs, mirror_time, mirror_rp)
else: print manage.describe_incs_human(incs, mirror_time, mirror_rp)
def CalculateAverage(rps):
"""Print out the average of the given statistics files"""
statobjs = map(lambda rp: StatsObj().read_stats_from_rp(rp), rps)
average_stats = StatsObj().set_to_average(statobjs)
print average_stats.get_stats_logstring(
"Average of %d stat files" % len(rps))
def RemoveOlderThan(rootrp):
"""Remove all increment files older than a certain time"""
rom_check_dir(rootrp)
try: time = Time.genstrtotime(remove_older_than_string)
except Time.TimeException, exc: Log.FatalError(str(exc))
timep = Time.timetopretty(time)
Log("Deleting increment(s) before %s" % timep, 4)
times_in_secs = [inc.getinctime() for inc in
restore.get_inclist(Globals.rbdir.append_path("increments"))]
times_in_secs = filter(lambda t: t < time, times_in_secs)
if not times_in_secs:
Log.FatalError("No increments older than %s found, exiting."
% (timep,), 1, errlevel = 0)
times_in_secs.sort()
inc_pretty_time = "\n".join(map(Time.timetopretty, times_in_secs))
if len(times_in_secs) > 1 and not force:
Log.FatalError("Found %d relevant increments, dated:\n%s"
"\nIf you want to delete multiple increments in this way, "
"use the --force." % (len(times_in_secs), inc_pretty_time))
if len(times_in_secs) == 1:
Log("Deleting increment at time:\n" + inc_pretty_time, 3)
else: Log("Deleting increments at times:\n" + inc_pretty_time, 3)
manage.delete_earlier_than(Globals.rbdir, time)
def rom_check_dir(rootrp):
"""Check destination dir before RemoveOlderThan"""
SetConnections.UpdateGlobal('rbdir',
rootrp.append_path("rdiff-backup-data"))
if not Globals.rbdir.isdir():
Log.FatalError("Unable to open rdiff-backup-data dir %s" %
(datadir.path,))
checkdest_if_necessary(rootrp)
def ListChangedSince(rp):
"""List all the files under rp that have changed since restoretime"""
try: rest_time = Time.genstrtotime(restore_timestr)
except Time.TimeException, exc: Log.FatalError(str(exc))
mirror_root, index = restore_get_root(rp)
restore_check_backup_dir(mirror_root)
mirror_rp = mirror_root.new_index(index)
inc_rp = mirror_rp.append_path("increments", index)
restore.ListChangedSince(mirror_rp, inc_rp, rest_time)
def ListAtTime(rp):
"""List files in archive under rp that are present at restoretime"""
try: rest_time = Time.genstrtotime(restore_timestr)
except Time.TimeException, exc: Log.FatalError(str(exc))
mirror_root, index = restore_get_root(rp)
restore_check_backup_dir(mirror_root)
mirror_rp = mirror_root.new_index(index)
inc_rp = mirror_rp.append_path("increments", index)
restore.ListAtTime(mirror_rp, inc_rp, rest_time)
def CheckDest(dest_rp):
"""Check the destination directory, """
if Globals.quoting_enabled:
dest_rp = FilenameMapping.get_quotedrpath(dest_rp)
if Globals.rbdir is None:
SetConnections.UpdateGlobal('rbdir',
dest_rp.append_path("rdiff-backup-data"))
need_check = checkdest_need_check(dest_rp)
if need_check is None:
Log.FatalError("No destination dir found at %s" % (dest_rp.path,))
elif need_check == 0:
Log.FatalError("Destination dir %s does not need checking" %
(dest_rp.path,))
dest_rp.conn.regress.Regress(dest_rp)
def checkdest_need_check(dest_rp):
"""Return None if no dest dir found, 1 if dest dir needs check, 0 o/w"""
if not dest_rp.isdir() or not Globals.rbdir.isdir(): return None
curmirroot = Globals.rbdir.append("current_mirror")
curmir_incs = restore.get_inclist(curmirroot)
if not curmir_incs:
Log.FatalError(
"""Bad rdiff-backup-data dir on destination side
The rdiff-backup data directory
%s
exists, but we cannot find a valid current_mirror marker. You can
avoid this message by removing this directory; however any data in it
will be lost.
""" % (Globals.rbdir.path,))
elif len(curmir_incs) == 1: return 0
else:
assert len(curmir_incs) == 2, "Found too many current_mirror incs!"
return 1
def checkdest_if_necessary(dest_rp):
"""Check the destination dir if necessary.
This can/should be run before an incremental backup.
"""
need_check = checkdest_need_check(dest_rp)
if need_check == 1:
Log("Previous backup seems to have failed, regressing "
"destination now.", 2)
dest_rp.conn.regress.Regress(dest_rp)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Invoke rdiff utility to make signatures, deltas, or patch"""
import os, librsync
import Globals, log, static, TempFile, rpath
def get_signature(rp, blocksize = None):
"""Take signature of rpin file and return in file object"""
if not blocksize: blocksize = find_blocksize(rp.getsize())
log.Log("Getting signature of %s with blocksize %s" %
(rp.get_indexpath(), blocksize), 7)
return librsync.SigFile(rp.open("rb"), blocksize)
def find_blocksize(file_len):
"""Return a reasonable block size to use on files of length file_len
If the block size is too big, deltas will be bigger than is
necessary. If the block size is too small, making deltas and
patching can take a really long time.
"""
if file_len < 1024000: return 512 # set minimum of 512 bytes
else: # Split file into about 2000 pieces, rounding to 512
return long((file_len/(2000*512))*512)
def get_delta_sigfileobj(sig_fileobj, rp_new):
"""Like get_delta but signature is in a file object"""
log.Log("Getting delta of %s with signature stream" % (rp_new.path,), 7)
return librsync.DeltaFile(sig_fileobj, rp_new.open("rb"))
def get_delta_sigrp(rp_signature, rp_new):
"""Take signature rp and new rp, return delta file object"""
log.Log("Getting delta of %s with signature %s" %
(rp_new.path, rp_signature.get_indexpath()), 7)
return librsync.DeltaFile(rp_signature.open("rb"), rp_new.open("rb"))
def write_delta(basis, new, delta, compress = None):
"""Write rdiff delta which brings basis to new"""
log.Log("Writing delta %s from %s -> %s" %
(basis.path, new.path, delta.path), 7)
deltafile = librsync.DeltaFile(get_signature(basis), new.open("rb"))
delta.write_from_fileobj(deltafile, compress)
def write_patched_fp(basis_fp, delta_fp, out_fp):
"""Write patched file to out_fp given input fps. Closes input files"""
rpath.copyfileobj(librsync.PatchedFile(basis_fp, delta_fp), out_fp)
assert not basis_fp.close() and not delta_fp.close()
def write_via_tempfile(fp, rp):
"""Write fileobj fp to rp by writing to tempfile and renaming"""
tf = TempFile.new(rp)
tf.write_from_fileobj(fp)
rpath.rename(tf, rp)
def patch_local(rp_basis, rp_delta, outrp = None, delta_compressed = None):
"""Patch routine that must be run locally, writes to outrp
This should be run local to rp_basis because it needs to be a real
file (librsync may need to seek around in it). If outrp is None,
patch rp_basis instead.
"""
assert rp_basis.conn is Globals.local_connection
if delta_compressed: deltafile = rp_delta.open("rb", 1)
else: deltafile = rp_delta.open("rb")
sigfile = librsync.SigFile(rp_basis.open("rb"))
patchfile = librsync.PatchedFile(rp_basis.open("rb"), deltafile)
if outrp: outrp.write_from_fileobj(patchfile)
else: write_via_tempfile(patchfile, rp_basis)
def copy_local(rpin, rpout, rpnew = None):
"""Write rpnew == rpin using rpout as basis. rpout and rpnew local"""
assert rpout.conn is Globals.local_connection
deltafile = rpin.conn.librsync.DeltaFile(get_signature(rpout),
rpin.open("rb"))
patched_file = librsync.PatchedFile(rpout.open("rb"), deltafile)
if rpnew: rpnew.write_from_fileobj(patched_file)
else: write_via_tempfile(patched_file, rpout)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Functions to make sure remote requests are kosher"""
import sys, tempfile
import Globals, Main, rpath
class Violation(Exception):
"""Exception that indicates an improper request has been received"""
pass
# This will store the list of functions that will be honored from
# remote connections.
allowed_requests = None
# This stores the list of global variables that the client can not
# set on the server.
disallowed_server_globals = ["server", "security_level", "restrict_path"]
def initialize(action, cmdpairs):
"""Initialize allowable request list and chroot"""
global allowed_requests
set_security_level(action, cmdpairs)
set_allowed_requests(Globals.security_level)
def set_security_level(action, cmdpairs):
"""If running client, set security level and restrict_path
To find these settings, we must look at the action to see what is
supposed to happen, and then look at the cmdpairs to see what end
the client is on.
"""
def islocal(cmdpair): return not cmdpair[0]
def bothlocal(cp1, cp2): return islocal(cp1) and islocal(cp2)
def bothremote(cp1, cp2): return not islocal(cp1) and not islocal(cp2)
def getpath(cmdpair): return cmdpair[1]
if Globals.server: return
cp1 = cmdpairs[0]
if len(cmdpairs) > 1: cp2 = cmdpairs[1]
else: cp2 = cp1
if action == "backup" or action == "check-destination-dir":
if bothlocal(cp1, cp2) or bothremote(cp1, cp2):
sec_level = "minimal"
rdir = tempfile.gettempdir()
elif islocal(cp1):
sec_level = "read-only"
rdir = getpath(cp1)
else:
assert islocal(cp2)
sec_level = "update-only"
rdir = getpath(cp2)
elif action == "restore" or action == "restore-as-of":
if len(cmdpairs) == 1 or bothlocal(cp1, cp2) or bothremote(cp1, cp2):
sec_level = "minimal"
rdir = tempfile.gettempdir()
elif islocal(cp1):
sec_level = "read-only"
rdir = Main.restore_get_root(rpath.RPath(Globals.local_connection,
getpath(cp1)))[0].path
else:
assert islocal(cp2)
sec_level = "all"
rdir = getpath(cp2)
elif action == "mirror":
if bothlocal(cp1, cp2) or bothremote(cp1, cp2):
sec_level = "minimal"
rdir = tempfile.gettempdir()
elif islocal(cp1):
sec_level = "read-only"
rdir = getpath(cp1)
else:
assert islocal(cp2)
sec_level = "all"
rdir = getpath(cp2)
elif (action == "test-server" or action == "list-increments" or
action == "list-at-time" or action == "list-changed-since"
or action == "calculate-average" or action == "remove-older-than"):
sec_level = "minimal"
rdir = tempfile.gettempdir()
else: assert 0, "Unknown action %s" % action
Globals.security_level = sec_level
Globals.restrict_path = rpath.RPath(Globals.local_connection,
rdir).normalize().path
def set_allowed_requests(sec_level):
"""Set the allowed requests list using the security level"""
global allowed_requests
if sec_level == "all": return
allowed_requests = ["VirtualFile.readfromid", "VirtualFile.closebyid",
"Globals.get", "Globals.is_not_None",
"Globals.get_dict_val",
"log.Log.open_logfile_allconn",
"log.Log.close_logfile_allconn",
"Log.log_to_file",
"SetConnections.add_redirected_conn",
"RedirectedRun",
"sys.stdout.write"]
if sec_level == "minimal": pass
elif sec_level == "read-only" or sec_level == "update-only":
allowed_requests.extend(
["C.make_file_dict",
"log.Log.log_to_file",
"os.getuid",
"os.listdir",
"Time.setcurtime_local",
"robust.Resume.ResumeCheck",
"backup.SourceStruct.split_initial_dsiter",
"backup.SourceStruct.get_diffs_and_finalize",
"rpath.gzip_open_local_read",
"rpath.open_local_read"])
if sec_level == "update-only":
allowed_requests.extend(
["Log.open_logfile_local", "Log.close_logfile_local",
"Log.close_logfile_allconn", "Log.log_to_file",
"log.Log.log_to_file",
"robust.SaveState.init_filenames",
"robust.SaveState.touch_last_file",
"backup.DestinationStruct.get_sigs",
"backup.DestinationStruct.patch_w_datadir_writes",
"backup.DestinationStruct.patch_and_finalize",
"backup.DestinationStruct.patch_increment_and_finalize",
"Main.backup_touch_curmirror_local",
"Globals.ITRB.increment_stat",
"statistics.record_error",
"log.ErrorLog.write_if_open"])
if Globals.server:
allowed_requests.extend(
["SetConnections.init_connection_remote",
"log.Log.setverbosity",
"log.Log.setterm_verbosity",
"Time.setprevtime_local",
"FilenameMapping.set_init_quote_vals_local",
"Globals.postset_regexp_local",
"Globals.set_select",
"backup.SourceStruct.set_session_info",
"backup.DestinationStruct.set_session_info"])
def vet_request(request, arglist):
"""Examine request for security violations"""
#if Globals.server: sys.stderr.write(str(request) + "\n")
security_level = Globals.security_level
if Globals.restrict_path:
for arg in arglist:
if isinstance(arg, rpath.RPath): vet_rpath(arg)
if security_level == "all": return
if request.function_string in allowed_requests: return
if request.function_string == "Globals.set":
if Globals.server and arglist[0] not in disallowed_server_globals:
return
raise Violation("\nWarning Security Violation!\n"
"Bad request for function: %s\n"
"with arguments: %s\n" % (request.function_string,
arglist))
def vet_rpath(rpath):
"""Require rpath not to step outside retricted directory"""
if Globals.restrict_path and rpath.conn is Globals.local_connection:
normalized, restrict = rpath.normalize().path, Globals.restrict_path
components = normalized.split("/")
# 3 cases for restricted dir /usr/foo: /var, /usr/foobar, /usr/foo/..
if (not normalized.startswith(restrict) or
(len(normalized) > len(restrict) and
normalized[len(restrict)] != "/") or
".." in components):
raise Violation("\nWarning Security Violation!\n"
"Request to handle path %s\n"
"which doesn't appear to be within "
"restrict path %s.\n" % (normalized, restrict))
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Parse args and setup connections
The functions in this module are used once by Main to parse file
descriptions like bescoto@folly.stanford.edu:/usr/bin/ls and to set up
the related connections.
"""
import os
from log import Log
import Globals, FilenameMapping, connection, rpath
# This is the schema that determines how rdiff-backup will open a
# pipe to the remote system. If the file is given as A::B, %s will
# be substituted with A in the schema.
__cmd_schema = 'ssh -C %s rdiff-backup --server'
__cmd_schema_no_compress = 'ssh %s rdiff-backup --server'
# This is a list of remote commands used to start the connections.
# The first is None because it is the local connection.
__conn_remote_cmds = [None]
class SetConnectionsException(Exception): pass
def get_cmd_pairs(arglist, remote_schema = None, remote_cmd = None):
"""Map the given file descriptions into command pairs
Command pairs are tuples cmdpair with length 2. cmdpair[0] is
None iff it describes a local path, and cmdpair[1] is the path.
"""
global __cmd_schema
if remote_schema: __cmd_schema = remote_schema
elif not Globals.ssh_compression: __cmd_schema = __cmd_schema_no_compress
if not arglist: return []
desc_pairs = map(parse_file_desc, arglist)
if filter(lambda x: x[0], desc_pairs): # True if any host_info found
if remote_cmd:
Log.FatalError("The --remote-cmd flag is not compatible "
"with remote file descriptions.")
elif remote_schema:
Log("Remote schema option ignored - no remote file "
"descriptions.", 2)
cmdpairs = map(desc2cmd_pairs, desc_pairs)
if remote_cmd: # last file description gets remote_cmd
cmd_pairs[-1] = (remote_cmd, cmd_pairs[-1][1])
return cmdpairs
def cmdpair2rp(cmd_pair):
"""Return normalized RPath from cmd_pair (remote_cmd, filename)"""
cmd, filename = cmd_pair
if cmd: conn = init_connection(cmd)
else: conn = Globals.local_connection
return rpath.RPath(conn, filename).normalize()
def desc2cmd_pairs(desc_pair):
"""Return pair (remote_cmd, filename) from desc_pair"""
host_info, filename = desc_pair
if not host_info: return (None, filename)
else: return (fill_schema(host_info), filename)
def parse_file_desc(file_desc):
"""Parse file description returning pair (host_info, filename)
In other words, bescoto@folly.stanford.edu::/usr/bin/ls =>
("bescoto@folly.stanford.edu", "/usr/bin/ls"). The
complication is to allow for quoting of : by a \. If the
string is not separated by :, then the host_info is None.
"""
def check_len(i):
if i >= len(file_desc):
raise SetConnectionsException(
"Unexpected end to file description %s" % file_desc)
host_info_list, i, last_was_quoted = [], 0, None
while 1:
if i == len(file_desc):
return (None, file_desc)
if file_desc[i] == '\\':
i = i+1
check_len(i)
last_was_quoted = 1
elif (file_desc[i] == ":" and i > 0 and file_desc[i-1] == ":"
and not last_was_quoted):
host_info_list.pop() # Remove last colon from name
break
else: last_was_quoted = None
host_info_list.append(file_desc[i])
i = i+1
check_len(i+1)
return ("".join(host_info_list), file_desc[i+1:])
def fill_schema(host_info):
"""Fills host_info into the schema and returns remote command"""
return __cmd_schema % host_info
def init_connection(remote_cmd):
"""Run remote_cmd, register connection, and then return it
If remote_cmd is None, then the local connection will be
returned. This also updates some settings on the remote side,
like global settings, its connection number, and verbosity.
"""
if not remote_cmd: return Globals.local_connection
Log("Executing " + remote_cmd, 4)
stdin, stdout = os.popen2(remote_cmd)
conn_number = len(Globals.connections)
conn = connection.PipeConnection(stdout, stdin, conn_number)
check_connection_version(conn, remote_cmd)
Log("Registering connection %d" % conn_number, 7)
init_connection_routing(conn, conn_number, remote_cmd)
init_connection_settings(conn)
return conn
def check_connection_version(conn, remote_cmd):
"""Log warning if connection has different version"""
try: remote_version = conn.Globals.get('version')
except connection.ConnectionReadError, exception:
Log.FatalError("""%s
Couldn't start up the remote connection by executing
%s
Remember that, under the default settings, rdiff-backup must be
installed in the PATH on the remote system. See the man page for more
information on this. This message may also be displayed if the remote
version of rdiff-backup is quite different from the local version (%s)."""
% (exception, remote_cmd, Globals.version))
if remote_version != Globals.version:
Log("Warning: Local version %s does not match remote version %s."
% (Globals.version, remote_version), 2)
def init_connection_routing(conn, conn_number, remote_cmd):
"""Called by init_connection, establish routing, conn dict"""
Globals.connection_dict[conn_number] = conn
conn.SetConnections.init_connection_remote(conn_number)
for other_remote_conn in Globals.connections[1:]:
conn.SetConnections.add_redirected_conn(
other_remote_conn.conn_number)
other_remote_conn.SetConnections.add_redirected_conn(conn_number)
Globals.connections.append(conn)
__conn_remote_cmds.append(remote_cmd)
def init_connection_settings(conn):
"""Tell new conn about log settings and updated globals"""
conn.log.Log.setverbosity(Log.verbosity)
conn.log.Log.setterm_verbosity(Log.term_verbosity)
for setting_name in Globals.changed_settings:
conn.Globals.set(setting_name, Globals.get(setting_name))
FilenameMapping.set_init_quote_vals()
def init_connection_remote(conn_number):
"""Run on server side to tell self that have given conn_number"""
Globals.connection_number = conn_number
Globals.local_connection.conn_number = conn_number
Globals.connection_dict[0] = Globals.connections[1]
Globals.connection_dict[conn_number] = Globals.local_connection
def add_redirected_conn(conn_number):
"""Run on server side - tell about redirected connection"""
Globals.connection_dict[conn_number] = \
connection.RedirectedConnection(conn_number)
def UpdateGlobal(setting_name, val):
"""Update value of global variable across all connections"""
for conn in Globals.connections:
conn.Globals.set(setting_name, val)
def BackupInitConnections(reading_conn, writing_conn):
"""Backup specific connection initialization"""
reading_conn.Globals.set("isbackup_reader", 1)
writing_conn.Globals.set("isbackup_writer", 1)
UpdateGlobal("backup_reader", reading_conn)
UpdateGlobal("backup_writer", writing_conn)
if writing_conn.os.getuid() == 0 and Globals.change_ownership != 0:
UpdateGlobal('change_ownership', 1)
def CloseConnections():
"""Close all connections. Run by client"""
assert not Globals.server
for conn in Globals.connections: conn.quit()
del Globals.connections[1:] # Only leave local connection
Globals.connection_dict = {0: Globals.local_connection}
Globals.backup_reader = Globals.isbackup_reader = \
Globals.backup_writer = Globals.isbackup_writer = None
def TestConnections():
"""Test connections, printing results"""
if len(Globals.connections) == 1: print "No remote connections specified"
else:
for i in range(1, len(Globals.connections)): test_connection(i)
def test_connection(conn_number):
"""Test connection. conn_number 0 is the local connection"""
print "Testing server started by: ", __conn_remote_cmds[conn_number]
conn = Globals.connections[conn_number]
try:
assert conn.pow(2,3) == 8
assert conn.os.path.join("a", "b") == "a/b"
version = conn.reval("lambda: Globals.version")
except:
sys.stderr.write("Server tests failed\n")
raise
if not version == Globals.version:
print """Server may work, but there is a version mismatch:
Local version: %s
Remote version: %s""" % (Globals.version, version)
else: print "Server OK"
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Manage temp files
Earlier this had routines for keeping track of existing tempfiles.
Now we just use normal rpaths instead of the TempFile class.
"""
import os
import Globals, rpath
# To make collisions less likely, this gets put in the file name
# and incremented whenever a new file is requested.
_tfindex = 0
def new(rp_base):
"""Return new tempfile that isn't in use in same dir as rp_base"""
return new_in_dir(rp_base.get_parent_rp())
def new_in_dir(dir_rp):
"""Return new temp rpath in directory dir_rp"""
global _tfindex
assert dir_rp.conn is Globals.local_connection
while 1:
if _tfindex > 100000000:
Log("Warning: Resetting tempfile index", 2)
_tfindex = 0
tf = dir_rp.append('rdiff-backup.tmp.%d' % _tfindex)
_tfindex = _tfindex+1
if not tf.lstat(): return tf
class TempFile(rpath.RPath):
"""Like an RPath, but keep track of which ones are still here"""
def rename(self, rp_dest):
"""Rename temp file to permanent location, possibly overwriting"""
if not self.lstat(): # "Moving" empty file, so just delete
if rp_dest.lstat(): rp_dest.delete()
remove_listing(self)
return
if self.isdir() and not rp_dest.isdir():
# Cannot move a directory directly over another file
rp_dest.delete()
rpath.rename(self, rp_dest)
# Sometimes this just seems to fail silently, as in one
# hardlinked twin is moved over the other. So check to make
# sure below.
self.setdata()
if self.lstat():
rp_dest.delete()
rpath.rename(self, rp_dest)
self.setdata()
if self.lstat(): raise OSError("Cannot rename tmp file correctly")
remove_listing(self)
def delete(self):
rpath.RPath.delete(self)
remove_listing(self)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Provide time related exceptions and functions"""
import time, types, re
import Globals
class TimeException(Exception): pass
_interval_conv_dict = {"s": 1, "m": 60, "h": 3600, "D": 86400,
"W": 7*86400, "M": 30*86400, "Y": 365*86400}
_integer_regexp = re.compile("^[0-9]+$")
_interval_regexp = re.compile("^([0-9]+)([smhDWMY])")
_genstr_date_regexp1 = re.compile("^(?P<year>[0-9]{4})[-/]"
"(?P<month>[0-9]{1,2})[-/](?P<day>[0-9]{1,2})$")
_genstr_date_regexp2 = re.compile("^(?P<month>[0-9]{1,2})[-/]"
"(?P<day>[0-9]{1,2})[-/](?P<year>[0-9]{4})$")
curtime = curtimestr = None
def setcurtime(curtime = None):
"""Sets the current time in curtime and curtimestr on all systems"""
t = curtime or time.time()
for conn in Globals.connections:
conn.Time.setcurtime_local(long(t))
def setcurtime_local(timeinseconds):
"""Only set the current time locally"""
global curtime, curtimestr
curtime, curtimestr = timeinseconds, timetostring(timeinseconds)
def setprevtime(timeinseconds):
"""Sets the previous inc time in prevtime and prevtimestr"""
assert 0 < timeinseconds < curtime, \
"Time %s is out of bounds" % (timeinseconds,)
timestr = timetostring(timeinseconds)
for conn in Globals.connections:
conn.Time.setprevtime_local(timeinseconds, timestr)
def setprevtime_local(timeinseconds, timestr):
"""Like setprevtime but only set the local version"""
global prevtime, prevtimestr
prevtime, prevtimestr = timeinseconds, timestr
def timetostring(timeinseconds):
"""Return w3 datetime compliant listing of timeinseconds"""
s = time.strftime("%Y-%m-%dT%H:%M:%S", time.localtime(timeinseconds))
return s + gettzd()
def stringtotime(timestring):
"""Return time in seconds from w3 timestring
If there is an error parsing the string, or it doesn't look
like a w3 datetime string, return None.
"""
try:
date, daytime = timestring[:19].split("T")
year, month, day = map(int, date.split("-"))
hour, minute, second = map(int, daytime.split(":"))
assert 1900 < year < 2100, year
assert 1 <= month <= 12
assert 1 <= day <= 31
assert 0 <= hour <= 23
assert 0 <= minute <= 59
assert 0 <= second <= 61 # leap seconds
timetuple = (year, month, day, hour, minute, second, -1, -1, -1)
if time.daylight:
utc_in_secs = time.mktime(timetuple) - time.altzone
else: utc_in_secs = time.mktime(timetuple) - time.timezone
return long(utc_in_secs) + tzdtoseconds(timestring[19:])
except (TypeError, ValueError, AssertionError): return None
def timetopretty(timeinseconds):
"""Return pretty version of time"""
return time.asctime(time.localtime(timeinseconds))
def stringtopretty(timestring):
"""Return pretty version of time given w3 time string"""
return timetopretty(stringtotime(timestring))
def inttopretty(seconds):
"""Convert num of seconds to readable string like "2 hours"."""
partlist = []
hours, seconds = divmod(seconds, 3600)
if hours > 1: partlist.append("%d hours" % hours)
elif hours == 1: partlist.append("1 hour")
minutes, seconds = divmod(seconds, 60)
if minutes > 1: partlist.append("%d minutes" % minutes)
elif minutes == 1: partlist.append("1 minute")
if seconds == 1: partlist.append("1 second")
elif not partlist or seconds > 1:
if isinstance(seconds, int) or isinstance(seconds, long):
partlist.append("%s seconds" % seconds)
else: partlist.append("%.2f seconds" % seconds)
return " ".join(partlist)
def intstringtoseconds(interval_string):
"""Convert a string expressing an interval (e.g. "4D2s") to seconds"""
def error():
raise TimeException("""Bad interval string "%s"
Intervals are specified like 2Y (2 years) or 2h30m (2.5 hours). The
allowed special characters are s, m, h, D, W, M, and Y. See the man
page for more information.
""" % interval_string)
if len(interval_string) < 2: error()
total = 0
while interval_string:
match = _interval_regexp.match(interval_string)
if not match: error()
num, ext = int(match.group(1)), match.group(2)
if not ext in _interval_conv_dict or num < 0: error()
total += num*_interval_conv_dict[ext]
interval_string = interval_string[match.end(0):]
return total
def gettzd():
"""Return w3's timezone identification string.
Expresed as [+/-]hh:mm. For instance, PST is -08:00. Zone is
coincides with what localtime(), etc., use.
"""
if time.daylight: offset = -1 * time.altzone/60
else: offset = -1 * time.timezone/60
if offset > 0: prefix = "+"
elif offset < 0: prefix = "-"
else: return "Z" # time is already in UTC
hours, minutes = map(abs, divmod(offset, 60))
assert 0 <= hours <= 23
assert 0 <= minutes <= 59
return "%s%02d:%02d" % (prefix, hours, minutes)
def tzdtoseconds(tzd):
"""Given w3 compliant TZD, return how far ahead UTC is"""
if tzd == "Z": return 0
assert len(tzd) == 6 # only accept forms like +08:00 for now
assert (tzd[0] == "-" or tzd[0] == "+") and tzd[3] == ":"
return -60 * (60 * int(tzd[:3]) + int(tzd[4:]))
def cmp(time1, time2):
"""Compare time1 and time2 and return -1, 0, or 1"""
if type(time1) is types.StringType:
time1 = stringtotime(time1)
assert time1 is not None
if type(time2) is types.StringType:
time2 = stringtotime(time2)
assert time2 is not None
if time1 < time2: return -1
elif time1 == time2: return 0
else: return 1
def genstrtotime(timestr, curtime = None):
"""Convert a generic time string to a time in seconds"""
if curtime is None: curtime = globals()['curtime']
if timestr == "now": return curtime
def error():
raise TimeException("""Bad time string "%s"
The acceptible time strings are intervals (like "3D64s"), w3-datetime
strings, like "2002-04-26T04:22:01-07:00" (strings like
"2002-04-26T04:22:01" are also acceptable - rdiff-backup will use the
current time zone), or ordinary dates like 2/4/1997 or 2001-04-23
(various combinations are acceptable, but the month always precedes
the day).""" % timestr)
# Test for straight integer
if _integer_regexp.search(timestr): return int(timestr)
# Test for w3-datetime format, possibly missing tzd
t = stringtotime(timestr) or stringtotime(timestr+gettzd())
if t: return t
try: # test for an interval, like "2 days ago"
return curtime - intstringtoseconds(timestr)
except TimeException: pass
# Now check for dates like 2001/3/23
match = _genstr_date_regexp1.search(timestr) or \
_genstr_date_regexp2.search(timestr)
if not match: error()
timestr = "%s-%02d-%02dT00:00:00%s" % (match.group('year'),
int(match.group('month')), int(match.group('day')), gettzd())
t = stringtotime(timestr)
if t: return t
else: error()
/* ----------------------------------------------------------------------- *
*
* Copyright 2002 2003 Ben Escoto
*
* This file is part of rdiff-backup.
*
* rdiff-backup is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of
* the License, or (at your option) any later version.
*
* rdiff-backup is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with rdiff-backup; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
* 02111-1307 USA
*
* ----------------------------------------------------------------------- */
#include <Python.h>
#include <rsync.h>
#define RS_JOB_BLOCKSIZE 65536
static PyObject *librsyncError;
/* Sets python error string from result */
static void
_librsync_seterror(rs_result result, char *location)
{
char error_string[200];
sprintf(error_string, "librsync error %d while in %s", result, location);
PyErr_SetString(librsyncError, error_string);
}
/* --------------- SigMaker Object for incremental signatures */
staticforward PyTypeObject _librsync_SigMakerType;
typedef struct {
PyObject_HEAD
PyObject *x_attr;
rs_job_t *sig_job;
} _librsync_SigMakerObject;
static PyObject*
_librsync_new_sigmaker(PyObject* self, PyObject* args)
{
_librsync_SigMakerObject* sm;
long blocklen;
if (!PyArg_ParseTuple(args, "l:new_sigmaker", &blocklen))
return NULL;
sm = PyObject_New(_librsync_SigMakerObject, &_librsync_SigMakerType);
if (sm == NULL) return NULL;
sm->x_attr = NULL;
sm->sig_job = rs_sig_begin((size_t)blocklen,
(size_t)RS_DEFAULT_STRONG_LEN);
return (PyObject*)sm;
}
static void
_librsync_sigmaker_dealloc(PyObject* self)
{
rs_job_free(((_librsync_SigMakerObject *)self)->sig_job);
PyObject_Del(self);
}
/* Take an input string, and generate a signature from it. The output
will be a triple (done, bytes_used, signature_string), where done
is true iff there is no more data coming and bytes_used is the
number of bytes of the input string processed.
*/
static PyObject *
_librsync_sigmaker_cycle(_librsync_SigMakerObject *self, PyObject *args)
{
char *inbuf, outbuf[RS_JOB_BLOCKSIZE];
int inbuf_length;
rs_buffers_t buf;
rs_result result;
if (!PyArg_ParseTuple(args, "s#:cycle", &inbuf, &inbuf_length))
return NULL;
buf.next_in = inbuf;
buf.avail_in = (size_t)inbuf_length;
buf.next_out = outbuf;
buf.avail_out = (size_t)RS_JOB_BLOCKSIZE;
buf.eof_in = (inbuf_length == 0);
result = rs_job_iter(self->sig_job, &buf);
if (result != RS_DONE && result != RS_BLOCKED) {
_librsync_seterror(result, "signature cycle");
return NULL;
}
return Py_BuildValue("(ils#)", (result == RS_DONE),
(long)inbuf_length - (long)buf.avail_in,
outbuf, RS_JOB_BLOCKSIZE - (long)buf.avail_out);
}
static PyMethodDef _librsync_sigmaker_methods[] = {
{"cycle", (PyCFunction)_librsync_sigmaker_cycle, METH_VARARGS},
{NULL, NULL, 0, NULL} /* sentinel */
};
static PyObject *
_librsync_sigmaker_getattr(_librsync_SigMakerObject *sm,
char *name)
{
if (sm->x_attr != NULL) {
PyObject *v = PyDict_GetItemString(sm->x_attr, name);
if (v != NULL) {
Py_INCREF(v);
return v;
}
}
return Py_FindMethod(_librsync_sigmaker_methods, (PyObject *)sm, name);
}
static int
_librsync_sigmaker_setattr(_librsync_SigMakerObject *sm,
char *name, PyObject *v)
{
if (sm->x_attr == NULL) {
sm->x_attr = PyDict_New();
if (sm->x_attr == NULL) return -1;
}
if (v == NULL) {
int rv = PyDict_DelItemString(sm->x_attr, name);
if (rv < 0)
PyErr_SetString(PyExc_AttributeError,
"delete non-existing sigmaker attribute");
return rv;
}
else return PyDict_SetItemString(sm->x_attr, name, v);
}
static PyTypeObject _librsync_SigMakerType = {
PyObject_HEAD_INIT(NULL)
0,
"sigmaker",
sizeof(_librsync_SigMakerObject),
0,
_librsync_sigmaker_dealloc, /*tp_dealloc*/
0, /*tp_print*/
(getattrfunc)_librsync_sigmaker_getattr, /*tp_getattr*/
(setattrfunc)_librsync_sigmaker_setattr, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
};
/* --------------- DeltaMaker Object for incremental deltas */
staticforward PyTypeObject _librsync_DeltaMakerType;
typedef struct {
PyObject_HEAD
PyObject *x_attr;
rs_job_t *delta_job;
rs_signature_t *sig_ptr;
} _librsync_DeltaMakerObject;
/* Call with the entire signature loaded into one big string */
static PyObject*
_librsync_new_deltamaker(PyObject* self, PyObject* args)
{
_librsync_DeltaMakerObject* dm;
char *sig_string, outbuf[RS_JOB_BLOCKSIZE];
int sig_length;
rs_job_t *sig_loader;
rs_signature_t *sig_ptr;
rs_buffers_t buf;
rs_result result;
if (!PyArg_ParseTuple(args,"s#:new_deltamaker", &sig_string, &sig_length))
return NULL;
dm = PyObject_New(_librsync_DeltaMakerObject, &_librsync_DeltaMakerType);
if (dm == NULL) return NULL;
dm->x_attr = NULL;
/* Put signature at sig_ptr and build hash */
sig_loader = rs_loadsig_begin(&sig_ptr);
buf.next_in = sig_string;
buf.avail_in = (size_t)sig_length;
buf.next_out = outbuf;
buf.avail_out = (size_t)RS_JOB_BLOCKSIZE;
buf.eof_in = 1;
result = rs_job_iter(sig_loader, &buf);
rs_job_free(sig_loader);
if (result != RS_DONE) {
_librsync_seterror(result, "delta rs_signature_t builder");
return NULL;
}
if ((result = rs_build_hash_table(sig_ptr)) != RS_DONE) {
_librsync_seterror(result, "delta rs_build_hash_table");
return NULL;
}
dm->sig_ptr = sig_ptr;
dm->delta_job = rs_delta_begin(sig_ptr);
return (PyObject*)dm;
}
static void
_librsync_deltamaker_dealloc(PyObject* self)
{
_librsync_DeltaMakerObject *dm = (_librsync_DeltaMakerObject *)self;
rs_signature_t *sig_ptr = dm->sig_ptr;
rs_free_sumset(sig_ptr);
rs_job_free(dm->delta_job);
PyObject_Del(self);
}
/* Take a chunk of the new file in an input string, and return a
triple (done bytes_used, delta_string), where done is true iff no
more data is coming and bytes_used is the number of bytes of the
input string processed.
*/
static PyObject *
_librsync_deltamaker_cycle(_librsync_DeltaMakerObject *self, PyObject *args)
{
char *inbuf, outbuf[RS_JOB_BLOCKSIZE];
int inbuf_length;
rs_buffers_t buf;
rs_result result;
if (!PyArg_ParseTuple(args, "s#:cycle", &inbuf, &inbuf_length))
return NULL;
buf.next_in = inbuf;
buf.avail_in = (size_t)inbuf_length;
buf.next_out = outbuf;
buf.avail_out = (size_t)RS_JOB_BLOCKSIZE;
buf.eof_in = (inbuf_length == 0);
result = rs_job_iter(self->delta_job, &buf);
if (result != RS_DONE && result != RS_BLOCKED) {
_librsync_seterror(result, "delta cycle");
return NULL;
}
return Py_BuildValue("(ils#)", (result == RS_DONE),
(long)inbuf_length - (long)buf.avail_in,
outbuf, RS_JOB_BLOCKSIZE - (long)buf.avail_out);
}
static PyMethodDef _librsync_deltamaker_methods[] = {
{"cycle", (PyCFunction)_librsync_deltamaker_cycle, METH_VARARGS},
{NULL, NULL, 0, NULL} /* sentinel */
};
static PyObject *
_librsync_deltamaker_getattr(_librsync_DeltaMakerObject *dm, char *name)
{
if (dm->x_attr != NULL) {
PyObject *v = PyDict_GetItemString(dm->x_attr, name);
if (v != NULL) {
Py_INCREF(v);
return v;
}
}
return Py_FindMethod(_librsync_deltamaker_methods, (PyObject *)dm, name);
}
static int
_librsync_deltamaker_setattr(_librsync_DeltaMakerObject *dm,
char *name, PyObject *v)
{
if (dm->x_attr == NULL) {
dm->x_attr = PyDict_New();
if (dm->x_attr == NULL) return -1;
}
if (v == NULL) {
int rv = PyDict_DelItemString(dm->x_attr, name);
if (rv < 0)
PyErr_SetString(PyExc_AttributeError,
"delete non-existing deltamaker attribute");
return rv;
}
else return PyDict_SetItemString(dm->x_attr, name, v);
}
static PyTypeObject _librsync_DeltaMakerType = {
PyObject_HEAD_INIT(NULL)
0,
"deltamaker",
sizeof(_librsync_DeltaMakerObject),
0,
_librsync_deltamaker_dealloc, /*tp_dealloc*/
0, /*tp_print*/
(getattrfunc)_librsync_deltamaker_getattr, /*tp_getattr*/
(setattrfunc)_librsync_deltamaker_setattr, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
};
/* --------------- PatchMaker Object for incremental patching */
staticforward PyTypeObject _librsync_PatchMakerType;
typedef struct {
PyObject_HEAD
PyObject *x_attr;
rs_job_t *patch_job;
PyObject *basis_file;
} _librsync_PatchMakerObject;
/* Call with the basis file */
static PyObject*
_librsync_new_patchmaker(PyObject* self, PyObject* args)
{
_librsync_PatchMakerObject* pm;
PyObject *python_file;
FILE *cfile;
if (!PyArg_ParseTuple(args, "O:new_patchmaker", &python_file))
return NULL;
if (!PyFile_Check(python_file)) {
PyErr_SetString(PyExc_TypeError, "Need true file object");
return NULL;
}
Py_INCREF(python_file);
pm = PyObject_New(_librsync_PatchMakerObject, &_librsync_PatchMakerType);
if (pm == NULL) return NULL;
pm->x_attr = NULL;
pm->basis_file = python_file;
cfile = PyFile_AsFile(python_file);
pm->patch_job = rs_patch_begin(rs_file_copy_cb, cfile);
return (PyObject*)pm;
}
static void
_librsync_patchmaker_dealloc(PyObject* self)
{
_librsync_PatchMakerObject *pm = (_librsync_PatchMakerObject *)self;
Py_DECREF(pm->basis_file);
rs_job_free(pm->patch_job);
PyObject_Del(self);
}
/* Take a chunk of the delta file in an input string, and return a
triple (done, bytes_used, patched_string), where done is true iff
there is no more data coming out and bytes_used is the number of
bytes of the input string processed.
*/
static PyObject *
_librsync_patchmaker_cycle(_librsync_PatchMakerObject *self, PyObject *args)
{
char *inbuf, outbuf[RS_JOB_BLOCKSIZE];
int inbuf_length;
rs_buffers_t buf;
rs_result result;
if (!PyArg_ParseTuple(args, "s#:cycle", &inbuf, &inbuf_length))
return NULL;
buf.next_in = inbuf;
buf.avail_in = (size_t)inbuf_length;
buf.next_out = outbuf;
buf.avail_out = (size_t)RS_JOB_BLOCKSIZE;
buf.eof_in = (inbuf_length == 0);
result = rs_job_iter(self->patch_job, &buf);
if (result != RS_DONE && result != RS_BLOCKED) {
_librsync_seterror(result, "patch cycle");
return NULL;
}
return Py_BuildValue("(ils#)", (result == RS_DONE),
(long)inbuf_length - (long)buf.avail_in,
outbuf, RS_JOB_BLOCKSIZE - (long)buf.avail_out);
}
static PyMethodDef _librsync_patchmaker_methods[] = {
{"cycle", (PyCFunction)_librsync_patchmaker_cycle, METH_VARARGS},
{NULL, NULL, 0, NULL} /* sentinel */
};
static PyObject *
_librsync_patchmaker_getattr(_librsync_PatchMakerObject *pm, char *name)
{
if (pm->x_attr != NULL) {
PyObject *v = PyDict_GetItemString(pm->x_attr, name);
if (v != NULL) {
Py_INCREF(v);
return v;
}
}
return Py_FindMethod(_librsync_patchmaker_methods, (PyObject *)pm, name);
}
static int
_librsync_patchmaker_setattr(_librsync_PatchMakerObject *pm,
char *name, PyObject *v)
{
if (pm->x_attr == NULL) {
pm->x_attr = PyDict_New();
if (pm->x_attr == NULL) return -1;
}
if (v == NULL) {
int rv = PyDict_DelItemString(pm->x_attr, name);
if (rv < 0)
PyErr_SetString(PyExc_AttributeError,
"delete non-existing patchmaker attribute");
return rv;
}
else return PyDict_SetItemString(pm->x_attr, name, v);
}
static PyTypeObject _librsync_PatchMakerType = {
PyObject_HEAD_INIT(NULL)
0,
"patchmaker",
sizeof(_librsync_PatchMakerObject),
0,
_librsync_patchmaker_dealloc, /*tp_dealloc*/
0, /*tp_print*/
(getattrfunc)_librsync_patchmaker_getattr, /*tp_getattr*/
(setattrfunc)_librsync_patchmaker_setattr, /*tp_setattr*/
0, /*tp_compare*/
0, /*tp_repr*/
0, /*tp_as_number*/
0, /*tp_as_sequence*/
0, /*tp_as_mapping*/
0, /*tp_hash */
};
/* --------------- _librsync module definition */
static PyMethodDef _librsyncMethods[] = {
{"new_sigmaker", _librsync_new_sigmaker, METH_VARARGS,
"Return a sigmaker object, for finding the signature of an object"},
{"new_deltamaker", _librsync_new_deltamaker, METH_VARARGS,
"Return a deltamaker object, for computing deltas"},
{"new_patchmaker", _librsync_new_patchmaker, METH_VARARGS,
"Return a patchmaker object, for patching basis files"},
{NULL, NULL, 0, NULL}
};
void init_librsync(void)
{
PyObject *m, *d;
_librsync_SigMakerType.ob_type = &PyType_Type;
_librsync_DeltaMakerType.ob_type = &PyType_Type;
m = Py_InitModule("_librsync", _librsyncMethods);
d = PyModule_GetDict(m);
librsyncError = PyErr_NewException("_librsync.librsyncError", NULL, NULL);
PyDict_SetItemString(d, "librsyncError", librsyncError);
PyDict_SetItemString(d, "RS_JOB_BLOCKSIZE",
Py_BuildValue("l", (long)RS_JOB_BLOCKSIZE));
PyDict_SetItemString(d, "RS_DEFAULT_BLOCK_LEN",
Py_BuildValue("l", (long)RS_DEFAULT_BLOCK_LEN));
}
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""High level functions for mirroring and mirror+incrementing"""
from __future__ import generators
import errno
import Globals, metadata, rorpiter, TempFile, Hardlink, robust, increment, \
rpath, static, log, selection, Time, Rdiff, statistics, iterfile
def Mirror(src_rpath, dest_rpath):
"""Turn dest_rpath into a copy of src_rpath"""
SourceS = src_rpath.conn.backup.SourceStruct
DestS = dest_rpath.conn.backup.DestinationStruct
source_rpiter = SourceS.get_source_select()
DestS.set_rorp_cache(dest_rpath, source_rpiter, 0)
dest_sigiter = DestS.get_sigs(dest_rpath)
source_diffiter = SourceS.get_diffs(dest_sigiter)
DestS.patch(dest_rpath, source_diffiter)
def Mirror_and_increment(src_rpath, dest_rpath, inc_rpath):
"""Mirror + put increments in tree based at inc_rpath"""
SourceS = src_rpath.conn.backup.SourceStruct
DestS = dest_rpath.conn.backup.DestinationStruct
source_rpiter = SourceS.get_source_select()
DestS.set_rorp_cache(dest_rpath, source_rpiter, 1)
dest_sigiter = DestS.get_sigs(dest_rpath)
source_diffiter = SourceS.get_diffs(dest_sigiter)
DestS.patch_and_increment(dest_rpath, source_diffiter, inc_rpath)
class SourceStruct:
"""Hold info used on source side when backing up"""
source_select = None # will be set to source Select iterator
def set_source_select(cls, rpath, tuplelist, *filelists):
"""Initialize select object using tuplelist
Note that each list in filelists must each be passed as
separate arguments, so each is recognized as a file by the
connection. Otherwise we will get an error because a list
containing files can't be pickled.
Also, cls.source_select needs to be cached so get_diffs below
can retrieve the necessary rps.
"""
sel = selection.Select(rpath)
sel.ParseArgs(tuplelist, filelists)
sel.set_iter()
cache_size = Globals.pipeline_max_length * 3 # to and from+leeway
cls.source_select = rorpiter.CacheIndexable(sel, cache_size)
def get_source_select(cls):
"""Return source select iterator, set by set_source_select"""
return cls.source_select
def get_diffs(cls, dest_sigiter):
"""Return diffs of any files with signature in dest_sigiter"""
source_rps = cls.source_select
error_handler = robust.get_error_handler("ListError")
def attach_snapshot(diff_rorp, src_rp):
"""Attach file of snapshot to diff_rorp, w/ error checking"""
fileobj = robust.check_common_error(
error_handler, rpath.RPath.open, (src_rp, "rb"))
if fileobj: diff_rorp.setfile(fileobj)
else: diff_rorp.zero()
diff_rorp.set_attached_filetype('snapshot')
def attach_diff(diff_rorp, src_rp, dest_sig):
"""Attach file of diff to diff_rorp, w/ error checking"""
fileobj = robust.check_common_error(
error_handler, Rdiff.get_delta_sigrp, (dest_sig, src_rp))
if fileobj:
diff_rorp.setfile(fileobj)
diff_rorp.set_attached_filetype('diff')
else:
diff_rorp.zero()
diff_rorp.set_attached_filetype('snapshot')
for dest_sig in dest_sigiter:
if dest_sig is iterfile.RORPIterFlushRepeat:
yield iterfile.RORPIterFlush # Flush buffer when get_sigs does
continue
src_rp = (source_rps.get(dest_sig.index) or
rpath.RORPath(dest_sig.index))
diff_rorp = src_rp.getRORPath()
if dest_sig.isflaglinked():
diff_rorp.flaglinked(dest_sig.get_link_flag())
elif dest_sig.isreg() and src_rp.isreg():
attach_diff(diff_rorp, src_rp, dest_sig)
elif src_rp.isreg(): attach_snapshot(diff_rorp, src_rp)
else: diff_rorp.set_attached_filetype('snapshot')
yield diff_rorp
static.MakeClass(SourceStruct)
class DestinationStruct:
"""Hold info used by destination side when backing up"""
def get_dest_select(cls, rpath, use_metadata = 1):
"""Return destination select rorpath iterator
If metadata file doesn't exist, select all files on
destination except rdiff-backup-data directory.
"""
if use_metadata:
metadata_iter = metadata.GetMetadata_at_time(Globals.rbdir,
Time.prevtime)
if metadata_iter: return metadata_iter
log.Log("Warning: Metadata file not found.\n"
"Metadata will be read from filesystem.", 2)
sel = selection.Select(rpath)
sel.parse_rbdir_exclude()
return sel.set_iter()
def set_rorp_cache(cls, baserp, source_iter, for_increment):
"""Initialize cls.CCPP, the destination rorp cache
for_increment should be true if we are mirror+incrementing,
false if we are just mirroring.
"""
dest_iter = cls.get_dest_select(baserp, for_increment)
collated = rorpiter.Collate2Iters(source_iter, dest_iter)
cls.CCPP = CacheCollatedPostProcess(
collated, Globals.pipeline_max_length*4)
# pipeline len adds some leeway over just*3 (to and from and back)
def get_sigs(cls, dest_base_rpath):
"""Yield signatures of any changed destination files
If we are backing up across a pipe, we must flush the pipeline
every so often so it doesn't get congested on destination end.
"""
flush_threshold = int(Globals.pipeline_max_length/2)
num_rorps_skipped = 0
for src_rorp, dest_rorp in cls.CCPP:
if (src_rorp and dest_rorp and src_rorp == dest_rorp and
(not Globals.preserve_hardlinks or
Hardlink.rorp_eq(src_rorp, dest_rorp))):
num_rorps_skipped += 1
if (Globals.backup_reader is not Globals.backup_writer and
num_rorps_skipped > flush_threshold):
num_rorps_skipped = 0
yield iterfile.RORPIterFlushRepeat
else:
index = src_rorp and src_rorp.index or dest_rorp.index
sig = cls.get_one_sig(dest_base_rpath, index,
src_rorp, dest_rorp)
if sig:
cls.CCPP.flag_changed(index)
yield sig
def get_one_sig(cls, dest_base_rpath, index, src_rorp, dest_rorp):
"""Return a signature given source and destination rorps"""
if (Globals.preserve_hardlinks and
Hardlink.islinked(src_rorp or dest_rorp)):
dest_sig = rpath.RORPath(index)
dest_sig.flaglinked(Hardlink.get_link_index(dest_sig))
elif dest_rorp:
dest_sig = dest_rorp.getRORPath()
if dest_rorp.isreg():
dest_rp = dest_base_rpath.new_index(index)
if not dest_rp.isreg():
log.ErrorLog.write_if_open("UpdateError", dest_rp,
"File changed from regular file before signature")
return None
dest_sig.setfile(Rdiff.get_signature(dest_rp))
else: dest_sig = rpath.RORPath(index)
return dest_sig
def patch(cls, dest_rpath, source_diffiter, start_index = ()):
"""Patch dest_rpath with an rorpiter of diffs"""
ITR = rorpiter.IterTreeReducer(PatchITRB, [dest_rpath, cls.CCPP])
for diff in rorpiter.FillInIter(source_diffiter, dest_rpath):
log.Log("Processing changed file " + diff.get_indexpath(), 5)
ITR(diff.index, diff)
ITR.Finish()
cls.CCPP.close()
dest_rpath.setdata()
def patch_and_increment(cls, dest_rpath, source_diffiter, inc_rpath):
"""Patch dest_rpath with rorpiter of diffs and write increments"""
ITR = rorpiter.IterTreeReducer(IncrementITRB,
[dest_rpath, inc_rpath, cls.CCPP])
for diff in rorpiter.FillInIter(source_diffiter, dest_rpath):
log.Log("Processing changed file " + diff.get_indexpath(), 5)
ITR(diff.index, diff)
ITR.Finish()
cls.CCPP.close()
dest_rpath.setdata()
static.MakeClass(DestinationStruct)
class CacheCollatedPostProcess:
"""Cache a collated iter of (source_rorp, dest_rp) pairs
This is necessary for two reasons:
1. The patch function may need the original source_rorp or
dest_rp information, which is not present in the diff it
receives.
2. The metadata must match what is stored in the destination
directory. If there is an error, either we do not update the
dest directory for that file and the old metadata is used, or
the file is deleted on the other end.. Thus we cannot write
any metadata until we know the file has been procesed
correctly.
The class caches older source_rorps and dest_rps so the patch
function can retrieve them if necessary. The patch function can
also update the processed correctly flag. When an item falls out
of the cache, we assume it has been processed, and write the
metadata for it.
"""
def __init__(self, collated_iter, cache_size):
"""Initialize new CCWP."""
self.iter = collated_iter # generates (source_rorp, dest_rorp) pairs
self.cache_size = cache_size
self.statfileobj = statistics.init_statfileobj()
if Globals.file_statistics: statistics.FileStats.init()
metadata.OpenMetadata()
# the following should map indicies to lists
# [source_rorp, dest_rorp, changed_flag, success_flag, increment]
# changed_flag should be true if the rorps are different, and
# success_flag should be 1 if dest_rorp has been successfully
# updated to source_rorp, and 2 if the destination file is
# deleted entirely. They both default to false (0).
# increment holds the RPath of the increment file if one
# exists. It is used to record file statistics.
self.cache_dict = {}
self.cache_indicies = []
def __iter__(self): return self
def next(self):
"""Return next (source_rorp, dest_rorp) pair. StopIteration passed"""
source_rorp, dest_rorp = self.iter.next()
self.pre_process(source_rorp, dest_rorp)
index = source_rorp and source_rorp.index or dest_rorp.index
self.cache_dict[index] = [source_rorp, dest_rorp, 0, 0, None]
self.cache_indicies.append(index)
if len(self.cache_indicies) > self.cache_size: self.shorten_cache()
return source_rorp, dest_rorp
def pre_process(self, source_rorp, dest_rorp):
"""Do initial processing on source_rorp and dest_rorp
It will not be clear whether source_rorp and dest_rorp have
errors at this point, so don't do anything which assumes they
will be backed up correctly.
"""
if source_rorp: Hardlink.add_rorp(source_rorp, source = 1)
if dest_rorp: Hardlink.add_rorp(dest_rorp, source = 0)
def shorten_cache(self):
"""Remove one element from cache, possibly adding it to metadata"""
first_index = self.cache_indicies[0]
del self.cache_indicies[0]
try: (old_source_rorp, old_dest_rorp, changed_flag,
success_flag, inc) = self.cache_dict[first_index]
except KeyError: # probably caused by error in file system (dup)
log.Log("Warning index %s missing from CCPP cache" %
(first_index,),2)
return
del self.cache_dict[first_index]
self.post_process(old_source_rorp, old_dest_rorp,
changed_flag, success_flag, inc)
def post_process(self, source_rorp, dest_rorp, changed, success, inc):
"""Post process source_rorp and dest_rorp.
The point of this is to write statistics and metadata.
changed will be true if the files have changed. success will
be true if the files have been successfully updated (this is
always false for un-changed files).
"""
if not changed or success:
if source_rorp: self.statfileobj.add_source_file(source_rorp)
if dest_rorp: self.statfileobj.add_dest_file(dest_rorp)
if success == 0: metadata_rorp = dest_rorp
elif success == 1 or success == 2:
self.statfileobj.add_changed(source_rorp, dest_rorp)
metadata_rorp = source_rorp
else: metadata_rorp = None
if metadata_rorp and metadata_rorp.lstat():
metadata.WriteMetadata(metadata_rorp)
if Globals.file_statistics:
statistics.FileStats.update(source_rorp, dest_rorp, changed, inc)
def in_cache(self, index):
"""Return true if given index is cached"""
return self.cache_dict.has_key(index)
def flag_success(self, index):
"""Signal that the file with given index was updated successfully"""
self.cache_dict[index][3] = 1
def flag_deleted(self, index):
"""Signal that the destination file was deleted"""
self.cache_dict[index][3] = 2
def flag_changed(self, index):
"""Signal that the file with given index has changed"""
self.cache_dict[index][2] = 1
def set_inc(self, index, inc):
"""Set the increment of the current file"""
self.cache_dict[index][4] = inc
def get_rorps(self, index):
"""Retrieve (source_rorp, dest_rorp) from cache"""
return self.cache_dict[index][:2]
def get_source_rorp(self, index):
"""Retrieve source_rorp with given index from cache"""
assert index >= self.cache_indicies[0], \
("CCPP index out of order: %s %s" %
(repr(index), repr(self.cache_indicies[0])))
return self.cache_dict[index][0]
def get_mirror_rorp(self, index):
"""Retrieve mirror_rorp with given index from cache"""
return self.cache_dict[index][1]
def close(self):
"""Process the remaining elements in the cache"""
while self.cache_indicies: self.shorten_cache()
metadata.CloseMetadata()
if Globals.print_statistics: statistics.print_active_stats()
if Globals.file_statistics: statistics.FileStats.close()
statistics.write_active_statfileobj()
class PatchITRB(rorpiter.ITRBranch):
"""Patch an rpath with the given diff iters (use with IterTreeReducer)
The main complication here involves directories. We have to
finish processing the directory after what's in the directory, as
the directory may have inappropriate permissions to alter the
contents or the dir's mtime could change as we change the
contents.
"""
def __init__(self, basis_root_rp, CCPP):
"""Set basis_root_rp, the base of the tree to be incremented"""
self.basis_root_rp = basis_root_rp
assert basis_root_rp.conn is Globals.local_connection
self.statfileobj = (statistics.get_active_statfileobj() or
statistics.StatFileObj())
self.dir_replacement, self.dir_update = None, None
self.cached_rp = None
self.CCPP = CCPP
self.error_handler = robust.get_error_handler("UpdateError")
def get_rp_from_root(self, index):
"""Return RPath by adding index to self.basis_root_rp"""
if not self.cached_rp or self.cached_rp.index != index:
self.cached_rp = self.basis_root_rp.new_index(index)
return self.cached_rp
def can_fast_process(self, index, diff_rorp):
"""True if diff_rorp and mirror are not directories"""
rp = self.get_rp_from_root(index)
return not diff_rorp.isdir() and not rp.isdir()
def fast_process(self, index, diff_rorp):
"""Patch base_rp with diff_rorp (case where neither is directory)"""
rp = self.get_rp_from_root(index)
tf = TempFile.new(rp)
if self.patch_to_temp(rp, diff_rorp, tf):
if tf.lstat():
rpath.rename(tf, rp)
self.CCPP.flag_success(index)
elif rp.lstat():
rp.delete()
self.CCPP.flag_deleted(index)
else:
tf.setdata()
if tf.lstat(): tf.delete()
def patch_to_temp(self, basis_rp, diff_rorp, new):
"""Patch basis_rp, writing output in new, which doesn't exist yet"""
if diff_rorp.isflaglinked():
Hardlink.link_rp(diff_rorp, new, self.basis_root_rp)
elif diff_rorp.get_attached_filetype() == 'snapshot':
if diff_rorp.isspecial():
self.write_special(diff_rorp, new)
rpath.copy_attribs(diff_rorp, new)
return 1
elif robust.check_common_error(self.error_handler, rpath.copy,
(diff_rorp, new)) == 0: return 0
else:
assert diff_rorp.get_attached_filetype() == 'diff'
if robust.check_common_error(self.error_handler,
Rdiff.patch_local, (basis_rp, diff_rorp, new)) == 0: return 0
if new.lstat(): rpath.copy_attribs(diff_rorp, new)
return self.matches_cached_rorp(diff_rorp, new)
def matches_cached_rorp(self, diff_rorp, new_rp):
"""Return true if new_rp matches cached src rorp
This is a final check to make sure the temp file just written
matches the stats which we got earlier. If it doesn't it
could confuse the regress operation. This is only necessary
for regular files.
"""
if not new_rp.isreg(): return 1
cached_rorp = self.CCPP.get_source_rorp(diff_rorp.index)
if cached_rorp and cached_rorp.equal_loose(new_rp): return 1
log.ErrorLog.write_if_open("UpdateError", diff_rorp, "Updated mirror "
"temp file %s does not match source" % (new_rp.path,))
return 0
def write_special(self, diff_rorp, new):
"""Write diff_rorp (which holds special file) to new"""
eh = robust.get_error_handler("SpecialFileError")
if robust.check_common_error(eh, rpath.copy, (diff_rorp, new)) == 0:
new.setdata()
if new.lstat(): new.delete()
new.touch()
def start_process(self, index, diff_rorp):
"""Start processing directory - record information for later"""
base_rp = self.base_rp = self.get_rp_from_root(index)
assert diff_rorp.isdir() or base_rp.isdir() or not base_rp.index
if diff_rorp.isdir(): self.prepare_dir(diff_rorp, base_rp)
elif self.set_dir_replacement(diff_rorp, base_rp):
self.CCPP.flag_success(index)
def set_dir_replacement(self, diff_rorp, base_rp):
"""Set self.dir_replacement, which holds data until done with dir
This is used when base_rp is a dir, and diff_rorp is not.
"""
assert diff_rorp.get_attached_filetype() == 'snapshot'
self.dir_replacement = TempFile.new(base_rp)
if not self.patch_to_temp(None, diff_rorp, self.dir_replacement):
if self.dir_replacement.lstat(): self.dir_replacement.delete()
# Was an error, so now restore original directory
rpath.copy_with_attribs(self.CCPP.get_mirror_rorp(diff_rorp.index),
self.dir_replacement)
success = 0
else: success = 1
if base_rp.isdir() and Globals.change_permissions: base_rp.chmod(0700)
return success
def prepare_dir(self, diff_rorp, base_rp):
"""Prepare base_rp to turn into a directory"""
self.dir_update = diff_rorp.getRORPath() # make copy in case changes
if not base_rp.isdir():
if base_rp.lstat(): base_rp.delete()
base_rp.mkdir()
self.CCPP.flag_success(diff_rorp.index)
else: # maybe no change, so query CCPP before tagging success
if self.CCPP.in_cache(diff_rorp.index):
self.CCPP.flag_success(diff_rorp.index)
if Globals.change_permissions: base_rp.chmod(0700)
def end_process(self):
"""Finish processing directory"""
if self.dir_update:
assert self.base_rp.isdir()
rpath.copy_attribs(self.dir_update, self.base_rp)
else:
assert self.dir_replacement
self.base_rp.rmdir()
if self.dir_replacement.lstat():
rpath.rename(self.dir_replacement, self.base_rp)
class IncrementITRB(PatchITRB):
"""Patch an rpath with the given diff iters and write increments
Like PatchITRB, but this time also write increments.
"""
def __init__(self, basis_root_rp, inc_root_rp, rorp_cache):
self.inc_root_rp = inc_root_rp
self.cached_incrp = None
PatchITRB.__init__(self, basis_root_rp, rorp_cache)
def get_incrp(self, index):
"""Return inc RPath by adding index to self.basis_root_rp"""
if not self.cached_incrp or self.cached_incrp.index != index:
self.cached_incrp = self.inc_root_rp.new_index(index)
return self.cached_incrp
def inc_with_checking(self, new, old, inc_rp):
"""Produce increment taking new to old checking for errors"""
try: inc = increment.Increment(new, old, inc_rp)
except OSError, exc:
if (errno.errorcode.has_key(exc[0]) and
errno.errorcode[exc[0]] == 'ENAMETOOLONG'):
self.error_handler(exc, old)
return None
else: raise
return inc
def fast_process(self, index, diff_rorp):
"""Patch base_rp with diff_rorp and write increment (neither is dir)"""
rp = self.get_rp_from_root(index)
tf = TempFile.new(rp)
if self.patch_to_temp(rp, diff_rorp, tf):
inc = self.inc_with_checking(tf, rp, self.get_incrp(index))
if inc is not None:
self.CCPP.set_inc(index, inc)
if inc.isreg():
inc.fsync_with_dir() # Write inc before rp changed
if tf.lstat():
rpath.rename(tf, rp)
self.CCPP.flag_success(index)
elif rp.lstat():
rp.delete()
self.CCPP.flag_deleted(index)
return # normal return, otherwise error occurred
tf.setdata()
if tf.lstat(): tf.delete()
def start_process(self, index, diff_rorp):
"""Start processing directory"""
base_rp = self.base_rp = self.get_rp_from_root(index)
assert diff_rorp.isdir() or base_rp.isdir()
if diff_rorp.isdir():
inc = self.inc_with_checking(diff_rorp, base_rp,
self.get_incrp(index))
if inc and inc.isreg():
inc.fsync_with_dir() # must write inc before rp changed
self.prepare_dir(diff_rorp, base_rp)
elif self.set_dir_replacement(diff_rorp, base_rp):
inc = self.inc_with_checking(self.dir_replacement, base_rp,
self.get_incrp(index))
if inc:
self.CCPP.set_inc(index, inc)
self.CCPP.flag_success(index)
/* ----------------------------------------------------------------------- *
*
* Copyright 2002 Ben Escoto
*
* This file is part of rdiff-backup.
*
* rdiff-backup is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation, Inc., 675 Mass Ave,
* Cambridge MA 02139, USA; either version 2 of the License, or (at
* your option) any later version; incorporated herein by reference.
*
* ----------------------------------------------------------------------- */
#include <Python.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <errno.h>
/* choose the appropriate stat and fstat functions and return structs */
/* This code taken from Python's posixmodule.c */
#undef STAT
#if defined(MS_WIN64) || defined(MS_WIN32)
# define STAT _stati64
# define FSTAT _fstati64
# define STRUCT_STAT struct _stati64
#else
# define STAT stat
# define FSTAT fstat
# define STRUCT_STAT struct stat
#endif
#ifndef PY_LONG_LONG
#define PY_LONG_LONG LONG_LONG
#endif
static PyObject *UnknownFileTypeError;
static PyObject *c_make_file_dict(PyObject *self, PyObject *args);
static PyObject *long2str(PyObject *self, PyObject *args);
static PyObject *str2long(PyObject *self, PyObject *args);
static PyObject *my_sync(PyObject *self, PyObject *args);
/* Turn a stat structure into a python dictionary. The preprocessor
stuff taken from Python's posixmodule.c */
static PyObject *c_make_file_dict(self, args)
PyObject *self;
PyObject *args;
{
PyObject *size, *inode, *mtime, *atime, *devloc, *return_val;
char *filename, filetype[5];
STRUCT_STAT sbuf;
long int mode, perms;
int res;
if (!PyArg_ParseTuple(args, "s", &filename)) return NULL;
Py_BEGIN_ALLOW_THREADS
res = lstat(filename, &sbuf);
Py_END_ALLOW_THREADS
if (res != 0) {
if (errno == ENOENT || errno == ENOTDIR)
return Py_BuildValue("{s:s}", "type", NULL);
else {
PyErr_SetFromErrnoWithFilename(PyExc_OSError, filename);
return NULL;
}
}
#ifdef HAVE_LARGEFILE_SUPPORT
size = PyLong_FromLongLong((PY_LONG_LONG)sbuf.st_size);
inode = PyLong_FromLongLong((PY_LONG_LONG)sbuf.st_ino);
#else
size = PyInt_FromLong(sbuf.st_size);
inode = PyInt_FromLong((long)sbuf.st_ino);
#endif
mode = (long)sbuf.st_mode;
perms = mode & 07777;
#if defined(HAVE_LONG_LONG) && !defined(MS_WINDOWS)
devloc = PyLong_FromLongLong((PY_LONG_LONG)sbuf.st_dev);
#else
devloc = PyInt_FromLong((long)sbuf.st_dev);
#endif
#if SIZEOF_TIME_T > SIZEOF_LONG
mtime = PyLong_FromLongLong((PY_LONG_LONG)sbuf.st_mtime);
atime = PyLong_FromLongLong((PY_LONG_LONG)sbuf.st_atime);
#else
mtime = PyInt_FromLong((long)sbuf.st_mtime);
atime = PyInt_FromLong((long)sbuf.st_atime);
#endif
/* Build return dictionary from stat struct */
if (S_ISREG(mode) || S_ISDIR(mode) || S_ISSOCK(mode) || S_ISFIFO(mode)) {
/* Regular files, directories, sockets, and fifos */
if S_ISREG(mode) strcpy(filetype, "reg");
else if S_ISDIR(mode) strcpy(filetype, "dir");
else if S_ISSOCK(mode) strcpy(filetype, "sock");
else strcpy(filetype, "fifo");
return_val = Py_BuildValue("{s:s,s:O,s:l,s:l,s:l,s:O,s:O,s:l,s:O,s:O}",
"type", filetype,
"size", size,
"perms", perms,
"uid", (long)sbuf.st_uid,
"gid", (long)sbuf.st_gid,
"inode", inode,
"devloc", devloc,
"nlink", (long)sbuf.st_nlink,
"mtime", mtime,
"atime", atime);
} else if S_ISLNK(mode) {
/* Symbolic links */
char linkname[1024];
int len_link = readlink(filename, linkname, 1023);
if (len_link < 0) {
PyErr_SetFromErrno(PyExc_OSError);
return_val = NULL;
} else {
linkname[len_link] = '\0';
return_val = Py_BuildValue("{s:s,s:O,s:l,s:l,s:l,s:O,s:O,s:l,s:s}",
"type", "sym",
"size", size,
"perms", perms,
"uid", (long)sbuf.st_uid,
"gid", (long)sbuf.st_gid,
"inode", inode,
"devloc", devloc,
"nlink", (long)sbuf.st_nlink,
"linkname", linkname);
}
} else if (S_ISCHR(mode) || S_ISBLK(mode)) {
/* Device files */
char devtype[2];
#if defined(HAVE_LONG_LONG) && !defined(MS_WINDOWS)
PY_LONG_LONG devnums = (PY_LONG_LONG)sbuf.st_rdev;
PyObject *major_num = PyLong_FromLongLong(major(devnums));
#else
long int devnums = (long)sbuf.st_dev;
PyObject *major_num = PyInt_FromLong(devnums >> 8);
#endif
int minor_num = (int)(minor(devnums));
if S_ISCHR(mode) strcpy(devtype, "c");
else strcpy(devtype, "b");
return_val = Py_BuildValue("{s:s,s:O,s:l,s:l,s:l,s:O,s:O,s:l,s:N}",
"type", "dev",
"size", size,
"perms", perms,
"uid", (long)sbuf.st_uid,
"gid", (long)sbuf.st_gid,
"inode", inode,
"devloc", devloc,
"nlink", (long)sbuf.st_nlink,
"devnums", Py_BuildValue("(s,O,i)", devtype,
major_num, minor_num));
Py_DECREF(major_num);
} else {
/* Unrecognized file type - raise exception */
PyErr_SetString(UnknownFileTypeError, filename);
return_val = NULL;
}
Py_DECREF(size);
Py_DECREF(inode);
Py_DECREF(devloc);
Py_DECREF(mtime);
Py_DECREF(atime);
return return_val;
}
/* Convert python long into 7 byte string */
static PyObject *long2str(self, args)
PyObject *self;
PyObject *args;
{
unsigned char s[7];
PyLongObject *pylong;
PyObject *return_val;
if (!PyArg_ParseTuple(args, "O!", &PyLong_Type, &pylong)) return NULL;
if (_PyLong_AsByteArray(pylong, s, 7, 0, 0) != 0) return NULL;
else return Py_BuildValue("s#", s, 7);
return return_val;
}
/* Run sync() and return None */
static PyObject *my_sync(self, args)
PyObject *self;
PyObject *args;
{
if (!PyArg_ParseTuple(args, "")) return NULL;
sync();
return Py_BuildValue("");
}
/* Reverse of above; convert 7 byte string into python long */
static PyObject *str2long(self, args)
PyObject *self;
PyObject *args;
{
unsigned char *s;
int ssize;
if (!PyArg_ParseTuple(args, "s#", &s, &ssize)) return NULL;
if (ssize != 7) {
PyErr_SetString(PyExc_TypeError, "Single argument must be 7 char string");
return NULL;
}
return _PyLong_FromByteArray(s, 7, 0, 0);
}
static PyMethodDef CMethods[] = {
{"make_file_dict", c_make_file_dict, METH_VARARGS,
"Make dictionary from file stat"},
{"long2str", long2str, METH_VARARGS, "Convert python long to 7 byte string"},
{"str2long", str2long, METH_VARARGS, "Convert 7 byte string to python long"},
{"sync", my_sync, METH_VARARGS, "sync buffers to disk"},
{NULL, NULL, 0, NULL}
};
void initC(void)
{
PyObject *m, *d;
m = Py_InitModule("C", CMethods);
d = PyModule_GetDict(m);
UnknownFileTypeError = PyErr_NewException("C.UnknownFileTypeError",
NULL, NULL);
PyDict_SetItemString(d, "UnknownFileTypeError", UnknownFileTypeError);
}
#!/usr/bin/env python
import sys, os
from distutils.core import setup, Extension
assert len(sys.argv) == 1
sys.argv.append("build")
setup(name="CModule",
version="0.9.0",
description="rdiff-backup's C component",
ext_modules=[Extension("C", ["cmodule.c"]),
Extension("_librsync", ["_librsyncmodule.c"],
libraries=["rsync"])])
def get_libraries():
"""Return filename of C.so and _librsync.so files"""
build_files = os.listdir("build")
lib_dirs = filter(lambda x: x.startswith("lib"), build_files)
assert len(lib_dirs) == 1, "No library directory or too many"
libdir = lib_dirs[0]
clib = os.path.join("build", libdir, "C.so")
rsynclib = os.path.join("build", libdir, "_librsync.so")
try:
os.lstat(clib)
os.lstat(rsynclib)
except os.error:
print "Library file missing"
sys.exit(1)
return clib, rsynclib
for filename in get_libraries():
assert not os.system("mv %s ." % (filename,))
assert not os.system("rm -rf build")
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Support code for remote execution and data transfer"""
from __future__ import generators
import types, os, tempfile, cPickle, shutil, traceback, pickle, \
socket, sys, gzip
class ConnectionError(Exception): pass
class ConnectionReadError(ConnectionError): pass
class ConnectionQuit(Exception): pass
class Connection:
"""Connection class - represent remote execution
The idea is that, if c is an instance of this class, c.foo will
return the object on the remote side. For functions, c.foo will
return a function that, when called, executes foo on the remote
side, sending over the arguments and sending back the result.
"""
def __repr__(self): return self.__str__()
def __str__(self): return "Simple Connection" # override later
def __nonzero__(self): return 1
class LocalConnection(Connection):
"""Local connection
This is a dummy connection class, so that LC.foo just evaluates to
foo using global scope.
"""
def __init__(self):
"""This prevents two instances of LocalConnection"""
assert not Globals.local_connection
self.conn_number = 0 # changed by SetConnections for server
def __getattr__(self, name):
if name in globals(): return globals()[name]
elif isinstance(__builtins__, dict): return __builtins__[name]
else: return __builtins__.__dict__[name]
def __setattr__(self, name, value): globals()[name] = value
def __delattr__(self, name): del globals()[name]
def __str__(self): return "LocalConnection"
def reval(self, function_string, *args):
return apply(eval(function_string), args)
def quit(self): pass
class ConnectionRequest:
"""Simple wrapper around a PipeConnection request"""
def __init__(self, function_string, num_args):
self.function_string = function_string
self.num_args = num_args
def __str__(self):
return "ConnectionRequest: %s with %d arguments" % \
(self.function_string, self.num_args)
class LowLevelPipeConnection(Connection):
"""Routines for just sending objects from one side of pipe to another
Each thing sent down the pipe is paired with a request number,
currently limited to be between 0 and 255. The size of each thing
should be less than 2^56.
Each thing also has a type, indicated by one of the following
characters:
o - generic object
i - iterator/generator of RORPs
f - file object
b - string
q - quit signal
R - RPath
r - RORPath only
c - PipeConnection object
"""
def __init__(self, inpipe, outpipe):
"""inpipe is a file-type open for reading, outpipe for writing"""
self.inpipe = inpipe
self.outpipe = outpipe
def __str__(self):
"""Return string version
This is actually an important function, because otherwise
requests to represent this object would result in "__str__"
being executed on the other side of the connection.
"""
return "LowLevelPipeConnection"
def _put(self, obj, req_num):
"""Put an object into the pipe (will send raw if string)"""
log.Log.conn("sending", obj, req_num)
if type(obj) is types.StringType: self._putbuf(obj, req_num)
elif isinstance(obj, connection.Connection):self._putconn(obj, req_num)
elif isinstance(obj, rpath.RPath): self._putrpath(obj, req_num)
elif isinstance(obj, rpath.RORPath): self._putrorpath(obj, req_num)
elif ((hasattr(obj, "read") or hasattr(obj, "write"))
and hasattr(obj, "close")): self._putfile(obj, req_num)
elif hasattr(obj, "next"): self._putiter(obj, req_num)
else: self._putobj(obj, req_num)
def _putobj(self, obj, req_num):
"""Send a generic python obj down the outpipe"""
# for some reason there is an error when cPickle is used below..
self._write("o", pickle.dumps(obj, 1), req_num)
def _putbuf(self, buf, req_num):
"""Send buffer buf down the outpipe"""
self._write("b", buf, req_num)
def _putfile(self, fp, req_num):
"""Send a file to the client using virtual files"""
self._write("f", str(VirtualFile.new(fp)), req_num)
def _putiter(self, iterator, req_num):
"""Put an iterator through the pipe"""
self._write("i",
str(VirtualFile.new(iterfile.RORPIterToFile(iterator))),
req_num)
def _putrpath(self, rpath, req_num):
"""Put an rpath into the pipe
The rpath's connection will be encoded as its conn_number. It
and the other information is put in a tuple.
"""
rpath_repr = (rpath.conn.conn_number, rpath.base,
rpath.index, rpath.data)
self._write("R", cPickle.dumps(rpath_repr, 1), req_num)
def _putrorpath(self, rorpath, req_num):
"""Put an rorpath into the pipe
This is only necessary because if there is a .file attached,
it must be excluded from the pickling
"""
rorpath_repr = (rorpath.index, rorpath.data)
self._write("r", cPickle.dumps(rorpath_repr, 1), req_num)
def _putconn(self, pipeconn, req_num):
"""Put a connection into the pipe
A pipe connection is represented just as the integer (in
string form) of its connection number it is *connected to*.
"""
self._write("c", str(pipeconn.conn_number), req_num)
def _putquit(self):
"""Send a string that takes down server"""
self._write("q", "", 255)
def _write(self, headerchar, data, req_num):
"""Write header and then data to the pipe"""
self.outpipe.write(headerchar + chr(req_num) +
C.long2str(long(len(data))))
self.outpipe.write(data)
self.outpipe.flush()
def _read(self, length):
"""Read length bytes from inpipe, returning result"""
return self.inpipe.read(length)
def _s2l_old(self, s):
"""Convert string to long int"""
assert len(s) == 7
l = 0L
for i in range(7): l = l*256 + ord(s[i])
return l
def _l2s_old(self, l):
"""Convert long int to string"""
s = ""
for i in range(7):
l, remainder = divmod(l, 256)
s = chr(remainder) + s
assert remainder == 0
return s
def _get(self):
"""Read an object from the pipe and return (req_num, value)"""
header_string = self.inpipe.read(9)
if not len(header_string) == 9:
raise ConnectionReadError("Truncated header string (problem "
"probably originated remotely)")
try:
format_string, req_num, length = (header_string[0],
ord(header_string[1]),
C.str2long(header_string[2:]))
except IndexError: raise ConnectionError()
if format_string == "q": raise ConnectionQuit("Received quit signal")
data = self._read(length)
if format_string == "o": result = cPickle.loads(data)
elif format_string == "b": result = data
elif format_string == "f": result = VirtualFile(self, int(data))
elif format_string == "i":
result = iterfile.FileToRORPIter(VirtualFile(self, int(data)))
elif format_string == "r": result = self._getrorpath(data)
elif format_string == "R": result = self._getrpath(data)
else:
assert format_string == "c", header_string
result = Globals.connection_dict[int(data)]
log.Log.conn("received", result, req_num)
return (req_num, result)
def _getrorpath(self, raw_rorpath_buf):
"""Reconstruct RORPath object from raw data"""
index, data = cPickle.loads(raw_rorpath_buf)
return rpath.RORPath(index, data)
def _getrpath(self, raw_rpath_buf):
"""Return RPath object indicated by raw_rpath_buf"""
conn_number, base, index, data = cPickle.loads(raw_rpath_buf)
return rpath.RPath(Globals.connection_dict[conn_number],
base, index, data)
def _close(self):
"""Close the pipes associated with the connection"""
self.outpipe.close()
self.inpipe.close()
class PipeConnection(LowLevelPipeConnection):
"""Provide server and client functions for a Pipe Connection
Both sides act as modules that allows for remote execution. For
instance, self.conn.pow(2,8) will execute the operation on the
server side.
The only difference between the client and server is that the
client makes the first request, and the server listens first.
"""
def __init__(self, inpipe, outpipe, conn_number = 0):
"""Init PipeConnection
conn_number should be a unique (to the session) integer to
identify the connection. For instance, all connections to the
client have conn_number 0. Other connections can use this
number to route commands to the correct process.
"""
LowLevelPipeConnection.__init__(self, inpipe, outpipe)
self.conn_number = conn_number
self.unused_request_numbers = {}
for i in range(256): self.unused_request_numbers[i] = None
def __str__(self): return "PipeConnection %d" % self.conn_number
def get_response(self, desired_req_num):
"""Read from pipe, responding to requests until req_num.
Sometimes after a request is sent, the other side will make
another request before responding to the original one. In
that case, respond to the request. But return once the right
response is given.
"""
while 1:
try: req_num, object = self._get()
except ConnectionQuit:
self._put("quitting", self.get_new_req_num())
self._close()
return
if req_num == desired_req_num: return object
else:
assert isinstance(object, ConnectionRequest)
self.answer_request(object, req_num)
def answer_request(self, request, req_num):
"""Put the object requested by request down the pipe"""
del self.unused_request_numbers[req_num]
argument_list = []
for i in range(request.num_args):
arg_req_num, arg = self._get()
assert arg_req_num == req_num
argument_list.append(arg)
try:
Security.vet_request(request, argument_list)
result = apply(eval(request.function_string), argument_list)
except: result = self.extract_exception()
self._put(result, req_num)
self.unused_request_numbers[req_num] = None
def extract_exception(self):
"""Return active exception"""
if log.Log.verbosity >= 5 or log.Log.term_verbosity >= 5:
log.Log("Sending back exception %s of type %s: \n%s" %
(sys.exc_info()[1], sys.exc_info()[0],
"".join(traceback.format_tb(sys.exc_info()[2]))), 5)
return sys.exc_info()[1]
def Server(self):
"""Start server's read eval return loop"""
Globals.server = 1
Globals.connections.append(self)
log.Log("Starting server", 6)
self.get_response(-1)
def reval(self, function_string, *args):
"""Execute command on remote side
The first argument should be a string that evaluates to a
function, like "pow", and the remaining are arguments to that
function.
"""
req_num = self.get_new_req_num()
self._put(ConnectionRequest(function_string, len(args)), req_num)
for arg in args: self._put(arg, req_num)
result = self.get_response(req_num)
self.unused_request_numbers[req_num] = None
if isinstance(result, Exception): raise result
else: return result
def get_new_req_num(self):
"""Allot a new request number and return it"""
if not self.unused_request_numbers:
raise ConnectionError("Exhaused possible connection numbers")
req_num = self.unused_request_numbers.keys()[0]
del self.unused_request_numbers[req_num]
return req_num
def quit(self):
"""Close the associated pipes and tell server side to quit"""
assert not Globals.server
self._putquit()
self._get()
self._close()
def __getattr__(self, name):
"""Intercept attributes to allow for . invocation"""
return EmulateCallable(self, name)
class RedirectedConnection(Connection):
"""Represent a connection more than one move away
For instance, suppose things are connected like this: S1---C---S2.
If Server1 wants something done by Server2, it will have to go
through the Client. So on S1's side, S2 will be represented by a
RedirectedConnection.
"""
def __init__(self, conn_number, routing_number = 0):
"""RedirectedConnection initializer
Returns a RedirectedConnection object for the given
conn_number, where commands are routed through the connection
with the given routing_number. 0 is the client, so the
default shouldn't have to be changed.
"""
self.conn_number = conn_number
self.routing_number = routing_number
self.routing_conn = Globals.connection_dict[routing_number]
def reval(self, function_string, *args):
"""Evalution function_string on args on remote connection"""
return self.routing_conn.reval("RedirectedRun", self.conn_number,
function_string, *args)
def __str__(self):
return "RedirectedConnection %d,%d" % (self.conn_number,
self.routing_number)
def __getattr__(self, name):
return EmulateCallableRedirected(self.conn_number, self.routing_conn,
name)
def RedirectedRun(conn_number, func, *args):
"""Run func with args on connection with conn number conn_number
This function is meant to redirect requests from one connection to
another, so conn_number must not be the local connection (and also
for security reasons since this function is always made
available).
"""
conn = Globals.connection_dict[conn_number]
assert conn is not Globals.local_connection, conn
return conn.reval(func, *args)
class EmulateCallable:
"""This is used by PipeConnection in calls like conn.os.chmod(foo)"""
def __init__(self, connection, name):
self.connection = connection
self.name = name
def __call__(self, *args):
return apply(self.connection.reval, (self.name,) + args)
def __getattr__(self, attr_name):
return EmulateCallable(self.connection,
"%s.%s" % (self.name, attr_name))
class EmulateCallableRedirected:
"""Used by RedirectedConnection in calls like conn.os.chmod(foo)"""
def __init__(self, conn_number, routing_conn, name):
self.conn_number, self.routing_conn = conn_number, routing_conn
self.name = name
def __call__(self, *args):
return apply(self.routing_conn.reval,
("RedirectedRun", self.conn_number, self.name) + args)
def __getattr__(self, attr_name):
return EmulateCallableRedirected(self.conn_number, self.routing_conn,
"%s.%s" % (self.name, attr_name))
class VirtualFile:
"""When the client asks for a file over the connection, it gets this
The returned instance then forwards requests over the connection.
The class's dictionary is used by the server to associate each
with a unique file number.
"""
#### The following are used by the server
vfiles = {}
counter = 0
def getbyid(cls, id):
return cls.vfiles[id]
getbyid = classmethod(getbyid)
def readfromid(cls, id, length):
if length is None: return cls.vfiles[id].read()
else: return cls.vfiles[id].read(length)
readfromid = classmethod(readfromid)
def readlinefromid(cls, id):
return cls.vfiles[id].readline()
readlinefromid = classmethod(readlinefromid)
def writetoid(cls, id, buffer):
return cls.vfiles[id].write(buffer)
writetoid = classmethod(writetoid)
def closebyid(cls, id):
fp = cls.vfiles[id]
del cls.vfiles[id]
return fp.close()
closebyid = classmethod(closebyid)
def new(cls, fileobj):
"""Associate a new VirtualFile with a read fileobject, return id"""
count = cls.counter
cls.vfiles[count] = fileobj
cls.counter = count + 1
return count
new = classmethod(new)
#### And these are used by the client
def __init__(self, connection, id):
self.connection = connection
self.id = id
def read(self, length = None):
return self.connection.VirtualFile.readfromid(self.id, length)
def readline(self):
return self.connection.VirtualFile.readlinefromid(self.id)
def write(self, buf):
return self.connection.VirtualFile.writetoid(self.id, buf)
def close(self):
return self.connection.VirtualFile.closebyid(self.id)
def __iter__(self):
"""Iterates lines in file, like normal iter(file) behavior"""
while 1:
line = self.readline()
if not line: break
yield line
# everything has to be available here for remote connection's use, but
# put at bottom to reduce circularities.
import Globals, Time, Rdiff, Hardlink, FilenameMapping, C, Security, \
Main, rorpiter, selection, increment, statistics, manage, lazy, \
iterfile, rpath, robust, restore, manage, backup, connection, \
TempFile, SetConnections, librsync, log, regress
Globals.local_connection = LocalConnection()
Globals.connections.append(Globals.local_connection)
# Following changed by server in SetConnections
Globals.connection_dict[0] = Globals.local_connection
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Determine the capabilities of given file system
rdiff-backup needs to read and write to file systems with varying
abilities. For instance, some file systems and not others have ACLs,
are case-sensitive, or can store ownership information. The code in
this module tests the file system for various features, and returns an
FSAbilities object describing it.
"""
import errno
import Globals, log, TempFile
class FSAbilities:
"""Store capabilities of given file system"""
chars_to_quote = None # Hold characters not allowable in file names
ownership = None # True if chown works on this filesystem
acls = None # True if access control lists supported
eas = None # True if extended attributes supported
hardlinks = None # True if hard linking supported
fsync_dirs = None # True if directories can be fsync'd
read_only = None # True if capabilities were determined non-destructively
def init_readonly(self, rp):
"""Set variables using fs tested at RPath rp
This method does not write to the file system at all, and
should be run on the file system when the file system will
only need to be read.
Only self.acls and self.eas are set.
"""
self.read_only = 1
self.set_eas(rp, 0)
self.set_acls(rp)
return self
def init_readwrite(self, rbdir, use_ctq_file = 1):
"""Set variables using fs tested at rp_base
This method creates a temp directory in rp_base and writes to
it in order to test various features. Use on a file system
that will be written to.
This sets self.chars_to_quote, self.ownership, self.acls,
self.eas, self.hardlinks, and self.fsync_dirs.
If user_ctq_file is true, try reading the "chars_to_quote"
file in directory.
"""
assert rbdir.isdir()
self.read_only = 0
subdir = TempFile.new_in_dir(rbdir)
subdir.mkdir()
self.set_ownership(subdir)
self.set_hardlinks(subdir)
self.set_fsync_dirs(subdir)
self.set_eas(subdir, 1)
self.set_acls(subdir)
self.set_chars_to_quote(subdir)
if use_ctq_file: self.compare_chars_to_quote(rbdir)
subdir.delete()
return self
def compare_chars_to_quote(self, rbdir):
"""Read chars_to_quote file, compare with current settings"""
assert self.chars_to_quote is not None
ctq_rp = rbdir.append("chars_to_quote")
def write_new_chars():
"""Replace old chars_to_quote file with new value"""
if ctq_rp.lstat(): ctq_rp.delete()
fp = ctq_rp.open("wb")
fp.write(self.chars_to_quote)
assert not fp.close()
def get_old_chars():
fp = ctq_rp.open("rb")
old_chars = fp.read()
assert not fp.close()
return old_chars
if not ctq_rp.lstat(): write_new_chars()
else:
old_chars = get_old_chars()
if old_chars != self.chars_to_quote:
if self.chars_to_quote == "":
log.Log("Warning: File system no longer needs quoting, "
"but will retain for backwards compatibility.", 2)
else: log.FatalError("""New quoting requirements
This may be caused when you copy an rdiff-backup directory from a
normal file system on to a windows one that cannot support the same
characters. If you want to risk it, remove the file
rdiff-backup-data/chars_to_quote.
""")
def set_ownership(self, testdir):
"""Set self.ownership to true iff testdir's ownership can be changed"""
tmp_rp = testdir.append("foo")
tmp_rp.touch()
uid, gid = tmp_rp.getuidgid()
try:
tmp_rp.chown(uid+1, gid+1) # just choose random uid/gid
tmp_rp.chown(0, 0)
except (IOError, OSError), exc:
if exc[0] == errno.EPERM:
log.Log("Warning: ownership cannot be changed on filesystem "
"at device %s" % (testdir.getdevloc(),), 2)
self.ownership = 0
else: raise
else: self.ownership = 1
tmp_rp.delete()
def set_hardlinks(self, testdir):
"""Set self.hardlinks to true iff hard linked files can be made"""
hl_source = testdir.append("hardlinked_file1")
hl_dest = testdir.append("hardlinked_file2")
hl_source.touch()
try:
hl_dest.hardlink(hl_source.path)
assert hl_source.getinode() == hl_dest.getinode()
except (IOError, OSError), exc:
if exc[0] in (errno.EOPNOTSUPP, errno.EPERM):
log.Log("Warning: hard linking not supported by filesystem %s"
% (testdir.getdevloc(),), 2)
self.hardlinks = 0
else: raise
else: self.hardlinks = 1
def set_fsync_dirs(self, testdir):
"""Set self.fsync_dirs if directories can be fsync'd"""
try: testdir.fsync()
except (IOError, OSError), exc:
log.Log("Warning: Directories on file system at %s are not "
"fsyncable.\nAssuming it's unnecessary." %
(testdir.getdevloc(),), 2)
self.fsync_dirs = 0
else: self.fsync_dirs = 1
def set_chars_to_quote(self, subdir):
"""Set self.chars_to_quote by trying to write various paths"""
def is_case_sensitive():
"""Return true if file system is case sensitive"""
upper_a = subdir.append("A")
upper_a.touch()
lower_a = subdir.append("a")
if lower_a.lstat():
lower_a.delete()
upper_a.setdata()
assert not upper_a.lstat()
return 0
else:
upper_a.delete()
return 1
def supports_unusual_chars():
"""Test handling of several chars sometimes not supported"""
for filename in [':', '\\', chr(175)]:
rp = subdir.append(filename)
try: rp.touch()
except IOError:
assert not rp.lstat()
return 0
assert rp.lstat()
rp.delete()
return 1
def sanity_check():
"""Make sure basic filenames writable"""
for filename in ['5-_ a']:
rp = subdir.append(filename)
rp.touch()
assert rp.lstat()
rp.delete()
sanity_check()
if is_case_sensitive():
if supports_unusual_chars(): self.chars_to_quote = ""
else: self.chars_to_quote = "^A-Za-z0-9_ -"
else:
if supports_unusual_chars(): self.chars_to_quote = "A-Z;"
else: self.chars_to_quote = "^a-z0-9_ -"
def set_acls(self, rp):
"""Set self.acls based on rp. Does not write. Needs to be local"""
assert Globals.local_connection is rp.conn
assert rp.lstat()
try: import posix1e
except ImportError:
log.Log("Warning: Unable to import module posix1e from pylibacl "
"package.\nACLs not supported on device %s" %
(rp.getdevloc(),), 2)
self.acls = 0
return
try: posix1e.ACL(file=rp.path)
except IOError, exc:
if exc[0] == errno.EOPNOTSUPP:
log.Log("Warning: ACLs appear not to be supported by "
"filesystem on device %s" % (rp.getdevloc(),), 2)
self.acls = 0
else: raise
else: self.acls = 1
def set_eas(self, rp, write):
"""Set extended attributes from rp. Run locally.
Tests writing if write is true.
"""
assert Globals.local_connection is rp.conn
assert rp.lstat()
try: import xattr
except ImportError:
log.Log("Warning: Unable to import module xattr. ACLs not "
"supported on device %s" % (rp.getdevloc(),), 2)
self.eas = 0
return
try:
xattr.listxattr(rp.path)
if write:
xattr.setxattr(rp.path, "user.test", "test val")
assert xattr.getxattr(rp.path, "user.test") == "test val"
except IOError, exc:
if exc[0] == errno.EOPNOTSUPP:
log.Log("Warning: Extended attributes not supported by "
"filesystem on device %s" % (rp.getdevloc(),), 2)
self.eas = 0
else: raise
else: self.eas = 1
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Provides functions and *ITR classes, for writing increment files"""
import Globals, Time, rpath, Rdiff, log, statistics, robust
def Increment(new, mirror, incpref):
"""Main file incrementing function, returns inc file created
new is the file on the active partition,
mirror is the mirrored file from the last backup,
incpref is the prefix of the increment file.
This function basically moves the information about the mirror
file to incpref.
"""
log.Log("Incrementing mirror file " + mirror.path, 5)
if ((new and new.isdir()) or mirror.isdir()) and not incpref.isdir():
incpref.mkdir()
if not mirror.lstat(): incrp = makemissing(incpref)
elif mirror.isdir(): incrp = makedir(mirror, incpref)
elif new.isreg() and mirror.isreg():
incrp = makediff(new, mirror, incpref)
else: incrp = makesnapshot(mirror, incpref)
statistics.process_increment(incrp)
return incrp
def makemissing(incpref):
"""Signify that mirror file was missing"""
incrp = get_inc(incpref, "missing")
incrp.touch()
return incrp
def iscompressed(mirror):
"""Return true if mirror's increments should be compressed"""
return (Globals.compression and
not Globals.no_compression_regexp.match(mirror.path))
def makesnapshot(mirror, incpref):
"""Copy mirror to incfile, since new is quite different"""
compress = iscompressed(mirror)
if compress and mirror.isreg():
snapshotrp = get_inc(incpref, "snapshot.gz")
else: snapshotrp = get_inc(incpref, "snapshot")
if mirror.isspecial(): # check for errors when creating special increments
eh = robust.get_error_handler("SpecialFileError")
if robust.check_common_error(eh, rpath.copy_with_attribs,
(mirror, snapshotrp, compress)) == 0:
snapshotrp.setdata()
if snapshotrp.lstat(): snapshotrp.delete()
snapshotrp.touch()
else: rpath.copy_with_attribs(mirror, snapshotrp, compress)
return snapshotrp
def makediff(new, mirror, incpref):
"""Make incfile which is a diff new -> mirror"""
compress = iscompressed(mirror)
if compress: diff = get_inc(incpref, "diff.gz")
else: diff = get_inc(incpref, "diff")
Rdiff.write_delta(new, mirror, diff, compress)
rpath.copy_attribs(mirror, diff)
return diff
def makedir(mirrordir, incpref):
"""Make file indicating directory mirrordir has changed"""
dirsign = get_inc(incpref, "dir")
dirsign.touch()
if Globals.change_dir_inc_perms: rpath.copy_attribs(mirrordir, dirsign)
return dirsign
def get_inc(rp, typestr, time = None):
"""Return increment like rp but with time and typestr suffixes
To avoid any quoting, the returned rpath has empty index, and the
whole filename is in the base (which is not quoted).
"""
if time is None: time = Time.prevtime
addtostr = lambda s: "%s.%s.%s" % (s, Time.timetostring(time), typestr)
if rp.index:
incrp = rp.__class__(rp.conn, rp.base, rp.index[:-1] +
(addtostr(rp.index[-1]),))
else:
dirname, basename = rp.dirsplit()
incrp = rp.__class__(rp.conn, dirname, (addtostr(basename),))
assert not incrp.lstat()
return incrp
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Convert an iterator to a file object and vice-versa"""
import cPickle, array, types
import Globals, C, robust, log, rpath
class IterFileException(Exception): pass
class UnwrapFile:
"""Contains some basic methods for parsing a file containing an iter"""
def __init__(self, file):
self.file = file
def _s2l_old(self, s):
"""Convert string to long int"""
assert len(s) == 7
l = 0L
for i in range(7): l = l*256 + ord(s[i])
return l
def _get(self):
"""Return pair (type, data) next in line on the file
type is a single character which is either
"o" for object,
"f" for file,
"c" for a continution of a file,
"e" for an exception, or
None if no more data can be read.
Data is either the file's data, if type is "c" or "f", or the
actual object if the type is "o" or "e".
"""
header = self.file.read(8)
if not header: return None, None
if len(header) != 8:
assert None, "Header %s is only %d bytes" % (header, len(header))
type, length = header[0], C.str2long(header[1:])
buf = self.file.read(length)
if type == "o" or type == "e": return type, cPickle.loads(buf)
else: return type, buf
class IterWrappingFile(UnwrapFile):
"""An iterator generated from a file.
Initialize with a file type object, and then it will return the
elements of the file in order.
"""
def __init__(self, file):
UnwrapFile.__init__(self, file)
self.currently_in_file = None
def __iter__(self): return self
def next(self):
if self.currently_in_file:
self.currently_in_file.close() # no error checking by this point
type, data = self._get()
if not type: raise StopIteration
if type == "o" or type == "e": return data
elif type == "f":
file = IterVirtualFile(self, data)
if data: self.currently_in_file = file
else: self.currently_in_file = None
return file
else: raise IterFileException("Bad file type %s" % type)
class IterVirtualFile(UnwrapFile):
"""Another version of a pretend file
This is returned by IterWrappingFile when a file is embedded in
the main file that the IterWrappingFile is based around.
"""
def __init__(self, iwf, initial_data):
"""Initializer
initial_data is the data from the first block of the file.
iwf is the iter wrapping file that spawned this
IterVirtualFile.
"""
UnwrapFile.__init__(self, iwf.file)
self.iwf = iwf
self.buffer = initial_data
self.closed = None
def read(self, length = -1):
"""Read length bytes from the file, updating buffers as necessary"""
assert not self.closed
if self.iwf.currently_in_file:
if length >= 0:
while length >= len(self.buffer):
if not self.addtobuffer(): break
real_len = min(length, len(self.buffer))
else:
while 1:
if not self.addtobuffer(): break
real_len = len(self.buffer)
else: real_len = min(length, len(self.buffer))
return_val = self.buffer[:real_len]
self.buffer = self.buffer[real_len:]
return return_val
def addtobuffer(self):
"""Read a chunk from the file and add it to the buffer"""
assert self.iwf.currently_in_file
type, data = self.iwf._get()
if type == "e":
self.iwf.currently_in_file = None
raise data
assert type == "c", "Type is %s instead of c" % type
if data:
self.buffer += data
return 1
else:
self.iwf.currently_in_file = None
return None
def close(self):
"""Currently just reads whats left and discards it"""
while self.iwf.currently_in_file:
self.addtobuffer()
self.buffer = ""
self.closed = 1
class FileWrappingIter:
"""A file interface wrapping around an iterator
This is initialized with an iterator, and then converts it into a
stream of characters. The object will evaluate as little of the
iterator as is necessary to provide the requested bytes.
The actual file is a sequence of marshaled objects, each preceded
by 8 bytes which identifies the following the type of object, and
specifies its length. File objects are not marshalled, but the
data is written in chunks of Globals.blocksize, and the following
blocks can identify themselves as continuations.
"""
def __init__(self, iter):
"""Initialize with iter"""
self.iter = iter
self.array_buf = array.array('c')
self.currently_in_file = None
self.closed = None
def read(self, length):
"""Return next length bytes in file"""
assert not self.closed
while len(self.array_buf) < length:
if not self.addtobuffer(): break
result = self.array_buf[:length].tostring()
del self.array_buf[:length]
return result
def addtobuffer(self):
"""Updates self.buffer, adding a chunk from the iterator.
Returns None if we have reached the end of the iterator,
otherwise return true.
"""
if self.currently_in_file: self.addfromfile("c")
else:
try: currentobj = self.iter.next()
except StopIteration: return None
if hasattr(currentobj, "read") and hasattr(currentobj, "close"):
self.currently_in_file = currentobj
self.addfromfile("f")
else:
pickle = cPickle.dumps(currentobj, 1)
self.array_buf.fromstring("o")
self.array_buf.fromstring(C.long2str(long(len(pickle))))
self.array_buf.fromstring(pickle)
return 1
def addfromfile(self, prefix_letter):
"""Read a chunk from the current file and add to array_buf
prefix_letter and the length will be prepended to the file
data. If there is an exception while reading the file, the
exception will be added to array_buf instead.
"""
buf = robust.check_common_error(self.read_error_handler,
self.currently_in_file.read,
[Globals.blocksize])
if buf == "" or buf is None:
assert not self.currently_in_file.close()
self.currently_in_file = None
if buf is None: # error occurred above, encode exception
prefix_letter = "e"
buf = cPickle.dumps(self.last_exception, 1)
total = "".join((prefix_letter, C.long2str(long(len(buf))), buf))
self.array_buf.fromstring(total)
def read_error_handler(self, exc, blocksize):
"""Log error when reading from file"""
self.last_exception = exc
return None
def _l2s_old(self, l):
"""Convert long int to string of 7 characters"""
s = ""
for i in range(7):
l, remainder = divmod(l, 256)
s = chr(remainder) + s
assert remainder == 0
return s
def close(self): self.closed = 1
class RORPIterFlush:
"""Used to signal that a RORPIterToFile should flush buffer"""
pass
class RORPIterFlushRepeat(RORPIterFlush):
"""Flush, but then cause RORPIter to yield this same object
Thus if we put together a pipeline of these, one RORPIterContFlush
can cause all the segments to flush in sequence.
"""
pass
class RORPIterToFile(FileWrappingIter):
"""Take a RORPIter and give it a file-ish interface
This is how we send signatures and diffs across the line. As
sending each one separately via a read() call would result in a
lot of latency, the read()'s are buffered - a read() call with no
arguments will return a variable length string (possibly empty).
To flush the RORPIterToFile, have the iterator yield a
RORPIterFlush class.
"""
def __init__(self, rpiter, max_buffer_bytes = None, max_buffer_rps = None):
"""RORPIterToFile initializer
max_buffer_bytes is the maximum size of the buffer in bytes.
max_buffer_rps is the maximum size of the buffer in rorps.
"""
self.max_buffer_bytes = max_buffer_bytes or Globals.conn_bufsize
self.max_buffer_rps = max_buffer_rps or Globals.pipeline_max_length
self.rorps_in_buffer = 0
self.next_in_line = None
FileWrappingIter.__init__(self, rpiter)
def read(self, length = None):
"""Return some number of bytes, including 0"""
assert not self.closed
if length is None:
while (len(self.array_buf) < self.max_buffer_bytes and
self.rorps_in_buffer < self.max_buffer_rps):
if not self.addtobuffer(): break
result = self.array_buf.tostring()
del self.array_buf[:]
self.rorps_in_buffer = 0
return result
else:
assert length >= 0
read_buffer = self.read()
while len(read_buffer) < length: read_buffer += self.read()
self.array_buf.fromstring(read_buffer[length:])
return read_buffer[length:]
def addtobuffer(self):
"""Add some number of bytes to the buffer. Return false if done"""
if self.currently_in_file:
self.addfromfile("c")
if not self.currently_in_file: self.rorps_in_buffer += 1
else:
if self.next_in_line:
currentobj = self.next_in_line
self.next_in_line = 0
else:
try: currentobj = self.iter.next()
except StopIteration:
self.addfinal()
return None
if hasattr(currentobj, "read") and hasattr(currentobj, "close"):
self.currently_in_file = currentobj
self.addfromfile("f")
elif (type(currentobj) is types.ClassType and
issubclass(currentobj, iterfile.RORPIterFlush)):
if currentobj is iterfile.RORPIterFlushRepeat:
self.add_flush_repeater()
return None
else: self.addrorp(currentobj)
return 1
def add_flush_repeater(self):
"""Add a RORPIterFlushRepeat object to the buffer"""
pickle = cPickle.dumps(iterfile.RORPIterFlushRepeat, 1)
self.array_buf.fromstring("o")
self.array_buf.fromstring(C.long2str(long(len(pickle))))
self.array_buf.fromstring(pickle)
def addrorp(self, rorp):
"""Add a rorp to the buffer"""
if rorp.file:
pickle = cPickle.dumps((rorp.index, rorp.data, 1), 1)
self.next_in_line = rorp.file
else:
pickle = cPickle.dumps((rorp.index, rorp.data, 0), 1)
self.rorps_in_buffer += 1
self.array_buf.fromstring("o")
self.array_buf.fromstring(C.long2str(long(len(pickle))))
self.array_buf.fromstring(pickle)
def addfinal(self):
"""Signal the end of the iterator to the other end"""
self.array_buf.fromstring("z")
self.array_buf.fromstring(C.long2str(0L))
def close(self): self.closed = 1
class FileToRORPIter(IterWrappingFile):
"""Take a RORPIterToFile and turn it back into a RORPIter"""
def __init__(self, file):
IterWrappingFile.__init__(self, file)
self.buf = ""
def __iter__(self): return self
def next(self):
"""Return next object in iter, or raise StopIteration"""
if self.currently_in_file:
self.currently_in_file.close()
type = None
while not type: type, data = self._get()
if type == "z": raise StopIteration
elif type == "o":
if data is iterfile.RORPIterFlushRepeat: return data
else: return self.get_rorp(data)
else: raise IterFileException("Bad file type %s" % (type,))
def get_rorp(self, pickled_tuple):
"""Return rorp that data represents"""
index, data_dict, num_files = pickled_tuple
rorp = rpath.RORPath(index, data_dict)
if num_files:
assert num_files == 1, "Only one file accepted right now"
rorp.setfile(self.get_file())
return rorp
def get_file(self):
"""Read file object from file"""
type, data = self._get()
if type == "f":
file = IterVirtualFile(self, data)
if data: self.currently_in_file = file
else: self.currently_in_file = None
return file
assert type == "e", "Expected type e, got %s" % (type,)
assert isinstance(data, Exception)
return ErrorFile(data)
def _get(self):
"""Return (type, data or object) pair
This is like UnwrapFile._get() but reads in variable length
blocks. Also type "z" is allowed, which means end of
iterator. An empty read() is not considered to mark the end
of remote iter.
"""
if not self.buf: self.buf += self.file.read()
if not self.buf: return None, None
assert len(self.buf) >= 8, "Unexpected end of RORPIter file"
type, length = self.buf[0], C.str2long(self.buf[1:8])
data = self.buf[8:8+length]
self.buf = self.buf[8+length:]
if type == "o" or type == "e": return type, cPickle.loads(data)
else: return type, data
class ErrorFile:
"""File-like that just raises error (used by FileToRORPIter above)"""
def __init__(self, exc):
"""Initialize new ErrorFile. exc is the exception to raise on read"""
self.exc = exc
def read(self, l=-1): raise self.exc
def close(self): return None
import iterfile
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
# UPDATE: I have decided not to use journaling and use the regress
# stuff exclusively. This code is left here for posterity.
"""Application level journaling for better error recovery
This module has routines for maintaining a "journal" to keep track of
writes to an rdiff-backup destination directory. This is necessary
because otherwise data could be lost if the program is abruptly
stopped (say to a computer crash). For instance, various temp files
could be left on the mirror drive. Or it may not be clear whether an
increment file has been fully written.
To keep this from happening, various writes may be journaled (a write
corresponds to the updating of a single file). To do this, a separate
file in the journal directory is created, and the necessary
information is written to it. When the transaction is finished, that
journal entry file will be deleted. If there is a crash, the next
time rdiff-backup is run, it will see the journal file, and process
it, bringing the rdiff-backup destination directory back into a
consistent state.
Two caveats:
1) The journal is only meant to be used in conjunction with a
regression to the state before the backup was attempted. If the
failed session is not "cleaned" out right after the journal is
recovered, something bad could happen.
2) This journal will only be effective if the actual hardware and OS
are working. If disk failures are causing data loss, or if a crash
causes your filesystem to be corrupted, rdiff-backup could lose
data despite all this journal stuff.
"""
import Globals, log, rpath, cPickle, TempFile, os, restore
# Holds an rpath of the journal directory, a file object, and then
journal_dir_rp = None
journal_dir_fp = None
# Set to time in seconds of previous aborted backup
unsuccessful_backup_time = None
def open_journal():
"""Make sure the journal dir exists (creating it if necessary)"""
global journal_dir_rp, journal_dir_fp
assert journal_dir_rp is journal_dir_fp is None
journal_dir_rp = Globals.rbdir.append("journal")
if not journal_dir_rp.lstat():
log.Log("Creating journal directory %s" % (journal_dir_rp.path,), 5)
journal_dir_rp.mkdir()
assert journal_dir_rp.isdir()
journal_dir_fp = journal_dir_rp.open("rb")
def close_journal():
"""Close the journal at the end of a session"""
global journal_dir_rp, journal_dir_fp
assert not journal_dir_rp.listdir()
assert not journal_dir_fp.close()
journal_dir_rp = journal_dir_fp = None
def sync_journal():
"""fsync the journal directory.
Note that fsync'ing a particular entry file may also be required
to guarantee writes have been committed.
"""
journal_dir_rp.fsync(journal_dir_fp)
def recover_journal():
"""Read the journal and recover each of the events"""
for entry in get_entries_from_journal():
entry.recover()
entry.delete()
def get_entries_from_journal():
"""Return list of entries in the journal (deletes bad entries)"""
entry_list = []
for filename in journal_dir_rp.listdir():
entry_rp = journal_dir_rp.append(filename)
e = Entry()
success = e.init_from_rp(entry_rp)
if not success: entry_rp.delete()
else: entry_list.append(e)
return entry_list
def write_entry(index, temp_index, testfile_option, testfile_type):
"""Write new entry given variables into journal, return entry"""
e = Entry()
e.index = index
e.temp_index = index
e.testfile_option = testfile_option
e.testfile_type = testfile_type
e.write()
return e
class Entry:
"""A single journal entry, describing one transaction
Although called a journal entry, this is less a description of
what is going happen than a short recipe of how to recover if
something goes wrong.
Currently the recipe needs to be very simple and is determined by
the four variables index, temp_index, testfile_option,
testfile_type. See the recover() method for details.
"""
index = None
temp_index = None
testfile_option = None
testfile_type = None # None is a valid value for this variable
# This points to the rpath in the journal dir that holds this entry
entry_rp = None
def recover(self):
"""Recover the current journal entry
self.testfile_option has 3 possibilities:
1 - testfile is mirror file
2 - testfile is increment file
3 - testfile is temp file
Either way, see if the type of the testfile matches
testfile_type. If so, delete increment file. Deleted
tempfile regardless.
We express things in terms of indicies because we need paths
relative to a fixed directory (like Globals.dest_root).
It's OK to recover the same entry multiple times.
"""
assert self.index is not None and self.temp_index is not None
mirror_rp = Globals.dest_root.new_index(self.index)
if self.temp_index:
temp_rp = Globals.dest_root.new_index(self.temp_index)
inc_rp = self.get_inc()
assert 1 <= self.testfile_option <= 3
if self.testfile_option == 1: test_rp = mirror_rp
elif self.testfile_option == 2: test_rp = inc_rp
else: test_rp = temp_rp
if test_rp and test_rp.lstat() == self.testfile_type:
if inc_rp and inc_rp.lstat(): inc_rp.sync_delete()
if temp_rp and temp_rp.lstat(): temp_rp.sync_delete()
def get_inc(self):
"""Return inc_rpath, if any, corresponding to self.index"""
incroot = Globals.rbdir.append_path("increments")
incbase = incroot.new_index(self.index)
inclist = restore.get_inclist(incbase)
inclist = filter(lambda inc:
inc.getinctime() == unsuccessful_backup_time, inclist)
assert len(inclist) <= 1
if inclist: return inclist[0]
else: return None
def to_string(self):
"""Return string form of entry"""
return cPickle.dumps({'index': self.index,
'testfile_option': self.testfile_option,
'testfile_type': self.testfile_type,
'temp_index': self.temp_index})
def write(self):
"""Write the current entry into the journal"""
entry_rp = TempFile.new_in_dir(journal_dir_rp)
fp = entry_rp.open("wb")
fp.write(self.to_string())
entry_rp.fsync(fp)
assert not fp.close()
sync_journal()
self.entry_rp = entry_rp
def init_from_string(self, s):
"""Initialize values from string. Return 0 if problem."""
try: val_dict = cPickle.loads(s)
except cPickle.UnpicklingError: return 0
try:
self.index = val_dict['index']
self.testfile_type = val_dict['testfile_type']
self.testfile_option = val_dict['testfile_option']
self.temp_index = val_dict['temp_index']
except TypeError, KeyError: return 0
return 1
def init_from_rp(self, entry_rp):
"""Initialize values from an rpath. Return 0 if problem"""
if not entry_rp.isreg(): return 0
success = self.init_from_string(entry_rp.get_data())
if not success: return 0
self.entry_rp = entry_rp
return 1
def delete(self):
"""Remove entry from the journal. self.entry_rp must be set"""
self.entry_rp.sync_delete()
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Define some lazy data structures and functions acting on them"""
from __future__ import generators
import os, stat, types
import static
class Iter:
"""Hold static methods for the manipulation of lazy iterators"""
def filter(predicate, iterator):
"""Like filter in a lazy functional programming language"""
for i in iterator:
if predicate(i): yield i
def map(function, iterator):
"""Like map in a lazy functional programming language"""
for i in iterator: yield function(i)
def foreach(function, iterator):
"""Run function on each element in iterator"""
for i in iterator: function(i)
def cat(*iters):
"""Lazily concatenate iterators"""
for iter in iters:
for i in iter: yield i
def cat2(iter_of_iters):
"""Lazily concatenate iterators, iterated by big iterator"""
for iter in iter_of_iters:
for i in iter: yield i
def empty(iter):
"""True if iterator has length 0"""
for i in iter: return None
return 1
def equal(iter1, iter2, verbose = None, operator = lambda x, y: x == y):
"""True if iterator 1 has same elements as iterator 2
Use equality operator, or == if it is unspecified.
"""
for i1 in iter1:
try: i2 = iter2.next()
except StopIteration:
if verbose: print "End when i1 = %s" % (i1,)
return None
if not operator(i1, i2):
if verbose: print "%s not equal to %s" % (i1, i2)
return None
try: i2 = iter2.next()
except StopIteration: return 1
if verbose: print "End when i2 = %s" % (i2,)
return None
def Or(iter):
"""True if any element in iterator is true. Short circuiting"""
i = None
for i in iter:
if i: return i
return i
def And(iter):
"""True if all elements in iterator are true. Short circuiting"""
i = 1
for i in iter:
if not i: return i
return i
def len(iter):
"""Return length of iterator"""
i = 0
while 1:
try: iter.next()
except StopIteration: return i
i = i+1
def foldr(f, default, iter):
"""foldr the "fundamental list recursion operator"?"""
try: next = iter.next()
except StopIteration: return default
return f(next, Iter.foldr(f, default, iter))
def foldl(f, default, iter):
"""the fundamental list iteration operator.."""
while 1:
try: next = iter.next()
except StopIteration: return default
default = f(default, next)
def multiplex(iter, num_of_forks, final_func = None, closing_func = None):
"""Split a single iterater into a number of streams
The return val will be a list with length num_of_forks, each
of which will be an iterator like iter. final_func is the
function that will be called on each element in iter just as
it is being removed from the buffer. closing_func is called
when all the streams are finished.
"""
if num_of_forks == 2 and not final_func and not closing_func:
im2 = IterMultiplex2(iter)
return (im2.yielda(), im2.yieldb())
if not final_func: final_func = lambda i: None
if not closing_func: closing_func = lambda: None
# buffer is a list of elements that some iterators need and others
# don't
buffer = []
# buffer[forkposition[i]] is the next element yieled by iterator
# i. If it is -1, yield from the original iter
starting_forkposition = [-1] * num_of_forks
forkposition = starting_forkposition[:]
called_closing_func = [None]
def get_next(fork_num):
"""Return the next element requested by fork_num"""
if forkposition[fork_num] == -1:
try: buffer.insert(0, iter.next())
except StopIteration:
# call closing_func if necessary
if (forkposition == starting_forkposition and
not called_closing_func[0]):
closing_func()
called_closing_func[0] = None
raise StopIteration
for i in range(num_of_forks): forkposition[i] += 1
return_val = buffer[forkposition[fork_num]]
forkposition[fork_num] -= 1
blen = len(buffer)
if not (blen-1) in forkposition:
# Last position in buffer no longer needed
assert forkposition[fork_num] == blen-2
final_func(buffer[blen-1])
del buffer[blen-1]
return return_val
def make_iterator(fork_num):
while(1): yield get_next(fork_num)
return tuple(map(make_iterator, range(num_of_forks)))
static.MakeStatic(Iter)
class IterMultiplex2:
"""Multiplex an iterator into 2 parts
This is a special optimized case of the Iter.multiplex function,
used when there is no closing_func or final_func, and we only want
to split it into 2. By profiling, this is a time sensitive class.
"""
def __init__(self, iter):
self.a_leading_by = 0 # How many places a is ahead of b
self.buffer = []
self.iter = iter
def yielda(self):
"""Return first iterator"""
buf, iter = self.buffer, self.iter
while(1):
if self.a_leading_by >= 0: # a is in front, add new element
elem = iter.next() # exception will be passed
buf.append(elem)
else: elem = buf.pop(0) # b is in front, subtract an element
self.a_leading_by += 1
yield elem
def yieldb(self):
"""Return second iterator"""
buf, iter = self.buffer, self.iter
while(1):
if self.a_leading_by <= 0: # b is in front, add new element
elem = iter.next() # exception will be passed
buf.append(elem)
else: elem = buf.pop(0) # a is in front, subtract an element
self.a_leading_by -= 1
yield elem
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Provides a high-level interface to some librsync functions
This is a python wrapper around the lower-level _librsync module,
which is written in C. The goal was to use C as little as possible...
"""
import _librsync, types, array
blocksize = _librsync.RS_JOB_BLOCKSIZE
class librsyncError(Exception):
"""Signifies error in internal librsync processing (bad signature, etc.)
underlying _librsync.librsyncError's are regenerated using this
class because the C-created exceptions are by default
unPickleable. There is probably a way to fix this in _librsync,
but this scheme was easier.
"""
pass
class LikeFile:
"""File-like object used by SigFile, DeltaFile, and PatchFile"""
mode = "rb"
# This will be replaced in subclasses by an object with
# appropriate cycle() method
maker = None
def __init__(self, infile, need_seek = None):
"""LikeFile initializer - zero buffers, set eofs off"""
self.check_file(infile, need_seek)
self.infile = infile
self.closed = self.infile_closed = None
self.inbuf = ""
self.outbuf = array.array('c')
self.eof = self.infile_eof = None
def check_file(self, file, need_seek = None):
"""Raise type error if file doesn't have necessary attributes"""
if not hasattr(file, "read"):
raise TypeError("Basis file must have a read() method")
if not hasattr(file, "close"):
raise TypeError("Basis file must have a close() method")
if need_seek and not hasattr(file, "seek"):
raise TypeError("Basis file must have a seek() method")
def read(self, length = -1):
"""Build up self.outbuf, return first length bytes"""
if length == -1:
while not self.eof: self._add_to_outbuf_once()
real_len = len(self.outbuf)
else:
while not self.eof and len(self.outbuf) < length:
self._add_to_outbuf_once()
real_len = min(length, len(self.outbuf))
return_val = self.outbuf[:real_len].tostring()
del self.outbuf[:real_len]
return return_val
def _add_to_outbuf_once(self):
"""Add one cycle's worth of output to self.outbuf"""
if not self.infile_eof: self._add_to_inbuf()
try: self.eof, len_inbuf_read, cycle_out = self.maker.cycle(self.inbuf)
except _librsync.librsyncError, e: raise librsyncError(str(e))
self.inbuf = self.inbuf[len_inbuf_read:]
self.outbuf.fromstring(cycle_out)
def _add_to_inbuf(self):
"""Make sure len(self.inbuf) >= blocksize"""
assert not self.infile_eof
while len(self.inbuf) < blocksize:
new_in = self.infile.read(blocksize)
if not new_in:
self.infile_eof = 1
assert not self.infile.close()
self.infile_closed = 1
break
self.inbuf += new_in
def close(self):
"""Close infile"""
if not self.infile_closed: assert not self.infile.close()
self.closed = 1
class SigFile(LikeFile):
"""File-like object which incrementally generates a librsync signature"""
def __init__(self, infile, blocksize = _librsync.RS_DEFAULT_BLOCK_LEN):
"""SigFile initializer - takes basis file
basis file only needs to have read() and close() methods. It
will be closed when we come to the end of the signature.
"""
LikeFile.__init__(self, infile)
try: self.maker = _librsync.new_sigmaker(blocksize)
except _librsync.librsyncError, e: raise librsyncError(str(e))
class DeltaFile(LikeFile):
"""File-like object which incrementally generates a librsync delta"""
def __init__(self, signature, new_file):
"""DeltaFile initializer - call with signature and new file
Signature can either be a string or a file with read() and
close() methods. New_file also only needs to have read() and
close() methods. It will be closed when self is closed.
"""
LikeFile.__init__(self, new_file)
if type(signature) is types.StringType: sig_string = signature
else:
self.check_file(signature)
sig_string = signature.read()
assert not signature.close()
try: self.maker = _librsync.new_deltamaker(sig_string)
except _librsync.librsyncError, e: raise librsyncError(str(e))
class PatchedFile(LikeFile):
"""File-like object which applies a librsync delta incrementally"""
def __init__(self, basis_file, delta_file):
"""PatchedFile initializer - call with basis delta
Here basis_file must be a true Python file, because we may
need to seek() around in it a lot, and this is done in C.
delta_file only needs read() and close() methods.
"""
LikeFile.__init__(self, delta_file)
if type(basis_file) is not types.FileType:
raise TypeError("basis_file must be a (true) file")
try: self.maker = _librsync.new_patchmaker(basis_file)
except _librsync.librsyncError, e: raise librsyncError(str(e))
class SigGenerator:
"""Calculate signature.
Input and output is same as SigFile, but the interface is like md5
module, not filelike object
"""
def __init__(self, blocksize = _librsync.RS_DEFAULT_BLOCK_LEN):
"""Return new signature instance"""
try: self.sig_maker = _librsync.new_sigmaker(blocksize)
except _librsync.librsyncError, e: raise librsyncError(str(e))
self.gotsig = None
self.buffer = ""
self.sig_string = ""
def update(self, buf):
"""Add buf to data that signature will be calculated over"""
if self.gotsig:
raise librsyncError("SigGenerator already provided signature")
self.buffer += buf
while len(self.buffer) >= blocksize:
if self.process_buffer():
raise librsyncError("Premature EOF received from sig_maker")
def process_buffer(self):
"""Run self.buffer through sig_maker, add to self.sig_string"""
try: eof, len_buf_read, cycle_out = self.sig_maker.cycle(self.buffer)
except _librsync.librsyncError, e: raise librsyncError(str(e))
self.buffer = self.buffer[len_buf_read:]
self.sig_string += cycle_out
return eof
def getsig(self):
"""Return signature over given data"""
while not self.process_buffer(): pass # keep running until eof
return self.sig_string
#!/usr/bin/env python
"""Demonstrate a memory leak in pysync/librsync"""
import os, _librsync
from librsync import *
os.chdir("/tmp")
# Write 2 1 byte files
afile = open("a", "wb")
afile.write("a")
afile.close()
efile = open("e", "wb")
efile.write("e")
efile.close()
def copy(infileobj, outpath):
outfile = open(outpath, "wb")
while 1:
buf = infileobj.read(32768)
if not buf: break
outfile.write(buf)
assert not outfile.close()
assert not infileobj.close()
def test_cycle():
for i in xrange(100000):
sm = _librsync.new_sigmaker()
sm.cycle("a")
def main_test():
for i in xrange(100000):
# Write signature file
afile = open("a", "rb")
copy(SigFile(afile), "sig")
# Write delta file
efile = open("e", "r")
sigfile = open("sig", "rb")
copy(DeltaFile(sigfile, efile), "delta")
# Write patched file
afile = open("e", "rb")
deltafile = open("delta", "rb")
copy(PatchedFile(afile, deltafile), "a.out")
main_test()
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Manage logging, displaying and recording messages with required verbosity"""
import time, sys, traceback, types
import Globals, static, re
class LoggerError(Exception): pass
class Logger:
"""All functions which deal with logging"""
def __init__(self):
self.log_file_open = None
self.log_file_local = None
self.verbosity = self.term_verbosity = 3
# termverbset is true if the term_verbosity has been explicity set
self.termverbset = None
def setverbosity(self, verbosity_string):
"""Set verbosity levels. Takes a number string"""
try: self.verbosity = int(verbosity_string)
except ValueError:
Log.FatalError("Verbosity must be a number, received '%s' "
"instead." % verbosity_string)
if not self.termverbset: self.term_verbosity = self.verbosity
def setterm_verbosity(self, termverb_string):
"""Set verbosity to terminal. Takes a number string"""
try: self.term_verbosity = int(termverb_string)
except ValueError:
Log.FatalError("Terminal verbosity must be a number, received "
"'%s' instead." % termverb_string)
self.termverbset = 1
def open_logfile(self, rpath):
"""Inform all connections of an open logfile.
rpath.conn will write to the file, and the others will pass
write commands off to it.
"""
assert not self.log_file_open
rpath.conn.log.Log.open_logfile_local(rpath)
for conn in Globals.connections:
conn.log.Log.open_logfile_allconn(rpath.conn)
def open_logfile_allconn(self, log_file_conn):
"""Run on all connections to signal log file is open"""
self.log_file_open = 1
self.log_file_conn = log_file_conn
def open_logfile_local(self, rpath):
"""Open logfile locally - should only be run on one connection"""
assert rpath.conn is Globals.local_connection
try: self.logfp = rpath.open("a")
except (OSError, IOError), e:
raise LoggerError("Unable to open logfile %s: %s"
% (rpath.path, e))
self.log_file_local = 1
self.logrp = rpath
def close_logfile(self):
"""Close logfile and inform all connections"""
if self.log_file_open:
for conn in Globals.connections:
conn.log.Log.close_logfile_allconn()
self.log_file_conn.log.Log.close_logfile_local()
def close_logfile_allconn(self):
"""Run on every connection"""
self.log_file_open = None
def close_logfile_local(self):
"""Run by logging connection - close logfile"""
assert self.log_file_conn is Globals.local_connection
assert not self.logfp.close()
self.log_file_local = None
def format(self, message, verbosity):
"""Format the message, possibly adding date information"""
if verbosity < 9: return message + "\n"
else: return "%s %s\n" % (time.asctime(time.localtime(time.time())),
message)
def __call__(self, message, verbosity):
"""Log message that has verbosity importance
message can be a string, which is logged as-is, or a function,
which is then called and should return the string to be
logged. We do it this way in case producing the string would
take a significant amount of CPU.
"""
if verbosity > self.verbosity and verbosity > self.term_verbosity:
return
if not type(message) is types.StringType:
assert type(message) is types.FunctionType
message = message()
if verbosity <= self.verbosity: self.log_to_file(message)
if verbosity <= self.term_verbosity:
self.log_to_term(message, verbosity)
def log_to_file(self, message):
"""Write the message to the log file, if possible"""
if self.log_file_open:
if self.log_file_local:
self.logfp.write(self.format(message, self.verbosity))
else: self.log_file_conn.log.Log.log_to_file(message)
def log_to_term(self, message, verbosity):
"""Write message to stdout/stderr"""
if verbosity <= 2 or Globals.server: termfp = sys.stderr
else: termfp = sys.stdout
termfp.write(self.format(message, self.term_verbosity))
def conn(self, direction, result, req_num):
"""Log some data on the connection
The main worry with this function is that something in here
will create more network traffic, which will spiral to
infinite regress. So, for instance, logging must only be done
to the terminal, because otherwise the log file may be remote.
"""
if self.term_verbosity < 9: return
if type(result) is types.StringType: result_repr = repr(result)
else: result_repr = str(result)
if Globals.server: conn_str = "Server"
else: conn_str = "Client"
self.log_to_term("%s %s (%d): %s" %
(conn_str, direction, req_num, result_repr), 9)
def FatalError(self, message, no_fatal_message = 0, errlevel = 1):
"""Log a fatal error and exit"""
assert no_fatal_message == 0 or no_fatal_message == 1
if no_fatal_message: prefix_string = ""
else: prefix_string = "Fatal Error: "
self(prefix_string + message, 1)
import Main
Main.cleanup()
sys.exit(errlevel)
def exception_to_string(self, arglist = []):
"""Return string version of current exception plus what's in arglist"""
type, value, tb = sys.exc_info()
s = ("Exception '%s' raised of class '%s':\n%s" %
(value, type, "".join(traceback.format_tb(tb))))
if arglist:
s += "__Arguments:\n" + "\n".join(map(str, arglist))
return s
def exception(self, only_terminal = 0, verbosity = 5):
"""Log an exception and traceback
If only_terminal is None, log normally. If it is 1, then only
log to disk if log file is local (self.log_file_open = 1). If
it is 2, don't log to disk at all.
"""
assert only_terminal in (0, 1, 2)
if (only_terminal == 0 or
(only_terminal == 1 and self.log_file_open)):
logging_func = self.__call__
else: logging_func = self.log_to_term
logging_func(self.exception_to_string(), verbosity)
Log = Logger()
class ErrorLog:
"""Log each recoverable error in error_log file
There are three types of recoverable errors: ListError, which
happens trying to list a directory or stat a file, UpdateError,
which happen when trying to update a changed file, and
SpecialFileError, which happen when a special file cannot be
created. See the error policy file for more info.
"""
_log_fileobj = None
_log_inc_rp = None
def open(cls, time_string, compress = 1):
"""Open the error log, prepare for writing"""
if not Globals.isbackup_writer:
return Globals.backup_writer.log.ErrorLog.open(time_string,
compress)
assert not cls._log_fileobj and not cls._log_inc_rp, "log already open"
assert Globals.isbackup_writer
if compress: typestr = 'data.gz'
else: typestr = 'data'
cls._log_inc_rp = Globals.rbdir.append("error_log.%s.%s" %
(time_string, typestr))
assert not cls._log_inc_rp.lstat(), ("""Error file %s already exists.
This is probably caused by your attempting to run two backups simultaneously
or within one second of each other. Wait a second and try again.""" %
(cls._log_inc_rp.path,))
cls._log_fileobj = cls._log_inc_rp.open("wb", compress = compress)
def isopen(cls):
"""True if the error log file is currently open"""
if Globals.isbackup_writer or not Globals.backup_writer:
return cls._log_fileobj is not None
else: return Globals.backup_writer.log.ErrorLog.isopen()
def write(cls, error_type, rp, exc):
"""Add line to log file indicating error exc with file rp"""
if not Globals.isbackup_writer:
return Globals.backup_writer.log.ErrorLog.write(error_type,
rp, exc)
s = cls.get_log_string(error_type, rp, exc)
Log(s, 2)
if Globals.null_separator: s += "\0"
else:
s = re.sub("\n", " ", s)
s += "\n"
cls._log_fileobj.write(s)
def get_indexpath(cls, obj):
"""Return filename for logging. rp is a rpath, string, or tuple"""
try: return obj.get_indexpath()
except AttributeError:
if type(obj) is types.TupleType: return "/".join(obj)
else: return str(obj)
def write_if_open(cls, error_type, rp, exc):
"""Call cls.write(...) if error log open, only log otherwise"""
if not Globals.isbackup_writer:
return Globals.backup_writer.log.ErrorLog.write_if_open(
error_type, rp, str(exc)) # convert exc bc of exc picking prob
if cls.isopen(): cls.write(error_type, rp, exc)
else: Log(cls.get_log_string(error_type, rp, exc), 2)
def get_log_string(cls, error_type, rp, exc):
"""Return log string to put in error log"""
assert (error_type == "ListError" or error_type == "UpdateError" or
error_type == "SpecialFileError"), "Unknown type "+error_type
return "%s %s %s" % (error_type, cls.get_indexpath(rp), str(exc))
def close(cls):
"""Close the error log file"""
if not Globals.isbackup_writer:
return Globals.backup_writer.log.ErrorLog.close()
assert not cls._log_fileobj.close()
cls._log_fileobj = cls._log_inc_rp = None
static.MakeClass(ErrorLog)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""list, delete, and otherwise manage increments"""
from __future__ import generators
from log import Log
import Globals, Time, static, manage
class ManageException(Exception): pass
def get_file_type(rp):
"""Returns one of "regular", "directory", "missing", or "special"."""
if not rp.lstat(): return "missing"
elif rp.isdir(): return "directory"
elif rp.isreg(): return "regular"
else: return "special"
def get_inc_type(inc):
"""Return file type increment represents"""
assert inc.isincfile()
type = inc.getinctype()
if type == "dir": return "directory"
elif type == "diff": return "regular"
elif type == "missing": return "missing"
elif type == "snapshot": return get_file_type(inc)
else: assert None, "Unknown type %s" % (type,)
def describe_incs_parsable(incs, mirror_time, mirrorrp):
"""Return a string parsable by computer describing the increments
Each line is a time in seconds of the increment, and then the
type of the file. It will be sorted oldest to newest. For example:
10000 regular
20000 directory
30000 special
40000 missing
50000 regular <- last will be the current mirror
"""
incpairs = [(inc.getinctime(), inc) for inc in incs]
incpairs.sort()
result = ["%s %s" % (time, get_inc_type(inc)) for time, inc in incpairs]
result.append("%s %s" % (mirror_time, get_file_type(mirrorrp)))
return "\n".join(result)
def describe_incs_human(incs, mirror_time, mirrorrp):
"""Return a string describing all the the root increments"""
incpairs = [(inc.getinctime(), inc) for inc in incs]
incpairs.sort()
result = ["Found %d increments:" % len(incpairs)]
for time, inc in incpairs:
result.append(" %s %s" %
(inc.dirsplit()[1], Time.timetopretty(time)))
result.append("Current mirror: %s" % Time.timetopretty(mirror_time))
return "\n".join(result)
def delete_earlier_than(baserp, time):
"""Deleting increments older than time in directory baserp
time is in seconds. It will then delete any empty directories
in the tree. To process the entire backup area, the
rdiff-backup-data directory should be the root of the tree.
"""
baserp.conn.manage.delete_earlier_than_local(baserp, time)
def delete_earlier_than_local(baserp, time):
"""Like delete_earlier_than, but run on local connection for speed"""
assert baserp.conn is Globals.local_connection
def yield_files(rp):
yield rp
if rp.isdir():
for filename in rp.listdir():
for sub_rp in yield_files(rp.append(filename)):
yield sub_rp
for rp in yield_files(baserp):
if ((rp.isincfile() and rp.getinctime() < time) or
(rp.isdir() and not rp.listdir())):
Log("Deleting increment file %s" % rp.path, 5)
rp.delete()
class IncObj:
"""Increment object - represent a completed increment"""
def __init__(self, incrp):
"""IncObj initializer
incrp is an RPath of a path like increments.TIMESTR.dir
standing for the root of the increment.
"""
if not incrp.isincfile():
raise ManageException("%s is not an inc file" % incrp.path)
self.incrp = incrp
self.time = incrp.getinctime()
def getbaserp(self):
"""Return rp of the incrp without extensions"""
return self.incrp.getincbase()
def pretty_time(self):
"""Return a formatted version of inc's time"""
return Time.timetopretty(self.time)
def full_description(self):
"""Return string describing increment"""
s = ["Increment file %s" % self.incrp.path,
"Date: %s" % self.pretty_time()]
return "\n".join(s)
#include <stdio.h>
#include <rsync.h>
main()
{
FILE *basis_file, *sig_file;
char filename[50];
rs_stats_t stats;
rs_result result;
long i;
for(i=0; i<=100000; i++) {
basis_file = fopen("a", "r");
sig_file = fopen("sig", "w");
result = rs_sig_file(basis_file, sig_file,
RS_DEFAULT_BLOCK_LEN, RS_DEFAULT_STRONG_LEN,
&stats);
if (result != RS_DONE) exit(result);
fclose(basis_file);
fclose(sig_file);
}
}
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Store and retrieve metadata in destination directory
The plan is to store metadata information for all files in the
destination directory in a special metadata file. There are two
reasons for this:
1) The filesystem of the mirror directory may not be able to handle
types of metadata that the source filesystem can. For instance,
rdiff-backup may not have root access on the destination side, so
cannot set uid/gid. Or the source side may have ACLs and the
destination side doesn't.
Hopefully every file system can store binary data. Storing
metadata separately allows us to back up anything (ok, maybe
strange filenames are still a problem).
2) Metadata can be more quickly read from a file than it can by
traversing the mirror directory over and over again. In many
cases most of rdiff-backup's time is spent compaing metadata (like
file size and modtime), trying to find differences. Reading this
data sequentially from a file is significantly less taxing than
listing directories and statting files all over the mirror
directory.
The metadata is stored in a text file, which is a bunch of records
concatenated together. Each record has the format:
File <filename>
<field_name1> <value>
<field_name2> <value>
...
Where the lines are separated by newlines. See the code below for the
field names and values.
"""
from __future__ import generators
import re, gzip, os
import log, Globals, rpath, Time, robust, increment
class ParsingError(Exception):
"""This is raised when bad or unparsable data is received"""
pass
def RORP2Record(rorpath):
"""From RORPath, return text record of file's metadata"""
str_list = ["File %s\n" % quote_path(rorpath.get_indexpath())]
# Store file type, e.g. "dev", "reg", or "sym", and type-specific data
type = rorpath.gettype()
if type is None: type = "None"
str_list.append(" Type %s\n" % type)
if type == "reg":
str_list.append(" Size %s\n" % rorpath.getsize())
# If file is hardlinked, add that information
if Globals.preserve_hardlinks:
numlinks = rorpath.getnumlinks()
if numlinks > 1:
str_list.append(" NumHardLinks %s\n" % numlinks)
str_list.append(" Inode %s\n" % rorpath.getinode())
str_list.append(" DeviceLoc %s\n" % rorpath.getdevloc())
elif type == "None": return "".join(str_list)
elif type == "dir" or type == "sock" or type == "fifo": pass
elif type == "sym":
str_list.append(" SymData %s\n" % quote_path(rorpath.readlink()))
elif type == "dev":
major, minor = rorpath.getdevnums()
if rorpath.isblkdev(): devchar = "b"
else:
assert rorpath.ischardev()
devchar = "c"
str_list.append(" DeviceNum %s %s %s\n" % (devchar, major, minor))
# Store time information
if type != 'sym' and type != 'dev':
str_list.append(" ModTime %s\n" % rorpath.getmtime())
# Add user, group, and permission information
uid, gid = rorpath.getuidgid()
str_list.append(" Uid %s\n" % uid)
str_list.append(" Gid %s\n" % gid)
str_list.append(" Permissions %s\n" % rorpath.getperms())
return "".join(str_list)
line_parsing_regexp = re.compile("^ *([A-Za-z0-9]+) (.+)$", re.M)
def Record2RORP(record_string):
"""Given record_string, return RORPath
For speed reasons, write the RORPath data dictionary directly
instead of calling rorpath functions. Profiling has shown this to
be a time critical function.
"""
data_dict = {}
for field, data in line_parsing_regexp.findall(record_string):
if field == "File":
if data == ".": index = ()
else: index = tuple(unquote_path(data).split("/"))
elif field == "Type":
if data == "None": data_dict['type'] = None
else: data_dict['type'] = data
elif field == "Size": data_dict['size'] = long(data)
elif field == "NumHardLinks": data_dict['nlink'] = int(data)
elif field == "Inode": data_dict['inode'] = long(data)
elif field == "DeviceLoc": data_dict['devloc'] = long(data)
elif field == "SymData": data_dict['linkname'] = unquote_path(data)
elif field == "DeviceNum":
devchar, major_str, minor_str = data.split(" ")
data_dict['devnums'] = (devchar, int(major_str), int(minor_str))
elif field == "ModTime": data_dict['mtime'] = long(data)
elif field == "Uid": data_dict['uid'] = int(data)
elif field == "Gid": data_dict['gid'] = int(data)
elif field == "Permissions": data_dict['perms'] = int(data)
else: raise ParsingError("Unknown field in line '%s'" % line)
return rpath.RORPath(index, data_dict)
chars_to_quote = re.compile("\\n|\\\\")
def quote_path(path_string):
"""Return quoted verson of path_string
Because newlines are used to separate fields in a record, they are
replaced with \n. Backslashes become \\ and everything else is
left the way it is.
"""
def replacement_func(match_obj):
"""This is called on the match obj of any char that needs quoting"""
char = match_obj.group(0)
if char == "\n": return "\\n"
elif char == "\\": return "\\\\"
assert 0, "Bad char %s needs quoting" % char
return chars_to_quote.sub(replacement_func, path_string)
def unquote_path(quoted_string):
"""Reverse what was done by quote_path"""
def replacement_func(match_obj):
"""Unquote match obj of two character sequence"""
two_chars = match_obj.group(0)
if two_chars == "\\n": return "\n"
elif two_chars == "\\\\": return "\\"
log.Log("Warning, unknown quoted sequence %s found" % two_chars, 2)
return two_chars
return re.sub("\\\\n|\\\\\\\\", replacement_func, quoted_string)
def write_rorp_iter_to_file(rorp_iter, file):
"""Given iterator of RORPs, write records to (pre-opened) file object"""
for rorp in rorp_iter: file.write(RORP2Record(rorp))
class rorp_extractor:
"""Controls iterating rorps from metadata file"""
def __init__(self, fileobj):
self.fileobj = fileobj # holds file object we are reading from
self.buf = "" # holds the next part of the file
self.record_boundary_regexp = re.compile("\\nFile")
self.at_end = 0 # True if we are at the end of the file
self.blocksize = 32 * 1024
def get_next_pos(self):
"""Return position of next record in buffer"""
while 1:
m = self.record_boundary_regexp.search(self.buf)
if m: return m.start(0)+1 # the +1 skips the newline
else: # add next block to the buffer, loop again
newbuf = self.fileobj.read(self.blocksize)
if not newbuf:
self.at_end = 1
return len(self.buf)
else: self.buf += newbuf
def iterate(self):
"""Return iterator over all records"""
while 1:
next_pos = self.get_next_pos()
try: yield Record2RORP(self.buf[:next_pos])
except ParsingError, e:
log.Log("Error parsing metadata file: %s" % (e,), 2)
if self.at_end: break
self.buf = self.buf[next_pos:]
assert not self.close()
def skip_to_index(self, index):
"""Scan through the file, set buffer to beginning of index record
Here we make sure that the buffer always ends in a newline, so
we will not be splitting lines in half.
"""
assert not self.buf or self.buf.endswith("\n")
if not index: indexpath = "."
else: indexpath = "/".join(index)
# Must double all backslashes, because they will be
# reinterpreted. For instance, to search for index \n
# (newline), it will be \\n (backslash n) in the file, so the
# regular expression is "File \\\\n\\n" (File two backslash n
# backslash n)
double_quote = re.sub("\\\\", "\\\\\\\\", indexpath)
begin_re = re.compile("(^|\\n)(File %s\\n)" % (double_quote,))
while 1:
m = begin_re.search(self.buf)
if m:
self.buf = self.buf[m.start(2):]
return
self.buf = self.fileobj.read(self.blocksize)
self.buf += self.fileobj.readline()
if not self.buf:
self.at_end = 1
return
def iterate_starting_with(self, index):
"""Iterate records whose index starts with given index"""
self.skip_to_index(index)
if self.at_end: return
while 1:
next_pos = self.get_next_pos()
try: rorp = Record2RORP(self.buf[:next_pos])
except ParsingError, e:
log.Log("Error parsing metadata file: %s" % (e,), 2)
else:
if rorp.index[:len(index)] != index: break
yield rorp
if self.at_end: break
self.buf = self.buf[next_pos:]
assert not self.close()
def close(self):
"""Return value of closing associated file"""
return self.fileobj.close()
metadata_rp = None
metadata_fileobj = None
metadata_record_buffer = [] # Use this because gzip writes are slow
def OpenMetadata(rp = None, compress = 1):
"""Open the Metadata file for writing, return metadata fileobj"""
global metadata_rp, metadata_fileobj
assert not metadata_fileobj, "Metadata file already open"
if rp: metadata_rp = rp
else:
if compress: typestr = 'snapshot.gz'
else: typestr = 'snapshot'
metadata_rp = Globals.rbdir.append("mirror_metadata.%s.%s" %
(Time.curtimestr, typestr))
metadata_fileobj = metadata_rp.open("wb", compress = compress)
def WriteMetadata(rorp):
"""Write metadata of rorp to file"""
global metadata_fileobj, metadata_record_buffer
metadata_record_buffer.append(RORP2Record(rorp))
if len(metadata_record_buffer) >= 100: write_metadata_buffer()
def write_metadata_buffer():
global metadata_record_buffer
metadata_fileobj.write("".join(metadata_record_buffer))
metadata_record_buffer = []
def CloseMetadata():
"""Close the metadata file"""
global metadata_rp, metadata_fileobj
assert metadata_fileobj, "Metadata file not open"
if metadata_record_buffer: write_metadata_buffer()
try: fileno = metadata_fileobj.fileno() # will not work if GzipFile
except AttributeError: fileno = metadata_fileobj.fileobj.fileno()
os.fsync(fileno)
result = metadata_fileobj.close()
metadata_fileobj = None
metadata_rp.setdata()
return result
def GetMetadata(rp, restrict_index = None, compressed = None):
"""Return iterator of metadata from given metadata file rp"""
if compressed is None:
if rp.isincfile():
compressed = rp.inc_compressed
assert rp.inc_type == "data" or rp.inc_type == "snapshot"
else: compressed = rp.get_indexpath().endswith(".gz")
fileobj = rp.open("rb", compress = compressed)
if restrict_index is None: return rorp_extractor(fileobj).iterate()
else: return rorp_extractor(fileobj).iterate_starting_with(restrict_index)
def GetMetadata_at_time(rbdir, time, restrict_index = None, rblist = None):
"""Scan through rbdir, finding metadata file at given time, iterate
If rdlist is given, use that instead of listing rddir. Time here
is exact, we don't take the next one older or anything. Returns
None if no matching metadata found.
"""
if rblist is None: rblist = map(lambda x: rbdir.append(x),
robust.listrp(rbdir))
for rp in rblist:
if (rp.isincfile() and
(rp.getinctype() == "data" or rp.getinctype() == "snapshot") and
rp.getincbase_str() == "mirror_metadata"):
if rp.getinctime() == time: return GetMetadata(rp, restrict_index)
return None
#!/usr/bin/env python
"""Like rdiff, but written in python and uses librsync module.
Useful for benchmarking and testing of librsync and _librsync.
"""
import librsync, sys
blocksize = 32768
def makesig(inpath, outpath):
"""Write a signature of inpath at outpath"""
sf = librsync.SigFile(open(inpath, "rb"))
fout = open(outpath, "wb")
while 1:
buf = sf.read(blocksize)
if not buf: break
fout.write(buf)
assert not sf.close()
assert not fout.close()
def makedelta(sigpath, newpath, deltapath):
"""Write delta at deltapath using signature at sigpath"""
df = librsync.DeltaFile(open(sigpath, "rb"), open(newpath, "rb"))
fout = open(deltapath, "wb")
while 1:
buf = df.read(blocksize)
if not buf: break
fout.write(buf)
assert not df.close()
assert not fout.close()
def makepatch(basis_path, delta_path, new_path):
"""Write new given basis and delta"""
pf = librsync.PatchedFile(open(basis_path, "rb"), open(delta_path, "rb"))
fout = open(new_path, "wb")
while 1:
buf = pf.read(blocksize)
if not buf: break
fout.write(buf)
assert not pf.close()
assert not fout.close()
if sys.argv[1] == "signature":
makesig(sys.argv[2], sys.argv[3])
elif sys.argv[1] == "delta":
makedelta(sys.argv[2], sys.argv[3], sys.argv[4])
elif sys.argv[1] == "patch":
makepatch(sys.argv[2], sys.argv[3], sys.argv[4])
else: assert 0, "Bad mode argument %s" % (sys.argv[1],)
#!/usr/bin/env python
"""Run rdiff-backup with profiling on
Same as rdiff-backup but runs profiler, and prints profiling
statistics afterwards.
"""
__no_execute__ = 1
import sys, rdiff_backup.Main, profile, pstats
profile.run("rdiff_backup.Main.Main(%s)" % repr(sys.argv[1:]),
"profile-output")
p = pstats.Stats("profile-output")
p.sort_stats('time')
p.print_stats(40)
#p.print_callers(20)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Code for reverting the rdiff-backup directory to prev state
This module is used after an aborted session, and the rdiff-backup
destination directory may be in-between states. In this situation we
need to bring back the directory as it was after the last successful
backup. The basic strategy is to restore all the attributes from the
metadata file (which we assume is intact) and delete the extra
increments. For regular files we examine the mirror file and use the
increment file to get the old data if the mirror file is out of date.
Currently this does recover hard links. This make make the regressed
directory take up more disk space, but hard links can still be
recovered.
"""
from __future__ import generators
import Globals, restore, log, rorpiter, TempFile, metadata, rpath, C, \
Time, backup, robust
# regress_time should be set to the time we want to regress back to
# (usually the time of the last successful backup)
regress_time = None
# This should be set to the latest unsuccessful backup time
unsuccessful_backup_time = None
class RegressException(Exception):
"""Raised on any exception in regress process"""
pass
def Regress(mirror_rp):
"""Bring mirror and inc directory back to regress_to_time
Also affects the rdiff-backup-data directory, so Globals.rbdir
should be set. Regress should only work one step at a time
(i.e. don't "regress" through two separate backup sets. This
function should be run locally to the rdiff-backup-data directory.
"""
inc_rpath = Globals.rbdir.append_path("increments")
assert mirror_rp.index == () and inc_rpath.index == ()
assert mirror_rp.isdir() and inc_rpath.isdir()
assert mirror_rp.conn is inc_rpath.conn is Globals.local_connection
set_regress_time()
set_restore_times()
ITR = rorpiter.IterTreeReducer(RegressITRB, [])
for rf in iterate_meta_rfs(mirror_rp, inc_rpath): ITR(rf.index, rf)
ITR.Finish()
remove_rbdir_increments()
def set_regress_time():
"""Set global regress_time to previous sucessful backup
If there are two current_mirror increments, then the last one
corresponds to a backup session that failed.
"""
global regress_time, unsuccessful_backup_time
curmir_incs = restore.get_inclist(Globals.rbdir.append("current_mirror"))
assert len(curmir_incs) == 2, \
"Found %s current_mirror flags, expected 2" % len(curmir_incs)
inctimes = [inc.getinctime() for inc in curmir_incs]
inctimes.sort()
regress_time = inctimes[0]
unsuccessful_backup_time = inctimes[-1]
log.Log("Regressing to " + Time.timetopretty(regress_time), 4)
def set_restore_times():
"""Set _rest_time and _mirror_time in the restore module
_rest_time (restore time) corresponds to the last successful
backup time. _mirror_time is the unsuccessful backup time.
"""
restore._mirror_time = unsuccessful_backup_time
restore._rest_time = regress_time
def remove_rbdir_increments():
"""Delete the increments in the rdiff-backup-data directory"""
old_current_mirror = None
for filename in Globals.rbdir.listdir():
rp = Globals.rbdir.append(filename)
if rp.isincfile() and rp.getinctime() == unsuccessful_backup_time:
if rp.getincbase_str() == "current_mirror": old_current_mirror = rp
else:
log.Log("Removing rdiff-backup-data increment " + rp.path, 5)
rp.delete()
if old_current_mirror:
C.sync() # Sync first, since we are marking dest dir as good now
old_current_mirror.delete()
def iterate_raw_rfs(mirror_rp, inc_rp):
"""Iterate all RegressFile objects in mirror/inc directory"""
root_rf = RegressFile(mirror_rp, inc_rp, restore.get_inclist(inc_rp))
def helper(rf):
yield rf
if rf.mirror_rp.isdir() or rf.inc_rp.isdir():
for sub_rf in rf.yield_sub_rfs():
for sub_sub_rf in helper(sub_rf):
yield sub_sub_rf
return helper(root_rf)
def yield_metadata():
"""Iterate rorps from metadata file, if any are available"""
metadata_iter = metadata.GetMetadata_at_time(Globals.rbdir, regress_time)
if metadata_iter: return metadata_iter
log.Log.FatalError("No metadata for time %s found, cannot regress"
% Time.timetopretty(regress_time))
def iterate_meta_rfs(mirror_rp, inc_rp):
"""Yield RegressFile objects with extra metadata information added
Each RegressFile will have an extra object variable .metadata_rorp
which will contain the metadata attributes of the mirror file at
regress_time.
"""
raw_rfs = iterate_raw_rfs(mirror_rp, inc_rp)
collated = rorpiter.Collate2Iters(raw_rfs, yield_metadata())
for raw_rf, metadata_rorp in collated:
if raw_rf:
raw_rf.set_metadata_rorp(metadata_rorp)
yield raw_rf
else:
log.Log("Warning, metadata file has entry for %s,\n"
"but there are no associated files." %
(metadata_rorp.get_indexpath(),), 2)
yield RegressFile(mirror_rp.new_index(metadata_rorp.index),
inc_rp.new_index(metadata_rorp.index), ())
class RegressFile(restore.RestoreFile):
"""Like RestoreFile but with metadata
Hold mirror_rp and related incs, but also put metadata info for
the mirror file at regress time in self.metadata_rorp.
self.metadata_rorp is not set in this class.
"""
def __init__(self, mirror_rp, inc_rp, inc_list):
restore.RestoreFile.__init__(self, mirror_rp, inc_rp, inc_list)
self.set_regress_inc()
def set_metadata_rorp(self, metadata_rorp):
"""Set self.metadata_rorp, creating empty if given None"""
if metadata_rorp: self.metadata_rorp = metadata_rorp
else: self.metadata_rorp = rpath.RORPath(self.index)
def isdir(self):
"""Return true if regress needs before/after processing"""
return ((self.metadata_rorp and self.metadata_rorp.isdir()) or
(self.mirror_rp and self.mirror_rp.isdir()))
def set_regress_inc(self):
"""Set self.regress_inc to increment to be removed (or None)"""
newer_incs = self.get_newer_incs()
assert len(newer_incs) <= 1, "Too many recent increments"
if newer_incs: self.regress_inc = newer_incs[0] # first is mirror_rp
else: self.regress_inc = None
class RegressITRB(rorpiter.ITRBranch):
"""Turn back state of dest directory (use with IterTreeReducer)
The arguments to the ITR will be RegressFiles. There are two main
assumptions this procedure makes (besides those mentioned above):
1. The mirror_rp and the metadata_rorp equal_loose correctly iff
they contain the same data. If this is the case, then the inc
file is unnecessary and we can delete it.
2. If the don't match, then applying the inc file will
successfully get us back to the previous state.
Since the metadata file is required, the two above really only
matter for regular files.
"""
def __init__(self):
"""Just initialize some variables to None"""
self.rf = None # will hold RegressFile applying to a directory
def can_fast_process(self, index, rf):
"""True if none of the rps is a directory"""
return not rf.mirror_rp.isdir() and not rf.metadata_rorp.isdir()
def fast_process(self, index, rf):
"""Process when nothing is a directory"""
if not rf.metadata_rorp.equal_loose(rf.mirror_rp):
log.Log("Regressing file %s" %
(rf.metadata_rorp.get_indexpath()), 5)
if rf.metadata_rorp.isreg(): self.restore_orig_regfile(rf)
else:
if rf.mirror_rp.lstat(): rf.mirror_rp.delete()
if rf.metadata_rorp.isspecial():
robust.check_common_error(None, rpath.copy_with_attribs,
(rf.metadata_rorp, rf.mirror_rp))
else: rpath.copy_with_attribs(rf.metadata_rorp, rf.mirror_rp)
if rf.regress_inc:
log.Log("Deleting increment " + rf.regress_inc.path, 5)
rf.regress_inc.delete()
def restore_orig_regfile(self, rf):
"""Restore original regular file
This is the trickiest case for avoiding information loss,
because we don't want to delete the increment before the
mirror is fully written.
"""
assert rf.metadata_rorp.isreg()
if rf.mirror_rp.isreg():
tf = TempFile.new(rf.mirror_rp)
tf.write_from_fileobj(rf.get_restore_fp())
rpath.copy_attribs(rf.metadata_rorp, tf)
tf.fsync_with_dir() # make sure tf fully written before move
rpath.rename(tf, rf.mirror_rp) # move is atomic
else:
if rf.mirror_rp.lstat(): rf.mirror_rp.delete()
rf.mirror_rp.write_from_fileobj(rf.get_restore_fp())
rpath.copy_attribs(rf.metadata_rorp, rf.mirror_rp)
rf.mirror_rp.fsync_with_dir() # require move before inc delete
def start_process(self, index, rf):
"""Start processing directory"""
if rf.metadata_rorp.isdir():
# make sure mirror is a readable dir
if not rf.mirror_rp.isdir():
if rf.mirror_rp.lstat(): rf.mirror_rp.delete()
rf.mirror_rp.mkdir()
if Globals.change_permissions and not rf.mirror_rp.hasfullperms():
rf.mirror_rp.chmod(0700)
self.rf = rf
def end_process(self):
"""Finish processing a directory"""
rf = self.rf
if rf.metadata_rorp.isdir():
if rf.mirror_rp.isdir():
rf.mirror_rp.setdata()
if not rf.metadata_rorp.equal_loose(rf.mirror_rp):
log.Log("Regressing attributes of " + rf.mirror_rp.path, 5)
rpath.copy_attribs(rf.metadata_rorp, rf.mirror_rp)
else:
rf.mirror_rp.delete()
log.Log("Regressing file " + rf.mirror_rp.path, 5)
rpath.copy_with_attribs(rf.metadata_rorp, rf.mirror_rp)
else: # replacing a dir with some other kind of file
assert rf.mirror_rp.isdir()
log.Log("Replacing directory " + rf.mirror_rp.path, 5)
if rf.metadata_rorp.isreg(): self.restore_orig_regfile(rf)
else:
rf.mirror_rp.delete()
rpath.copy_with_attribs(rf.metadata_rorp, rf.mirror_rp)
if rf.regress_inc:
log.Log("Deleting increment " + rf.regress_inc.path, 5)
rf.regress_inc.delete()
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Read increment files and restore to original"""
from __future__ import generators
import tempfile, os, cStringIO
import Globals, Time, Rdiff, Hardlink, rorpiter, selection, rpath, \
log, static, robust, metadata, statistics, TempFile
# This should be set to selection.Select objects over the source and
# mirror directories respectively.
_select_source = None
_select_mirror = None
# This will be set to the time of the current mirror
_mirror_time = None
# This will be set to the exact time to restore to (not restore_to_time)
_rest_time = None
class RestoreError(Exception): pass
def Restore(mirror_rp, inc_rpath, target, restore_to_time):
"""Recursively restore mirror and inc_rpath to target at rest_time"""
MirrorS = mirror_rp.conn.restore.MirrorStruct
TargetS = target.conn.restore.TargetStruct
MirrorS.set_mirror_and_rest_times(restore_to_time)
MirrorS.initialize_rf_cache(mirror_rp, inc_rpath)
target_iter = TargetS.get_initial_iter(target)
diff_iter = MirrorS.get_diffs(target_iter)
TargetS.patch(target, diff_iter)
def get_inclist(inc_rpath):
"""Returns increments with given base"""
dirname, basename = inc_rpath.dirsplit()
parent_dir = inc_rpath.__class__(inc_rpath.conn, dirname, ())
if not parent_dir.isdir(): return [] # inc directory not created yet
index = inc_rpath.index
inc_list = []
for filename in parent_dir.listdir():
inc = parent_dir.append(filename)
if inc.isincfile() and inc.getincbase_str() == basename:
inc_list.append(inc)
return inc_list
def ListChangedSince(mirror_rp, inc_rp, restore_to_time):
"""List the changed files under mirror_rp since rest time"""
MirrorS = mirror_rp.conn.restore.MirrorStruct
MirrorS.set_mirror_and_rest_times(restore_to_time)
MirrorS.initialize_rf_cache(mirror_rp, inc_rp)
cur_iter = MirrorS.get_mirror_rorp_iter(_mirror_time, 1)
old_iter = MirrorS.get_mirror_rorp_iter(_rest_time, 1)
collated = rorpiter.Collate2Iters(old_iter, cur_iter)
for old_rorp, cur_rorp in collated:
if not old_rorp: change = "new"
elif not cur_rorp: change = "deleted"
elif old_rorp == cur_rorp: continue
else: change = "changed"
path_desc = (old_rorp and old_rorp.get_indexpath() or
cur_rorp.get_indexpath())
print "%-7s %s" % (change, path_desc)
def ListAtTime(mirror_rp, inc_rp, time):
"""List the files in archive at the given time"""
MirrorS = mirror_rp.conn.restore.MirrorStruct
MirrorS.set_mirror_and_rest_times(time)
MirrorS.initialize_rf_cache(mirror_rp, inc_rp)
old_iter = MirrorS.get_mirror_rorp_iter(_rest_time, 1)
for rorp in old_iter: print rorp.get_indexpath()
class MirrorStruct:
"""Hold functions to be run on the mirror side"""
_select = None # If selection command line arguments given, use Select here
def set_mirror_and_rest_times(cls, restore_to_time):
"""Set global variabels _mirror_time and _rest_time on mirror conn"""
global _mirror_time, _rest_time
_mirror_time = cls.get_mirror_time()
_rest_time = cls.get_rest_time(restore_to_time)
def get_mirror_time(cls):
"""Return time (in seconds) of latest mirror"""
cur_mirror_incs = get_inclist(Globals.rbdir.append("current_mirror"))
if not cur_mirror_incs:
log.Log.FatalError("Could not get time of current mirror")
elif len(cur_mirror_incs) > 1:
log.Log("Warning, two different times for current mirror found", 2)
return cur_mirror_incs[0].getinctime()
def get_rest_time(cls, restore_to_time):
"""Return older time, if restore_to_time is in between two inc times
There is a slightly tricky reason for doing this: The rest of the
code just ignores increments that are older than restore_to_time.
But sometimes we want to consider the very next increment older
than rest time, because rest_time will be between two increments,
and what was actually on the mirror side will correspond to the
older one.
So here we assume all rdiff-backup events were recorded in
"increments" increments, and if it's in-between we pick the
older one here.
"""
global _rest_time
base_incs = get_inclist(Globals.rbdir.append("increments"))
if not base_incs: return _mirror_time
inctimes = [inc.getinctime() for inc in base_incs]
inctimes.append(_mirror_time)
older_times = filter(lambda time: time <= restore_to_time, inctimes)
if older_times: return max(older_times)
else: # restore time older than oldest increment, just return that
return min(inctimes)
def initialize_rf_cache(cls, mirror_base, inc_base):
"""Set cls.rf_cache to CachedRF object"""
inc_list = get_inclist(inc_base)
rf = RestoreFile(mirror_base, inc_base, get_inclist(inc_base))
cls.mirror_base, cls.inc_base = mirror_base, inc_base
cls.root_rf = rf
cls.rf_cache = CachedRF(rf)
def get_mirror_rorp_iter(cls, rest_time = None, require_metadata = None):
"""Return iter of mirror rps at given restore time
Usually we can use the metadata file, but if this is
unavailable, we may have to build it from scratch.
If the cls._select object is set, use it to filter out the
unwanted files from the metadata_iter.
"""
if rest_time is None: rest_time = _rest_time
metadata_iter = metadata.GetMetadata_at_time(Globals.rbdir,
rest_time, restrict_index = cls.mirror_base.index)
if metadata_iter: rorp_iter = metadata_iter
elif require_metadata: log.Log.FatalError("Mirror metadata not found")
else:
log.Log("Warning: Mirror metadata not found, "
"reading from directory", 2)
rorp_iter = cls.get_rorp_iter_from_rf(cls.root_rf)
if cls._select:
rorp_iter = selection.FilterIter(cls._select, rorp_iter)
return rorp_iter
def set_mirror_select(cls, target_rp, select_opts, *filelists):
"""Initialize the mirror selection object"""
assert select_opts, "If no selection options, don't use selector"
cls._select = selection.Select(target_rp)
cls._select.ParseArgs(select_opts, filelists)
def get_rorp_iter_from_rf(cls, rf):
"""Recursively yield mirror rorps from rf"""
rorp = rf.get_attribs()
yield rorp
if rorp.isdir():
for sub_rf in rf.yield_sub_rfs(): yield sub_rf.get_attribs()
def subtract_indicies(cls, index, rorp_iter):
"""Subtract index from index of each rorp in rorp_iter
subtract_indicies and add_indicies are necessary because we
may not be restoring from the root index.
"""
if index == (): return rorp_iter
def get_iter():
for rorp in rorp_iter:
assert rorp.index[:len(index)] == index, (rorp.index, index)
rorp.index = rorp.index[len(index):]
yield rorp
return get_iter()
def get_diffs(cls, target_iter):
"""Given rorp iter of target files, return diffs
Here the target_iter doesn't contain any actual data, just
attribute listings. Thus any diffs we generate will be
snapshots.
"""
mir_iter = cls.subtract_indicies(cls.mirror_base.index,
cls.get_mirror_rorp_iter())
collated = rorpiter.Collate2Iters(mir_iter, target_iter)
return cls.get_diffs_from_collated(collated)
def get_diffs_from_collated(cls, collated):
"""Get diff iterator from collated"""
for mir_rorp, target_rorp in collated:
if Globals.preserve_hardlinks:
if mir_rorp: Hardlink.add_rorp(mir_rorp, source = 1)
if target_rorp: Hardlink.add_rorp(target_rorp, source = 0)
if (not target_rorp or not mir_rorp or
not mir_rorp == target_rorp or
(Globals.preserve_hardlinks and not
Hardlink.rorp_eq(mir_rorp, target_rorp))):
yield cls.get_diff(mir_rorp, target_rorp)
def get_diff(cls, mir_rorp, target_rorp):
"""Get a diff for mir_rorp at time"""
if not mir_rorp: mir_rorp = rpath.RORPath(target_rorp.index)
elif Globals.preserve_hardlinks and Hardlink.islinked(mir_rorp):
mir_rorp.flaglinked(Hardlink.get_link_index(mir_rorp))
elif mir_rorp.isreg():
expanded_index = cls.mirror_base.index + mir_rorp.index
mir_rorp.setfile(cls.rf_cache.get_fp(expanded_index))
mir_rorp.set_attached_filetype('snapshot')
return mir_rorp
static.MakeClass(MirrorStruct)
class TargetStruct:
"""Hold functions to be run on the target side when restoring"""
def get_initial_iter(cls, target):
"""Return a selection object iterating the rorpaths in target"""
return selection.Select(target).set_iter()
def patch(cls, target, diff_iter):
"""Patch target with the diffs from the mirror side
This function and the associated ITRB is similar to the
patching code in backup.py, but they have different error
correction requirements, so it seemed easier to just repeat it
all in this module.
"""
ITR = rorpiter.IterTreeReducer(PatchITRB, [target])
for diff in rorpiter.FillInIter(diff_iter, target):
log.Log("Processing changed file " + diff.get_indexpath(), 5)
ITR(diff.index, diff)
ITR.Finish()
target.setdata()
static.MakeClass(TargetStruct)
class CachedRF:
"""Store RestoreFile objects until they are needed
The code above would like to pretend it has random access to RFs,
making one for a particular index at will. However, in general
this involves listing and filtering a directory, which can get
expensive.
Thus, when a CachedRF retrieves an RestoreFile, it creates all the
RFs of that directory at the same time, and doesn't have to
recalculate. It assumes the indicies will be in order, so the
cache is deleted if a later index is requested.
"""
def __init__(self, root_rf):
"""Initialize CachedRF, self.rf_list variable"""
self.root_rf = root_rf
self.rf_list = [] # list should filled in index order
def list_rfs_in_cache(self, index):
"""Used for debugging, return indicies of cache rfs for printing"""
s1 = "-------- Cached RF for %s -------" % (index,)
s2 = " ".join([str(rf.index) for rf in self.rf_list])
s3 = "--------------------------"
return "\n".join((s1, s2, s3))
def get_rf(self, index):
"""Return RestoreFile of given index"""
while 1:
if not self.rf_list: self.add_rfs(index)
rf = self.rf_list.pop(0)
if rf.index < index: continue
elif rf.index == index: return rf
self.rf_list.insert(0, rf)
self.add_rfs(index)
def get_fp(self, index):
"""Return the file object (for reading) of given index"""
return self.get_rf(index).get_restore_fp()
def add_rfs(self, index):
"""Given index, add the rfs in that same directory"""
if not index: return self.root_rf
parent_index = index[:-1]
temp_rf = RestoreFile(self.root_rf.mirror_rp.new_index(parent_index),
self.root_rf.inc_rp.new_index(parent_index), [])
new_rfs = list(temp_rf.yield_sub_rfs())
assert new_rfs, "No RFs added for index %s" % index
self.rf_list[0:0] = new_rfs
class RestoreFile:
"""Hold data about a single mirror file and its related increments
self.relevant_incs will be set to a list of increments that matter
for restoring a regular file. If the patches are to mirror_rp, it
will be the first element in self.relevant.incs
"""
def __init__(self, mirror_rp, inc_rp, inc_list):
assert mirror_rp.index == inc_rp.index, \
("mirror and inc indicies don't match: %s %s" %
(mirror_rp.get_indexpath(), inc_rp.get_indexpath()))
self.index = mirror_rp.index
self.mirror_rp = mirror_rp
self.inc_rp, self.inc_list = inc_rp, inc_list
self.set_relevant_incs()
def relevant_incs_string(self):
"""Return printable string of relevant incs, used for debugging"""
l = ["---- Relevant incs for %s" % ("/".join(self.index),)]
l.extend(["%s %s %s" % (inc.getinctype(), inc.lstat(), inc.path)
for inc in self.relevant_incs])
l.append("--------------------------------")
return "\n".join(l)
def set_relevant_incs(self):
"""Set self.relevant_incs to increments that matter for restoring
relevant_incs is sorted newest first. If mirror_rp matters,
it will be (first) in relevant_incs.
"""
self.mirror_rp.inc_type = 'snapshot'
self.mirror_rp.inc_compressed = 0
if not self.inc_list or _rest_time >= _mirror_time:
self.relevant_incs = [self.mirror_rp]
return
newer_incs = self.get_newer_incs()
i = 0
while(i < len(newer_incs)):
# Only diff type increments require later versions
if newer_incs[i].getinctype() != "diff": break
i = i+1
self.relevant_incs = newer_incs[:i+1]
if (not self.relevant_incs or
self.relevant_incs[-1].getinctype() == "diff"):
self.relevant_incs.append(self.mirror_rp)
self.relevant_incs.reverse() # return in reversed order
def get_newer_incs(self):
"""Return list of newer incs sorted by time (increasing)
Also discard increments older than rest_time (rest_time we are
assuming is the exact time rdiff-backup was run, so no need to
consider the next oldest increment or any of that)
"""
incpairs = []
for inc in self.inc_list:
time = inc.getinctime()
if time >= _rest_time: incpairs.append((time, inc))
incpairs.sort()
return [pair[1] for pair in incpairs]
def get_attribs(self):
"""Return RORP with restored attributes, but no data
This should only be necessary if the metadata file is lost for
some reason. Otherwise the file provides all data. The size
will be wrong here, because the attribs may be taken from
diff.
"""
last_inc = self.relevant_incs[-1]
if last_inc.getinctype() == 'missing': return rpath.RORPath(self.index)
rorp = last_inc.getRORPath()
rorp.index = self.index
if last_inc.getinctype() == 'dir': rorp.data['type'] = 'dir'
return rorp
def get_restore_fp(self):
"""Return file object of restored data"""
if not self.relevant_incs[-1].isreg():
log.Log("""Warning: Could not restore file %s!
A regular file was indicated by the metadata, but could not be
constructed from existing increments because last increment had type
%s. Instead of the actual file's data, an empty length file will be
created. This error is probably caused by data loss in the
rdiff-backup destination directory, or a bug in rdiff-backup""" %
(self.mirror_rp.path, self.relevant_incs[-1].lstat()), 2)
return cStringIO.StringIO('')
current_fp = self.get_first_fp()
for inc_diff in self.relevant_incs[1:]:
log.Log("Applying patch %s" % (inc_diff.get_indexpath(),), 7)
assert inc_diff.getinctype() == 'diff'
delta_fp = inc_diff.open("rb", inc_diff.isinccompressed())
new_fp = tempfile.TemporaryFile()
Rdiff.write_patched_fp(current_fp, delta_fp, new_fp)
new_fp.seek(0)
current_fp = new_fp
return current_fp
def get_first_fp(self):
"""Return first file object from relevant inc list"""
first_inc = self.relevant_incs[0]
assert first_inc.getinctype() == 'snapshot'
if not first_inc.isinccompressed(): return first_inc.open("rb")
# current_fp must be a real (uncompressed) file
current_fp = tempfile.TemporaryFile()
fp = first_inc.open("rb", compress = 1)
rpath.copyfileobj(fp, current_fp)
assert not fp.close()
current_fp.seek(0)
return current_fp
def yield_sub_rfs(self):
"""Return RestoreFiles under current RestoreFile (which is dir)"""
assert self.mirror_rp.isdir() or self.inc_rp.isdir()
if self.mirror_rp.isdir():
mirror_iter = self.yield_mirrorrps(self.mirror_rp)
else: mirror_iter = iter([])
if self.inc_rp.isdir():
inc_pair_iter = self.yield_inc_complexes(self.inc_rp)
else: inc_pair_iter = iter([])
collated = rorpiter.Collate2Iters(mirror_iter, inc_pair_iter)
for mirror_rp, inc_pair in collated:
if not inc_pair:
inc_rp = self.inc_rp.new_index(mirror_rp.index)
inc_list = []
else: inc_rp, inc_list = inc_pair
if not mirror_rp:
mirror_rp = self.mirror_rp.new_index(inc_rp.index)
yield self.__class__(mirror_rp, inc_rp, inc_list)
def yield_mirrorrps(self, mirrorrp):
"""Yield mirrorrps underneath given mirrorrp"""
assert mirrorrp.isdir()
for filename in robust.listrp(mirrorrp):
rp = mirrorrp.append(filename)
if rp.index != ('rdiff-backup-data',): yield rp
def yield_inc_complexes(self, inc_rpath):
"""Yield (sub_inc_rpath, inc_list) IndexedTuples from given inc_rpath
Finds pairs under directory inc_rpath. sub_inc_rpath will just be
the prefix rp, while the rps in inc_list should actually exist.
"""
if not inc_rpath.isdir(): return
inc_dict = {} # dictionary of basenames:IndexedTuples(index, inc_list)
dirlist = robust.listrp(inc_rpath)
def affirm_dict_indexed(basename):
"""Make sure the rid dictionary has given basename as key"""
if not inc_dict.has_key(basename):
sub_inc_rp = inc_rpath.append(basename)
inc_dict[basename] = rorpiter.IndexedTuple(sub_inc_rp.index,
(sub_inc_rp, []))
def add_to_dict(filename):
"""Add filename to the inc tuple dictionary"""
rp = inc_rpath.append(filename)
if rp.isincfile() and rp.getinctype() != 'data':
basename = rp.getincbase_str()
affirm_dict_indexed(basename)
inc_dict[basename][1].append(rp)
elif rp.isdir(): affirm_dict_indexed(filename)
for filename in dirlist: add_to_dict(filename)
keys = inc_dict.keys()
keys.sort()
for key in keys: yield inc_dict[key]
class PatchITRB(rorpiter.ITRBranch):
"""Patch an rpath with the given diff iters (use with IterTreeReducer)
The main complication here involves directories. We have to
finish processing the directory after what's in the directory, as
the directory may have inappropriate permissions to alter the
contents or the dir's mtime could change as we change the
contents.
This code was originally taken from backup.py. However, because
of different error correction requirements, it is repeated here.
"""
def __init__(self, basis_root_rp):
"""Set basis_root_rp, the base of the tree to be incremented"""
self.basis_root_rp = basis_root_rp
assert basis_root_rp.conn is Globals.local_connection
self.dir_replacement, self.dir_update = None, None
self.cached_rp = None
def get_rp_from_root(self, index):
"""Return RPath by adding index to self.basis_root_rp"""
if not self.cached_rp or self.cached_rp.index != index:
self.cached_rp = self.basis_root_rp.new_index(index)
return self.cached_rp
def can_fast_process(self, index, diff_rorp):
"""True if diff_rorp and mirror are not directories"""
rp = self.get_rp_from_root(index)
return not diff_rorp.isdir() and not rp.isdir()
def fast_process(self, index, diff_rorp):
"""Patch base_rp with diff_rorp (case where neither is directory)"""
rp = self.get_rp_from_root(index)
tf = TempFile.new(rp)
self.patch_to_temp(rp, diff_rorp, tf)
rpath.rename(tf, rp)
def patch_to_temp(self, basis_rp, diff_rorp, new):
"""Patch basis_rp, writing output in new, which doesn't exist yet"""
if diff_rorp.isflaglinked():
Hardlink.link_rp(diff_rorp, new, self.basis_root_rp)
elif diff_rorp.get_attached_filetype() == 'snapshot':
rpath.copy(diff_rorp, new)
else:
assert diff_rorp.get_attached_filetype() == 'diff'
Rdiff.patch_local(basis_rp, diff_rorp, new)
if new.lstat(): rpath.copy_attribs(diff_rorp, new)
def start_process(self, index, diff_rorp):
"""Start processing directory - record information for later"""
base_rp = self.base_rp = self.get_rp_from_root(index)
assert diff_rorp.isdir() or base_rp.isdir() or not base_rp.index
if diff_rorp.isdir(): self.prepare_dir(diff_rorp, base_rp)
else: self.set_dir_replacement(diff_rorp, base_rp)
def set_dir_replacement(self, diff_rorp, base_rp):
"""Set self.dir_replacement, which holds data until done with dir
This is used when base_rp is a dir, and diff_rorp is not.
"""
assert diff_rorp.get_attached_filetype() == 'snapshot'
self.dir_replacement = TempFile.new(base_rp)
rpath.copy_with_attribs(diff_rorp, self.dir_replacement)
if base_rp.isdir() and Globals.change_permissions: base_rp.chmod(0700)
def prepare_dir(self, diff_rorp, base_rp):
"""Prepare base_rp to turn into a directory"""
self.dir_update = diff_rorp.getRORPath() # make copy in case changes
if not base_rp.isdir():
if base_rp.lstat(): base_rp.delete()
base_rp.mkdir()
if Globals.change_permissions: base_rp.chmod(0700)
def end_process(self):
"""Finish processing directory"""
if self.dir_update:
assert self.base_rp.isdir()
rpath.copy_attribs(self.dir_update, self.base_rp)
else:
assert self.dir_replacement
self.base_rp.rmdir()
if self.dir_replacement.lstat():
rpath.rename(self.dir_replacement, self.base_rp)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Catch various exceptions given system call"""
import errno, signal
import librsync, C, static, rpath, Globals, log, statistics
def check_common_error(error_handler, function, args = []):
"""Apply function to args, if error, run error_handler on exception
This uses the catch_error predicate below to only catch
certain exceptions which seems innocent enough.
"""
try: return function(*args)
except Exception, exc:
TracebackArchive.add([function] + list(args))
if catch_error(exc):
log.Log.exception()
conn = Globals.backup_writer
if conn is not None: conn.statistics.record_error()
if error_handler: return error_handler(exc, *args)
else: return None
log.Log.exception(1, 2)
raise
def catch_error(exc):
"""Return true if exception exc should be caught"""
for exception_class in (rpath.SkipFileException, rpath.RPathException,
librsync.librsyncError, C.UnknownFileTypeError):
if isinstance(exc, exception_class): return 1
if (isinstance(exc, EnvironmentError) and
# the invalid mode shows up in backups of /proc for some reason
(exc[0] == 'invalid mode: rb' or
errno.errorcode.has_key(exc[0]) and
errno.errorcode[exc[0]] in ('EPERM', 'ENOENT', 'EACCES', 'EBUSY',
'EEXIST', 'ENOTDIR', 'ENAMETOOLONG',
'EINTR', 'ENOTEMPTY', 'EIO', 'ETXTBSY',
'ESRCH', 'EINVAL', 'EDEADLOCK'))):
return 1
return 0
def get_error_handler(error_type):
"""Return error handler function that can be used above
Function will just log error to the error_log and then return
None. First two arguments must be the exception and then an rp
(from which the filename will be extracted).
"""
def error_handler(exc, rp, *args):
log.ErrorLog.write_if_open(error_type, rp, exc)
return 0
return error_handler
def listrp(rp):
"""Like rp.listdir() but return [] if error, and sort results"""
def error_handler(exc):
log.Log("Error listing directory %s" % rp.path, 2)
return []
dir_listing = check_common_error(error_handler, rp.listdir)
dir_listing.sort()
return dir_listing
def signal_handler(signum, frame):
"""This is called when signal signum is caught"""
raise SignalException(signum)
def install_signal_handlers():
"""Install signal handlers on current connection"""
for signum in [signal.SIGQUIT, signal.SIGHUP, signal.SIGTERM]:
signal.signal(signum, signal_handler)
class SignalException(Exception):
"""SignalException(signum) means signal signum has been received"""
pass
class TracebackArchive:
"""Save last 10 caught exceptions, so they can be printed if fatal"""
_traceback_strings = []
def add(cls, extra_args = []):
"""Add most recent exception to archived list
If extra_args are present, convert to strings and add them as
extra information to same traceback archive.
"""
cls._traceback_strings.append(log.Log.exception_to_string(extra_args))
if len(cls._traceback_strings) > 10:
cls._traceback_strings = cls._traceback_strings[:10]
def log(cls):
"""Print all exception information to log file"""
if cls._traceback_strings:
log.Log("------------ Old traceback info -----------\n%s\n"
"-------------------------------------------" %
("\n".join(cls._traceback_strings),), 3)
static.MakeClass(TracebackArchive)
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Operations on Iterators of Read Only Remote Paths
The main structure will be an iterator that yields RORPaths.
Every RORPath has a "raw" form that makes it more amenable to
being turned into a file. The raw form of the iterator yields
each RORPath in the form of the tuple (index, data_dictionary,
files), where files is the number of files attached (usually 1 or
0). After that, if a file is attached, it yields that file.
"""
from __future__ import generators
import os, tempfile, UserList, types
import Globals, rpath, iterfile, log
def CollateIterators(*rorp_iters):
"""Collate RORPath iterators by index
So it takes two or more iterators of rorps and returns an
iterator yielding tuples like (rorp1, rorp2) with the same
index. If one or the other lacks that index, it will be None
"""
# overflow[i] means that iter[i] has been exhausted
# rorps[i] is None means that it is time to replenish it.
iter_num = len(rorp_iters)
if iter_num == 2:
return Collate2Iters(rorp_iters[0], rorp_iters[1])
overflow = [None] * iter_num
rorps = overflow[:]
def setrorps(overflow, rorps):
"""Set the overflow and rorps list"""
for i in range(iter_num):
if not overflow[i] and rorps[i] is None:
try: rorps[i] = rorp_iters[i].next()
except StopIteration:
overflow[i] = 1
rorps[i] = None
def getleastindex(rorps):
"""Return the first index in rorps, assuming rorps isn't empty"""
return min(map(lambda rorp: rorp.index,
filter(lambda x: x, rorps)))
def yield_tuples(iter_num, overflow, rorps):
while 1:
setrorps(overflow, rorps)
if not None in overflow: break
index = getleastindex(rorps)
yieldval = []
for i in range(iter_num):
if rorps[i] and rorps[i].index == index:
yieldval.append(rorps[i])
rorps[i] = None
else: yieldval.append(None)
yield IndexedTuple(index, yieldval)
return yield_tuples(iter_num, overflow, rorps)
def Collate2Iters(riter1, riter2):
"""Special case of CollateIterators with 2 arguments
This does the same thing but is faster because it doesn't have
to consider the >2 iterator case. Profiler says speed is
important here.
"""
relem1, relem2 = None, None
while 1:
if not relem1:
try: relem1 = riter1.next()
except StopIteration:
if relem2: yield (None, relem2)
for relem2 in riter2:
yield (None, relem2)
break
index1 = relem1.index
if not relem2:
try: relem2 = riter2.next()
except StopIteration:
if relem1: yield (relem1, None)
for relem1 in riter1:
yield (relem1, None)
break
index2 = relem2.index
if index1 < index2:
yield (relem1, None)
relem1 = None
elif index1 == index2:
yield (relem1, relem2)
relem1, relem2 = None, None
else: # index2 is less
yield (None, relem2)
relem2 = None
class IndexedTuple(UserList.UserList):
"""Like a tuple, but has .index
This is used by CollateIterator above, and can be passed to the
IterTreeReducer.
"""
def __init__(self, index, sequence):
self.index = index
self.data = tuple(sequence)
def __len__(self): return len(self.data)
def __getitem__(self, key):
"""This only works for numerical keys (easier this way)"""
return self.data[key]
def __lt__(self, other): return self.__cmp__(other) == -1
def __le__(self, other): return self.__cmp__(other) != 1
def __ne__(self, other): return not self.__eq__(other)
def __gt__(self, other): return self.__cmp__(other) == 1
def __ge__(self, other): return self.__cmp__(other) != -1
def __cmp__(self, other):
assert isinstance(other, IndexedTuple)
if self.index < other.index: return -1
elif self.index == other.index: return 0
else: return 1
def __eq__(self, other):
if isinstance(other, IndexedTuple):
return self.index == other.index and self.data == other.data
elif type(other) is types.TupleType:
return self.data == other
else: return None
def __str__(self):
return "(%s).%s" % (", ".join(map(str, self.data)), self.index)
def FillInIter(rpiter, rootrp):
"""Given ordered rpiter and rootrp, fill in missing indicies with rpaths
For instance, suppose rpiter contains rpaths with indicies (),
(1,2), (2,5). Then return iter with rpaths (), (1,), (1,2), (2,),
(2,5). This is used when we need to process directories before or
after processing a file in that directory.
If start_index is given, start with start_index instead of ().
The indicies of rest of the rorps should also start with
start_index.
"""
# Handle first element as special case
first_rp = rpiter.next() # StopIteration gets passed upwards
cur_index = first_rp.index
for i in range(len(cur_index)): yield rootrp.new_index(cur_index[:i])
yield first_rp
del first_rp
old_index = cur_index
# Now do all the other elements
for rp in rpiter:
cur_index = rp.index
if not cur_index[:-1] == old_index[:-1]: # Handle special case quickly
for i in range(1, len(cur_index)): # i==0 case already handled
if cur_index[:i] != old_index[:i]:
filler_rp = rootrp.new_index(cur_index[:i])
yield filler_rp
yield rp
old_index = cur_index
class IterTreeReducer:
"""Tree style reducer object for iterator
The indicies of a RORPIter form a tree type structure. This class
can be used on each element of an iter in sequence and the result
will be as if the corresponding tree was reduced. This tries to
bridge the gap between the tree nature of directories, and the
iterator nature of the connection between hosts and the temporal
order in which the files are processed.
"""
def __init__(self, branch_class, branch_args):
"""ITR initializer"""
self.branch_class = branch_class
self.branch_args = branch_args
self.index = None
self.root_branch = branch_class(*branch_args)
self.branches = [self.root_branch]
self.root_fast_processed = None
def finish_branches(self, index):
"""Run Finish() on all branches index has passed
When we pass out of a branch, delete it and process it with
the parent. The innermost branches will be the last in the
list. Return None if we are out of the entire tree, and 1
otherwise.
"""
branches = self.branches
while 1:
to_be_finished = branches[-1]
base_index = to_be_finished.base_index
if base_index != index[:len(base_index)]:
# out of the tree, finish with to_be_finished
to_be_finished.end_process()
del branches[-1]
if not branches: return None
branches[-1].branch_process(to_be_finished)
else: return 1
def add_branch(self, index):
"""Return branch of type self.branch_class, add to branch list"""
branch = self.branch_class(*self.branch_args)
branch.base_index = index
self.branches.append(branch)
return branch
def Finish(self):
"""Call at end of sequence to tie everything up"""
if self.index is None or self.root_fast_processed: return
while 1:
to_be_finished = self.branches.pop()
to_be_finished.end_process()
if not self.branches: break
self.branches[-1].branch_process(to_be_finished)
def __call__(self, *args):
"""Process args, where args[0] is current position in iterator
Returns true if args successfully processed, false if index is
not in the current tree and thus the final result is
available.
Also note below we set self.index after doing the necessary
start processing, in case there is a crash in the middle.
"""
index = args[0]
if self.index is None:
self.root_branch.base_index = index
if self.root_branch.can_fast_process(*args):
self.root_branch.fast_process(*args)
self.root_fast_processed = 1
else: self.root_branch.start_process(*args)
self.index = index
return 1
if index == self.index:
log.Log("Warning, repeated index %s, bad filesystem?"
% (index,), 2)
elif index < self.index:
assert 0, "Bad index order: %s >= %s" % (self.index, index)
else: # normal case
if self.finish_branches(index) is None:
return None # We are no longer in the main tree
last_branch = self.branches[-1]
if last_branch.can_fast_process(*args):
last_branch.fast_process(*args)
else:
branch = self.add_branch(index)
branch.start_process(*args)
self.index = index
return 1
class ITRBranch:
"""Helper class for IterTreeReducer above
There are five stub functions below: start_process, end_process,
branch_process, can_fast_process, and fast_process. A class that
subclasses this one will probably fill in these functions to do
more.
"""
base_index = index = None
def start_process(self, *args):
"""Do some initial processing (stub)"""
pass
def end_process(self):
"""Do any final processing before leaving branch (stub)"""
pass
def branch_process(self, branch):
"""Process a branch right after it is finished (stub)"""
pass
def can_fast_process(self, *args):
"""True if object can be processed without new branch (stub)"""
return None
def fast_process(self, *args):
"""Process args without new child branch (stub)"""
pass
class CacheIndexable:
"""Cache last few indexed elements in iterator
This class should be initialized with an iterator yielding
.index'd objects. It looks like it is just the same iterator as
the one that initialized it. Luckily, it does more, caching the
last few elements iterated, which can be retrieved using the
.get() method.
If the index is not in the cache, return None.
"""
def __init__(self, indexed_iter, cache_size = None):
"""Make new CacheIndexable. Cache_size is max cache length"""
self.cache_size = cache_size
self.iter = indexed_iter
self.cache_dict = {}
self.cache_indicies = []
def next(self):
"""Return next elem, add to cache. StopIteration passed upwards"""
next_elem = self.iter.next()
next_index = next_elem.index
self.cache_dict[next_index] = next_elem
self.cache_indicies.append(next_index)
if len(self.cache_indicies) > self.cache_size:
del self.cache_dict[self.cache_indicies[0]]
del self.cache_indicies[0]
return next_elem
def __iter__(self): return self
def get(self, index):
"""Return element with index index from cache"""
try: return self.cache_dict[index]
except KeyError:
assert index >= self.cache_indicies[0], \
"Index out of order: "+repr((index, self.cache_indicies[0]))
return None
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Wrapper class around a real path like "/usr/bin/env"
The RPath (short for Remote Path) and associated classes make some
function calls more convenient and also make working with files on
remote systems transparent.
For instance, suppose
rp = RPath(connection_object, "/usr/bin/env")
Then rp.getperms() returns the permissions of that file, and
rp.delete() deletes that file. Both of these will work the same even
if "usr/bin/env" is on a different computer. So many rdiff-backup
functions use rpaths so they don't have to know whether the files they
are dealing with are local or remote.
"""
import os, stat, re, sys, shutil, gzip, socket, time
import Globals, Time, static, log
class SkipFileException(Exception):
"""Signal that the current file should be skipped but then continue
This exception will often be raised when there is problem reading
an individual file, but it makes sense for the rest of the backup
to keep going.
"""
pass
class RPathException(Exception): pass
def copyfileobj(inputfp, outputfp):
"""Copies file inputfp to outputfp in blocksize intervals"""
blocksize = Globals.blocksize
while 1:
inbuf = inputfp.read(blocksize)
if not inbuf: break
outputfp.write(inbuf)
def cmpfileobj(fp1, fp2):
"""True if file objects fp1 and fp2 contain same data"""
blocksize = Globals.blocksize
while 1:
buf1 = fp1.read(blocksize)
buf2 = fp2.read(blocksize)
if buf1 != buf2: return None
elif not buf1: return 1
def check_for_files(*rps):
"""Make sure that all the rps exist, raise error if not"""
for rp in rps:
if not rp.lstat():
raise RPathException("File %s does not exist" % rp.get_indexpath())
def move(rpin, rpout):
"""Move rpin to rpout, renaming if possible"""
try: rename(rpin, rpout)
except os.error:
copy(rpin, rpout)
rpin.delete()
def copy(rpin, rpout, compress = 0):
"""Copy RPath rpin to rpout. Works for symlinks, dirs, etc."""
log.Log("Regular copying %s to %s" % (rpin.index, rpout.path), 6)
if not rpin.lstat():
if rpout.lstat(): rpout.delete()
return
if rpout.lstat():
if rpin.isreg() or not cmp(rpin, rpout):
rpout.delete() # easier to write than compare
else: return
if rpin.isreg(): copy_reg_file(rpin, rpout, compress)
elif rpin.isdir(): rpout.mkdir()
elif rpin.issym(): rpout.symlink(rpin.readlink())
elif rpin.ischardev():
major, minor = rpin.getdevnums()
rpout.makedev("c", major, minor)
elif rpin.isblkdev():
major, minor = rpin.getdevnums()
rpout.makedev("b", major, minor)
elif rpin.isfifo(): rpout.mkfifo()
elif rpin.issock(): rpout.mksock()
else: raise RPathException("File %s has unknown type" % rpin.path)
def copy_reg_file(rpin, rpout, compress = 0):
"""Copy regular file rpin to rpout, possibly avoiding connection"""
try:
if (rpout.conn is rpin.conn and
rpout.conn is not Globals.local_connection):
rpout.conn.rpath.copy_reg_file(rpin.path, rpout.path, compress)
rpout.setdata()
return
except AttributeError: pass
rpout.write_from_fileobj(rpin.open("rb"), compress = compress)
def cmp(rpin, rpout):
"""True if rpin has the same data as rpout
cmp does not compare file ownership, permissions, or times, or
examine the contents of a directory.
"""
check_for_files(rpin, rpout)
if rpin.isreg():
if not rpout.isreg(): return None
fp1, fp2 = rpin.open("rb"), rpout.open("rb")
result = cmpfileobj(fp1, fp2)
if fp1.close() or fp2.close():
raise RPathException("Error closing file")
return result
elif rpin.isdir(): return rpout.isdir()
elif rpin.issym():
return rpout.issym() and (rpin.readlink() == rpout.readlink())
elif rpin.ischardev():
return rpout.ischardev() and (rpin.getdevnums() == rpout.getdevnums())
elif rpin.isblkdev():
return rpout.isblkdev() and (rpin.getdevnums() == rpout.getdevnums())
elif rpin.isfifo(): return rpout.isfifo()
elif rpin.issock(): return rpout.issock()
else: raise RPathException("File %s has unknown type" % rpin.path)
def copy_attribs(rpin, rpout):
"""Change file attributes of rpout to match rpin
Only changes the chmoddable bits, uid/gid ownership, and
timestamps, so both must already exist.
"""
log.Log("Copying attributes from %s to %s" % (rpin.index, rpout.path), 7)
check_for_files(rpin, rpout)
if rpin.issym(): return # symlinks have no valid attributes
if Globals.change_ownership: apply(rpout.chown, rpin.getuidgid())
if Globals.change_permissions: rpout.chmod(rpin.getperms())
if not rpin.isdev(): rpout.setmtime(rpin.getmtime())
def cmp_attribs(rp1, rp2):
"""True if rp1 has the same file attributes as rp2
Does not compare file access times. If not changing
ownership, do not check user/group id.
"""
check_for_files(rp1, rp2)
if Globals.change_ownership and rp1.getuidgid() != rp2.getuidgid():
result = None
elif rp1.getperms() != rp2.getperms(): result = None
elif rp1.issym() and rp2.issym(): # Don't check times for some types
result = 1
elif rp1.isblkdev() and rp2.isblkdev(): result = 1
elif rp1.ischardev() and rp2.ischardev(): result = 1
else: result = (rp1.getmtime() == rp2.getmtime())
log.Log("Compare attribs of %s and %s: %s" %
(rp1.get_indexpath(), rp2.get_indexpath(), result), 7)
return result
def copy_with_attribs(rpin, rpout, compress = 0):
"""Copy file and then copy over attributes"""
copy(rpin, rpout, compress)
if rpin.lstat(): copy_attribs(rpin, rpout)
def rename(rp_source, rp_dest):
"""Rename rp_source to rp_dest"""
assert rp_source.conn is rp_dest.conn
log.Log(lambda: "Renaming %s to %s" % (rp_source.path, rp_dest.path), 7)
if not rp_source.lstat(): rp_dest.delete()
else:
if rp_dest.lstat() and rp_source.getinode() == rp_dest.getinode():
log.Log("Warning: Attempt to rename over same inode: %s to %s"
% (rp_source.path, rp_dest.path), 2)
# You can't rename one hard linked file over another
rp_source.delete()
else: rp_source.conn.os.rename(rp_source.path, rp_dest.path)
rp_dest.data = rp_source.data
rp_source.data = {'type': None}
def tupled_lstat(filename):
"""Like os.lstat, but return only a tuple, or None if os.error
Later versions of os.lstat return a special lstat object,
which can confuse the pickler and cause errors in remote
operations. This has been fixed in Python 2.2.1.
"""
try: return tuple(os.lstat(filename))
except os.error: return None
def make_socket_local(rpath):
"""Make a local socket at the given path
This takes an rpath so that it will be checked by Security.
(Miscellaneous strings will not be.)
"""
assert rpath.conn is Globals.local_connection
s = socket.socket(socket.AF_UNIX)
try: s.bind(rpath.path)
except socket.error, exc:
raise SkipFileException("Socket error: " + str(exc))
def gzip_open_local_read(rpath):
"""Return open GzipFile. See security note directly above"""
assert rpath.conn is Globals.local_connection
return gzip.GzipFile(rpath.path, "rb")
def open_local_read(rpath):
"""Return open file (provided for security reasons)"""
assert rpath.conn is Globals.local_connection
return open(rpath.path, "rb")
class RORPath:
"""Read Only RPath - carry information about a path
These contain information about a file, and possible the file's
data, but do not have a connection and cannot be written to or
changed. The advantage of these objects is that they can be
communicated by encoding their index and data dictionary.
"""
def __init__(self, index, data = None):
self.index = index
if data: self.data = data
else: self.data = {'type':None} # signify empty file
self.file = None
def zero(self):
"""Set inside of self to type None"""
self.data = {'type': None}
self.file = None
def __nonzero__(self): return 1
def __eq__(self, other):
"""True iff the two rorpaths are equivalent"""
if self.index != other.index: return None
for key in self.data.keys(): # compare dicts key by key
if (key == 'uid' or key == 'gid') and self.issym():
pass # Don't compare gid/uid for symlinks
elif key == 'perms' and not Globals.change_permissions: pass
elif key == 'atime' and not Globals.preserve_atime: pass
elif key == 'devloc' or key == 'nlink': pass
elif key == 'size' and not self.isreg(): pass
elif (key == 'inode' and
(not self.isreg() or self.getnumlinks() == 1 or
not Globals.compare_inode or
not Globals.preserve_hardlinks)): pass
elif (not other.data.has_key(key) or
self.data[key] != other.data[key]): return None
return 1
def equal_loose(self, other):
"""True iff the two rorpaths are kinda equivalent
Sometimes because permissions cannot be set, a file cannot be
replicated exactly on the remote side. This function tells
you whether the two files are close enough. self must be the
original rpath.
"""
for key in self.data.keys(): # compare dicts key by key
if ((key == 'uid' or key == 'gid') and
(self.issym() or not Globals.change_ownership)):
# Don't compare gid/uid for symlinks, and only root
# can change ownership
pass
elif (key == 'type' and self.isspecial() and
other.isreg() and other.getsize() == 0):
pass # Special files may be replaced with empty regular files
elif key == 'atime' and not Globals.preserve_atime: pass
elif key == 'devloc' or key == 'nlink': pass
elif key == 'size' and not self.isreg(): pass
elif key == 'perms' and not Globals.change_permissions: pass
elif key == 'inode': pass
elif (not other.data.has_key(key) or
self.data[key] != other.data[key]): return 0
return 1
def equal_verbose(self, other, check_index = 1,
compare_inodes = 0, compare_ownership = 0):
"""Like __eq__, but log more information. Useful when testing"""
if check_index and self.index != other.index:
log.Log("Index %s != index %s" % (self.index, other.index), 2)
return None
for key in self.data.keys(): # compare dicts key by key
if ((key == 'uid' or key == 'gid') and
(self.issym() or not compare_ownership)):
# Don't compare gid/uid for symlinks, or if told not to
pass
elif key == 'perms' and not Globals.change_permissions: pass
elif key == 'atime' and not Globals.preserve_atime: pass
elif key == 'devloc' or key == 'nlink': pass
elif key == 'size' and not self.isreg(): pass
elif key == 'inode' and (not self.isreg() or not compare_inodes):
pass
elif (not other.data.has_key(key) or
self.data[key] != other.data[key]):
if not other.data.has_key(key):
log.Log("Second is missing key %s" % (key,), 2)
else: log.Log("Value of %s differs: %s vs %s" %
(key, self.data[key], other.data[key]), 2)
return None
return 1
def __ne__(self, other): return not self.__eq__(other)
def __str__(self):
"""Pretty print file statistics"""
return "Index: %s\nData: %s" % (self.index, self.data)
def summary_string(self):
"""Return summary string"""
return "%s %s" % (self.get_indexpath(), self.lstat())
def __getstate__(self):
"""Return picklable state
This is necessary in case the RORPath is carrying around a
file object, which can't/shouldn't be pickled.
"""
return (self.index, self.data)
def __setstate__(self, rorp_state):
"""Reproduce RORPath from __getstate__ output"""
self.index, self.data = rorp_state
def getRORPath(self):
"""Return new rorpath based on self"""
return RORPath(self.index, self.data.copy())
def lstat(self):
"""Returns type of file
The allowable types are None if the file doesn't exist, 'reg'
for a regular file, 'dir' for a directory, 'dev' for a device
file, 'fifo' for a fifo, 'sock' for a socket, and 'sym' for a
symlink.
"""
return self.data['type']
gettype = lstat
def isdir(self):
"""True if self is a dir"""
return self.data['type'] == 'dir'
def isreg(self):
"""True if self is a regular file"""
return self.data['type'] == 'reg'
def issym(self):
"""True if path is of a symlink"""
return self.data['type'] == 'sym'
def isfifo(self):
"""True if path is a fifo"""
return self.data['type'] == 'fifo'
def ischardev(self):
"""True if path is a character device file"""
return self.data['type'] == 'dev' and self.data['devnums'][0] == 'c'
def isblkdev(self):
"""True if path is a block device file"""
return self.data['type'] == 'dev' and self.data['devnums'][0] == 'b'
def isdev(self):
"""True if path is a device file"""
return self.data['type'] == 'dev'
def issock(self):
"""True if path is a socket"""
return self.data['type'] == 'sock'
def isspecial(self):
"""True if the file is a sock, symlink, device, or fifo"""
type = self.data['type']
return (type == 'dev' or type == 'sock' or
type == 'fifo' or type == 'sym')
def getperms(self):
"""Return permission block of file"""
return self.data['perms']
def hassize(self):
"""True if rpath has a size parameter"""
return self.data.has_key('size')
def getsize(self):
"""Return length of file in bytes"""
return self.data['size']
def getuidgid(self):
"""Return userid/groupid of file"""
return self.data['uid'], self.data['gid']
def getatime(self):
"""Return access time in seconds"""
return self.data['atime']
def getmtime(self):
"""Return modification time in seconds"""
return self.data['mtime']
def getinode(self):
"""Return inode number of file"""
return self.data['inode']
def getdevloc(self):
"""Device number file resides on"""
return self.data['devloc']
def getnumlinks(self):
"""Number of places inode is linked to"""
if self.data.has_key('nlink'): return self.data['nlink']
else: return 1
def readlink(self):
"""Wrapper around os.readlink()"""
return self.data['linkname']
def getdevnums(self):
"""Return a devices major/minor numbers from dictionary"""
return self.data['devnums'][1:]
def setfile(self, file):
"""Right now just set self.file to be the already opened file"""
assert file and not self.file
def closing_hook(): self.file_already_open = None
self.file = RPathFileHook(file, closing_hook)
self.file_already_open = None
def get_indexpath(self):
"""Return path of index portion
For instance, if the index is ("a", "b"), return "a/b".
"""
if not self.index: return "."
return "/".join(self.index)
def get_attached_filetype(self):
"""If there is a file attached, say what it is
Currently the choices are 'snapshot' meaning an exact copy of
something, and 'diff' for an rdiff style diff.
"""
return self.data['filetype']
def set_attached_filetype(self, type):
"""Set the type of the attached file"""
self.data['filetype'] = type
def isflaglinked(self):
"""True if rorp is a signature/diff for a hardlink file
This indicates that a file's data need not be transferred
because it is hardlinked on the remote side.
"""
return self.data.has_key('linked')
def get_link_flag(self):
"""Return previous index that a file is hard linked to"""
return self.data['linked']
def flaglinked(self, index):
"""Signal that rorp is a signature/diff for a hardlink file"""
self.data['linked'] = index
def open(self, mode):
"""Return file type object if any was given using self.setfile"""
if mode != "rb": raise RPathException("Bad mode %s" % mode)
if self.file_already_open:
raise RPathException("Attempt to open same file twice")
self.file_already_open = 1
return self.file
def close_if_necessary(self):
"""If file is present, discard data and close"""
if self.file:
while self.file.read(Globals.blocksize): pass
assert not self.file.close(), \
"Error closing file\ndata = %s\nindex = %s\n" % (self.data,
self.index)
self.file_already_open = None
class RPath(RORPath):
"""Remote Path class - wrapper around a possibly non-local pathname
This class contains a dictionary called "data" which should
contain all the information about the file sufficient for
identification (i.e. if two files have the the same (==) data
dictionary, they are the same file).
"""
regex_chars_to_quote = re.compile("[\\\\\\\"\\$`]")
def __init__(self, connection, base, index = (), data = None):
"""RPath constructor
connection = self.conn is the Connection the RPath will use to
make system calls, and index is the name of the rpath used for
comparison, and should be a tuple consisting of the parts of
the rpath after the base split up. For instance ("foo",
"bar") for "foo/bar" (no base), and ("local", "bin") for
"/usr/local/bin" if the base is "/usr".
For the root directory "/", the index is empty and the base is
"/".
"""
self.conn = connection
self.index = index
self.base = base
if base is not None:
if base == "/": self.path = "/" + "/".join(index)
else: self.path = "/".join((base,) + index)
self.file = None
if data or base is None: self.data = data
else: self.data = self.conn.C.make_file_dict(self.path)
def __str__(self):
return "Path: %s\nIndex: %s\nData: %s" % (self.path, self.index,
self.data)
def __getstate__(self):
"""Return picklable state
The connection must be local because we can't pickle a
connection. Data and any attached file also won't be saved.
"""
assert self.conn is Globals.local_connection
return (self.index, self.base, self.data)
def __setstate__(self, rpath_state):
"""Reproduce RPath from __getstate__ output"""
self.conn = Globals.local_connection
self.index, self.base, self.data = rpath_state
self.path = "/".join((self.base,) + self.index)
def setdata(self):
"""Set data dictionary using C extension"""
self.data = self.conn.C.make_file_dict(self.path)
def make_file_dict_old(self):
"""Create the data dictionary"""
statblock = self.conn.rpath.tupled_lstat(self.path)
if statblock is None:
return {'type':None}
data = {}
mode = statblock[stat.ST_MODE]
if stat.S_ISREG(mode): type = 'reg'
elif stat.S_ISDIR(mode): type = 'dir'
elif stat.S_ISCHR(mode):
type = 'dev'
data['devnums'] = ('c',) + self._getdevnums()
elif stat.S_ISBLK(mode):
type = 'dev'
data['devnums'] = ('b',) + self._getdevnums()
elif stat.S_ISFIFO(mode): type = 'fifo'
elif stat.S_ISLNK(mode):
type = 'sym'
data['linkname'] = self.conn.os.readlink(self.path)
elif stat.S_ISSOCK(mode): type = 'sock'
else: raise C.UnknownFileError(self.path)
data['type'] = type
data['size'] = statblock[stat.ST_SIZE]
data['perms'] = stat.S_IMODE(mode)
data['uid'] = statblock[stat.ST_UID]
data['gid'] = statblock[stat.ST_GID]
data['inode'] = statblock[stat.ST_INO]
data['devloc'] = statblock[stat.ST_DEV]
data['nlink'] = statblock[stat.ST_NLINK]
if not (type == 'sym' or type == 'dev'):
# mtimes on symlinks and dev files don't work consistently
data['mtime'] = long(statblock[stat.ST_MTIME])
data['atime'] = long(statblock[stat.ST_ATIME])
return data
def check_consistency(self):
"""Raise an error if consistency of rp broken
This is useful for debugging when the cache and disk get out
of sync and you need to find out where it happened.
"""
temptype = self.data['type']
self.setdata()
assert temptype == self.data['type'], \
"\nName: %s\nOld: %s --> New: %s\n" % \
(self.path, temptype, self.data['type'])
def _getdevnums(self):
"""Return tuple for special file (major, minor)"""
s = self.conn.reval("lambda path: os.lstat(path).st_rdev", self.path)
return (s >> 8, s & 0xff)
def chmod(self, permissions):
"""Wrapper around os.chmod"""
self.conn.os.chmod(self.path, permissions)
self.data['perms'] = permissions
def settime(self, accesstime, modtime):
"""Change file modification times"""
log.Log("Setting time of %s to %d" % (self.path, modtime), 7)
self.conn.os.utime(self.path, (accesstime, modtime))
self.data['atime'] = accesstime
self.data['mtime'] = modtime
def setmtime(self, modtime):
"""Set only modtime (access time to present)"""
log.Log(lambda: "Setting time of %s to %d" % (self.path, modtime), 7)
self.conn.os.utime(self.path, (long(time.time()), modtime))
self.data['mtime'] = modtime
def chown(self, uid, gid):
"""Set file's uid and gid"""
self.conn.os.chown(self.path, uid, gid)
self.data['uid'] = uid
self.data['gid'] = gid
def mkdir(self):
log.Log("Making directory " + self.path, 6)
self.conn.os.mkdir(self.path)
self.setdata()
def rmdir(self):
log.Log("Removing directory " + self.path, 6)
self.conn.os.rmdir(self.path)
self.data = {'type': None}
def listdir(self):
"""Return list of string paths returned by os.listdir"""
return self.conn.os.listdir(self.path)
def symlink(self, linktext):
"""Make symlink at self.path pointing to linktext"""
self.conn.os.symlink(linktext, self.path)
self.setdata()
assert self.issym()
def hardlink(self, linkpath):
"""Make self into a hardlink joined to linkpath"""
self.conn.os.link(linkpath, self.path)
self.setdata()
def mkfifo(self):
"""Make a fifo at self.path"""
self.conn.os.mkfifo(self.path)
self.setdata()
assert self.isfifo()
def mksock(self):
"""Make a socket at self.path"""
self.conn.rpath.make_socket_local(self)
self.setdata()
assert self.issock()
def touch(self):
"""Make sure file at self.path exists"""
log.Log("Touching " + self.path, 7)
self.conn.open(self.path, "w").close()
self.setdata()
assert self.isreg(), self.path
def hasfullperms(self):
"""Return true if current process has full permissions on the file"""
if self.isowner(): return self.getperms() % 01000 >= 0700
elif self.isgroup(): return self.getperms() % 0100 >= 070
else: return self.getperms() % 010 >= 07
def readable(self):
"""Return true if current process has read permissions on the file"""
if self.isowner(): return self.getperms() % 01000 >= 0400
elif self.isgroup(): return self.getperms() % 0100 >= 040
else: return self.getperms() % 010 >= 04
def executable(self):
"""Return true if current process has execute permissions"""
if self.isowner(): return self.getperms() % 0200 >= 0100
elif self.isgroup(): return self.getperms() % 020 >= 010
else: return self.getperms() % 02 >= 01
def isowner(self):
"""Return true if current process is owner of rp or root"""
uid = self.conn.os.getuid()
return uid == 0 or uid == self.data['uid']
def isgroup(self):
"""Return true if current process is in group of rp"""
return self.conn.Globals.get('process_gid') == self.data['gid']
def delete(self):
"""Delete file at self.path. Recursively deletes directories."""
log.Log("Deleting %s" % self.path, 7)
if self.isdir():
try: self.rmdir()
except os.error: shutil.rmtree(self.path)
else: self.conn.os.unlink(self.path)
self.setdata()
def quote(self):
"""Return quoted self.path for use with os.system()"""
return '"%s"' % self.regex_chars_to_quote.sub(
lambda m: "\\"+m.group(0), self.path)
def normalize(self):
"""Return RPath canonical version of self.path
This just means that redundant /'s will be removed, including
the trailing one, even for directories. ".." components will
be retained.
"""
newpath = "/".join(filter(lambda x: x and x != ".",
self.path.split("/")))
if self.path[0] == "/": newpath = "/" + newpath
elif not newpath: newpath = "."
return self.newpath(newpath)
def dirsplit(self):
"""Returns a tuple of strings (dirname, basename)
Basename is never '' unless self is root, so it is unlike
os.path.basename. If path is just above root (so dirname is
root), then dirname is ''. In all other cases dirname is not
the empty string. Also, dirsplit depends on the format of
self, so basename could be ".." and dirname could be a
subdirectory. For an atomic relative path, dirname will be
'.'.
"""
normed = self.normalize()
if normed.path.find("/") == -1: return (".", normed.path)
comps = normed.path.split("/")
return "/".join(comps[:-1]), comps[-1]
def get_parent_rp(self):
"""Return new RPath of directory self is in"""
if self.index:
return self.__class__(self.conn, self.base, self.index[:-1])
dirname = self.dirsplit()[0]
if dirname: return self.__class__(self.conn, dirname)
else: return self.__class__(self.conn, "/")
def newpath(self, newpath, index = ()):
"""Return new RPath with the same connection but different path"""
return self.__class__(self.conn, newpath, index)
def append(self, ext):
"""Return new RPath with same connection by adjoing ext"""
return self.__class__(self.conn, self.base, self.index + (ext,))
def append_path(self, ext, new_index = ()):
"""Like append, but add ext to path instead of to index"""
return self.__class__(self.conn, "/".join((self.base, ext)), new_index)
def new_index(self, index):
"""Return similar RPath but with new index"""
return self.__class__(self.conn, self.base, index)
def open(self, mode, compress = None):
"""Return open file. Supports modes "w" and "r".
If compress is true, data written/read will be gzip
compressed/decompressed on the fly. The extra complications
below are for security reasons - try to make the extent of the
risk apparent from the remote call.
"""
if self.conn is Globals.local_connection:
if compress: return gzip.GzipFile(self.path, mode)
else: return open(self.path, mode)
if compress:
if mode == "r" or mode == "rb":
return self.conn.rpath.gzip_open_local_read(self)
else: return self.conn.gzip.GzipFile(self.path, mode)
else:
if mode == "r" or mode == "rb":
return self.conn.rpath.open_local_read(self)
else: return self.conn.open(self.path, mode)
def write_from_fileobj(self, fp, compress = None):
"""Reads fp and writes to self.path. Closes both when done
If compress is true, fp will be gzip compressed before being
written to self.
"""
log.Log("Writing file object to " + self.path, 7)
assert not self.lstat(), "File %s already exists" % self.path
outfp = self.open("wb", compress = compress)
copyfileobj(fp, outfp)
if fp.close() or outfp.close():
raise RPathException("Error closing file")
self.setdata()
def isincfile(self):
"""Return true if path looks like an increment file
Also sets various inc information used by the *inc* functions.
"""
if self.index: dotsplit = self.index[-1].split(".")
else: dotsplit = self.base.split(".")
if dotsplit[-1] == "gz":
compressed = 1
if len(dotsplit) < 4: return None
timestring, ext = dotsplit[-3:-1]
else:
compressed = None
if len(dotsplit) < 3: return None
timestring, ext = dotsplit[-2:]
if Time.stringtotime(timestring) is None: return None
if not (ext == "snapshot" or ext == "dir" or
ext == "missing" or ext == "diff" or ext == "data"):
return None
self.inc_timestr = timestring
self.inc_compressed = compressed
self.inc_type = ext
if compressed: self.inc_basestr = ".".join(dotsplit[:-3])
else: self.inc_basestr = ".".join(dotsplit[:-2])
return 1
def isinccompressed(self):
"""Return true if inc file is compressed"""
return self.inc_compressed
def getinctype(self):
"""Return type of an increment file"""
return self.inc_type
def getinctime(self):
"""Return time in seconds of an increment file"""
return Time.stringtotime(self.inc_timestr)
def getincbase(self):
"""Return the base filename of an increment file in rp form"""
if self.index:
return self.__class__(self.conn, self.base, self.index[:-1] +
(self.inc_basestr,))
else: return self.__class__(self.conn, self.inc_basestr)
def getincbase_str(self):
"""Return the base filename string of an increment file"""
rp = self.getincbase()
if rp.index: return rp.index[-1]
else: return rp.dirsplit()[1]
def makedev(self, type, major, minor):
"""Make a special file with specified type, and major/minor nums"""
cmdlist = ['mknod', self.path, type, str(major), str(minor)]
if self.conn.os.spawnvp(os.P_WAIT, 'mknod', cmdlist) != 0:
raise RPathException("Error running %s" % cmdlist)
if type == 'c': datatype = 'chr'
elif type == 'b': datatype = 'blk'
else: raise RPathException
self.setdata()
def fsync(self, fp = None):
"""fsync the current file or directory
If fp is none, get the file description by opening the file.
This can be useful for directories.
"""
if not fp:
fp = self.open("rb")
os.fsync(fp.fileno())
assert not fp.close()
else: os.fsync(fp.fileno())
def fsync_with_dir(self, fp = None):
"""fsync self and directory self is under"""
self.fsync(fp)
if Globals.fsync_directories: self.get_parent_rp().fsync()
def sync_delete(self):
"""Delete self with sync to guarantee completion
On some filesystems (like linux's ext2), we must sync both the
file and the directory to make sure.
"""
if self.lstat() and not self.issym():
fp = self.open("rb")
self.delete()
os.fsync(fp.fileno())
assert not fp.close()
if Globals.fsync_directories: self.get_parent_rp().fsync()
def get_data(self):
"""Open file as a regular file, read data, close, return data"""
fp = self.open("rb")
s = fp.read()
assert not fp.close()
return s
class RPathFileHook:
"""Look like a file, but add closing hook"""
def __init__(self, file, closing_thunk):
self.file = file
self.closing_thunk = closing_thunk
def read(self, length = -1): return self.file.read(length)
def write(self, buf): return self.file.write(buf)
def close(self):
"""Close file and then run closing thunk"""
result = self.file.close()
self.closing_thunk()
return result
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Iterate exactly the requested files in a directory
Parses includes and excludes to yield correct files. More
documentation on what this code does can be found on the man page.
"""
from __future__ import generators
import re
import FilenameMapping, robust, rpath, Globals, log, rorpiter
class SelectError(Exception):
"""Some error dealing with the Select class"""
pass
class FilePrefixError(SelectError):
"""Signals that a specified file doesn't start with correct prefix"""
pass
class GlobbingError(SelectError):
"""Something has gone wrong when parsing a glob string"""
pass
class Select:
"""Iterate appropriate RPaths in given directory
This class acts as an iterator on account of its next() method.
Basically, it just goes through all the files in a directory in
order (depth-first) and subjects each file to a bunch of tests
(selection functions) in order. The first test that includes or
excludes the file means that the file gets included (iterated) or
excluded. The default is include, so with no tests we would just
iterate all the files in the directory in order.
The one complication to this is that sometimes we don't know
whether or not to include a directory until we examine its
contents. For instance, if we want to include all the **.py
files. If /home/ben/foo.py exists, we should also include /home
and /home/ben, but if these directories contain no **.py files,
they shouldn't be included. For this reason, a test may not
include or exclude a directory, but merely "scan" it. If later a
file in the directory gets included, so does the directory.
As mentioned above, each test takes the form of a selection
function. The selection function takes an rpath, and returns:
None - means the test has nothing to say about the related file
0 - the file is excluded by the test
1 - the file is included
2 - the test says the file (must be directory) should be scanned
Also, a selection function f has a variable f.exclude which should
be true iff f could potentially exclude some file. This is used
to signal an error if the last function only includes, which would
be redundant and presumably isn't what the user intends.
"""
# This re should not match normal filenames, but usually just globs
glob_re = re.compile("(.*[*?[]|ignorecase\\:)", re.I | re.S)
def __init__(self, rootrp):
"""Select initializer. rpath is the root directory"""
assert isinstance(rootrp, rpath.RPath)
self.selection_functions = []
self.rpath = rootrp
self.prefix = self.rpath.path
def set_iter(self, sel_func = None):
"""Initialize more variables, get ready to iterate
Selection function sel_func is called on each rpath and is
usually self.Select. Returns self just for convenience.
"""
if not sel_func: sel_func = self.Select
self.rpath.setdata() # this may have changed since Select init
self.iter = self.Iterate_fast(self.rpath, sel_func)
self.next = self.iter.next
self.__iter__ = lambda: self
return self
def Iterate_fast(self, rpath, sel_func):
"""Like Iterate, but don't recur, saving time"""
def error_handler(exc, filename):
log.ErrorLog.write_if_open("ListError",
rpath.index + (filename,), exc)
return None
def diryield(rpath):
"""Generate relevant files in directory rpath
Returns (rpath, num) where num == 0 means rpath should be
generated normally, num == 1 means the rpath is a directory
and should be included iff something inside is included.
"""
for filename in self.listdir(rpath):
new_rpath = robust.check_common_error(error_handler,
rpath.append, (filename,))
if new_rpath and new_rpath.lstat():
s = sel_func(new_rpath)
if s == 1: yield (new_rpath, 0)
elif s == 2 and new_rpath.isdir(): yield (new_rpath, 1)
yield rpath
if not rpath.isdir(): return
diryield_stack = [diryield(rpath)]
delayed_rp_stack = []
while diryield_stack:
try: rpath, val = diryield_stack[-1].next()
except StopIteration:
diryield_stack.pop()
if delayed_rp_stack: delayed_rp_stack.pop()
continue
if val == 0:
if delayed_rp_stack:
for delayed_rp in delayed_rp_stack: yield delayed_rp
del delayed_rp_stack[:]
yield rpath
if rpath.isdir(): diryield_stack.append(diryield(rpath))
elif val == 1:
delayed_rp_stack.append(rpath)
diryield_stack.append(diryield(rpath))
def Iterate(self, rpath, rec_func, sel_func):
"""Return iterator yielding rpaths in rpath
rec_func is usually the same as this function and is what
Iterate uses to find files in subdirectories. It is used in
iterate_starting_from.
sel_func is the selection function to use on the rpaths. It
is usually self.Select.
"""
s = sel_func(rpath)
if s == 0: return
elif s == 1: # File is included
yield rpath
if rpath.isdir():
for rp in self.iterate_in_dir(rpath, rec_func, sel_func):
yield rp
elif s == 2:
if rpath.isdir(): # Directory is merely scanned
iid = self.iterate_in_dir(rpath, rec_func, sel_func)
try: first = iid.next()
except StopIteration: return # no files inside; skip rp
yield rpath
yield first
for rp in iid: yield rp
else: assert 0, "Invalid selection result %s" % (str(s),)
def listdir(self, dir_rp):
"""List directory rpath with error logging"""
def error_handler(exc):
log.ErrorLog.write_if_open("ListError", dir_rp, exc)
return []
dir_listing = robust.check_common_error(error_handler, dir_rp.listdir)
dir_listing.sort()
return dir_listing
def iterate_in_dir(self, rpath, rec_func, sel_func):
"""Iterate the rpaths in directory rpath."""
def error_handler(exc, filename):
log.ErrorLog.write_if_open("ListError",
rpath.index + (filename,), exc)
return None
for filename in self.listdir(rpath):
new_rp = robust.check_common_error(
error_handler, rpath.append, [filename])
if new_rp:
for rp in rec_func(new_rp, rec_func, sel_func):
yield rp
def FilterIter(self, rorp_iter):
"""Filter rorp_iter using Select below, removing excluded rorps"""
def getrpiter(rorp_iter):
"""Return rp iter by adding indicies of rorp_iter to self.rpath"""
for rorp in rorp_iter:
yield rpath.RPath(self.rpath.conn, self.rpath.base,
rorp.index, rorp.data)
ITR = rorpiter.IterTreeReducer(FilterIterITRB, [self])
for rp in rp_iter: ITR(rp.index, rp)
ITR.Finish()
def Select(self, rp):
"""Run through the selection functions and return dominant val 0/1/2"""
for sf in self.selection_functions:
result = sf(rp)
if result is not None: return result
return 1
def ParseArgs(self, argtuples, filelists):
"""Create selection functions based on list of tuples
The tuples have the form (option string, additional argument)
and are created when the initial commandline arguments are
read. The reason for the extra level of processing is that
the filelists may only be openable by the main connection, but
the selection functions need to be on the backup reader or
writer side. When the initial arguments are parsed the right
information is sent over the link.
"""
filelists_index = 0
try:
for opt, arg in argtuples:
if opt == "--exclude":
self.add_selection_func(self.glob_get_sf(arg, 0))
elif opt == "--exclude-device-files":
self.add_selection_func(self.devfiles_get_sf(0))
elif opt == "--exclude-filelist":
self.add_selection_func(self.filelist_get_sf(
filelists[filelists_index], 0, arg))
filelists_index += 1
elif opt == "--exclude-globbing-filelist":
map(self.add_selection_func,
self.filelist_globbing_get_sfs(
filelists[filelists_index], 0, arg))
filelists_index += 1
elif opt == "--exclude-other-filesystems":
self.add_selection_func(self.other_filesystems_get_sf(0))
elif opt == "--exclude-regexp":
self.add_selection_func(self.regexp_get_sf(arg, 0))
elif opt == "--exclude-special-files":
self.add_selection_func(self.special_get_sf(0))
elif opt == "--include":
self.add_selection_func(self.glob_get_sf(arg, 1))
elif opt == "--include-filelist":
self.add_selection_func(self.filelist_get_sf(
filelists[filelists_index], 1, arg))
filelists_index += 1
elif opt == "--include-globbing-filelist":
map(self.add_selection_func,
self.filelist_globbing_get_sfs(
filelists[filelists_index], 1, arg))
filelists_index += 1
elif opt == "--include-regexp":
self.add_selection_func(self.regexp_get_sf(arg, 1))
else: assert 0, "Bad selection option %s" % opt
except IOError: pass#SelectError, e: self.parse_catch_error(e)
assert filelists_index == len(filelists)
self.parse_last_excludes()
self.parse_rbdir_exclude()
def parse_catch_error(self, exc):
"""Deal with selection error exc"""
if isinstance(exc, FilePrefixError):
log.Log.FatalError(
"""Fatal Error: The file specification
' %s'
cannot match any files in the base directory
' %s'
Useful file specifications begin with the base directory or some
pattern (such as '**') which matches the base directory.""" %
(exc, self.prefix))
elif isinstance(e, GlobbingError):
log.Log.FatalError("Fatal Error while processing expression\n"
"%s" % exc)
else: raise
def parse_rbdir_exclude(self):
"""Add exclusion of rdiff-backup-data dir to front of list"""
self.add_selection_func(
self.glob_get_tuple_sf(("rdiff-backup-data",), 0), 1)
def parse_last_excludes(self):
"""Exit with error if last selection function isn't an exclude"""
if (self.selection_functions and
not self.selection_functions[-1].exclude):
log.Log.FatalError(
"""Last selection expression:
%s
only specifies that files be included. Because the default is to
include all files, the expression is redundant. Exiting because this
probably isn't what you meant.""" %
(self.selection_functions[-1].name,))
def add_selection_func(self, sel_func, add_to_start = None):
"""Add another selection function at the end or beginning"""
if add_to_start: self.selection_functions.insert(0, sel_func)
else: self.selection_functions.append(sel_func)
def filelist_get_sf(self, filelist_fp, inc_default, filelist_name):
"""Return selection function by reading list of files
The format of the filelist is documented in the man page.
filelist_fp should be an (open) file object.
inc_default should be true if this is an include list,
false for an exclude list.
filelist_name is just a string used for logging.
"""
log.Log("Reading filelist %s" % filelist_name, 4)
tuple_list, something_excluded = \
self.filelist_read(filelist_fp, inc_default, filelist_name)
log.Log("Sorting filelist %s" % filelist_name, 4)
tuple_list.sort()
i = [0] # We have to put index in list because of stupid scoping rules
def selection_function(rp):
while 1:
if i[0] >= len(tuple_list): return None
include, move_on = \
self.filelist_pair_match(rp, tuple_list[i[0]])
if move_on:
i[0] += 1
if include is None: continue # later line may match
return include
selection_function.exclude = something_excluded or inc_default == 0
selection_function.name = "Filelist: " + filelist_name
return selection_function
def filelist_read(self, filelist_fp, include, filelist_name):
"""Read filelist from fp, return (tuplelist, something_excluded)"""
prefix_warnings = [0]
def incr_warnings(exc):
"""Warn if prefix is incorrect"""
prefix_warnings[0] += 1
if prefix_warnings[0] < 6:
log.Log("Warning: file specification '%s' in filelist %s\n"
"doesn't start with correct prefix %s. Ignoring." %
(exc, filelist_name, self.prefix), 2)
if prefix_warnings[0] == 5:
log.Log("Future prefix errors will not be logged.", 2)
something_excluded, tuple_list = None, []
separator = Globals.null_separator and "\0" or "\n"
for line in filelist_fp.read().split(separator):
if not line: continue # skip blanks
try: tuple = self.filelist_parse_line(line, include)
except FilePrefixError, exc:
incr_warnings(exc)
continue
tuple_list.append(tuple)
if not tuple[1]: something_excluded = 1
if filelist_fp.close():
log.Log("Error closing filelist %s" % filelist_name, 2)
return (tuple_list, something_excluded)
def filelist_parse_line(self, line, include):
"""Parse a single line of a filelist, returning a pair
pair will be of form (index, include), where index is another
tuple, and include is 1 if the line specifies that we are
including a file. The default is given as an argument.
prefix is the string that the index is relative to.
"""
if line[:2] == "+ ": # Check for "+ "/"- " syntax
include = 1
line = line[2:]
elif line[:2] == "- ":
include = 0
line = line[2:]
if not line.startswith(self.prefix): raise FilePrefixError(line)
line = line[len(self.prefix):] # Discard prefix
index = tuple(filter(lambda x: x, line.split("/"))) # remove empties
return (index, include)
def filelist_pair_match(self, rp, pair):
"""Matches a filelist tuple against a rpath
Returns a pair (include, move_on). include is None if the
tuple doesn't match either way, and 0/1 if the tuple excludes
or includes the rpath.
move_on is true if the tuple cannot match a later index, and
so we should move on to the next tuple in the index.
"""
index, include = pair
if include == 1:
if index < rp.index: return (None, 1)
if index == rp.index: return (1, 1)
elif index[:len(rp.index)] == rp.index:
return (1, None) # /foo/bar implicitly includes /foo
else: return (None, None) # rp greater, not initial sequence
elif include == 0:
if rp.index[:len(index)] == index:
return (0, None) # /foo implicitly excludes /foo/bar
elif index < rp.index: return (None, 1)
else: return (None, None) # rp greater, not initial sequence
else: assert 0, "Include is %s, should be 0 or 1" % (include,)
def filelist_globbing_get_sfs(self, filelist_fp, inc_default, list_name):
"""Return list of selection functions by reading fileobj
filelist_fp should be an open file object
inc_default is true iff this is an include list
list_name is just the name of the list, used for logging
See the man page on --[include/exclude]-globbing-filelist
"""
log.Log("Reading globbing filelist %s" % list_name, 4)
separator = Globals.null_separator and "\0" or "\n"
for line in filelist_fp.read().split(separator):
if not line: continue # skip blanks
if line[:2] == "+ ": yield self.glob_get_sf(line[2:], 1)
elif line[:2] == "- ": yield self.glob_get_sf(line[2:], 0)
else: yield self.glob_get_sf(line, inc_default)
def other_filesystems_get_sf(self, include):
"""Return selection function matching files on other filesystems"""
assert include == 0 or include == 1
root_devloc = self.rpath.getdevloc()
def sel_func(rp):
if rp.getdevloc() == root_devloc: return None
else: return include
sel_func.exclude = not include
sel_func.name = "Match other filesystems"
return sel_func
def regexp_get_sf(self, regexp_string, include):
"""Return selection function given by regexp_string"""
assert include == 0 or include == 1
try: regexp = re.compile(regexp_string)
except:
log.Log("Error compiling regular expression %s" % regexp_string, 1)
raise
def sel_func(rp):
if regexp.search(rp.path): return include
else: return None
sel_func.exclude = not include
sel_func.name = "Regular expression: %s" % regexp_string
return sel_func
def devfiles_get_sf(self, include):
"""Return a selection function matching all dev files"""
if self.selection_functions:
log.Log("Warning: exclude-device-files is not the first "
"selector.\nThis may not be what you intended", 3)
def sel_func(rp):
if rp.isdev(): return include
else: return None
sel_func.exclude = not include
sel_func.name = (include and "include" or "exclude") + " device files"
return sel_func
def special_get_sf(self, include):
"""Return sel function matching sockets, symlinks, sockets, devs"""
if self.selection_functions:
log.Log("Warning: exclude-special-files is not the first "
"selector.\nThis may not be what you intended", 3)
def sel_func(rp):
if rp.issym() or rp.issock() or rp.isfifo() or rp.isdev():
return include
else: return None
sel_func.exclude = not include
sel_func.name = (include and "include" or "exclude") + " special files"
return sel_func
def glob_get_sf(self, glob_str, include):
"""Return selection function given by glob string"""
assert include == 0 or include == 1
if glob_str == "**": sel_func = lambda rp: include
elif not self.glob_re.match(glob_str): # normal file
sel_func = self.glob_get_filename_sf(glob_str, include)
else: sel_func = self.glob_get_normal_sf(glob_str, include)
sel_func.exclude = not include
sel_func.name = "Command-line %s glob: %s" % \
(include and "include" or "exclude", glob_str)
return sel_func
def glob_get_filename_sf(self, filename, include):
"""Get a selection function given a normal filename
Some of the parsing is better explained in
filelist_parse_line. The reason this is split from normal
globbing is things are a lot less complicated if no special
globbing characters are used.
"""
if not filename.startswith(self.prefix):
raise FilePrefixError(filename)
index = tuple(filter(lambda x: x,
filename[len(self.prefix):].split("/")))
return self.glob_get_tuple_sf(index, include)
def glob_get_tuple_sf(self, tuple, include):
"""Return selection function based on tuple"""
def include_sel_func(rp):
if (rp.index == tuple[:len(rp.index)] or
rp.index[:len(tuple)] == tuple):
return 1 # /foo/bar implicitly matches /foo, vice-versa
else: return None
def exclude_sel_func(rp):
if rp.index[:len(tuple)] == tuple:
return 0 # /foo excludes /foo/bar, not vice-versa
else: return None
if include == 1: sel_func = include_sel_func
elif include == 0: sel_func = exclude_sel_func
sel_func.exclude = not include
sel_func.name = "Tuple select %s" % (tuple,)
return sel_func
def glob_get_normal_sf(self, glob_str, include):
"""Return selection function based on glob_str
The basic idea is to turn glob_str into a regular expression,
and just use the normal regular expression. There is a
complication because the selection function should return '2'
(scan) for directories which may contain a file which matches
the glob_str. So we break up the glob string into parts, and
any file which matches an initial sequence of glob parts gets
scanned.
Thanks to Donovan Baarda who provided some code which did some
things similar to this.
"""
if glob_str.lower().startswith("ignorecase:"):
re_comp = lambda r: re.compile(r, re.I | re.S)
glob_str = glob_str[len("ignorecase:"):]
else: re_comp = lambda r: re.compile(r, re.S)
# matches what glob matches and any files in directory
glob_comp_re = re_comp("^%s($|/)" % self.glob_to_re(glob_str))
if glob_str.find("**") != -1:
glob_str = glob_str[:glob_str.find("**")+2] # truncate after **
scan_comp_re = re_comp("^(%s)$" %
"|".join(self.glob_get_prefix_res(glob_str)))
def include_sel_func(rp):
if glob_comp_re.match(rp.path): return 1
elif scan_comp_re.match(rp.path): return 2
else: return None
def exclude_sel_func(rp):
if glob_comp_re.match(rp.path): return 0
else: return None
# Check to make sure prefix is ok
if not include_sel_func(self.rpath): raise FilePrefixError(glob_str)
if include: return include_sel_func
else: return exclude_sel_func
def glob_get_prefix_res(self, glob_str):
"""Return list of regexps equivalent to prefixes of glob_str"""
glob_parts = glob_str.split("/")
if "" in glob_parts[1:-1]: # "" OK if comes first or last, as in /foo/
raise GlobbingError("Consecutive '/'s found in globbing string "
+ glob_str)
prefixes = map(lambda i: "/".join(glob_parts[:i+1]),
range(len(glob_parts)))
# we must make exception for root "/", only dir to end in slash
if prefixes[0] == "": prefixes[0] = "/"
return map(self.glob_to_re, prefixes)
def glob_to_re(self, pat):
"""Returned regular expression equivalent to shell glob pat
Currently only the ?, *, [], and ** expressions are supported.
Ranges like [a-z] are also currently unsupported. There is no
way to quote these special characters.
This function taken with minor modifications from efnmatch.py
by Donovan Baarda.
"""
i, n, res = 0, len(pat), ''
while i < n:
c, s = pat[i], pat[i:i+2]
i = i+1
if s == '**':
res = res + '.*'
i = i + 1
elif c == '*': res = res + '[^/]*'
elif c == '?': res = res + '[^/]'
elif c == '[':
j = i
if j < n and pat[j] in '!^': j = j+1
if j < n and pat[j] == ']': j = j+1
while j < n and pat[j] != ']': j = j+1
if j >= n: res = res + '\\[' # interpret the [ literally
else: # Deal with inside of [..]
stuff = pat[i:j].replace('\\','\\\\')
i = j+1
if stuff[0] in '!^': stuff = '^' + stuff[1:]
res = res + '[' + stuff + ']'
else: res = res + re.escape(c)
return res
class FilterIter:
"""Filter rorp_iter using a Select object, removing excluded rorps"""
def __init__(self, select, rorp_iter):
"""Constructor
Input is the Select object to use and the iter of rorps to be
filtered. The rorps will be converted to rps using the Select
base.
"""
self.rorp_iter = rorp_iter
self.base_rp = select.rpath
self.stored_rorps = []
self.ITR = rorpiter.IterTreeReducer(FilterIterITRB,
[select.Select, self.stored_rorps])
self.itr_finished = 0
def __iter__(self): return self
def next(self):
"""Return next object, or StopIteration"""
while not self.stored_rorps:
try: next_rorp = self.rorp_iter.next()
except StopIteration:
if self.itr_finished: raise
else:
self.ITR.Finish()
self.itr_finished = 1
else:
next_rp = rpath.RPath(self.base_rp.conn, self.base_rp.base,
next_rorp.index, next_rorp.data)
self.ITR(next_rorp.index, next_rp, next_rorp)
return self.stored_rorps.pop(0)
class FilterIterITRB(rorpiter.ITRBranch):
"""ITRBranch used in above FilterIter class
The reason this is necessary is because for directories sometimes
we don't know whether a rorp is excluded until we see what is in
the directory.
"""
def __init__(self, select, rorp_cache):
"""Initialize FilterIterITRB. Called by IterTreeReducer.
select should be the relevant Select object used to test the
rps. rorp_cache is the list rps should be appended to if they
aren't excluded.
"""
self.select, self.rorp_cache = select, rorp_cache
self.branch_excluded = None
self.base_queue = None # holds branch base while examining contents
def can_fast_process(self, index, next_rp, next_rorp):
return not next_rp.isdir()
def fast_process(self, index, next_rp, next_rorp):
"""For ordinary files, just append if select is positive"""
if self.branch_excluded: return
s = self.select(next_rp)
if s == 1:
if self.base_queue:
self.rorp_cache.append(self.base_queue)
self.base_queue = None
self.rorp_cache.append(next_rorp)
else: assert s == 0, "Unexpected select value %s" % (s,)
def start_process(self, index, next_rp, next_rorp):
s = self.select(next_rp)
if s == 0: self.branch_excluded = 1
elif s == 1: self.rorp_cache.append(next_rorp)
else:
assert s == 2, s
self.base_queue = next_rorp
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""MakeStatic and MakeClass
These functions are used to make all the instance methods in a class
into static or class methods.
"""
class StaticMethodsError(Exception): pass
def MakeStatic(cls):
"""turn instance methods into static ones
The methods (that don't begin with _) of any class that
subclasses this will be turned into static methods.
"""
for name in dir(cls):
if name[0] != "_":
cls.__dict__[name] = staticmethod(cls.__dict__[name])
def MakeClass(cls):
"""Turn instance methods into classmethods. Ignore _ like above"""
for name in dir(cls):
if name[0] != "_":
cls.__dict__[name] = classmethod(cls.__dict__[name])
# Copyright 2002 Ben Escoto
#
# This file is part of rdiff-backup.
#
# rdiff-backup is free software; you can redistribute it and/or modify
# under the terms of the GNU General Public License as published by the
# Free Software Foundation; either version 2 of the License, or (at your
# option) any later version.
#
# rdiff-backup is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with rdiff-backup; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA
"""Generate and process aggregated backup information"""
import re, os, time
import Globals, Time, increment, log, static
class StatsException(Exception): pass
class StatsObj:
"""Contains various statistics, provide string conversion functions"""
# used when quoting files in get_stats_line
space_regex = re.compile(" ")
stat_file_attrs = ('SourceFiles', 'SourceFileSize',
'MirrorFiles', 'MirrorFileSize',
'NewFiles', 'NewFileSize',
'DeletedFiles', 'DeletedFileSize',
'ChangedFiles',
'ChangedSourceSize', 'ChangedMirrorSize',
'IncrementFiles', 'IncrementFileSize')
stat_misc_attrs = ('Errors', 'TotalDestinationSizeChange')
stat_time_attrs = ('StartTime', 'EndTime', 'ElapsedTime')
stat_attrs = (('Filename',) + stat_time_attrs +
stat_misc_attrs + stat_file_attrs)
# Below, the second value in each pair is true iff the value
# indicates a number of bytes
stat_file_pairs = (('SourceFiles', None), ('SourceFileSize', 1),
('MirrorFiles', None), ('MirrorFileSize', 1),
('NewFiles', None), ('NewFileSize', 1),
('DeletedFiles', None), ('DeletedFileSize', 1),
('ChangedFiles', None),
('ChangedSourceSize', 1), ('ChangedMirrorSize', 1),
('IncrementFiles', None), ('IncrementFileSize', 1))
# This is used in get_byte_summary_string below
byte_abbrev_list = ((1024*1024*1024*1024, "TB"),
(1024*1024*1024, "GB"),
(1024*1024, "MB"),
(1024, "KB"))
def __init__(self):
"""Set attributes to None"""
for attr in self.stat_attrs: self.__dict__[attr] = None
def get_stat(self, attribute):
"""Get a statistic"""
return self.__dict__[attribute]
def set_stat(self, attr, value):
"""Set attribute to given value"""
self.__dict__[attr] = value
def increment_stat(self, attr):
"""Add 1 to value of attribute"""
self.__dict__[attr] += 1
def add_to_stat(self, attr, value):
"""Add value to given attribute"""
self.__dict__[attr] += value
def get_total_dest_size_change(self):
"""Return total destination size change
This represents the total change in the size of the
rdiff-backup destination directory.
"""
addvals = [self.NewFileSize, self.ChangedSourceSize,
self.IncrementFileSize]
subtractvals = [self.DeletedFileSize, self.ChangedMirrorSize]
for val in addvals + subtractvals:
if val is None:
result = None
break
else:
def addlist(l): return reduce(lambda x,y: x+y, l)
result = addlist(addvals) - addlist(subtractvals)
self.TotalDestinationSizeChange = result
return result
def get_stats_line(self, index, use_repr = 1):
"""Return one line abbreviated version of full stats string"""
file_attrs = map(lambda attr: str(self.get_stat(attr)),
self.stat_file_attrs)
if not index: filename = "."
else:
filename = apply(os.path.join, index)
if use_repr:
# use repr to quote newlines in relative filename, then
# take of leading and trailing quote and quote spaces.
filename = self.space_regex.sub("\\x20", repr(filename)[1:-1])
return " ".join([filename,] + file_attrs)
def set_stats_from_line(self, line):
"""Set statistics from given line"""
def error(): raise StatsException("Bad line '%s'" % line)
if line[-1] == "\n": line = line[:-1]
lineparts = line.split(" ")
if len(lineparts) < len(stat_file_attrs): error()
for attr, val_string in zip(stat_file_attrs,
lineparts[-len(stat_file_attrs):]):
try: val = long(val_string)
except ValueError:
try: val = float(val_string)
except ValueError: error()
self.set_stat(attr, val)
return self
def get_stats_string(self):
"""Return extended string printing out statistics"""
return "%s%s%s" % (self.get_timestats_string(),
self.get_filestats_string(),
self.get_miscstats_string())
def get_timestats_string(self):
"""Return portion of statistics string dealing with time"""
timelist = []
if self.StartTime is not None:
timelist.append("StartTime %.2f (%s)\n" %
(self.StartTime, Time.timetopretty(self.StartTime)))
if self.EndTime is not None:
timelist.append("EndTime %.2f (%s)\n" %
(self.EndTime, Time.timetopretty(self.EndTime)))
if self.ElapsedTime or (self.StartTime is not None and
self.EndTime is not None):
if self.ElapsedTime is None:
self.ElapsedTime = self.EndTime - self.StartTime
timelist.append("ElapsedTime %.2f (%s)\n" %
(self.ElapsedTime, Time.inttopretty(self.ElapsedTime)))
return "".join(timelist)
def get_filestats_string(self):
"""Return portion of statistics string about files and bytes"""
def fileline(stat_file_pair):
"""Return zero or one line of the string"""
attr, in_bytes = stat_file_pair
val = self.get_stat(attr)
if val is None: return ""
if in_bytes:
return "%s %s (%s)\n" % (attr, val,
self.get_byte_summary_string(val))
else: return "%s %s\n" % (attr, val)
return "".join(map(fileline, self.stat_file_pairs))
def get_miscstats_string(self):
"""Return portion of extended stat string about misc attributes"""
misc_string = ""
tdsc = self.get_total_dest_size_change()
if tdsc is not None:
misc_string += ("TotalDestinationSizeChange %s (%s)\n" %
(tdsc, self.get_byte_summary_string(tdsc)))
if self.Errors is not None: misc_string += "Errors %d\n" % self.Errors
return misc_string
def get_byte_summary_string(self, byte_count):
"""Turn byte count into human readable string like "7.23GB" """
if byte_count < 0:
sign = "-"
byte_count = -byte_count
else: sign = ""
for abbrev_bytes, abbrev_string in self.byte_abbrev_list:
if byte_count >= abbrev_bytes:
# Now get 3 significant figures
abbrev_count = float(byte_count)/abbrev_bytes
if abbrev_count >= 100: precision = 0
elif abbrev_count >= 10: precision = 1
else: precision = 2
return "%s%%.%df %s" % (sign, precision, abbrev_string) \
% (abbrev_count,)
byte_count = round(byte_count)
if byte_count == 1: return sign + "1 byte"
else: return "%s%d bytes" % (sign, byte_count)
def get_stats_logstring(self, title):
"""Like get_stats_string, but add header and footer"""
header = "--------------[ %s ]--------------" % title
footer = "-" * len(header)
return "%s\n%s%s\n" % (header, self.get_stats_string(), footer)
def set_stats_from_string(self, s):
"""Initialize attributes from string, return self for convenience"""
def error(line): raise StatsException("Bad line '%s'" % line)
for line in s.split("\n"):
if not line: continue
line_parts = line.split()
if len(line_parts) < 2: error(line)
attr, value_string = line_parts[:2]
if not attr in self.stat_attrs: error(line)
try:
try: val1 = long(value_string)
except ValueError: val1 = None
val2 = float(value_string)
if val1 == val2: self.set_stat(attr, val1) # use integer val
else: self.set_stat(attr, val2) # use float
except ValueError: error(line)
return self
def write_stats_to_rp(self, rp):
"""Write statistics string to given rpath"""
fp = rp.open("wb")
fp.write(self.get_stats_string())
assert not fp.close()
def read_stats_from_rp(self, rp):
"""Set statistics from rpath, return self for convenience"""
fp = rp.open("r")
self.set_stats_from_string(fp.read())
fp.close()
return self
def stats_equal(self, s):
"""Return true if s has same statistics as self"""
assert isinstance(s, StatsObj)
for attr in self.stat_file_attrs:
if self.get_stat(attr) != s.get_stat(attr): return None
return 1
def set_to_average(self, statobj_list):
"""Set self's attributes to average of those in statobj_list"""
for attr in self.stat_attrs: self.set_stat(attr, 0)
for statobj in statobj_list:
for attr in self.stat_attrs:
if statobj.get_stat(attr) is None: self.set_stat(attr, None)
elif self.get_stat(attr) is not None:
self.set_stat(attr, statobj.get_stat(attr) +
self.get_stat(attr))
# Don't compute average starting/stopping time
self.StartTime = None
self.EndTime = None
for attr in self.stat_attrs:
if self.get_stat(attr) is not None:
self.set_stat(attr,
self.get_stat(attr)/float(len(statobj_list)))
return self
def get_statsobj_copy(self):
"""Return new StatsObj object with same stats as self"""
s = StatObj()
for attr in self.stat_attrs: s.set_stat(attr, self.get_stat(attr))
return s
class StatFileObj(StatsObj):
"""Build on StatsObj, add functions for processing files"""
def __init__(self, start_time = None):
"""StatFileObj initializer - zero out file attributes"""
StatsObj.__init__(self)
for attr in self.stat_file_attrs: self.set_stat(attr, 0)
if start_time is None: start_time = Time.curtime
self.StartTime = start_time
self.Errors = 0
def add_source_file(self, src_rorp):
"""Add stats of source file"""
self.SourceFiles += 1
if src_rorp.isreg(): self.SourceFileSize += src_rorp.getsize()
def add_dest_file(self, dest_rorp):
"""Add stats of destination size"""
self.MirrorFiles += 1
if dest_rorp.isreg(): self.MirrorFileSize += dest_rorp.getsize()
def add_changed(self, src_rorp, dest_rorp):
"""Update stats when src_rorp changes to dest_rorp"""
if src_rorp and src_rorp.lstat() and dest_rorp and dest_rorp.lstat():
self.ChangedFiles += 1
if src_rorp.isreg(): self.ChangedSourceSize += src_rorp.getsize()
if dest_rorp.isreg(): self.ChangedMirrorSize += dest_rorp.getsize()
elif src_rorp and src_rorp.lstat():
self.NewFiles += 1
if src_rorp.isreg(): self.NewFileSize += src_rorp.getsize()
elif dest_rorp and dest_rorp.lstat():
self.DeletedFiles += 1
if dest_rorp.isreg(): self.DeletedFileSize += dest_rorp.getsize()
def add_increment(self, inc_rorp):
"""Update stats with increment rorp"""
self.IncrementFiles += 1
if inc_rorp.isreg(): self.IncrementFileSize += inc_rorp.getsize()
def add_error(self):
"""Increment error stat by 1"""
self.Errors += 1
def finish(self, end_time = None):
"""Record end time and set other stats"""
if end_time is None: end_time = time.time()
self.EndTime = end_time
_active_statfileobj = None
def init_statfileobj():
"""Return new stat file object, record as active stat object"""
global _active_statfileobj
assert not _active_statfileobj, _active_statfileobj
_active_statfileobj = StatFileObj()
return _active_statfileobj
def get_active_statfileobj():
"""Return active stat file object if it exists"""
if _active_statfileobj: return _active_statfileobj
else: return None
def record_error():
"""Record error on active statfileobj, if there is one"""
if _active_statfileobj: _active_statfileobj.add_error()
def process_increment(inc_rorp):
"""Add statistics of increment rp incrp if there is active statfile"""
if _active_statfileobj: _active_statfileobj.add_increment(inc_rorp)
def write_active_statfileobj():
"""Write active StatFileObj object to session statistics file"""
global _active_statfileobj
assert _active_statfileobj
rp_base = Globals.rbdir.append("session_statistics")
session_stats_rp = increment.get_inc(rp_base, 'data', Time.curtime)
_active_statfileobj.finish()
_active_statfileobj.write_stats_to_rp(session_stats_rp)
_active_statfileobj = None
def print_active_stats():
"""Print statistics of active statobj to stdout and log"""
global _active_statfileobj
assert _active_statfileobj
_active_statfileobj.finish()
statmsg = _active_statfileobj.get_stats_logstring("Session statistics")
log.Log.log_to_file(statmsg)
Globals.client_conn.sys.stdout.write(statmsg)
class FileStats:
"""Keep track of less detailed stats on file-by-file basis"""
_fileobj, _rp = None, None
_line_sep = None
def init(cls):
"""Open file stats object and prepare to write"""
assert not (cls._fileobj or cls._rp), (cls._fileobj, cls._rp)
rpbase = Globals.rbdir.append("file_statistics")
suffix = Globals.compression and 'data.gz' or 'data'
cls._rp = increment.get_inc(rpbase, suffix, Time.curtime)
assert not cls._rp.lstat()
cls._fileobj = cls._rp.open("wb", compress = Globals.compression)
cls._line_sep = Globals.null_separator and '\0' or '\n'
cls.write_docstring()
cls.line_buffer = []
def write_docstring(cls):
"""Write the first line (a documentation string) into file"""
cls._fileobj.write("# Format of each line in file statistics file:")
cls._fileobj.write(cls._line_sep)
cls._fileobj.write("# Filename Changed SourceSize MirrorSize "
"IncrementSize" + cls._line_sep)
def update(cls, source_rorp, dest_rorp, changed, inc):
"""Update file stats with given information"""
if source_rorp: filename = source_rorp.get_indexpath()
else: filename = dest_rorp.get_indexpath()
size_list = map(cls.get_size, [source_rorp, dest_rorp, inc])
line = " ".join([filename, str(changed)] + size_list)
cls.line_buffer.append(line)
if len(cls.line_buffer) >= 100: cls.write_buffer()
def get_size(cls, rorp):
"""Return the size of rorp as string, or "NA" if not a regular file"""
if not rorp: return "NA"
if rorp.isreg(): return str(rorp.getsize())
else: return "0"
def write_buffer(cls):
"""Write buffer to file because buffer is full
The buffer part is necessary because the GzipFile.write()
method seems fairly slow.
"""
assert cls.line_buffer and cls._fileobj
cls.line_buffer.append('') # have join add _line_sep to end also
cls._fileobj.write(cls._line_sep.join(cls.line_buffer))
cls.line_buffer = []
def close(cls):
"""Close file stats file"""
assert cls._fileobj, cls._fileobj
if cls.line_buffer: cls.write_buffer()
assert not cls._fileobj.close()
cls._fileobj = cls._rp = None
static.MakeClass(FileStats)
import unittest
from commontest import *
from rdiff_backup import FilenameMapping
class FilenameMappingTest(unittest.TestCase):
"""Test the FilenameMapping class, for quoting filenames"""
def setUp(self):
"""Just initialize quoting - assume windows mode"""
FilenameMapping.set_init_quote_vals()
def testBasicQuote(self):
"""Test basic quoting and unquoting"""
filenames = ["hello", "HeLLo", "EUOeu/EUOeu", ":", "::::EU", "/:/:"]
for filename in filenames:
quoted = FilenameMapping.quote(filename)
assert FilenameMapping.unquote(quoted) == filename, filename
def testQuotedRPath(self):
"""Test the QuotedRPath class"""
def testQuotedSepBase(self):
"""Test get_quoted_sep_base function"""
path = ("/usr/local/mirror_metadata"
".1969-12-31;08421;05833;05820-07;05800.data.gz")
qrp = FilenameMapping.get_quoted_sep_base(path)
assert qrp.base == "/usr/local", qrp.base
assert len(qrp.index) == 1, qrp.index
assert (qrp.index[0] ==
"mirror_metadata.1969-12-31T21:33:20-07:00.data.gz")
if __name__ == "__main__": unittest.main()
import unittest
from commontest import *
from rdiff_backup import Globals, SetConnections
class RemoteMirrorTest(unittest.TestCase):
"""Test mirroring"""
def setUp(self):
"""Start server"""
Log.setverbosity(3)
Globals.change_source_perms = 1
SetConnections.UpdateGlobal('checkpoint_interval', 3)
def testMirror(self):
"""Testing simple mirror"""
MirrorTest(None, None, ["testfiles/increment1"])
def testMirror2(self):
"""Test mirror with larger data set"""
MirrorTest(1, None, ['testfiles/increment1', 'testfiles/increment2',
'testfiles/increment3', 'testfiles/increment4'])
def testMirror3(self):
"""Local version of testMirror2"""
MirrorTest(1, 1, ['testfiles/increment1', 'testfiles/increment2',
'testfiles/increment3', 'testfiles/increment4'])
if __name__ == "__main__": unittest.main()
import sys, time
from commontest import *
from rdiff_backup import rpath, Globals
"""benchmark.py
When possible, use 'rdiff-backup' from the shell, which allows using
different versions of rdiff-backup by altering the PYTHONPATH. We
just use clock time, so this isn't exact at all.
"""
output_local = 1
output_desc = "testfiles/output"
new_pythonpath = None
def run_cmd(cmd):
"""Run the given cmd, return the amount of time it took"""
if new_pythonpath: full_cmd = "PYTHONPATH=%s %s" % (new_pythonpath, cmd)
else: full_cmd = cmd
print "Running command '%s'" % (full_cmd,)
t = time.time()
assert not os.system(full_cmd)
return time.time() - t
def create_many_files(dirname, s, count = 1000):
"""Create many short files in the dirname directory
There will be count files in the directory, and each file will
contain the string s.
"""
Myrm("testfiles/many_out")
dir_rp = rpath.RPath(Globals.local_connection, dirname)
dir_rp.mkdir()
for i in xrange(count):
rp = dir_rp.append(str(i))
fp = rp.open("wb")
fp.write(s)
assert not fp.close()
def create_nested(dirname, s, depth, branch_factor = 10):
"""Create many short files in branching directory"""
def write(rp):
fp = rp.open("wb")
fp.write(s)
assert not fp.close()
def helper(rp, depth):
rp.mkdir()
sub_rps = map(lambda i: rp.append(str(i)), range(branch_factor))
if depth == 1: map(write, sub_rps)
else: map(lambda rp: helper(rp, depth-1), sub_rps)
Myrm("testfiles/nested_out")
helper(rpath.RPath(Globals.local_connection, dirname), depth)
def benchmark(backup_cmd, restore_cmd, desc, update_func = None):
"""Print benchmark using backup_cmd and restore_cmd
If update_func is given, run it and then do backup a third time.
"""
print "Initially backing up %s: %ss" % (desc, run_cmd(backup_cmd))
print "Updating %s, no change: %ss" % (desc, run_cmd(backup_cmd))
if update_func:
update_func()
print "Updating %s, all changed: %ss" % (desc, run_cmd(backup_cmd))
Myrm("testfiles/rest_out")
print "Restoring %s to empty dir: %ss" % (desc, run_cmd(restore_cmd))
print "Restoring %s to unchanged dir: %ss" % (desc, run_cmd(restore_cmd))
def many_files():
"""Time backup and restore of 2000 files"""
count = 2000
create_many_files("testfiles/many_out", "a", count)
backup_cmd = "rdiff-backup testfiles/many_out " + output_desc
restore_cmd = "rdiff-backup --force -r now %s testfiles/rest_out" % \
(output_desc,)
update_func = lambda: create_many_files("testfiles/many_out", "e", count)
benchmark(backup_cmd, restore_cmd, "2000 1-byte files", update_func)
def many_files_rsync():
"""Test rsync benchmark"""
count = 2000
create_many_files("testfiles/many_out", "a", count)
rsync_command = ("rsync -e ssh -aH --delete testfiles/many_out " +
output_desc)
print "Initial rsync: %ss" % (run_cmd(rsync_command),)
print "rsync update: %ss" % (run_cmd(rsync_command),)
create_many_files("testfiles/many_out", "e", count)
print "Update changed rsync: %ss" % (run_cmd(rsync_command),)
def nested_files():
"""Time backup and restore of 10000 nested files"""
depth = 4
create_nested("testfiles/nested_out", "a", depth)
backup_cmd = "rdiff-backup testfiles/nested_out " + output_desc
restore_cmd = "rdiff-backup --force -r now %s testfiles/rest_out" % \
(output_desc,)
update_func = lambda: create_nested("testfiles/nested_out", "e", depth)
benchmark(backup_cmd, restore_cmd, "10000 1-byte nested files",
update_func)
def nested_files_rsync():
"""Test rsync on nested files"""
depth = 4
create_nested("testfiles/nested_out", "a", depth)
rsync_command = ("rsync -e ssh -aH --delete testfiles/nested_out " +
output_desc)
print "Initial rsync: %ss" % (run_cmd(rsync_command),)
print "rsync update: %ss" % (run_cmd(rsync_command),)
create_nested("testfiles/nested_out", "e", depth)
print "Update changed rsync: %ss" % (run_cmd(rsync_command),)
if len(sys.argv) < 2 or len(sys.argv) > 3:
print "Syntax: benchmark.py benchmark_func [output_description]"
print
print "Where output_description defaults to 'testfiles/output'."
print "Currently benchmark_func includes:"
print "'many_files', 'many_files_rsync', and, 'nested_files'."
sys.exit(1)
if len(sys.argv) == 3:
output_desc = sys.argv[2]
if ":" in output_desc: output_local = None
if output_local:
assert not rpath.RPath(Globals.local_connection, output_desc).lstat(), \
"Outfile file %s exists, try deleting it first" % (output_desc,)
if os.environ.has_key('BENCHMARKPYPATH'):
new_pythonpath = os.environ['BENCHMARKPYPATH']
function_name = sys.argv[1]
print "Running ", function_name
eval(sys.argv[1])()
#!/usr/bin/env python
"""Used to emulate a remote connection by changing directories.
If given an argument, will change to that directory, and then start
the server. Otherwise will start the server without a chdir.
"""
import os, sys
olddir = os.getcwd()
if len(sys.argv) > 1: os.chdir(sys.argv[1])
#PipeConnection(sys.stdin, sys.stdout).Server()
#os.system("/home/ben/prog/python/rdiff-backup/rdiff-backup --server")
#os.system("/home/ben/prog/rdiff-backup/server.py /home/ben/prog/python/rdiff-backup/src")
os.system("%s/server.py" % (olddir,))
#!/usr/bin/env python
"""Used to emulate a remote connection by changing directories.
Like chdir-wrapper, but this time run the 'rdiff-backup' script, not
some other special thing.
"""
import os, sys
if len(sys.argv) > 1:
olddir = os.getcwd()
os.chdir(sys.argv[1])
#PipeConnection(sys.stdin, sys.stdout).Server()
os.system(os.path.join(olddir, "../rdiff-backup") + " --server")
"""commontest - Some functions and constants common to several test cases"""
import os, sys
from rdiff_backup.log import Log
from rdiff_backup.rpath import RPath
from rdiff_backup import Globals, Hardlink, SetConnections, Main, \
selection, lazy, Time, rpath
RBBin = "../rdiff-backup"
SourceDir = "../rdiff_backup"
AbsCurdir = os.getcwd() # Absolute path name of current directory
AbsTFdir = AbsCurdir+"/testfiles"
MiscDir = "../misc"
__no_execute__ = 1 # Keeps the actual rdiff-backup program from running
def Myrm(dirstring):
"""Run myrm on given directory string"""
root_rp = rpath.RPath(Globals.local_connection, dirstring)
for rp in selection.Select(root_rp).set_iter():
if rp.isdir(): rp.chmod(0700) # otherwise may not be able to remove
assert not os.system("rm -rf %s" % (dirstring,))
def Make():
"""Make sure the rdiff-backup script in the source dir is up-to-date"""
os.chdir(SourceDir)
os.system("python ./Make")
os.chdir(AbsCurdir)
def MakeOutputDir():
"""Initialize the testfiles/output directory"""
Myrm("testfiles/output")
rp = rpath.RPath(Globals.local_connection, "testfiles/output")
rp.mkdir()
return rp
def rdiff_backup(source_local, dest_local, src_dir, dest_dir,
current_time = None, extra_options = ""):
"""Run rdiff-backup with the given options
source_local and dest_local are boolean values. If either is
false, then rdiff-backup will be run pretending that src_dir and
dest_dir, respectively, are remote. The server process will be
run in directories test1 and test2/tmp respectively.
src_dir and dest_dir are the source and destination
(mirror) directories, relative to the testing directory.
If current time is true, add the --current-time option with the
given number of seconds.
extra_options are just added to the command line.
"""
if not source_local:
src_dir = ("cd test1; ../%s/rdiff-backup --server::../%s" %
(SourceDir, src_dir))
if not dest_local:
dest_dir = ("test2/tmp; ../../%s/rdiff-backup --server::../../%s" %
(SourceDir, dest_dir))
cmdargs = [RBBin, extra_options]
if not (source_local and dest_local): cmdargs.append("--remote-schema %s")
if current_time: cmdargs.append("--current-time %s" % current_time)
cmdargs.extend([src_dir, dest_dir])
cmdline = " ".join(cmdargs)
print "Executing: ", cmdline
assert not os.system(cmdline)
def cmd_schemas2rps(schema_list, remote_schema):
"""Input list of file descriptions and the remote schema, return rps
File descriptions should be strings of the form 'hostname.net::foo'
"""
return map(SetConnections.cmdpair2rp,
SetConnections.get_cmd_pairs(schema_list, remote_schema))
def InternalBackup(source_local, dest_local, src_dir, dest_dir,
current_time = None):
"""Backup src to dest internally
This is like rdiff_backup but instead of running a separate
rdiff-backup script, use the separate *.py files. This way the
script doesn't have to be rebuild constantly, and stacktraces have
correct line/file references.
"""
Globals.current_time = current_time
#_reset_connections()
remote_schema = '%s'
if not source_local:
src_dir = "cd test1; python ../server.py ../%s::../%s" % \
(SourceDir, src_dir)
if not dest_local:
dest_dir = "cd test2/tmp; python ../../server.py ../../%s::../../%s" \
% (SourceDir, dest_dir)
rpin, rpout = cmd_schemas2rps([src_dir, dest_dir], remote_schema)
Main.misc_setup([rpin, rpout])
Main.Backup(rpin, rpout)
Main.cleanup()
def InternalMirror(source_local, dest_local, src_dir, dest_dir):
"""Mirror src to dest internally
like InternalBackup, but only mirror. Do this through
InternalBackup, but then delete rdiff-backup-data directory.
"""
# Save attributes of root to restore later
src_root = rpath.RPath(Globals.local_connection, src_dir)
dest_root = rpath.RPath(Globals.local_connection, dest_dir)
dest_rbdir = dest_root.append("rdiff-backup-data")
InternalBackup(source_local, dest_local, src_dir, dest_dir)
dest_root.setdata()
Myrm(dest_rbdir.path)
# Restore old attributes
rpath.copy_attribs(src_root, dest_root)
def InternalRestore(mirror_local, dest_local, mirror_dir, dest_dir, time):
"""Restore mirror_dir to dest_dir at given time
This will automatically find the increments.XXX.dir representing
the time specified. The mirror_dir and dest_dir are relative to
the testing directory and will be modified for remote trials.
"""
Main.force = 1
remote_schema = '%s'
#_reset_connections()
if not mirror_local:
mirror_dir = "cd test1; python ../server.py ../%s::../%s" % \
(SourceDir, mirror_dir)
if not dest_local:
dest_dir = "cd test2/tmp; python ../../server.py ../../%s::../../%s" \
% (SourceDir, dest_dir)
mirror_rp, dest_rp = cmd_schemas2rps([mirror_dir, dest_dir], remote_schema)
Main.misc_setup([mirror_rp, dest_rp])
inc = get_increment_rp(mirror_rp, time)
if inc: Main.Restore(get_increment_rp(mirror_rp, time), dest_rp)
else: # use alternate syntax
Main.restore_timestr = str(time)
Main.RestoreAsOf(mirror_rp, dest_rp)
Main.cleanup()
def get_increment_rp(mirror_rp, time):
"""Return increment rp matching time in seconds"""
data_rp = mirror_rp.append("rdiff-backup-data")
if not data_rp.isdir(): return None
for filename in data_rp.listdir():
rp = data_rp.append(filename)
if rp.isincfile() and rp.getincbase_str() == "increments":
if rp.getinctime() == time: return rp
return None # Couldn't find appropriate increment
def _reset_connections(src_rp, dest_rp):
"""Reset some global connection information"""
Globals.isbackup_reader = Globals.isbackup_writer = None
#Globals.connections = [Globals.local_connection]
#Globals.connection_dict = {0: Globals.local_connection}
SetConnections.UpdateGlobal('rbdir', None)
Main.misc_setup([src_rp, dest_rp])
def CompareRecursive(src_rp, dest_rp, compare_hardlinks = 1,
equality_func = None, exclude_rbdir = 1,
ignore_tmp_files = None, compare_ownership = 0):
"""Compare src_rp and dest_rp, which can be directories
This only compares file attributes, not the actual data. This
will overwrite the hardlink dictionaries if compare_hardlinks is
specified.
"""
if compare_hardlinks: reset_hardlink_dicts()
src_rp.setdata()
dest_rp.setdata()
Log("Comparing %s and %s, hardlinks %s" % (src_rp.path, dest_rp.path,
compare_hardlinks), 3)
src_select = selection.Select(src_rp)
dest_select = selection.Select(dest_rp)
if ignore_tmp_files:
# Ignoring temp files can be useful when we want to check the
# correctness of a backup which aborted in the middle. In
# these cases it is OK to have tmp files lying around.
src_select.add_selection_func(src_select.regexp_get_sf(
".*rdiff-backup.tmp.[^/]+$", 0))
dest_select.add_selection_func(dest_select.regexp_get_sf(
".*rdiff-backup.tmp.[^/]+$", 0))
if exclude_rbdir:
src_select.parse_rbdir_exclude()
dest_select.parse_rbdir_exclude()
else:
# include rdiff-backup-data/increments
src_select.add_selection_func(src_select.glob_get_tuple_sf(
('rdiff-backup-data', 'increments'), 1))
dest_select.add_selection_func(dest_select.glob_get_tuple_sf(
('rdiff-backup-data', 'increments'), 1))
# but exclude rdiff-backup-data
src_select.add_selection_func(src_select.glob_get_tuple_sf(
('rdiff-backup-data',), 0))
dest_select.add_selection_func(dest_select.glob_get_tuple_sf(
('rdiff-backup-data',), 0))
dsiter1, dsiter2 = src_select.set_iter(), dest_select.set_iter()
def hardlink_equal(src_rorp, dest_rorp):
if not src_rorp.equal_verbose(dest_rorp,
compare_ownership = compare_ownership):
return None
if Hardlink.rorp_eq(src_rorp, dest_rorp): return 1
Log("%s: %s" % (src_rorp.index, Hardlink.get_indicies(src_rorp, 1)), 3)
Log("%s: %s" % (dest_rorp.index,
Hardlink.get_indicies(dest_rorp, None)), 3)
return None
def rbdir_equal(src_rorp, dest_rorp):
"""Like hardlink_equal, but make allowances for data directories"""
if not src_rorp.index and not dest_rorp.index: return 1
if (src_rorp.index and src_rorp.index[0] == 'rdiff-backup-data' and
src_rorp.index == dest_rorp.index):
# Don't compare dirs - they don't carry significant info
if dest_rorp.isdir() and src_rorp.isdir(): return 1
if dest_rorp.isreg() and src_rorp.isreg():
# Don't compare gzipped files because it is apparently
# non-deterministic.
if dest_rorp.index[-1].endswith('gz'): return 1
# Don't compare .missing increments because they don't matter
if dest_rorp.index[-1].endswith('.missing'): return 1
if compare_hardlinks:
if Hardlink.rorp_eq(src_rorp, dest_rorp): return 1
elif src_rorp.equal_verbose(dest_rorp,
compare_ownership = compare_ownership):
return 1
Log("%s: %s" % (src_rorp.index, Hardlink.get_indicies(src_rorp, 1)), 3)
Log("%s: %s" % (dest_rorp.index,
Hardlink.get_indicies(dest_rorp, None)), 3)
return None
if equality_func: result = lazy.Iter.equal(dsiter1, dsiter2,
1, equality_func)
elif compare_hardlinks:
dsiter1 = Hardlink.add_rorp_iter(dsiter1, 1)
dsiter2 = Hardlink.add_rorp_iter(dsiter2, None)
if exclude_rbdir:
result = lazy.Iter.equal(dsiter1, dsiter2, 1, hardlink_equal)
else: result = lazy.Iter.equal(dsiter1, dsiter2, 1, rbdir_equal)
elif not exclude_rbdir:
result = lazy.Iter.equal(dsiter1, dsiter2, 1, rbdir_equal)
else: result = lazy.Iter.equal(dsiter1, dsiter2, 1,
lambda x, y: x.equal_verbose(y, compare_ownership = compare_ownership))
for i in dsiter1: pass # make sure all files processed anyway
for i in dsiter2: pass
return result
def reset_hardlink_dicts():
"""Clear the hardlink dictionaries"""
Hardlink._src_inode_indicies = {}
Hardlink._src_index_indicies = {}
Hardlink._dest_inode_indicies = {}
Hardlink._dest_index_indicies = {}
Hardlink._restore_index_path = {}
def BackupRestoreSeries(source_local, dest_local, list_of_dirnames,
compare_hardlinks = 1,
dest_dirname = "testfiles/output",
restore_dirname = "testfiles/rest_out",
compare_backups = 1):
"""Test backing up/restoring of a series of directories
The dirnames correspond to a single directory at different times.
After each backup, the dest dir will be compared. After the whole
set, each of the earlier directories will be recovered to the
restore_dirname and compared.
"""
Globals.set('preserve_hardlinks', compare_hardlinks)
time = 10000
dest_rp = rpath.RPath(Globals.local_connection, dest_dirname)
restore_rp = rpath.RPath(Globals.local_connection, restore_dirname)
Myrm(dest_dirname)
for dirname in list_of_dirnames:
src_rp = rpath.RPath(Globals.local_connection, dirname)
reset_hardlink_dicts()
_reset_connections(src_rp, dest_rp)
InternalBackup(source_local, dest_local, dirname, dest_dirname, time)
time += 10000
_reset_connections(src_rp, dest_rp)
if compare_backups:
assert CompareRecursive(src_rp, dest_rp, compare_hardlinks)
time = 10000
for dirname in list_of_dirnames[:-1]:
reset_hardlink_dicts()
Myrm(restore_dirname)
InternalRestore(dest_local, source_local, dest_dirname,
restore_dirname, time)
src_rp = rpath.RPath(Globals.local_connection, dirname)
assert CompareRecursive(src_rp, restore_rp)
# Restore should default back to newest time older than it
# with a backup then.
if time == 20000: time = 21000
time += 10000
def MirrorTest(source_local, dest_local, list_of_dirnames,
compare_hardlinks = 1,
dest_dirname = "testfiles/output"):
"""Mirror each of list_of_dirnames, and compare after each"""
Globals.set('preserve_hardlinks', compare_hardlinks)
dest_rp = rpath.RPath(Globals.local_connection, dest_dirname)
old_force_val = Main.force
Main.force = 1
Myrm(dest_dirname)
for dirname in list_of_dirnames:
src_rp = rpath.RPath(Globals.local_connection, dirname)
reset_hardlink_dicts()
_reset_connections(src_rp, dest_rp)
InternalMirror(source_local, dest_local, dirname, dest_dirname)
_reset_connections(src_rp, dest_rp)
assert CompareRecursive(src_rp, dest_rp, compare_hardlinks)
Main.force = old_force_val
import unittest, types, tempfile, os, sys
from commontest import *
from rdiff_backup.connection import *
from rdiff_backup import Globals, rpath
class LocalConnectionTest(unittest.TestCase):
"""Test the dummy connection"""
lc = Globals.local_connection
def testGetAttrs(self):
"""Test getting of various attributes"""
assert type(self.lc.LocalConnection) is types.ClassType
try: self.lc.asotnuhaoseu
except (NameError, KeyError): pass
else: unittest.fail("NameError or KeyError should be raised")
def testSetattrs(self):
"""Test setting of global attributes"""
self.lc.x = 5
assert self.lc.x == 5
self.lc.x = 7
assert self.lc.x == 7
def testDelattrs(self):
"""Testing deletion of attributes"""
self.lc.x = 5
del self.lc.x
try: self.lc.x
except (NameError, KeyError): pass
else: unittest.fail("No exception raised")
def testReval(self):
"""Test string evaluation"""
assert self.lc.reval("pow", 2, 3) == 8
class LowLevelPipeConnectionTest(unittest.TestCase):
"""Test LLPC class"""
objs = ["Hello", ("Tuple", "of", "strings"),
[1, 2, 3, 4], 53.34235]
excts = [TypeError("te"), NameError("ne"), os.error("oe")]
filename = tempfile.mktemp()
def testObjects(self):
"""Try moving objects across connection"""
outpipe = open(self.filename, "w")
LLPC = LowLevelPipeConnection(None, outpipe)
for obj in self.objs: LLPC._putobj(obj, 3)
outpipe.close()
inpipe = open(self.filename, "r")
LLPC.inpipe = inpipe
for obj in self.objs:
gotten = LLPC._get()
assert gotten == (3, obj), gotten
inpipe.close
os.unlink(self.filename)
def testBuf(self):
"""Try moving a buffer"""
outpipe = open(self.filename, "w")
LLPC = LowLevelPipeConnection(None, outpipe)
inbuf = open("testfiles/various_file_types/regular_file", "r").read()
LLPC._putbuf(inbuf, 234)
outpipe.close()
inpipe = open(self.filename, "r")
LLPC.inpipe = inpipe
assert (234, inbuf) == LLPC._get()
inpipe.close()
os.unlink(self.filename)
def testSendingExceptions(self):
"""Exceptions should also be sent down pipe well"""
outpipe = open(self.filename, "w")
LLPC = LowLevelPipeConnection(None, outpipe)
for exception in self.excts: LLPC._putobj(exception, 0)
outpipe.close()
inpipe = open(self.filename, "r")
LLPC.inpipe = inpipe
for exception in self.excts:
incoming_exception = LLPC._get()
assert isinstance(incoming_exception[1], exception.__class__)
inpipe.close()
os.unlink(self.filename)
class PipeConnectionTest(unittest.TestCase):
"""Test Pipe connection"""
regfilename = "testfiles/various_file_types/regular_file"
def setUp(self):
"""Must start a server for this"""
stdin, stdout = os.popen2("python ./server.py "+SourceDir)
self.conn = PipeConnection(stdout, stdin)
#self.conn.Log.setverbosity(9)
#Log.setverbosity(9)
def testBasic(self):
"""Test some basic pipe functions"""
assert self.conn.ord("a") == 97
assert self.conn.pow(2,3) == 8
assert self.conn.reval("ord", "a") == 97
def testModules(self):
"""Test module emulation"""
assert type(self.conn.tempfile.mktemp()) is types.StringType
assert self.conn.os.path.join("a", "b") == "a/b"
rp1 = rpath.RPath(self.conn, self.regfilename)
assert rp1.isreg()
def testVirtualFiles(self):
"""Testing virtual files"""
tempout = self.conn.open("testfiles/tempout", "w")
assert isinstance(tempout, VirtualFile)
regfilefp = open(self.regfilename, "r")
rpath.copyfileobj(regfilefp, tempout)
tempout.close()
regfilefp.close()
tempoutlocal = open("testfiles/tempout", "r")
regfilefp = open(self.regfilename, "r")
assert rpath.cmpfileobj(regfilefp, tempoutlocal)
tempoutlocal.close()
regfilefp.close()
os.unlink("testfiles/tempout")
assert rpath.cmpfileobj(self.conn.open(self.regfilename, "r"),
open(self.regfilename, "r"))
def testString(self):
"""Test transmitting strings"""
assert "32" == self.conn.str(32)
assert 32 == self.conn.int("32")
def testIterators(self):
"""Test transmission of iterators"""
i = iter(map(RORPsubstitute, range(10)))
assert self.conn.hasattr(i, "next")
datastring = self.conn.reval("lambda i: i.next().data", i)
assert datastring == "Hello, there 0", datastring
def testRPaths(self):
"""Test transmission of rpaths"""
rp = rpath.RPath(self.conn,
"testfiles/various_file_types/regular_file")
assert self.conn.reval("lambda rp: rp.data", rp) == rp.data
assert self.conn.reval("lambda rp: rp.conn is Globals.local_connection", rp)
def testExceptions(self):
"""Test exceptional results"""
self.assertRaises(os.error, self.conn.os.lstat,
"asoeut haosetnuhaoseu tn")
self.assertRaises(SyntaxError, self.conn.reval,
"aoetnsu aoehtnsu")
assert self.conn.pow(2,3) == 8
def tearDown(self):
"""Bring down connection"""
self.conn.quit()
class RedirectedConnectionTest(unittest.TestCase):
"""Test routing and redirection"""
def setUp(self):
"""Must start two servers for this"""
#Log.setverbosity(9)
self.conna = SetConnections.init_connection("python ./server.py " +
SourceDir)
self.connb = SetConnections.init_connection("python ./server.py " +
SourceDir)
def testBasic(self):
"""Test basic operations with redirection"""
self.conna.Globals.set("tmp_val", 1)
self.connb.Globals.set("tmp_val", 2)
assert self.conna.Globals.get("tmp_val") == 1
assert self.connb.Globals.get("tmp_val") == 2
self.conna.Globals.set("tmp_connb", self.connb)
self.connb.Globals.set("tmp_conna", self.conna)
assert self.conna.Globals.get("tmp_connb") is self.connb
assert self.connb.Globals.get("tmp_conna") is self.conna
val = self.conna.reval("Globals.get('tmp_connb').Globals.get",
"tmp_val")
assert val == 2, val
val = self.connb.reval("Globals.get('tmp_conna').Globals.get",
"tmp_val")
assert val == 1, val
assert self.conna.reval("Globals.get('tmp_connb').pow", 2, 3) == 8
self.conna.reval("Globals.tmp_connb.reval",
"Globals.tmp_conna.Globals.set", "tmp_marker", 5)
assert self.conna.Globals.get("tmp_marker") == 5
def testRpaths(self):
"""Test moving rpaths back and forth across connections"""
rp = rpath.RPath(self.conna, "foo")
self.connb.Globals.set("tmp_rpath", rp)
rp_returned = self.connb.Globals.get("tmp_rpath")
assert rp_returned.conn is rp.conn
assert rp_returned.path == rp.path
def tearDown(self):
SetConnections.CloseConnections()
class RORPsubstitute:
"""Used in testIterators above to simulate a RORP"""
def __init__(self, i):
self.index = i
self.data = "Hello, there %d" % i
self.file = None
if __name__ == "__main__":
unittest.main()
import unittest
from commontest import *
from rdiff_backup import C
from rdiff_backup.rpath import *
class CTest(unittest.TestCase):
"""Test the C module by comparing results to python functions"""
def test_make_dict(self):
"""Test making stat dictionaries"""
rp1 = RPath(Globals.local_connection, "/dev/ttyS1")
rp2 = RPath(Globals.local_connection, "./ctest.py")
rp3 = RPath(Globals.local_connection, "aestu/aeutoheu/oeu")
rp4 = RPath(Globals.local_connection, "testfiles/various_file_types/symbolic_link")
rp5 = RPath(Globals.local_connection, "testfiles/various_file_types/fifo")
for rp in [rp1, rp2, rp3, rp4, rp5]:
dict1 = rp.make_file_dict_old()
dict2 = C.make_file_dict(rp.path)
if dict1 != dict2:
print "Python dictionary: ", dict1
print "not equal to C dictionary: ", dict2
print "for path ", rp.path
assert 0
def test_strlong(self):
"""Test str2long and long2str"""
self.assertRaises(TypeError, C.long2str, "hello")
self.assertRaises(TypeError, C.str2long, 34)
self.assertRaises(TypeError, C.str2long, "oeuo")
self.assertRaises(TypeError, C.str2long, "oeuoaoeuaoeu")
for s in ["\0\0\0\0\0\0\0", "helloto",
"\xff\xff\xff\xff\xff\xff\xff", "randoms"]:
assert len(s) == 7, repr(s)
s_out = C.long2str(C.str2long(s))
assert s_out == s, (s_out, C.str2long(s), s)
for l in 0L, 1L, 4000000000L, 34234L, 234234234L:
assert C.str2long(C.long2str(l)) == l
def test_sync(self):
"""Test running C.sync"""
C.sync()
if __name__ == "__main__": unittest.main()
from __future__ import generators
import unittest
from commontest import *
from rdiff_backup import rpath, selection, Globals, destructive_stepping
Log.setverbosity(4)
class DSTest(unittest.TestCase):
def setUp(self):
self.lc = Globals.local_connection
self.noperms = rpath.RPath(self.lc, "testfiles/noperms")
Globals.change_source_perms = 1
self.iteration_dir = rpath.RPath(self.lc, "testfiles/iteration-test")
def testDSIter(self):
"""Testing destructive stepping iterator from baserp"""
for i in range(2):
sel = selection.Select(destructive_stepping.
DSRPath(1, self.noperms)).set_iter()
ds_iter = sel.iterate_with_finalizer()
noperms = ds_iter.next()
assert noperms.isdir() and noperms.getperms() == 0, \
(noperms.isdir(), noperms.getperms())
bar = ds_iter.next()
assert bar.isreg() and bar.getperms() == 0, \
"%s %s" % (bar.isreg(), bar.getperms())
barbuf = bar.open("rb").read()
assert len(barbuf) > 0
foo = ds_iter.next()
assert foo.isreg() and foo.getperms() == 0
assert foo.getmtime() < 1000300000
fuz = ds_iter.next()
assert fuz.isreg() and fuz.getperms() == 0200
fuzbuf = fuz.open("rb").read()
assert len(fuzbuf) > 0
self.assertRaises(StopIteration, ds_iter.next)
if __name__ == "__main__": unittest.main()
import unittest, StringIO
from commontest import *
from rdiff_backup.filelist import *
class FilelistTest(unittest.TestCase):
"""Test Filelist class"""
def testFile2Iter(self):
"""Test File2Iter function"""
filelist = """
hello
goodbye
a/b/c
test"""
baserp = RPath(Globals.local_connection, "/base")
i = Filelist.File2Iter(StringIO.StringIO(filelist), baserp)
assert i.next().path == "/base/hello"
assert i.next().path == "/base/goodbye"
assert i.next().path == "/base/a/b/c"
assert i.next().path == "/base/test"
self.assertRaises(StopIteration, i.next)
def testmake_subdirs(self):
"""Test Filelist.make_subdirs"""
self.assertRaises(os.error, os.lstat, "foo_delete_me")
Filelist.make_subdirs(RPath(Globals.local_connection,
"foo_delete_me/a/b/c/d"))
os.lstat("foo_delete_me")
os.lstat("foo_delete_me/a")
os.lstat("foo_delete_me/a/b")
os.lstat("foo_delete_me/a/b/c")
os.system("rm -rf foo_delete_me")
if __name__ == "__main__": unittest.main()
import unittest, os, re, sys, time
from commontest import *
from rdiff_backup import Globals, log, rpath, robust, FilenameMapping
"""Regression tests"""
Globals.exclude_mirror_regexps = [re.compile(".*/rdiff-backup-data")]
log.Log.setverbosity(3)
lc = Globals.local_connection
class Local:
"""This is just a place to put increments relative to the local
connection"""
def get_local_rp(extension):
return rpath.RPath(Globals.local_connection, "testfiles/" + extension)
vftrp = get_local_rp('various_file_types')
inc1rp = get_local_rp('increment1')
inc2rp = get_local_rp('increment2')
inc3rp = get_local_rp('increment3')
inc4rp = get_local_rp('increment4')
rpout = get_local_rp('output')
rpout_inc = get_local_rp('output_inc')
rpout1 = get_local_rp('restoretarget1')
rpout2 = get_local_rp('restoretarget2')
rpout3 = get_local_rp('restoretarget3')
rpout4 = get_local_rp('restoretarget4')
prefix = get_local_rp('.')
vft_in = get_local_rp('vft_out')
vft_out = get_local_rp('increment2/various_file_types')
vft2_in = get_local_rp('vft2_out')
timbar_in = get_local_rp('increment1/timbar.pyc')
timbar_out = get_local_rp('../timbar.pyc') # in cur directory
wininc2 = get_local_rp('win-increment2')
wininc3 = get_local_rp('win-increment3')
class PathSetter(unittest.TestCase):
def setUp(self):
self.reset_schema()
def reset_schema(self):
self.rb_schema = (SourceDir +
"/../rdiff-backup -v3 --no-compare-inode "
"--remote-schema './chdir-wrapper2 %s' ")
def refresh(self, *rp_list):
"""Reread data for the given rps"""
for rp in rp_list: rp.setdata()
def set_connections(self, src_pre, src_back, dest_pre, dest_back):
"""Set source and destination prefixes"""
if src_pre: self.src_prefix = "%s::%s" % (src_pre, src_back)
else: self.src_prefix = './'
if dest_pre: self.dest_prefix = "%s::%s" % (dest_pre, dest_back)
else: self.dest_prefix = './'
def exec_rb(self, time, *args):
"""Run rdiff-backup on given arguments"""
arglist = []
if time: arglist.extend(["--current-time", str(time)])
arglist.append(self.src_prefix + args[0])
if len(args) > 1:
arglist.append(self.dest_prefix + args[1])
assert len(args) == 2
argstring = ' '.join(map(lambda s: "'%s'" % (s,), arglist))
cmdstr = self.rb_schema + argstring
print "executing " + cmdstr
assert not os.system(cmdstr)
def exec_rb_extra_args(self, time, extra_args, *args):
"""Run rdiff-backup on given arguments"""
arglist = []
if time: arglist.extend(["--current-time", str(time)])
arglist.append(self.src_prefix + args[0])
if len(args) > 1:
arglist.append(self.dest_prefix + args[1])
assert len(args) == 2
cmdstr = "%s %s %s" % (self.rb_schema, extra_args, ' '.join(arglist))
print "executing " + cmdstr
assert not os.system(cmdstr)
def exec_rb_restore(self, time, *args):
"""Restore using rdiff-backup's new syntax and given time"""
arglist = []
arglist.append("--restore-as-of %s" % str(time))
arglist.append(self.src_prefix + args[0])
if len(args) > 1:
arglist.append(self.dest_prefix + args[1])
assert len(args) == 2
cmdstr = self.rb_schema + " ".join(arglist)
print "Restoring via cmdline: " + cmdstr
assert not os.system(cmdstr)
def delete_tmpdirs(self):
"""Remove any temp directories created by previous tests"""
assert not os.system(MiscDir + '/myrm testfiles/output* '
'testfiles/restoretarget* testfiles/vft_out '
'timbar.pyc testfiles/vft2_out')
def runtest(self):
self.delete_tmpdirs()
# Backing up increment1
self.exec_rb(10000, 'testfiles/increment1', 'testfiles/output')
assert CompareRecursive(Local.inc1rp, Local.rpout)
time.sleep(1)
# Backing up increment2
self.exec_rb(20000, 'testfiles/increment2', 'testfiles/output')
assert CompareRecursive(Local.inc2rp, Local.rpout)
time.sleep(1)
# Backing up increment3
self.exec_rb(30000, 'testfiles/increment3', 'testfiles/output')
assert CompareRecursive(Local.inc3rp, Local.rpout)
time.sleep(1)
# Backing up increment4
self.exec_rb(40000, 'testfiles/increment4', 'testfiles/output')
assert CompareRecursive(Local.inc4rp, Local.rpout)
# Getting restore rps
inc_paths = self.getinc_paths("increments.",
"testfiles/output/rdiff-backup-data")
assert len(inc_paths) == 3
# Restoring increment1
self.exec_rb(None, inc_paths[0], 'testfiles/restoretarget1')
assert CompareRecursive(Local.inc1rp, Local.rpout1)
# Restoring increment2
self.exec_rb(None, inc_paths[1], 'testfiles/restoretarget2')
assert CompareRecursive(Local.inc2rp, Local.rpout2)
# Restoring increment3
self.exec_rb(None, inc_paths[2], 'testfiles/restoretarget3')
assert CompareRecursive(Local.inc3rp, Local.rpout3)
# Test restoration of a few random files
vft_paths = self.getinc_paths("various_file_types.",
"testfiles/output/rdiff-backup-data/increments")
self.exec_rb(None, vft_paths[1], 'testfiles/vft_out')
self.refresh(Local.vft_in, Local.vft_out)
assert CompareRecursive(Local.vft_in, Local.vft_out)
timbar_paths = self.getinc_paths("timbar.pyc.",
"testfiles/output/rdiff-backup-data/increments")
self.exec_rb(None, timbar_paths[0])
self.refresh(Local.timbar_in, Local.timbar_out)
assert Local.timbar_in.equal_loose(Local.timbar_out)
self.exec_rb_restore(25000, 'testfiles/output/various_file_types',
'testfiles/vft2_out')
self.refresh(Local.vft2_in, Local.vft_out)
assert CompareRecursive(Local.vft2_in, Local.vft_out)
# Make sure too many increment files not created
assert len(self.getinc_paths("nochange.",
"testfiles/output/rdiff-backup-data/increments")) == 0
nochange_incs = len(self.getinc_paths("",
"testfiles/output/rdiff-backup-data/increments/nochange"))
assert nochange_incs == 1 or nochange_incs == 0, nochange_incs
def getinc_paths(self, basename, directory, quoted = 0):
"""Return increment.______.dir paths"""
if quoted:
FilenameMapping.set_init_quote_vals()
dirrp = FilenameMapping.QuotedRPath(Globals.local_connection,
directory)
else: dirrp = rpath.RPath(Globals.local_connection, directory)
incbasenames = [filename for filename in robust.listrp(dirrp)
if filename.startswith(basename)]
incbasenames.sort()
incrps = map(dirrp.append, incbasenames)
return map(lambda x: x.path,
filter(lambda incrp: incrp.isincfile(), incrps))
class Final(PathSetter):
def testLocal(self):
"""Run test sequence everything local"""
self.set_connections(None, None, None, None)
self.runtest()
def testRemoteAll(self):
"""Run test sequence everything remote"""
self.set_connections("test1/", '../', 'test2/tmp/', '../../')
self.runtest()
def testRemoteSource(self):
"""Run test sequence when remote side is source"""
self.set_connections("test1/", "../", None, None)
self.runtest()
def testRemoteDest(self):
"""Run test sequence when remote side is destination"""
self.set_connections(None, None, "test2/tmp", "../../")
self.runtest()
def testProcLocal(self):
"""Test initial backup of /proc locally"""
Myrm("testfiles/procoutput")
procout = rpath.RPath(Globals.local_connection, 'testfiles/procoutput')
self.set_connections(None, None, None, None)
self.exec_rb(10000, '../../../../../../proc', procout.path)
time.sleep(1)
self.exec_rb(20000, '../../../../../../proc', procout.path)
time.sleep(1)
self.exec_rb(30000, Local.inc1rp.path, procout.path)
assert CompareRecursive(Local.inc1rp, procout)
time.sleep(1)
self.exec_rb(40000, '../../../../../../proc', procout.path)
def testProcRemote(self):
"""Test mirroring proc remote"""
Myrm("testfiles/procoutput")
procout = rpath.RPath(Globals.local_connection, 'testfiles/procoutput')
self.set_connections(None, None, "test2/tmp/", "../../")
self.exec_rb(10000, '../../../../../../proc', procout.path)
time.sleep(1)
self.exec_rb(20000, '../../../../../../proc', procout.path)
time.sleep(1)
self.exec_rb(30000, Local.inc1rp.path, procout.path)
assert CompareRecursive(Local.inc1rp, procout)
time.sleep(1)
self.exec_rb(40000, '../../../../../../proc', procout.path)
def testProcRemote2(self):
"""Test mirroring proc, this time when proc is remote, dest local"""
Myrm("testfiles/procoutput")
self.set_connections("test1/", "../", None, None)
self.exec_rb(None, '../../../../../../proc', 'testfiles/procoutput')
def testWindowsMode(self):
"""Test backup with the --windows-mode option
We need to delete from the increment? directories long file
names, because quoting adds too many extra letters.
"""
def delete_long(base_rp, length = 100):
"""Delete filenames longer than length given"""
for rp in selection.Select(base_rp).set_iter():
if len(rp.dirsplit()[1]) > length: rp.delete()
if not Local.wininc2.lstat() or not Local.wininc3.lstat():
os.system("cp -a testfiles/increment2 testfiles/win-increment2")
os.system("cp -a testfiles/increment3 testfiles/win-increment3")
delete_long(Local.wininc2)
delete_long(Local.wininc3)
old_schema = self.rb_schema
self.rb_schema = old_schema + " --windows-mode "
self.set_connections(None, None, None, None)
self.delete_tmpdirs()
# Back up increment2, this contains a file with colons
self.exec_rb(20000, 'testfiles/win-increment2', 'testfiles/output')
time.sleep(1)
# Back up increment3
self.exec_rb(30000, 'testfiles/win-increment3', 'testfiles/output')
# Start restore
self.rb_schema = old_schema + ' --windows-restore '
Globals.time_separator = "_"
inc_paths = self.getinc_paths("increments.",
"testfiles/output/rdiff-backup-data", 1)
Globals.time_separator = ":"
assert len(inc_paths) == 1, inc_paths
# Restore increment2
self.exec_rb(None, inc_paths[0], 'testfiles/restoretarget2')
assert CompareRecursive(Local.wininc2, Local.rpout2,
compare_hardlinks = 0)
# Now check to make sure no ":" in output directory
popen_fp = os.popen("find testfiles/output -name '*:*' | wc")
wc_output = popen_fp.read()
popen_fp.close()
assert wc_output.split() == ["0", "0", "0"], wc_output
def testLegacy(self):
"""Test restoring directory with no mirror_metadata file"""
self.delete_tmpdirs()
self.set_connections(None, None, None, None)
self.exec_rb(10000, 'testfiles/various_file_types',
'testfiles/output')
self.exec_rb(20000, 'testfiles/empty', 'testfiles/output')
assert not os.system(MiscDir + '/myrm testfiles/output/rdiff-backup-data/mirror_metadata*')
self.exec_rb_extra_args(None, '-r0', 'testfiles/output',
'testfiles/restoretarget1')
assert CompareRecursive(Local.vftrp, Local.rpout1,
compare_hardlinks = 0)
class FinalSelection(PathSetter):
"""Test selection options"""
def run(self, cmd):
print "Executing: ", cmd
assert not os.system(cmd)
def testSelLocal(self):
"""Quick backup testing a few selection options"""
self.delete_tmpdirs()
# Test --include option
assert not \
os.system(self.rb_schema +
"--current-time 10000 "
"--include testfiles/increment2/various_file_types "
"--exclude '**' "
"testfiles/increment2 testfiles/output")
assert os.lstat("testfiles/output/various_file_types/regular_file")
self.assertRaises(OSError, os.lstat, "testfiles/output/test.py")
# Now try reading list of files
fp = os.popen(self.rb_schema +
"--current-time 20000 "
"--include-filelist-stdin --exclude '**' "
"testfiles/increment2 testfiles/output", "w")
fp.write("""
testfiles/increment2/test.py
testfiles/increment2/changed_dir""")
assert not fp.close()
assert os.lstat("testfiles/output/changed_dir")
assert os.lstat("testfiles/output/test.py")
self.assertRaises(OSError, os.lstat,
"testfiles/output/various_file_types")
self.assertRaises(OSError, os.lstat,
"testfiles/output/changed_dir/foo")
# Test selective restoring
mirror_rp = rpath.RPath(Globals.local_connection, "testfiles/output")
restore_filename = get_increment_rp(mirror_rp, 10000).path
self.run(self.rb_schema +
"--include testfiles/restoretarget1/various_file_types/"
"regular_file "
"--exclude '**' " +
restore_filename + " testfiles/restoretarget1")
assert os.lstat("testfiles/restoretarget1/various_file_types/"
"regular_file")
self.assertRaises(OSError, os.lstat, "testfiles/restoretarget1/tester")
self.assertRaises(OSError, os.lstat,
"testfiles/restoretarget1/various_file_types/executable")
fp = os.popen(self.rb_schema +
"--include-filelist-stdin " + restore_filename +
" testfiles/restoretarget2", "w")
fp.write("""
- testfiles/restoretarget2/various_file_types/executable""")
assert not fp.close()
assert os.lstat("testfiles/restoretarget2/various_file_types/"
"regular_file")
self.assertRaises(OSError, os.lstat,
"testfiles/restoretarget2/various_file_types/executable")
def testSelFilesRemote(self):
"""Test for bug found in 0.7.[34] - filelist where source remote"""
self.delete_tmpdirs()
self.set_connections("test1/", "../", 'test2/tmp/', '../../')
self.rb_schema += ("--exclude-filelist testfiles/vft_out/exclude "
"--include-filelist testfiles/vft_out/include "
"--exclude '**' ")
# Make an exclude list
os.mkdir("testfiles/vft_out")
excluderp = rpath.RPath(Globals.local_connection,
"testfiles/vft_out/exclude")
fp = excluderp.open("w")
fp.write("""
../testfiles/various_file_types/regular_file
../testfiles/various_file_types/test
""")
assert not fp.close()
# Make an include list
includerp = rpath.RPath(Globals.local_connection,
"testfiles/vft_out/include")
fp = includerp.open("w")
fp.write("""
../testfiles/various_file_types/executable
../testfiles/various_file_types/symbolic_link
../testfiles/various_file_types/regular_file
../testfiles/various_file_types/test
""")
assert not fp.close()
self.exec_rb(None, 'testfiles/various_file_types', 'testfiles/output')
self.reset_schema()
self.exec_rb_restore("now", 'testfiles/output',
'testfiles/restoretarget1')
assert os.lstat('testfiles/restoretarget1/executable')
assert os.lstat('testfiles/restoretarget1/symbolic_link')
self.assertRaises(OSError, os.lstat,
'testfiles/restoretarget1/regular_file')
self.assertRaises(OSError, os.lstat,
'testfiles/restoretarget1/executable2')
if __name__ == "__main__": unittest.main()
#!/usr/bin/env python
"""find-max-ram - Returns the maximum amount of memory used by a program.
Every half second, run ps with the appropriate commands, getting the
size of the program. Return max value.
"""
import os, sys, time
def get_val(cmdstr):
"""Runs ps and gets sum rss for processes making cmdstr
Returns None if process not found.
"""
cmd = ("ps -Ao cmd -o rss | grep '%s' | grep -v grep" % cmdstr)
# print "Running ", cmd
fp = os.popen(cmd)
lines = fp.readlines()
fp.close()
if not lines: return None
else: return reduce(lambda x,y: x+y, map(read_ps_line, lines))
def read_ps_line(psline):
"""Given a specially formatted line by ps, return rss value"""
l = psline.split()
assert len(l) >= 2 # first few are name, last one is rss
return int(l[-1])
def main(cmdstr):
while get_val(cmdstr) is None: time.sleep(0.5)
current_max = 0
while 1:
rss = get_val(cmdstr)
print rss
if rss is None: break
current_max = max(current_max, rss)
time.sleep(0.5)
print current_max
if __name__=="__main__":
if len(sys.argv) != 2:
print """Usage: find-max-ram [command string]
It will then run ps twice a second and keep totalling how much RSS
(resident set size) the process(es) whose ps command name contain the
given string use up. When there are no more processes found, it will
print the number and exit.
"""
sys.exit(1)
else: main(sys.argv[1])
import unittest, os, time
from commontest import *
from rdiff_backup import Globals, rpath, fs_abilities
class FSAbilitiesTest(unittest.TestCase):
"""Test testing of file system abilities
Some of these tests assume that the actual file system tested has
the given abilities. If the file system this is run on differs
from the original test system, this test may/should fail. Change
the expected values below.
"""
# Describes standard linux file system with acls/eas
dir_to_test = "testfiles"
eas = acls = 1
chars_to_quote = ""
ownership = (os.getuid() == 0)
hardlinks = fsync_dirs = 1
# Describes MS-Windows style file system
#dir_to_test = "/mnt/fat"
#eas = acls = 0
#chars_to_quote = "^a-z0-9_ -"
#ownership = hardlinks = 0
#fsync_dirs = 1
def testReadOnly(self):
"""Test basic querying read only"""
base_dir = rpath.RPath(Globals.local_connection, self.dir_to_test)
fsa = fs_abilities.FSAbilities().init_readonly(base_dir)
assert fsa.read_only == 1, fsa.read_only
assert fsa.eas == self.eas, fsa.eas
assert fsa.acls == self.acls, fsa.acls
def testReadWrite(self):
"""Test basic querying read/write"""
base_dir = rpath.RPath(Globals.local_connection, self.dir_to_test)
new_dir = base_dir.append("fs_abilitiestest")
if new_dir.lstat(): Myrm(new_dir.path)
new_dir.setdata()
new_dir.mkdir()
t = time.time()
fsa = fs_abilities.FSAbilities().init_readwrite(new_dir)
print "Time elapsed = ", time.time() - t
assert fsa.read_only == 0, fsa.read_only
assert fsa.eas == self.eas, fsa.eas
assert fsa.acls == self.acls, fsa.acls
assert fsa.chars_to_quote == self.chars_to_quote, fsa.chars_to_quote
assert fsa.ownership == self.ownership, fsa.ownership
assert fsa.hardlinks == self.hardlinks, fsa.hardlinks
assert fsa.fsync_dirs == self.fsync_dirs, fsa.fsync_dirs
ctq_rp = new_dir.append("chars_to_quote")
assert ctq_rp.lstat()
fp = ctq_rp.open('rb')
chars_to_quote = fp.read()
assert not fp.close()
assert chars_to_quote == self.chars_to_quote, chars_to_quote
new_dir.delete()
if __name__ == "__main__": unittest.main()
import os, unittest, time
from commontest import *
from rdiff_backup import Globals, Hardlink, selection, rpath
Log.setverbosity(3)
class HardlinkTest(unittest.TestCase):
"""Test cases for Hard links"""
outputrp = rpath.RPath(Globals.local_connection, "testfiles/output")
hardlink_dir1 = rpath.RPath(Globals.local_connection,
"testfiles/hardlinks/dir1")
hardlink_dir1copy = rpath.RPath(Globals.local_connection,
"testfiles/hardlinks/dir1copy")
hardlink_dir2 = rpath.RPath(Globals.local_connection,
"testfiles/hardlinks/dir2")
hardlink_dir3 = rpath.RPath(Globals.local_connection,
"testfiles/hardlinks/dir3")
def reset_output(self):
"""Erase and recreate testfiles/output directory"""
os.system(MiscDir+'/myrm testfiles/output')
self.outputrp.mkdir()
def testEquality(self):
"""Test rorp_eq function in conjunction with CompareRecursive"""
assert CompareRecursive(self.hardlink_dir1, self.hardlink_dir1copy)
assert CompareRecursive(self.hardlink_dir1, self.hardlink_dir2,
compare_hardlinks = None)
assert not CompareRecursive(self.hardlink_dir1, self.hardlink_dir2,
compare_hardlinks = 1)
def testBuildingDict(self):
"""See if the partial inode dictionary is correct"""
Globals.preserve_hardlinks = 1
reset_hardlink_dicts()
for dsrp in selection.Select(self.hardlink_dir3).set_iter():
Hardlink.add_rorp(dsrp, 1)
assert len(Hardlink._src_inode_indicies.keys()) == 3, \
Hardlink._src_inode_indicies
assert len(Hardlink._src_index_indicies.keys()) == 3, \
Hardlink._src_index_indicies
vals1 = Hardlink._src_inode_indicies.values()
vals2 = Hardlink._src_index_indicies.values()
vals1.sort()
vals2.sort()
assert vals1 == vals2
def testBuildingDict2(self):
"""Same as testBuildingDict but test destination building"""
Globals.preserve_hardlinks = 1
reset_hardlink_dicts()
for dsrp in selection.Select(self.hardlink_dir3).set_iter():
Hardlink.add_rorp(dsrp, None)
assert len(Hardlink._dest_inode_indicies.keys()) == 3, \
Hardlink._dest_inode_indicies
assert len(Hardlink._dest_index_indicies.keys()) == 3, \
Hardlink._dest_index_indicies
vals1 = Hardlink._dest_inode_indicies.values()
vals2 = Hardlink._dest_index_indicies.values()
vals1.sort()
vals2.sort()
assert vals1 == vals2
def testCompletedDict(self):
"""See if the hardlink dictionaries are built correctly"""
reset_hardlink_dicts()
for dsrp in selection.Select(self.hardlink_dir1).set_iter():
Hardlink.add_rorp(dsrp, 1)
assert Hardlink._src_inode_indicies == {}, \
Hardlink._src_inode_indicies
hll1 = [('file1',), ('file2',), ('file3',)]
hll2 = [('file4',), ('file5',), ('file6',)]
dict = {}
for index in hll1: dict[index] = hll1
for index in hll2: dict[index] = hll2
assert Hardlink._src_index_indicies == dict
reset_hardlink_dicts()
for dsrp in selection.Select(self.hardlink_dir2).set_iter():
Hardlink.add_rorp(dsrp, 1)
assert Hardlink._src_inode_indicies == {}, \
Hardlink._src_inode_indicies
hll1 = [('file1',), ('file3',), ('file4',)]
hll2 = [('file2',), ('file5',), ('file6',)]
dict = {}
for index in hll1: dict[index] = hll1
for index in hll2: dict[index] = hll2
assert Hardlink._src_index_indicies == dict
def testSeries(self):
"""Test hardlink system by backing up and restoring a few dirs"""
dirlist = ['testfiles/hardlinks/dir1',
'testfiles/hardlinks/dir2',
'testfiles/hardlinks/dir3',
'testfiles/various_file_types']
BackupRestoreSeries(None, None, dirlist, compare_hardlinks=1)
BackupRestoreSeries(1, 1, dirlist, compare_hardlinks=1)
def testInnerRestore(self):
"""Restore part of a dir, see if hard links preserved"""
MakeOutputDir()
output = rpath.RPath(Globals.local_connection,
"testfiles/output")
# Now set up directories out_hardlink1 and out_hardlink2
hlout1 = rpath.RPath(Globals.local_connection,
"testfiles/out_hardlink1")
if hlout1.lstat(): hlout1.delete()
hlout1.mkdir()
hlout1_sub = hlout1.append("subdir")
hlout1_sub.mkdir()
hl1_1 = hlout1_sub.append("hardlink1")
hl1_2 = hlout1_sub.append("hardlink2")
hl1_3 = hlout1_sub.append("hardlink3")
hl1_4 = hlout1_sub.append("hardlink4")
# 1 and 2 are hard linked, as are 3 and 4
hl1_1.touch()
hl1_2.hardlink(hl1_1.path)
hl1_3.touch()
hl1_4.hardlink(hl1_3.path)
hlout2 = rpath.RPath(Globals.local_connection,
"testfiles/out_hardlink2")
if hlout2.lstat(): hlout2.delete()
assert not os.system("cp -a testfiles/out_hardlink1 "
"testfiles/out_hardlink2")
hlout2_sub = hlout2.append("subdir")
hl2_1 = hlout2_sub.append("hardlink1")
hl2_2 = hlout2_sub.append("hardlink2")
hl2_3 = hlout2_sub.append("hardlink3")
hl2_4 = hlout2_sub.append("hardlink4")
# Now 2 and 3 are hard linked, also 1 and 4
rpath.copy_with_attribs(hl1_1, hl2_1)
rpath.copy_with_attribs(hl1_2, hl2_2)
hl2_3.delete()
hl2_3.hardlink(hl2_2.path)
hl2_4.delete()
hl2_4.hardlink(hl2_1.path)
rpath.copy_attribs(hlout1_sub, hlout2_sub)
# Now try backing up twice, making sure hard links are preserved
InternalBackup(1, 1, hlout1.path, output.path)
out_subdir = output.append("subdir")
assert out_subdir.append("hardlink1").getinode() == \
out_subdir.append("hardlink2").getinode()
assert out_subdir.append("hardlink3").getinode() == \
out_subdir.append("hardlink4").getinode()
assert out_subdir.append("hardlink1").getinode() != \
out_subdir.append("hardlink3").getinode()
time.sleep(1)
InternalBackup(1, 1, hlout2.path, output.path)
out_subdir.setdata()
assert out_subdir.append("hardlink1").getinode() == \
out_subdir.append("hardlink4").getinode()
assert out_subdir.append("hardlink2").getinode() == \
out_subdir.append("hardlink3").getinode()
assert out_subdir.append("hardlink1").getinode() != \
out_subdir.append("hardlink2").getinode()
# Now try restoring, still checking hard links.
out2 = rpath.RPath(Globals.local_connection, "testfiles/out2")
hlout1 = out2.append("hardlink1")
hlout2 = out2.append("hardlink2")
hlout3 = out2.append("hardlink3")
hlout4 = out2.append("hardlink4")
if out2.lstat(): out2.delete()
InternalRestore(1, 1, "testfiles/output/subdir", "testfiles/out2", 1)
out2.setdata()
for rp in [hlout1, hlout2, hlout3, hlout4]: rp.setdata()
assert hlout1.getinode() == hlout2.getinode()
assert hlout3.getinode() == hlout4.getinode()
assert hlout1.getinode() != hlout3.getinode()
if out2.lstat(): out2.delete()
InternalRestore(1, 1, "testfiles/output/subdir", "testfiles/out2",
int(time.time()))
out2.setdata()
for rp in [hlout1, hlout2, hlout3, hlout4]: rp.setdata()
assert hlout1.getinode() == hlout4.getinode(), \
"%s %s" % (hlout1.path, hlout4.path)
assert hlout2.getinode() == hlout3.getinode()
assert hlout1.getinode() != hlout2.getinode()
if __name__ == "__main__": unittest.main()
import unittest, os, re, time
from commontest import *
from rdiff_backup import log, rpath, increment, Time, Rdiff, statistics
lc = Globals.local_connection
Globals.change_source_perms = 1
Log.setverbosity(3)
def getrp(ending):
return rpath.RPath(lc, "testfiles/various_file_types/" + ending)
rf = getrp("regular_file")
rf2 = getrp("two_hardlinked_files1")
exec1 = getrp("executable")
exec2 = getrp("executable2")
sig = getrp("regular_file.sig")
hl1, hl2 = map(getrp, ["two_hardlinked_files1", "two_hardlinked_files2"])
test = getrp("test")
dir = getrp(".")
sym = getrp("symbolic_link")
nothing = getrp("nothing")
target = rpath.RPath(lc, "testfiles/output/out")
out2 = rpath.RPath(lc, "testfiles/output/out2")
out_gz = rpath.RPath(lc, "testfiles/output/out.gz")
Time.setcurtime(1000000000)
Time.setprevtime(999424113)
prevtimestr = "2001-09-02T02:48:33-07:00"
t_pref = "testfiles/output/out.2001-09-02T02:48:33-07:00"
t_diff = "testfiles/output/out.2001-09-02T02:48:33-07:00.diff"
Globals.no_compression_regexp = \
re.compile(Globals.no_compression_regexp_string, re.I)
class inctest(unittest.TestCase):
"""Test the incrementRP function"""
def setUp(self):
Globals.set('isbackup_writer',1)
MakeOutputDir()
def check_time(self, rp):
"""Make sure that rp is an inc file, and time is Time.prevtime"""
assert rp.isincfile(), rp
t = rp.getinctime()
assert t == Time.prevtime, (t, Time.prevtime)
def testreg(self):
"""Test increment of regular files"""
Globals.compression = None
target.setdata()
if target.lstat(): target.delete()
rpd = rpath.RPath(lc, t_diff)
if rpd.lstat(): rpd.delete()
diffrp = increment.Increment(rf, exec1, target)
assert diffrp.isreg(), diffrp
assert rpath.cmp_attribs(diffrp, exec1)
self.check_time(diffrp)
assert diffrp.getinctype() == 'diff', diffrp.getinctype()
diffrp.delete()
def testmissing(self):
"""Test creation of missing files"""
missing_rp = increment.Increment(rf, nothing, target)
self.check_time(missing_rp)
assert missing_rp.getinctype() == 'missing'
missing_rp.delete()
def testsnapshot(self):
"""Test making of a snapshot"""
Globals.compression = None
snap_rp = increment.Increment(rf, sym, target)
self.check_time(snap_rp)
assert rpath.cmp_attribs(snap_rp, sym)
assert rpath.cmp(snap_rp, sym)
snap_rp.delete()
snap_rp2 = increment.Increment(sym, rf, target)
self.check_time(snap_rp2)
assert rpath.cmp_attribs(snap_rp2, rf)
assert rpath.cmp(snap_rp2, rf)
snap_rp2.delete()
def testGzipsnapshot(self):
"""Test making a compressed snapshot"""
Globals.compression = 1
rp = increment.Increment(rf, sym, target)
self.check_time(rp)
assert rpath.cmp_attribs(rp, sym)
assert rpath.cmp(rp, sym)
rp.delete()
rp = increment.Increment(sym, rf, target)
self.check_time(rp)
assert rpath.cmp_attribs(rp, rf)
assert rpath.cmpfileobj(rp.open("rb", 1), rf.open("rb"))
assert rp.isinccompressed()
rp.delete()
def testdir(self):
"""Test increment on dir"""
rp = increment.Increment(sym, dir, target)
self.check_time(rp)
assert rp.lstat()
assert target.isdir()
assert rpath.cmp_attribs(dir, rp)
assert rp.isreg()
rp.delete()
target.delete()
def testDiff(self):
"""Test making diffs"""
Globals.compression = None
rp = increment.Increment(rf, rf2, target)
self.check_time(rp)
assert rpath.cmp_attribs(rp, rf2)
Rdiff.patch_local(rf, rp, out2)
assert rpath.cmp(rf2, out2)
rp.delete()
out2.delete()
def testGzipDiff(self):
"""Test making gzipped diffs"""
Globals.compression = 1
rp = increment.Increment(rf, rf2, target)
self.check_time(rp)
assert rpath.cmp_attribs(rp, rf2)
Rdiff.patch_local(rf, rp, out2, delta_compressed = 1)
assert rpath.cmp(rf2, out2)
rp.delete()
out2.delete()
def testGzipRegexp(self):
"""Here a .gz file shouldn't be compressed"""
Globals.compression = 1
rpath.copy(rf, out_gz)
assert out_gz.lstat()
rp = increment.Increment(rf, out_gz, target)
self.check_time(rp)
assert rpath.cmp_attribs(rp, out_gz)
Rdiff.patch_local(rf, rp, out2)
assert rpath.cmp(out_gz, out2)
rp.delete()
out2.delete()
out_gz.delete()
if __name__ == '__main__': unittest.main()
import unittest, StringIO
from commontest import *
from rdiff_backup.iterfile import *
from rdiff_backup import lazy
class FileException:
"""Like a file, but raise exception after certain # bytes read"""
def __init__(self, max):
self.count = 0
self.max = max
def read(self, l):
self.count += l
if self.count > self.max: raise IOError(13, "Permission Denied")
return "a"*l
def close(self): return None
class testIterFile(unittest.TestCase):
def setUp(self):
self.iter1maker = lambda: iter(range(50))
self.iter2maker = lambda: iter(map(str, range(50)))
def testConversion(self):
"""Test iter to file conversion"""
for itm in [self.iter1maker, self.iter2maker]:
assert lazy.Iter.equal(itm(),
IterWrappingFile(FileWrappingIter(itm())))
def testFile(self):
"""Test sending files through iters"""
buf1 = "hello"*10000
file1 = StringIO.StringIO(buf1)
buf2 = "goodbye"*10000
file2 = StringIO.StringIO(buf2)
file_iter = FileWrappingIter(iter([file1, file2]))
new_iter = IterWrappingFile(file_iter)
assert new_iter.next().read() == buf1
assert new_iter.next().read() == buf2
self.assertRaises(StopIteration, new_iter.next)
def testFileException(self):
"""Test encoding a file which raises an exception"""
f = FileException(100*1024)
new_iter = IterWrappingFile(FileWrappingIter(iter([f, "foo"])))
f_out = new_iter.next()
assert f_out.read(10000) == "a"*10000
try: buf = f_out.read(100*1024)
except IOError: pass
else: assert 0, len(buf)
assert new_iter.next() == "foo"
self.assertRaises(StopIteration, new_iter.next)
class testRORPIters(unittest.TestCase):
"""Test sending rorpiter back and forth"""
def setUp(self):
"""Make testfiles/output directory and a few files"""
Myrm("testfiles/output")
self.outputrp = rpath.RPath(Globals.local_connection,
"testfiles/output")
self.regfile1 = self.outputrp.append("reg1")
self.regfile2 = self.outputrp.append("reg2")
self.regfile3 = self.outputrp.append("reg3")
self.outputrp.mkdir()
fp = self.regfile1.open("wb")
fp.write("hello")
fp.close()
self.regfile1.setfile(self.regfile1.open("rb"))
self.regfile2.touch()
self.regfile2.setfile(self.regfile2.open("rb"))
fp = self.regfile3.open("wb")
fp.write("goodbye")
fp.close()
self.regfile3.setfile(self.regfile3.open("rb"))
self.regfile1.setdata()
self.regfile2.setdata()
self.regfile3.setdata()
def print_RORPIterFile(self, rpiter_file):
"""Print the given rorpiter file"""
while 1:
buf = rpiter_file.read()
sys.stdout.write(buf)
if buf[0] == "z": break
def testBasic(self):
"""Test basic conversion"""
l = [self.outputrp, self.regfile1, self.regfile2, self.regfile3]
i_out = FileToRORPIter(RORPIterToFile(iter(l)))
out1 = i_out.next()
assert out1 == self.outputrp
out2 = i_out.next()
assert out2 == self.regfile1
fp = out2.open("rb")
assert fp.read() == "hello"
assert not fp.close()
out3 = i_out.next()
assert out3 == self.regfile2
fp = out3.open("rb")
assert fp.read() == ""
assert not fp.close()
i_out.next()
self.assertRaises(StopIteration, i_out.next)
def testFlush(self):
"""Test flushing property of RORPIterToFile"""
l = [self.outputrp, RORPIterFlush, self.outputrp]
filelike = RORPIterToFile(iter(l))
new_filelike = StringIO.StringIO((filelike.read() + "z" +
C.long2str(0L)))
i_out = FileToRORPIter(new_filelike)
assert i_out.next() == self.outputrp
self.assertRaises(StopIteration, i_out.next)
i_out2 = FileToRORPIter(filelike)
assert i_out2.next() == self.outputrp
self.assertRaises(StopIteration, i_out2.next)
def testFlushRepeat(self):
"""Test flushing like above, but have Flush obj emerge from iter"""
l = [self.outputrp, RORPIterFlushRepeat, self.outputrp]
filelike = RORPIterToFile(iter(l))
new_filelike = StringIO.StringIO((filelike.read() + "z" +
C.long2str(0L)))
i_out = FileToRORPIter(new_filelike)
assert i_out.next() == self.outputrp
assert i_out.next() is RORPIterFlushRepeat
self.assertRaises(StopIteration, i_out.next)
i_out2 = FileToRORPIter(filelike)
assert i_out2.next() == self.outputrp
self.assertRaises(StopIteration, i_out2.next)
if __name__ == "__main__": unittest.main()
from commontest import *
import unittest
from rdiff_backup import journal, Globals, rpath
class JournalTest(unittest.TestCase):
def testBasic(self):
"""Test opening a journal, then reading, writing, and deleting"""
MakeOutputDir()
Globals.dest_root = rpath.RPath(Globals.local_connection,
"testfiles/output")
Globals.rbdir = Globals.dest_root.append("rdiff-backup-data")
Globals.rbdir.mkdir()
journal.open_journal()
assert len(journal.get_entries_from_journal()) == 0
# It's important that none of these files really exist
e1 = journal.write_entry(("Hello48",), ("temp_index", "foo"),
2, "reg")
e2 = journal.write_entry(("2nd", "Entry", "now"),
("temp_index",), 1, None)
assert e1.entry_rp and e2.entry_rp
l = journal.get_entries_from_journal()
assert len(l) == 2
first_index = l[0].index
assert (first_index == ("Hello48",) or
first_index == ("2nd", "Entry", "now"))
# Now test recovering journal, and make sure everything deleted
journal.recover_journal()
assert len(journal.get_entries_from_journal()) == 0
if __name__ == "__main__": unittest.main()
import unittest, os, signal, sys, random, time
from commontest import *
from rdiff_backup.log import *
from rdiff_backup import Globals, Main, restore
"""Test consistency by killing rdiff-backup as it is backing up"""
Log.setverbosity(3)
class Local:
"""Hold some local RPaths"""
def get_local_rp(ext):
return RPath(Globals.local_connection, "testfiles/" + ext)
kt1rp = get_local_rp('killtest1')
kt2rp = get_local_rp('killtest2')
kt3rp = get_local_rp('killtest3')
kt4rp = get_local_rp('killtest4')
rpout = get_local_rp('output')
rpout_inc = get_local_rp('output_inc')
rpout1 = get_local_rp('restoretarget1')
rpout2 = get_local_rp('restoretarget2')
rpout3 = get_local_rp('restoretarget3')
rpout4 = get_local_rp('restoretarget4')
rpout5 = get_local_rp('restoretarget5')
back1 = get_local_rp('backup1')
back2 = get_local_rp('backup2')
back3 = get_local_rp('backup3')
back4 = get_local_rp('backup4')
back5 = get_local_rp('backup5')
class TimingError(Exception):
"""Indicates timing error - process killed too soon or too late"""
pass
class ProcessFuncs(unittest.TestCase):
"""Subclassed by Resume and NoResume"""
def delete_tmpdirs(self):
"""Remove any temp directories created by previous tests"""
assert not os.system(MiscDir + '/myrm testfiles/output* '
'testfiles/restoretarget* testfiles/vft_out '
'timbar.pyc testfiles/vft2_out')
def exec_rb(self, time, wait, *args):
"""Run rdiff-backup return pid. Wait until done if wait is true"""
arglist = ['python', '../rdiff-backup', '-v3']
if time:
arglist.append("--current-time")
arglist.append(str(time))
arglist.extend(args)
print "Running ", arglist
if wait: return os.spawnvp(os.P_WAIT, 'python', arglist)
else: return os.spawnvp(os.P_NOWAIT, 'python', arglist)
def exec_and_kill(self, min_max_pair, backup_time, arg1, arg2):
"""Run rdiff-backup, then kill and run again
Kill after a time between mintime and maxtime. First process
should not terminate before maxtime.
"""
mintime, maxtime = min_max_pair
pid = self.exec_rb(backup_time, None, arg1, arg2)
time.sleep(random.uniform(mintime, maxtime))
if os.waitpid(pid, os.WNOHANG)[0] != 0:
# Timing problem, process already terminated (max time too big?)
return -1
os.kill(pid, self.killsignal)
while 1:
pid, exitstatus = os.waitpid(pid, os.WNOHANG)
if pid:
assert exitstatus != 0
break
time.sleep(0.2)
print "---------------------- killed"
def create_killtest_dirs(self):
"""Create testfiles/killtest? directories
They are similar to the testfiles/increment? directories but
have more files in them so they take a significant time to
back up.
"""
def copy_thrice(input, output):
"""Copy input directory to output directory three times"""
assert not os.system("cp -a %s %s" % (input, output))
assert not os.system("cp -a %s %s/killtesta" % (input, output))
assert not os.system("cp -a %s %s/killtestb" % (input, output))
if (Local.kt1rp.lstat() and Local.kt2rp.lstat() and
Local.kt3rp.lstat() and Local.kt4rp.lstat()): return
assert not os.system("rm -rf testfiles/killtest?")
for i in [1, 2, 3, 4]:
copy_thrice("testfiles/increment%d" % i,
"testfiles/killtest%d" % i)
def runtest_sequence(self, total_tests,
exclude_rbdir, ignore_tmp, compare_links,
stop_on_error = None):
timing_problems, failures = 0, 0
for i in range(total_tests):
try:
result = self.runtest(exclude_rbdir, ignore_tmp, compare_links)
except TimingError, te:
print te
timing_problems += 1
continue
if result != 1:
if stop_on_error: assert 0, "Compare Failure"
else: failures += 1
print total_tests, "tests attempted total"
print "%s setup problems, %s failures, %s successes" % \
(timing_problems, failures,
total_tests - timing_problems - failures)
class KillTest(ProcessFuncs):
"""Test rdiff-backup by killing it, recovering, and then comparing"""
killsignal = signal.SIGTERM
# The following are lower and upper bounds on the amount of time
# rdiff-backup is expected to run. They are used to determine how
# long to wait before killing the rdiff-backup process
time_pairs = [(0.0, 3.7), (0.0, 3.7), (0.0, 3.0), (0.0, 5.0), (0.0, 5.0)]
def setUp(self):
"""Create killtest? and backup? directories if necessary"""
Local.kt1rp.setdata()
Local.kt2rp.setdata()
Local.kt3rp.setdata()
Local.kt4rp.setdata()
if (not Local.kt1rp.lstat() or not Local.kt2rp.lstat() or
not Local.kt3rp.lstat() or not Local.kt4rp.lstat()):
self.create_killtest_dirs()
def testTiming(self):
"""Run each rdiff-backup sequence 10 times, printing average time"""
time_list = [[], [], [], [], []] # List of time lists
iterations = 10
def run_once(current_time, input_rp, index):
start_time = time.time()
self.exec_rb(current_time, 1, input_rp.path, Local.rpout.path)
time_list[index].append(time.time() - start_time)
for i in range(iterations):
self.delete_tmpdirs()
run_once(10000, Local.kt3rp, 0)
run_once(20000, Local.kt1rp, 1)
run_once(30000, Local.kt3rp, 2)
run_once(40000, Local.kt3rp, 3)
run_once(50000, Local.kt3rp, 4)
for i in range(len(time_list)):
print "%s -> %s" % (i, " ".join(map(str, time_list[i])))
def mark_incomplete(self, curtime, rp):
"""Check the date of current mirror
Return 1 if there are two current_mirror incs and last one has
time curtime. Return 0 if only one with time curtime, and
then add a current_mirror marker. Return -1 if only one and
time is not curtime.
"""
rbdir = rp.append_path("rdiff-backup-data")
inclist = restore.get_inclist(rbdir.append("current_mirror"))
assert 1 <= len(inclist) <= 2, str(map(lambda x: x.path, inclist))
inc_date_pairs = map(lambda inc: (inc.getinctime(), inc), inclist)
inc_date_pairs.sort()
if len(inclist) == 2:
assert inc_date_pairs[-1][0] == curtime, \
(inc_date_pairs[-1][0], curtime)
return 1
if inc_date_pairs[-1][0] == curtime:
result = 0
marker_time = curtime - 10000
else:
assert inc_date_pairs[-1][0] == curtime - 10000
marker_time = curtime
result = -1
cur_mirror_rp = rbdir.append("current_mirror.%s.data" %
(Time.timetostring(marker_time),))
assert not cur_mirror_rp.lstat()
cur_mirror_rp.touch()
return result
def testTerm(self):
"""Run rdiff-backup, termining and regressing each time
Because rdiff-backup must be killed, the timing should be
updated
"""
count, killed_too_soon, killed_too_late = 5, [0]*4, [0]*4
self.delete_tmpdirs()
# Back up killtest3 first because it is big and the first case
# is kind of special (there's no incrementing, so different
# code)
self.exec_rb(10000, 1, Local.kt3rp.path, Local.rpout.path)
assert CompareRecursive(Local.kt3rp, Local.rpout)
def cycle_once(min_max_time_pair, curtime, input_rp, old_rp):
"""Backup input_rp, kill, regress, and then compare"""
time.sleep(1)
self.exec_and_kill(min_max_time_pair, curtime,
input_rp.path, Local.rpout.path)
result = self.mark_incomplete(curtime, Local.rpout)
assert not self.exec_rb(None, 1, '--check-destination-dir',
Local.rpout.path)
assert CompareRecursive(old_rp, Local.rpout, compare_hardlinks = 0)
return result
# Keep backing kt1rp, and then regressing to kt3rp. Then go to kt1rp
for i in range(count):
result = cycle_once(self.time_pairs[1], 20000,
Local.kt1rp, Local.kt3rp)
if result == 0: killed_too_late[0] += 1
elif result == -1: killed_too_soon[0] += 1
self.exec_rb(20000, 1, Local.kt1rp.path, Local.rpout.path)
# Now keep regressing from kt2rp, only staying there at the end
for i in range(count):
result = cycle_once(self.time_pairs[2], 30000,
Local.kt2rp, Local.kt1rp)
if result == 0: killed_too_late[1] += 1
elif result == -1: killed_too_soon[1] += 1
self.exec_rb(30000, 1, Local.kt2rp.path, Local.rpout.path)
# Now keep regressing from kt3rp, only staying there at the end
for i in range(count):
result = cycle_once(self.time_pairs[3], 40000,
Local.kt3rp, Local.kt2rp)
if result == 0: killed_too_late[2] += 1
elif result == -1: killed_too_soon[2] += 1
self.exec_rb(40000, 1, Local.kt3rp.path, Local.rpout.path)
# Now keep regressing from kt4rp, only staying there at the end
for i in range(count):
result = cycle_once(self.time_pairs[4], 50000,
Local.kt4rp, Local.kt3rp)
if result == 0: killed_too_late[3] += 1
elif result == -1: killed_too_soon[3] += 1
print "Killed too soon out of %s: %s" % (count, killed_too_soon)
print "Killed too late out of %s: %s" % (count, killed_too_late)
if __name__ == "__main__": unittest.main()
from __future__ import generators
import unittest, pickle
from commontest import *
from rdiff_backup.lazy import *
class Iterators(unittest.TestCase):
one_to_100 = lambda s: iter(range(1, 101))
evens = lambda s: iter(range(2, 101, 2))
odds = lambda s: iter(range(1, 100, 2))
empty = lambda s: iter([])
def __init__(self, *args):
apply (unittest.TestCase.__init__, (self,) + args)
self.falseerror = self.falseerror_maker()
self.trueerror = self.trueerror_maker()
self.emptygen = self.emptygen_maker()
self.typeerror = self.typeerror_maker()
self.nameerror = self.nameerror_maker()
def falseerror_maker(self):
yield None
yield 0
yield []
raise Exception
def trueerror_maker(self):
yield 1
yield "hello"
yield (2, 3)
raise Exception
def nameerror_maker(self):
if 0: yield 1
raise NameError
def typeerror_maker(self):
yield 1
yield 2
raise TypeError
def alwayserror(self, x):
raise Exception
def emptygen_maker(self):
if 0: yield 1
class IterEqualTestCase(Iterators):
"""Tests for iter_equal function"""
def testEmpty(self):
"""Empty iterators should be equal"""
assert Iter.equal(self.empty(), iter([]))
def testNormal(self):
"""See if normal iterators are equal"""
assert Iter.equal(iter((1,2,3)), iter((1,2,3)))
assert Iter.equal(self.odds(), iter(range(1, 100, 2)))
assert Iter.equal(iter((1,2,3)), iter(range(1, 4)))
def testNormalInequality(self):
"""See if normal unequals work"""
assert not Iter.equal(iter((1,2,3)), iter((1,2,4)))
assert not Iter.equal(self.odds(), iter(["hello", "there"]))
def testGenerators(self):
"""equals works for generators"""
def f():
yield 1
yield "hello"
def g():
yield 1
yield "hello"
assert Iter.equal(f(), g())
def testLength(self):
"""Differently sized iterators"""
assert not Iter.equal(iter((1,2,3)), iter((1,2)))
assert not Iter.equal(iter((1,2)), iter((1,2,3)))
class FilterTestCase(Iterators):
"""Tests for lazy_filter function"""
def testEmpty(self):
"""empty iterators -> empty iterators"""
assert Iter.empty(Iter.filter(self.alwayserror,
self.empty())), \
"Filtering an empty iterator should result in empty iterator"
def testNum1(self):
"""Test numbers 1 - 100 #1"""
assert Iter.equal(Iter.filter(lambda x: x % 2 == 0,
self.one_to_100()),
self.evens())
assert Iter.equal(Iter.filter(lambda x: x % 2,
self.one_to_100()),
self.odds())
def testError(self):
"""Should raise appropriate error"""
i = Iter.filter(lambda x: x, self.falseerror_maker())
self.assertRaises(Exception, i.next)
class MapTestCase(Iterators):
"""Test mapping of iterators"""
def testNumbers(self):
"""1 to 100 * 2 = 2 to 200"""
assert Iter.equal(Iter.map(lambda x: 2*x, self.one_to_100()),
iter(range(2, 201, 2)))
def testShortcut(self):
"""Map should go in order"""
def f(x):
if x == "hello":
raise NameError
i = Iter.map(f, self.trueerror_maker())
i.next()
self.assertRaises(NameError, i.next)
def testEmpty(self):
"""Map of an empty iterator is empty"""
assert Iter.empty(Iter.map(lambda x: x, iter([])))
class CatTestCase(Iterators):
"""Test concatenation of iterators"""
def testEmpty(self):
"""Empty + empty = empty"""
assert Iter.empty(Iter.cat(iter([]), iter([])))
def testNumbers(self):
"""1 to 50 + 51 to 100 = 1 to 100"""
assert Iter.equal(Iter.cat(iter(range(1, 51)), iter(range(51, 101))),
self.one_to_100())
def testShortcut(self):
"""Process iterators in order"""
i = Iter.cat(self.typeerror_maker(), self.nameerror_maker())
i.next()
i.next()
self.assertRaises(TypeError, i.next)
class AndOrTestCase(Iterators):
"""Test And and Or"""
def testEmpty(self):
"""And() -> true, Or() -> false"""
assert Iter.And(self.empty())
assert not Iter.Or(self.empty())
def testAndShortcut(self):
"""And should return if any false"""
assert Iter.And(self.falseerror_maker()) is None
def testOrShortcut(self):
"""Or should return if any true"""
assert Iter.Or(self.trueerror_maker()) == 1
def testNormalAnd(self):
"""And should go through true iterators, picking last"""
assert Iter.And(iter([1,2,3,4])) == 4
self.assertRaises(Exception, Iter.And, self.trueerror_maker())
def testNormalOr(self):
"""Or goes through false iterators, picking last"""
assert Iter.Or(iter([0, None, []])) == []
self.assertRaises(Exception, Iter.Or, self.falseerror_maker())
class FoldingTest(Iterators):
"""Test folding operations"""
def f(self, x, y): return x + y
def testEmpty(self):
"""Folds of empty iterators should produce defaults"""
assert Iter.foldl(self.f, 23, self.empty()) == 23
assert Iter.foldr(self.f, 32, self.empty()) == 32
def testAddition(self):
"""Use folds to sum lists"""
assert Iter.foldl(self.f, 0, self.one_to_100()) == 5050
assert Iter.foldr(self.f, 0, self.one_to_100()) == 5050
def testLargeAddition(self):
"""Folds on 10000 element iterators"""
assert Iter.foldl(self.f, 0, iter(range(1, 10001))) == 50005000
self.assertRaises(RuntimeError,
Iter.foldr, self.f, 0, iter(range(1, 10001)))
def testLen(self):
"""Use folds to calculate length of lists"""
assert Iter.foldl(lambda x, y: x+1, 0, self.evens()) == 50
assert Iter.foldr(lambda x, y: y+1, 0, self.odds()) == 50
class MultiplexTest(Iterators):
def testSingle(self):
"""Test multiplex single stream"""
i_orig = self.one_to_100()
i2_orig = self.one_to_100()
i = Iter.multiplex(i_orig, 1)[0]
assert Iter.equal(i, i2_orig)
def testTrible(self):
"""Test splitting iterator into three"""
counter = [0]
def ff(x): counter[0] += 1
i_orig = self.one_to_100()
i2_orig = self.one_to_100()
i1, i2, i3 = Iter.multiplex(i_orig, 3, ff)
assert Iter.equal(i1, i2)
assert Iter.equal(i3, i2_orig)
assert counter[0] == 100, counter
def testDouble(self):
"""Test splitting into two..."""
i1, i2 = Iter.multiplex(self.one_to_100(), 2)
assert Iter.equal(i1, self.one_to_100())
assert Iter.equal(i2, self.one_to_100())
if __name__ == "__main__": unittest.main()
import unittest, random
from commontest import *
from rdiff_backup import librsync, log
def MakeRandomFile(path, length = None):
"""Writes a random file of given length, or random len if unspecified"""
if not length: length = random.randrange(5000, 100000)
fp = open(path, "wb")
fp_random = open('/dev/urandom', 'rb')
# Old slow way, may still be of use on systems without /dev/urandom
#randseq = []
#for i in xrange(random.randrange(5000, 30000)):
# randseq.append(chr(random.randrange(256)))
#fp.write("".join(randseq))
fp.write(fp_random.read(length))
fp.close()
fp_random.close()
class LibrsyncTest(unittest.TestCase):
"""Test various librsync wrapper functions"""
basis = RPath(Globals.local_connection, "testfiles/basis")
new = RPath(Globals.local_connection, "testfiles/new")
new2 = RPath(Globals.local_connection, "testfiles/new2")
sig = RPath(Globals.local_connection, "testfiles/signature")
sig2 = RPath(Globals.local_connection, "testfiles/signature2")
delta = RPath(Globals.local_connection, "testfiles/delta")
def sig_file_test_helper(self, blocksize, iterations, file_len = None):
"""Compare SigFile output to rdiff output at given blocksize"""
for i in range(iterations):
MakeRandomFile(self.basis.path, file_len)
if self.sig.lstat(): self.sig.delete()
assert not os.system("rdiff -b %s signature %s %s" %
(blocksize, self.basis.path, self.sig.path))
fp = self.sig.open("rb")
rdiff_sig = fp.read()
fp.close()
sf = librsync.SigFile(self.basis.open("rb"), blocksize)
librsync_sig = sf.read()
sf.close()
assert rdiff_sig == librsync_sig, \
(len(rdiff_sig), len(librsync_sig))
def testSigFile(self):
"""Make sure SigFile generates same data as rdiff, blocksize 512"""
self.sig_file_test_helper(512, 5)
def testSigFile2(self):
"""Test SigFile like above, but try various blocksize"""
self.sig_file_test_helper(2048, 1, 60000)
self.sig_file_test_helper(7168, 1, 6000)
self.sig_file_test_helper(204800, 1, 40*1024*1024)
def testSigGenerator(self):
"""Test SigGenerator, make sure it's same as SigFile"""
for i in range(5):
MakeRandomFile(self.basis.path)
sf = librsync.SigFile(self.basis.open("rb"))
sigfile_string = sf.read()
sf.close()
sig_gen = librsync.SigGenerator()
infile = self.basis.open("rb")
while 1:
buf = infile.read(1000)
if not buf: break
sig_gen.update(buf)
siggen_string = sig_gen.getsig()
assert sigfile_string == siggen_string, \
(len(sigfile_string), len(siggen_string))
def OldtestDelta(self):
"""Test delta generation against Rdiff"""
MakeRandomFile(self.basis.path)
assert not os.system("rdiff signature %s %s" %
(self.basis.path, self.sig.path))
for i in range(5):
MakeRandomFile(self.new.path)
assert not os.system("rdiff delta %s %s %s" %
(self.sig.path, self.new.path, self.delta.path))
fp = self.delta.open("rb")
rdiff_delta = fp.read()
fp.close()
df = librsync.DeltaFile(self.sig.open("rb"), self.new.open("rb"))
librsync_delta = df.read()
df.close()
print len(rdiff_delta), len(librsync_delta)
print repr(rdiff_delta[:100])
print repr(librsync_delta[:100])
assert rdiff_delta == librsync_delta
def testDelta(self):
"""Test delta generation by making sure rdiff can process output
There appears to be some undeterminism so we can't just
byte-compare the deltas produced by rdiff and DeltaFile.
"""
MakeRandomFile(self.basis.path)
assert not os.system("rdiff signature %s %s" %
(self.basis.path, self.sig.path))
for i in range(5):
MakeRandomFile(self.new.path)
df = librsync.DeltaFile(self.sig.open("rb"), self.new.open("rb"))
librsync_delta = df.read()
df.close()
fp = self.delta.open("wb")
fp.write(librsync_delta)
fp.close()
assert not os.system("rdiff patch %s %s %s" %
(self.basis.path, self.delta.path,
self.new2.path))
new_fp = self.new.open("rb")
new = new_fp.read()
new_fp.close()
new2_fp = self.new2.open("rb")
new2 = new2_fp.read()
new2_fp.close()
assert new == new2, (len(new), len(new2))
def testPatch(self):
"""Test patching against Rdiff"""
MakeRandomFile(self.basis.path)
assert not os.system("rdiff signature %s %s" %
(self.basis.path, self.sig.path))
for i in range(5):
MakeRandomFile(self.new.path)
assert not os.system("rdiff delta %s %s %s" %
(self.sig.path, self.new.path, self.delta.path))
fp = self.new.open("rb")
real_new = fp.read()
fp.close()
pf = librsync.PatchedFile(self.basis.open("rb"),
self.delta.open("rb"))
librsync_new = pf.read()
pf.close()
assert real_new == librsync_new, \
(len(real_new), len(librsync_new))
if __name__ == "__main__": unittest.main()
#!/bin/sh
# This script will create the testing/restoretest3 directory as it
# needs to be for one of the tests in restoretest.py to work.
rm -rf testfiles/restoretest3
rdiff-backup --current-time 10000 testfiles/increment1 testfiles/restoretest3
rdiff-backup --current-time 20000 testfiles/increment2 testfiles/restoretest3
rdiff-backup --current-time 30000 testfiles/increment3 testfiles/restoretest3
rdiff-backup --current-time 40000 testfiles/increment4 testfiles/restoretest3
import unittest, os, cStringIO, time
from rdiff_backup.metadata import *
from rdiff_backup import rpath, connection, Globals, selection
tempdir = rpath.RPath(Globals.local_connection, "testfiles/output")
class MetadataTest(unittest.TestCase):
def make_temp(self):
"""Make temp directory testfiles/output"""
global tempdir
if tempdir.lstat(): tempdir.delete()
tempdir.mkdir()
def testQuote(self):
"""Test quoting and unquoting"""
filenames = ["foo", ".", "hello\nthere", "\\", "\\\\\\",
"h\no\t\x87\n", " "]
for filename in filenames:
quoted = quote_path(filename)
assert not "\n" in quoted
result = unquote_path(quoted)
assert result == filename, (quoted, result, filename)
def get_rpaths(self):
"""Return list of rorps"""
vft = rpath.RPath(Globals.local_connection,
"testfiles/various_file_types")
rpaths = map(lambda x: vft.append(x), vft.listdir())
extra_rpaths = map(lambda x: rpath.RPath(Globals.local_connection, x),
['/bin/ls', '/dev/ttyS0', '/dev/hda', 'aoeuaou'])
return [vft] + rpaths + extra_rpaths
def testRORP2Record(self):
"""Test turning RORPs into records and back again"""
for rp in self.get_rpaths():
record = RORP2Record(rp)
#print record
new_rorp = Record2RORP(record)
assert new_rorp == rp, (new_rorp, rp, record)
def testIterator(self):
"""Test writing RORPs to file and iterating them back"""
l = self.get_rpaths()
fp = cStringIO.StringIO()
write_rorp_iter_to_file(iter(l), fp)
fp.seek(0)
cstring = fp.read()
fp.seek(0)
outlist = list(rorp_extractor(fp).iterate())
assert len(l) == len(outlist), (len(l), len(outlist))
for i in range(len(l)):
if not l[i].equal_verbose(outlist[i]):
#print cstring
assert 0, (i, str(l[i]), str(outlist[i]))
fp.close()
def write_metadata_to_temp(self):
"""If necessary, write metadata of bigdir to file metadata.gz"""
global tempdir
temprp = tempdir.append("metadata.gz")
if temprp.lstat(): return temprp
self.make_temp()
rootrp = rpath.RPath(Globals.local_connection, "testfiles/bigdir")
rpath_iter = selection.Select(rootrp).set_iter()
start_time = time.time()
OpenMetadata(temprp)
for rp in rpath_iter: WriteMetadata(rp)
CloseMetadata()
print "Writing metadata took %s seconds" % (time.time() - start_time)
return temprp
def testSpeed(self):
"""Test testIterator on 10000 files"""
temprp = self.write_metadata_to_temp()
start_time = time.time(); i = 0
for rorp in GetMetadata(temprp): i += 1
print "Reading %s metadata entries took %s seconds." % \
(i, time.time() - start_time)
start_time = time.time()
blocksize = 32 * 1024
tempfp = temprp.open("rb", compress = 1)
while 1:
buf = tempfp.read(blocksize)
if not buf: break
assert not tempfp.close()
print "Simply decompressing metadata file took %s seconds" % \
(time.time() - start_time)
def testIterate_restricted(self):
"""Test getting rorps restricted to certain index
In this case, get assume subdir (subdir3, subdir10) has 50
files in it.
"""
temprp = self.write_metadata_to_temp()
start_time = time.time(); i = 0
for rorp in GetMetadata(temprp, ("subdir3", "subdir10")): i += 1
print "Reading %s metadata entries took %s seconds." % \
(i, time.time() - start_time)
assert i == 51
if __name__ == "__main__": unittest.main()
import unittest, random
from commontest import *
from rdiff_backup import Globals, Rdiff, selection, log, rpath
Log.setverbosity(7)
def MakeRandomFile(path):
"""Writes a random file of length between 10000 and 100000"""
fp = open(path, "w")
randseq = []
for i in xrange(random.randrange(5000, 30000)):
randseq.append(chr(random.randrange(256)))
fp.write("".join(randseq))
fp.close()
class RdiffTest(unittest.TestCase):
"""Test rdiff"""
lc = Globals.local_connection
basis = rpath.RPath(lc, "testfiles/basis")
new = rpath.RPath(lc, "testfiles/new")
output = rpath.RPath(lc, "testfiles/output")
delta = rpath.RPath(lc, "testfiles/delta")
signature = rpath.RPath(lc, "testfiles/signature")
def testRdiffSig(self):
"""Test making rdiff signatures"""
sig = rpath.RPath(self.lc,
"testfiles/various_file_types/regular_file.sig")
sigfp = sig.open("r")
rfsig = Rdiff.get_signature(RPath(self.lc, "testfiles/various_file_types/regular_file"), 2048)
assert rpath.cmpfileobj(sigfp, rfsig)
sigfp.close()
rfsig.close()
def testRdiffDeltaPatch(self):
"""Test making deltas and patching files"""
rplist = [self.basis, self.new, self.delta,
self.signature, self.output]
for rp in rplist:
if rp.lstat(): rp.delete()
for i in range(2):
MakeRandomFile(self.basis.path)
MakeRandomFile(self.new.path)
map(rpath.RPath.setdata, [self.basis, self.new])
assert self.basis.lstat() and self.new.lstat()
self.signature.write_from_fileobj(Rdiff.get_signature(self.basis))
assert self.signature.lstat()
self.delta.write_from_fileobj(Rdiff.get_delta_sigrp(self.signature,
self.new))
assert self.delta.lstat()
Rdiff.patch_local(self.basis, self.delta, self.output)
assert rpath.cmp(self.new, self.output)
map(rpath.RPath.delete, rplist)
def testRdiffDeltaPatchGzip(self):
"""Same as above by try gzipping patches"""
rplist = [self.basis, self.new, self.delta,
self.signature, self.output]
for rp in rplist:
if rp.lstat(): rp.delete()
MakeRandomFile(self.basis.path)
MakeRandomFile(self.new.path)
map(rpath.RPath.setdata, [self.basis, self.new])
assert self.basis.lstat() and self.new.lstat()
self.signature.write_from_fileobj(Rdiff.get_signature(self.basis))
assert self.signature.lstat()
self.delta.write_from_fileobj(Rdiff.get_delta_sigrp(self.signature,
self.new))
assert self.delta.lstat()
os.system("gzip " + self.delta.path)
os.system("mv %s %s" % (self.delta.path + ".gz", self.delta.path))
self.delta.setdata()
Rdiff.patch_local(self.basis, self.delta, self.output,
delta_compressed = 1)
assert rpath.cmp(self.new, self.output)
map(rpath.RPath.delete, rplist)
def testWriteDelta(self):
"""Test write delta feature of rdiff"""
if self.delta.lstat(): self.delta.delete()
rplist = [self.basis, self.new, self.delta, self.output]
MakeRandomFile(self.basis.path)
MakeRandomFile(self.new.path)
map(rpath.RPath.setdata, [self.basis, self.new])
assert self.basis.lstat() and self.new.lstat()
Rdiff.write_delta(self.basis, self.new, self.delta)
assert self.delta.lstat()
Rdiff.patch_local(self.basis, self.delta, self.output)
assert rpath.cmp(self.new, self.output)
map(rpath.RPath.delete, rplist)
def testWriteDeltaGzip(self):
"""Same as above but delta is written gzipped"""
rplist = [self.basis, self.new, self.delta, self.output]
MakeRandomFile(self.basis.path)
MakeRandomFile(self.new.path)
map(rpath.RPath.setdata, [self.basis, self.new])
assert self.basis.lstat() and self.new.lstat()
delta_gz = rpath.RPath(self.delta.conn, self.delta.path + ".gz")
if delta_gz.lstat(): delta_gz.delete()
Rdiff.write_delta(self.basis, self.new, delta_gz, 1)
assert delta_gz.lstat()
os.system("gunzip " + delta_gz.path)
delta_gz.setdata()
self.delta.setdata()
Rdiff.patch_local(self.basis, self.delta, self.output)
assert rpath.cmp(self.new, self.output)
map(rpath.RPath.delete, rplist)
def testRdiffRename(self):
"""Rdiff replacing original file with patch outfile"""
rplist = [self.basis, self.new, self.delta, self.signature]
for rp in rplist:
if rp.lstat(): rp.delete()
MakeRandomFile(self.basis.path)
MakeRandomFile(self.new.path)
map(rpath.RPath.setdata, [self.basis, self.new])
assert self.basis.lstat() and self.new.lstat()
self.signature.write_from_fileobj(Rdiff.get_signature(self.basis))
assert self.signature.lstat()
self.delta.write_from_fileobj(Rdiff.get_delta_sigrp(self.signature,
self.new))
assert self.delta.lstat()
Rdiff.patch_local(self.basis, self.delta)
assert rpath.cmp(self.basis, self.new)
map(rpath.RPath.delete, rplist)
def testCopy(self):
"""Using rdiff to copy two files"""
rplist = [self.basis, self.new]
for rp in rplist:
if rp.lstat(): rp.delete()
MakeRandomFile(self.basis.path)
MakeRandomFile(self.new.path)
map(rpath.RPath.setdata, rplist)
Rdiff.copy_local(self.basis, self.new)
assert rpath.cmp(self.basis, self.new)
map(rpath.RPath.delete, rplist)
if __name__ == '__main__':
unittest.main()
import unittest, os
from commontest import *
from rdiff_backup import Globals, SetConnections, log, rpath, backup
"""Regression tests
This one must be run in the rdiff-backup directory, as it requres
chdir-wrapper, the various rdiff-backup files, and the directory
testfiles
"""
Globals.set('change_source_perms', 1)
Globals.counter = 0
log.Log.setverbosity(3)
def get_local_rp(extension):
return rpath.RPath(Globals.local_connection, "testfiles/" + extension)
class Local:
"""This is just a place to put increments relative to the local
connection"""
inc1rp = get_local_rp('increment1')
inc2rp = get_local_rp('increment2')
inc3rp = get_local_rp('increment3')
inc4rp = get_local_rp('increment4')
rpout = get_local_rp('output')
rpout_inc = get_local_rp('output_inc')
rpout1 = get_local_rp('restoretarget1')
rpout2 = get_local_rp('restoretarget2')
rpout3 = get_local_rp('restoretarget3')
rpout4 = get_local_rp('restoretarget4')
noperms = get_local_rp('noperms')
noperms_out = get_local_rp('noperms_output')
rootfiles = get_local_rp('root')
rootfiles2 = get_local_rp('root2')
rootfiles21 = get_local_rp('root2.1')
rootfiles_out = get_local_rp('root_output')
rootfiles_out2 = get_local_rp('root_output2')
prefix = get_local_rp('.')
class PathSetter(unittest.TestCase):
def get_prefix_and_conn(self, path, return_path):
"""Return (prefix, connection) tuple"""
if path:
return (return_path,
SetConnections.init_connection("./chdir-wrapper "+path))
else: return ('./', Globals.local_connection)
def get_src_rp(self, path):
return rpath.RPath(self.src_conn, self.src_prefix + path)
def get_dest_rp(self, path):
return rpath.RPath(self.dest_conn, self.dest_prefix + path)
def set_rbdir(self, rpout):
"""Create rdiff-backup-data dir if not already, tell everyone"""
self.rbdir = self.rpout.append('rdiff-backup-data')
self.rpout.mkdir()
self.rbdir.mkdir()
SetConnections.UpdateGlobal('rbdir', self.rbdir)
def setPathnames(self, src_path, src_return, dest_path, dest_return):
"""Start servers which will run in src_path and dest_path respectively
If either is None, then no server will be run and local
process will handle that end. src_return and dest_return are
the prefix back to the original rdiff-backup directory. So
for instance is src_path is "test2/tmp", then src_return will
be '../'.
"""
# Clear old data that may rely on deleted connections
Globals.isbackup_writer = None
Globals.isbackup_reader = None
Globals.rbdir = None
print "Setting up connection"
self.src_prefix, self.src_conn = \
self.get_prefix_and_conn(src_path, src_return)
self.dest_prefix, self.dest_conn = \
self.get_prefix_and_conn(dest_path, dest_return)
SetConnections.BackupInitConnections(self.src_conn, self.dest_conn)
assert not os.system("rm -rf testfiles/output* "
"testfiles/restoretarget* "
"testfiles/noperms_output testfiles/root_output "
"testfiles/unreadable_out")
self.inc1rp = self.get_src_rp("testfiles/increment1")
self.inc2rp = self.get_src_rp('testfiles/increment2')
self.inc3rp = self.get_src_rp('testfiles/increment3')
self.inc4rp = self.get_src_rp('testfiles/increment4')
self.rpout_inc = self.get_dest_rp('testfiles/output_inc')
self.rpout1 = self.get_dest_rp('testfiles/restoretarget1')
self.rpout2 = self.get_dest_rp('testfiles/restoretarget2')
self.rpout3 = self.get_dest_rp('testfiles/restoretarget3')
self.rpout4 = self.get_dest_rp('testfiles/restoretarget4')
self.rpout = self.get_dest_rp('testfiles/output')
self.set_rbdir(self.rpout)
self.noperms = self.get_src_rp('testfiles/noperms')
self.noperms_out = self.get_dest_rp('testfiles/noperms_output')
self.rootfiles = self.get_src_rp('testfiles/root')
self.rootfiles_out = self.get_dest_rp('testfiles/root_output')
self.rootfiles2 = self.get_src_rp('testfiles/root2')
self.rootfiles21 = self.get_src_rp('testfiles/root2.1')
self.rootfiles_out2 = self.get_dest_rp('testfiles/root_output2')
self.one_unreadable = self.get_src_rp('testfiles/one_unreadable')
self.one_unreadable_out = self.get_dest_rp('testfiles/unreadable_out')
def tearDown(self):
print "Taking down connections"
SetConnections.CloseConnections()
class IncrementTest1(unittest.TestCase):
dirlist = ["testfiles/increment1", "testfiles/increment2",
"testfiles/increment3", "testfiles/increment4"]
gzip_dirlist = ["testfiles/gzips/inc1", "testfiles/gzips/inc2"]
def testLocalGzipinc(self):
"""Local test small archive which exercises gzip options"""
BackupRestoreSeries(1, 1, self.gzip_dirlist)
def testRemoteBothGzipinc(self):
"""Remote test small archive which exercises gzip options"""
BackupRestoreSeries(None, None, self.gzip_dirlist)
def testLocalinc(self):
"""Test self.incrementing, and then restoring, local"""
BackupRestoreSeries(1, 1, self.dirlist)
def test_remote_src(self):
"""Increment/Restore when source directory is remote"""
BackupRestoreSeries(None, 1, self.dirlist)
def test_remote_dest(self):
"""Increment/Restore when target directory is remote"""
BackupRestoreSeries(1, None, self.dirlist)
def test_remote_both(self):
"""Increment/Restore when both directories are remote"""
BackupRestoreSeries(None, None, self.dirlist)
def test_long_filenames_local(self):
"""Test backing up a directory with lots of long filenames in it"""
Myrm(Local.rpout.path)
InternalBackup(1, 1, "testfiles/longfilenames1", Local.rpout.path, 100)
InternalBackup(1, 1, "testfiles/longfilenames2", Local.rpout.path, 200)
def test_quoted_hardlinks(self):
"""Test backing up a directory with quoted hardlinks in it"""
hldir = rpath.RPath(Globals.local_connection,
"testfiles/quoted_hardlinks")
if hldir.lstat():
Myrm(hldir.path)
hldir.setdata()
hldir.mkdir()
hl1 = hldir.append("HardLink1")
hl1.touch()
hl2 = hldir.append("HardLink2")
hl2.hardlink(hl1.path)
Myrm(Local.rpout.path)
old_settings = (Globals.quoting_enabled, Globals.chars_to_quote,
Globals.quoting_char)
Globals.quoting_enabled = 1
Globals.chars_to_quote = 'A-Z'
Globals.quoting_char = ';'
InternalBackup(1, 1, hldir.path, Local.rpout.path, current_time = 1)
InternalBackup(1, 1, "testfiles/empty", Local.rpout.path,
current_time = 10000)
(Globals.quoting_enabled, Globals.chars_to_quote,
Globals.quoting_char) = old_settings
def test_long_socket(self):
"""Test backing up a directory with long sockets in them
For some reason many unicies don't allow sockets with long
names to be made in the usual way.
"""
sockdir = rpath.RPath(Globals.local_connection, "testfiles/sockettest")
if sockdir.lstat():
Myrm(sockdir.path)
sockdir.setdata()
sockdir.mkdir()
tmp_sock = sockdir.append("sock")
tmp_sock.mksock()
sock1 = sockdir.append("Long_socket_name---------------------------------------------------------------------------------------------------")
self.assertRaises(rpath.SkipFileException, sock1.mksock)
rpath.rename(tmp_sock, sock1)
assert sock1.issock()
sock2 = sockdir.append("Medium_socket_name--------------------------------------------------------------")
sock2.mksock()
Myrm(Local.rpout.path)
InternalBackup(1, 1, sockdir.path, Local.rpout.path,
current_time = 1)
InternalBackup(1, 1, "testfiles/empty", Local.rpout.path,
current_time = 10000)
def testNoWrite(self):
"""Test backup/restore on dirs without write permissions"""
def write_string(rp, s = ""):
"""Write string s to file"""
fp = rp.open("wb")
fp.write(s)
assert not fp.close()
def make_subdirs():
"""Make testfiles/no_write_out and testfiles/no_write_out2"""
nw_out1 = get_local_rp("no_write_out")
nw_out1.mkdir()
nw_out1_1 = get_local_rp("no_write_out/1")
write_string(nw_out1_1)
nw_out1_1.chmod(0)
nw_out1_2 = get_local_rp("no_write_out/2")
write_string(nw_out1_2, 'e')
nw_out1_1.chmod(0400)
nw1_sub = get_local_rp("no_write_out/subdir")
nw1_sub.mkdir()
nw_out1_sub1 = get_local_rp("no_write_out/subdir/1")
write_string(nw_out1_sub1, 'f')
nw1_sub.chmod(0500)
nw_out1.chmod(0500)
nw_out2 = get_local_rp("no_write_out2")
nw_out2.mkdir()
nw_out2_1 = get_local_rp("no_write_out2/1")
write_string(nw_out2_1, 'g')
nw_out2_2 = get_local_rp("no_write_out2/2")
write_string(nw_out2_2, 'aeu')
nw_out1.chmod(0500)
Myrm("testfiles/no_write_out")
Myrm("testfiles/no_write_out2")
Myrm("testfiles/output")
make_subdirs()
BackupRestoreSeries(1, 1, ['testfiles/no_write_out',
'testfiles/no_write_out2',
'testfiles/empty'])
class MirrorTest(PathSetter):
"""Test some mirroring functions"""
def testLocalMirror(self):
"""Test Local mirroring"""
self.setPathnames(None, None, None, None)
self.runtest()
def testPartialLocalMirror(self):
"""Test updating an existing directory"""
self.setPathnames(None, None, None, None)
self.run_partial_test()
def testRemoteMirror(self):
"""Mirroring when destination is remote"""
self.setPathnames(None, None, 'test1', '../')
self.runtest()
def testPartialRemoteMirror(self):
"""Partial mirroring when destination is remote"""
self.setPathnames(None, None, 'test1', '../')
self.run_partial_test()
def testSourceRemoteMirror(self):
"""Mirroring when source is remote"""
self.setPathnames('test2', '../', None, None)
self.runtest()
def testPartialSourceRemoteMirror(self):
"""Partial Mirroring when source is remote"""
self.setPathnames('test2', '../', None, None)
self.run_partial_test()
def testBothRemoteMirror(self):
"""Mirroring when both directories are remote"""
self.setPathnames('test1', '../', 'test2/tmp', '../../')
self.runtest()
def testPartialBothRemoteMirror(self):
"""Partial mirroring when both directories are remote"""
self.setPathnames('test1', '../', 'test2/tmp', '../../')
self.run_partial_test()
def testPermSkipLocal(self):
"""Test to see if rdiff-backup will skip unreadable files"""
self.setPathnames(None, None, None, None)
Time.setcurtime()
self.Mirror(self.one_unreadable, self.one_unreadable_out)
# Could add test, but for now just make sure it doesn't exit
def testPermSkipRemote(self):
"""Test skip of unreadable files remote"""
self.setPathnames('test1', '../', 'test2/tmp', '../../')
Time.setcurtime()
self.Mirror(self.one_unreadable, self.one_unreadable_out)
# Could add test, but for now just make sure it doesn't exit
def refresh(self, *rps):
for rp in rps: rp.setdata()
def _testRootLocal(self):
"""Test mirroring a directory with dev files and different owners"""
self.setPathnames(None, None, None, None)
Globals.change_ownership = 1
self.refresh(self.rootfiles, self.rootfiles_out,
Local.rootfiles, Local.rootfiles_out) # add uid/gid info
backup.Mirror(self.rootfiles, self.rootfiles_out)
assert CompareRecursive(Local.rootfiles, Local.rootfiles_out)
Globals.change_ownership = None
self.refresh(self.rootfiles, self.rootfiles_out,
Local.rootfiles, Local.rootfiles_out) # remove that info
def _testRootRemote(self):
"""Mirroring root files both ends remote"""
self.setPathnames('test1', '../', 'test2/tmp', '../../')
for conn in Globals.connections:
conn.Globals.set('change_ownership', 1)
self.refresh(self.rootfiles, self.rootfiles_out,
Local.rootfiles, Local.rootfiles_out) # add uid/gid info
backup.Mirror(self.rootfiles, self.rootfiles_out)
assert CompareRecursive(Local.rootfiles, Local.rootfiles_out)
for coon in Globals.connections:
conn.Globals.set('change_ownership', None)
self.refresh(self.rootfiles, self.rootfiles_out,
Local.rootfiles, Local.rootfiles_out) # remove that info
def deleteoutput(self):
assert not os.system("rm -rf testfiles/output*")
self.rbdir = self.rpout.append('rdiff-backup-data')
self.reset_rps()
def reset_rps(self):
"""Use after external changes made, to update the rps"""
for rp in [self.rpout, Local.rpout,
self.rpout_inc, Local.rpout_inc,
self.rpout1, Local.rpout1,
self.rpout2, Local.rpout2,
self.rpout3, Local.rpout3,
self.rpout4, Local.rpout4]:
rp.setdata()
def runtest(self):
self.deleteoutput()
Time.setcurtime()
assert not self.rbdir.lstat()
self.Mirror(self.inc1rp, self.rpout)
assert CompareRecursive(Local.inc1rp, Local.rpout)
self.deleteoutput()
self.Mirror(self.inc2rp, self.rpout)
assert CompareRecursive(Local.inc2rp, Local.rpout)
def run_partial_test(self):
assert not os.system("rm -rf testfiles/output")
assert not os.system("cp -a testfiles/increment3 testfiles/output")
self.reset_rps()
Time.setcurtime()
self.Mirror(self.inc1rp, self.rpout)
#rpath.RPath.copy_attribs(self.inc1rp, self.rpout)
assert CompareRecursive(Local.inc1rp, Local.rpout)
Myrm(Local.rpout.append("rdiff-backup-data").path)
self.Mirror(self.inc2rp, self.rpout)
assert CompareRecursive(Local.inc2rp, Local.rpout)
def Mirror(self, rpin, rpout):
"""Like backup.Mirror, but setup first, cleanup later"""
Main.force = 1
assert not rpout.append("rdiff-backup-data").lstat()
Main.misc_setup([rpin, rpout])
Main.backup_set_select(rpin)
Main.backup_init_dirs(rpin, rpout)
backup.Mirror(rpin, rpout)
log.ErrorLog.close()
log.Log.close_logfile()
Hardlink.clear_dictionaries()
if __name__ == "__main__": unittest.main()
"""regresstest - test the regress module.
Not to be confused with the regression tests.
"""
import unittest
from commontest import *
from rdiff_backup import regress, Time
Log.setverbosity(3)
class RegressTest(unittest.TestCase):
output_rp = rpath.RPath(Globals.local_connection, "testfiles/output")
output_rbdir_rp = output_rp.append_path("rdiff-backup-data")
inc1_rp = rpath.RPath(Globals.local_connection, "testfiles/increment1")
inc2_rp = rpath.RPath(Globals.local_connection, "testfiles/increment2")
inc3_rp = rpath.RPath(Globals.local_connection, "testfiles/increment3")
inc4_rp = rpath.RPath(Globals.local_connection, "testfiles/increment4")
def runtest(self, regress_function):
"""Test regressing a full directory to older state
Make two directories, one with one more backup in it. Then
regress the bigger one, and then make sure they compare the
same.
Regress_function takes a time and should regress
self.output_rp back to that time.
"""
self.output_rp.setdata()
if self.output_rp.lstat(): Myrm(self.output_rp.path)
rdiff_backup(1, 1, self.inc1_rp.path, self.output_rp.path,
current_time = 10000)
assert CompareRecursive(self.inc1_rp, self.output_rp)
rdiff_backup(1, 1, self.inc2_rp.path, self.output_rp.path,
current_time = 20000)
assert CompareRecursive(self.inc2_rp, self.output_rp)
rdiff_backup(1, 1, self.inc3_rp.path, self.output_rp.path,
current_time = 30000)
assert CompareRecursive(self.inc3_rp, self.output_rp)
rdiff_backup(1, 1, self.inc4_rp.path, self.output_rp.path,
current_time = 40000)
assert CompareRecursive(self.inc4_rp, self.output_rp)
Globals.rbdir = self.output_rbdir_rp
regress_function(30000)
assert CompareRecursive(self.inc3_rp, self.output_rp,
compare_hardlinks = 0)
regress_function(20000)
assert CompareRecursive(self.inc2_rp, self.output_rp,
compare_hardlinks = 0)
regress_function(10000)
assert CompareRecursive(self.inc1_rp, self.output_rp,
compare_hardlinks = 0)
def regress_to_time_local(self, time):
"""Regress self.output_rp to time by running regress locally"""
self.output_rp.setdata()
self.output_rbdir_rp.setdata()
self.add_current_mirror(time)
regress.Regress(self.output_rp)
def add_current_mirror(self, time):
"""Add current_mirror marker at given time"""
cur_mirror_rp = self.output_rbdir_rp.append(
"current_mirror.%s.data" % (Time.timetostring(time),))
cur_mirror_rp.touch()
def regress_to_time_remote(self, time):
"""Like test_full above, but run regress remotely"""
self.output_rp.setdata()
self.output_rbdir_rp.setdata()
self.add_current_mirror(time)
cmdline = (SourceDir +
"/../rdiff-backup -v3 --check-destination-dir "
"--remote-schema './chdir-wrapper2 %s' "
"test1::../" + self.output_rp.path)
print "Running:", cmdline
assert not os.system(cmdline)
def test_local(self):
"""Run regress test locally"""
self.runtest(self.regress_to_time_local)
def test_remote(self):
"""Run regress test remotely"""
self.runtest(self.regress_to_time_remote)
if __name__ == "__main__": unittest.main()
import unittest
from commontest import *
from rdiff_backup import log, restore, Globals, rpath, TempFile
Log.setverbosity(3)
lc = Globals.local_connection
tempdir = rpath.RPath(Globals.local_connection, "testfiles/output")
restore_base_rp = rpath.RPath(Globals.local_connection,
"testfiles/restoretest")
restore_base_filenames = restore_base_rp.listdir()
mirror_time = 1041109438 # just some late time
class RestoreFileComparer:
"""Holds a file to be restored and tests against it
Each object has a restore file and a dictionary of times ->
rpaths. When the restore file is restored to one of the given
times, the resulting file should be the same as the related rpath.
"""
def __init__(self, rf):
self.rf = rf
self.time_rp_dict = {}
def add_rpath(self, rp, t):
"""Add rp, which represents what rf should be at given time t"""
assert not self.time_rp_dict.has_key(t)
self.time_rp_dict[t] = rp
def compare_at_time(self, t):
"""Restore file, make sure it is the same at time t"""
log.Log("Checking result at time %s" % (t,), 7)
tf = TempFile.new(tempdir.append("foo"))
restore._mirror_time = mirror_time
restore._rest_time = t
self.rf.set_relevant_incs()
out_rorpath = self.rf.get_attribs().getRORPath()
correct_result = self.time_rp_dict[t]
if out_rorpath.isreg():
out_rorpath.setfile(self.rf.get_restore_fp())
rpath.copy_with_attribs(out_rorpath, tf)
assert tf.equal_verbose(correct_result, check_index = 0), \
"%s, %s" % (tf, correct_result)
if tf.isreg():
assert rpath.cmpfileobj(tf.open("rb"), correct_result.open("rb"))
if tf.lstat(): tf.delete()
def compare_all(self):
"""Check restore results for all available times"""
for t in self.time_rp_dict.keys(): self.compare_at_time(t)
class RestoreTest(unittest.TestCase):
"""Test Restore class"""
def get_rfcs(self):
"""Return available RestoreFileCompararer objects"""
base_rf = restore.RestoreFile(restore_base_rp, restore_base_rp, [])
rfs = base_rf.yield_sub_rfs()
rfcs = []
for rf in rfs:
if rf.mirror_rp.dirsplit()[1] in ["dir"]:
log.Log("skipping 'dir'", 5)
continue
rfc = RestoreFileComparer(rf)
for inc in rf.inc_list:
test_time = inc.getinctime()
rfc.add_rpath(self.get_correct(rf.mirror_rp, test_time),
test_time)
rfc.add_rpath(rf.mirror_rp, mirror_time)
rfcs.append(rfc)
return rfcs
def get_correct(self, mirror_rp, test_time):
"""Return correct version with base mirror_rp at time test_time"""
assert -1 < test_time < 2000000000, test_time
dirname, basename = mirror_rp.dirsplit()
for filename in restore_base_filenames:
comps = filename.split(".")
base = ".".join(comps[:-1])
t = Time.stringtotime(comps[-1])
if t == test_time and basename == base:
return restore_base_rp.append(filename)
# Correct rp must be empty
return restore_base_rp.append("%s.%s" %
(basename, Time.timetostring(test_time)))
def testRestoreSingle(self):
"""Test restoring files one at at a time"""
MakeOutputDir()
for rfc in self.get_rfcs():
if rfc.rf.inc_rp.isincfile(): continue
log.Log("Comparing %s" % (rfc.rf.inc_rp.path,), 5)
rfc.compare_all()
def testBothLocal(self):
"""Test directory restore everything local"""
self.restore_dir_test(1,1)
def testMirrorRemote(self):
"""Test directory restore mirror is remote"""
self.restore_dir_test(0, 1)
def testDestRemote(self):
"""Test directory restore destination is remote"""
self.restore_dir_test(1, 0)
def testBothRemote(self):
"""Test directory restore everything is remote"""
self.restore_dir_test(0, 0)
def restore_dir_test(self, mirror_local, dest_local):
"""Run whole dir tests
If any of the above tests don't work, try rerunning
makerestoretest3.
"""
Myrm("testfiles/output")
target_rp = rpath.RPath(Globals.local_connection, "testfiles/output")
mirror_rp = rpath.RPath(Globals.local_connection,
"testfiles/restoretest3")
inc1_rp = rpath.RPath(Globals.local_connection,
"testfiles/increment1")
inc2_rp = rpath.RPath(Globals.local_connection,
"testfiles/increment2")
inc3_rp = rpath.RPath(Globals.local_connection,
"testfiles/increment3")
inc4_rp = rpath.RPath(Globals.local_connection,
"testfiles/increment4")
InternalRestore(mirror_local, dest_local, "testfiles/restoretest3",
"testfiles/output", 45000)
assert CompareRecursive(inc4_rp, target_rp)
InternalRestore(mirror_local, dest_local, "testfiles/restoretest3",
"testfiles/output", 35000)
assert CompareRecursive(inc3_rp, target_rp, compare_hardlinks = 0)
InternalRestore(mirror_local, dest_local, "testfiles/restoretest3",
"testfiles/output", 25000)
assert CompareRecursive(inc2_rp, target_rp, compare_hardlinks = 0)
InternalRestore(mirror_local, dest_local, "testfiles/restoretest3",
"testfiles/output", 5000)
assert CompareRecursive(inc1_rp, target_rp, compare_hardlinks = 0)
# def testRestoreCorrupt(self):
# """Test restoring a partially corrupt archive
#
# The problem here is that a directory is missing from what is
# to be restored, but because the previous backup was aborted in
# the middle, some of the files in that directory weren't marked
# as .missing.
#
# """
# Myrm("testfiles/output")
# InternalRestore(1, 1, "testfiles/restoretest4", "testfiles/output",
# 10000)
# assert os.lstat("testfiles/output")
# self.assertRaises(OSError, os.lstat, "testfiles/output/tmp")
# self.assertRaises(OSError, os.lstat, "testfiles/output/rdiff-backup")
def testRestoreNoincs(self):
"""Test restoring a directory with no increments, just mirror"""
Myrm("testfiles/output")
InternalRestore(1, 1, 'testfiles/restoretest5/regular_file', 'testfiles/output',
10000)
assert os.lstat("testfiles/output")
if __name__ == "__main__": unittest.main()
import unittest
from commontest import *
from rdiff_backup.rlist import *
class BasicObject:
"""The simplest object that can be used with RList"""
def __init__(self, i):
self.index = i
self.data = "This is object # %d" % i
def __eq__(self, other):
return self.index == other.index and self.data == other.data
l1_pre = filter(lambda x: x != 342 and not x in [650, 651, 652] and
x != 911 and x != 987,
range(1, 1001))
l2_pre = filter(lambda x: not x in [222, 223, 224, 225] and x != 950
and x != 999 and x != 444,
range(1, 1001))
l1 = map(BasicObject, l1_pre)
l2 = map(BasicObject, l2_pre)
combined = map(BasicObject, range(1, 1001))
def lmaphelper2((x, i)):
"""Return difference triple to say that index x only in list # i"""
if i == 1: return (BasicObject(x), None)
elif i == 2: return (None, BasicObject(x))
else: assert 0, "Invalid parameter %s for i" % i
difference1 = map(lmaphelper2, [(222, 1), (223, 1), (224, 1), (225, 1),
(342, 2), (444, 1), (650, 2), (651, 2),
(652, 2), (911, 2), (950, 1), (987, 2),
(999, 1)])
difference2 = map(lambda (a, b): (b, a), difference1)
def comparelists(l1, l2):
print len(l1), len(l2)
for i in range(len(l1)):
if l1[i] != l2[i]: print l1[i], l2[i]
print l1
print l2
class RListTest(unittest.TestCase):
def setUp(self):
"""Make signatures, deltas"""
self.l1_sig = RList.Signatures(l1)
self.l2_sig = RList.Signatures(l2)
self.l1_to_l2_diff = RList.Deltas(self.l1_sig, l2)
self.l2_to_l1_diff = RList.Deltas(self.l2_sig, l1)
# for d in makedeltas(makesigs(l2ci(l1)), l2ci(l2)):
# print d.min, d.max
# print d.elemlist
def testPatching(self):
"""Test to make sure each list can be reconstructed from other"""
newlist = list(RList.Patch(l1, RList.Deltas(RList.Signatures(l1),
l2)))
assert l2 == newlist
newlist = list(RList.Patch(l2, RList.Deltas(RList.Signatures(l2),
l1)))
assert l1 == newlist
def testDifference(self):
"""Difference between lists correctly identified"""
diff = list(RList.Dissimilar(l1, RList.Deltas(RList.Signatures(l1),
l2)))
assert diff == difference1
diff = list(RList.Dissimilar(l2, RList.Deltas(RList.Signatures(l2),
l1)))
assert diff == difference2
class CachingIterTest(unittest.TestCase):
"""Test the Caching Iter object"""
def testNormalIter(self):
"""Make sure it can act like a normal iterator"""
ci = CachingIter(iter(range(10)))
for i in range(10): assert i == ci.next()
self.assertRaises(StopIteration, ci.next)
def testPushing(self):
"""Pushing extra objects onto the iterator"""
ci = CachingIter(iter(range(10)))
ci.push(12)
ci.push(11)
assert ci.next() == 11
assert ci.next() == 12
assert ci.next() == 0
ci.push(10)
assert ci.next() == 10
if __name__ == "__main__": unittest.main()
import os, unittest
from commontest import *
from rdiff_backup import rpath, robust, TempFile, Globals
class RobustTest(unittest.TestCase):
"""Test robust module"""
def test_check_common_error(self):
"""Test capturing errors"""
def cause_catchable_error(a):
os.lstat("aoenuthaoeu/aosutnhcg.4fpr,38p")
def cause_uncatchable_error():
ansoethusaotneuhsaotneuhsaontehuaou
result = robust.check_common_error(None, cause_catchable_error, [1])
assert result is None, result
try: robust.check_common_error(None, cause_uncatchable_error)
except NameError: pass
else: assert 0, "Key error not raised"
if __name__ == '__main__': unittest.main()
import unittest, os
from commontest import *
from rdiff_backup import Globals, log
"""Root tests
This is mainly a copy of regressiontest.py, but contains the two tests
that are meant to be run as root.
"""
Globals.set('change_source_perms', None)
Globals.counter = 0
log.Log.setverbosity(6)
def Run(cmd):
print "Running: ", cmd
assert not os.system(cmd)
class RootTest(unittest.TestCase):
dirlist1 = ["testfiles/root", "testfiles/various_file_types", "testfiles/increment4"]
dirlist2 = ["testfiles/increment4", "testfiles/root",
"testfiles/increment1"]
def testLocal1(self): BackupRestoreSeries(1, 1, self.dirlist1)
def testLocal2(self): BackupRestoreSeries(1, 1, self.dirlist2)
def testRemote(self): BackupRestoreSeries(None, None, self.dirlist1)
class NonRoot(unittest.TestCase):
"""Test backing up as non-root user
Test backing up a directory with files of different userids and
with device files in it, as a non-root user. When restoring as
root, everything should be restored normally.
"""
user = 'ben'
def make_root_dirs(self):
"""Make directory createable only by root"""
rp = rpath.RPath(Globals.local_connection, "testfiles/root_out1")
if rp.lstat(): Myrm(rp.path)
rp.mkdir()
rp1 = rp.append("1")
rp1.touch()
rp2 = rp.append("2")
rp2.touch()
rp2.chown(1, 1)
rp3 = rp.append("3")
rp3.touch()
rp3.chown(2, 2)
rp4 = rp.append("dev")
rp4.makedev('c', 4, 28)
sp = rpath.RPath(Globals.local_connection, "testfiles/root_out2")
if sp.lstat(): Myrm(sp.path)
Run("cp -a %s %s" % (rp.path, sp.path))
rp2 = sp.append("2")
rp2.chown(2, 2)
rp3 = sp.append("3")
rp3.chown(1, 1)
assert not CompareRecursive(rp, sp, compare_ownership = 1)
return rp, sp
def backup(self, input_rp, output_rp, time):
backup_cmd = ("rdiff-backup --no-compare-inode "
"--current-time %s %s %s" %
(time, input_rp.path, output_rp.path))
Run("su %s -c '%s'" % (self.user, backup_cmd))
def restore(self, dest_rp, restore_rp, time = None):
assert restore_rp.path == "testfiles/rest_out"
Myrm(restore_rp.path)
if time is None: time = "now"
restore_cmd = "rdiff-backup -r %s %s %s" % (time, dest_rp.path,
restore_rp.path,)
Run(restore_cmd)
def test_non_root(self):
"""Main non-root -> root test"""
Myrm("testfiles/output")
input_rp1, input_rp2 = self.make_root_dirs()
Globals.change_ownership = 1
output_rp = rpath.RPath(Globals.local_connection, "testfiles/output")
restore_rp = rpath.RPath(Globals.local_connection,
"testfiles/rest_out")
empty_rp = rpath.RPath(Globals.local_connection, "testfiles/empty")
self.backup(input_rp1, output_rp, 1000000)
self.restore(output_rp, restore_rp)
assert CompareRecursive(input_rp1, restore_rp, compare_ownership = 1)
self.backup(input_rp2, output_rp, 2000000)
self.restore(output_rp, restore_rp)
assert CompareRecursive(input_rp2, restore_rp, compare_ownership = 1)
self.backup(empty_rp, output_rp, 3000000)
self.restore(output_rp, restore_rp)
assert CompareRecursive(empty_rp, restore_rp, compare_ownership = 1)
self.restore(output_rp, restore_rp, 1000000)
assert CompareRecursive(input_rp1, restore_rp, compare_ownership = 1)
self.restore(output_rp, restore_rp, 2000000)
assert CompareRecursive(input_rp2, restore_rp, compare_ownership = 1)
if __name__ == "__main__": unittest.main()
from __future__ import generators
import unittest, time, pickle
from commontest import *
from rdiff_backup import log, rpath, rorpiter, Globals, lazy
#Log.setverbosity(8)
class index:
"""This is just used below to test the iter tree reducer"""
def __init__(self, index):
self.index = index
class RORPIterTest(unittest.TestCase):
def setUp(self):
self.lc = Globals.local_connection
self.inc0rp = rpath.RPath(self.lc, "testfiles/empty", ())
self.inc1rp = rpath.RPath(self.lc, "testfiles/inc-reg-perms1", ())
self.inc2rp = rpath.RPath(self.lc, "testfiles/inc-reg-perms2", ())
self.output = rpath.RPath(self.lc, "testfiles/output", ())
def testCollateIterators(self):
"""Test basic collating"""
indicies = map(index, [0,1,2,3])
helper = lambda i: indicies[i]
makeiter1 = lambda: iter(indicies)
makeiter2 = lambda: iter(map(helper, [0,1,3]))
makeiter3 = lambda: iter(map(helper, [1,2]))
outiter = rorpiter.CollateIterators(makeiter1(), makeiter2())
assert lazy.Iter.equal(outiter,
iter([(indicies[0], indicies[0]),
(indicies[1], indicies[1]),
(indicies[2], None),
(indicies[3], indicies[3])]))
assert lazy.Iter.equal(rorpiter.CollateIterators(makeiter1(),
makeiter2(),
makeiter3()),
iter([(indicies[0], indicies[0], None),
(indicies[1], indicies[1], indicies[1]),
(indicies[2], None, indicies[2]),
(indicies[3], indicies[3], None)]))
assert lazy.Iter.equal(rorpiter.CollateIterators(makeiter1(),
iter([])),
iter(map(lambda i: (i, None),
indicies)))
assert lazy.Iter.equal(iter(map(lambda i: (i, None), indicies)),
rorpiter.CollateIterators(makeiter1(),
iter([])))
def compare_no_times(self, src_rp, dest_rp):
"""Compare but disregard directories attributes"""
def equal(src_rorp, dest_rorp):
return ((src_rorp.isdir() and dest_rorp.isdir()) or
src_rorp == dest_rorp)
return CompareRecursive(src_rp, dest_rp, None, equal)
class IndexedTupleTest(unittest.TestCase):
def testTuple(self):
"""Test indexed tuple"""
i = rorpiter.IndexedTuple((1,2,3), ("a", "b"))
i2 = rorpiter.IndexedTuple((), ("hello", "there", "how are you"))
assert i[0] == "a"
assert i[1] == "b"
assert i2[1] == "there"
assert len(i) == 2 and len(i2) == 3
assert i2 < i, i2 < i
def testTupleAssignment(self):
a, b, c = rorpiter.IndexedTuple((), (1, 2, 3))
assert a == 1
assert b == 2
assert c == 3
class DirHandlerTest(unittest.TestCase):
made_test_dir = 0 # Set to 1 once we have made the test dir
def make_test_dir(self):
"""Make the test directory"""
self.rootrp = RPath(Globals.local_connection, "testfiles/output")
self.rootrp.delete()
self.rootrp.mkdir()
self.a = self.rootrp.append("a")
self.b = self.rootrp.append("b")
self.c = self.rootrp.append("c")
self.a.mkdir()
self.b.mkdir()
self.b.chmod(0700)
self.c.mkdir()
self.c.chmod(0500) # No write permissions to c
self.rootmtime = self.rootrp.getmtime()
self.amtime = self.a.getmtime()
self.bmtime = self.b.getmtime()
self.cmtime = self.c.getmtime()
self.made_test_dir = 1
def test_times_and_writes(self):
"""Test writing without disrupting times, and to unwriteable dir"""
return
self.make_test_dir()
time.sleep(1) # make sure the mtimes would get updated otherwise
DH = DirHandler(self.rootrp)
new_a_rp = self.a.append("foo")
DH(new_a_rp)
new_a_rp.touch()
DH(self.b)
self.b.chmod(0751)
new_b_rp = self.b.append("aoenuth")
DH(new_b_rp)
new_b_rp.touch()
new_root_rp = self.rootrp.append("bb")
DH(new_root_rp)
new_root_rp.touch()
new_c_rp = self.c.append("bar")
DH(new_c_rp)
new_c_rp.touch()
DH.Finish()
assert new_a_rp.lstat() and new_b_rp.lstat() and new_c_rp.lstat()
self.a.setdata()
self.b.setdata()
self.c.setdata()
assert self.a.getmtime() == self.amtime
assert self.c.getmtime() == self.cmtime
assert self.rootrp.getmtime() == self.rootmtime
assert self.b.getperms() == 0751
assert self.c.getperms() == 0500
class FillTest(unittest.TestCase):
def test_fill_in(self):
"""Test fill_in_iter"""
rootrp = RPath(Globals.local_connection, "testfiles/output")
def get_rpiter():
for int_index in [(1,2), (1,3), (1,4),
(2,), (2,1),
(3,4,5), (3,6)]:
index = tuple(map(lambda i: str(i), int_index))
yield rootrp.new_index(index)
filled_in = rorpiter.FillInIter(get_rpiter(), rootrp)
rp_list = list(filled_in)
index_list = map(lambda rp: tuple(map(int, rp.index)), rp_list)
assert index_list == [(), (1,), (1,2), (1,3), (1,4),
(2,), (2,1),
(3,), (3,4), (3,4,5), (3,6)], index_list
class ITRBadder(rorpiter.ITRBranch):
def start_process(self, index):
self.total = 0
def end_process(self):
if self.base_index:
summand = self.base_index[-1]
#print "Adding ", summand
self.total += summand
def branch_process(self, subinstance):
#print "Adding subinstance ", subinstance.total
self.total += subinstance.total
class ITRBadder2(rorpiter.ITRBranch):
def start_process(self, index):
self.total = 0
def end_process(self):
#print "Adding ", self.base_index
self.total += reduce(lambda x,y: x+y, self.base_index, 0)
def can_fast_process(self, index):
if len(index) == 3: return 1
else: return None
def fast_process(self, index):
self.total += index[0] + index[1] + index[2]
def branch_process(self, subinstance):
#print "Adding branch ", subinstance.total
self.total += subinstance.total
class TreeReducerTest(unittest.TestCase):
def setUp(self):
self.i1 = [(), (1,), (2,), (3,)]
self.i2 = [(0,), (0,1), (0,1,0), (0,1,1), (0,2), (0,2,1), (0,3)]
self.i1a = [(), (1,)]
self.i1b = [(2,), (3,)]
self.i2a = [(0,), (0,1), (0,1,0)]
self.i2b = [(0,1,1), (0,2)]
self.i2c = [(0,2,1), (0,3)]
def testTreeReducer(self):
"""testing IterTreeReducer"""
itm = rorpiter.IterTreeReducer(ITRBadder, [])
for index in self.i1:
val = itm(index)
assert val, (val, index)
itm.Finish()
assert itm.root_branch.total == 6, itm.root_branch.total
itm2 = rorpiter.IterTreeReducer(ITRBadder2, [])
for index in self.i2:
val = itm2(index)
if index == (): assert not val
else: assert val
itm2.Finish()
assert itm2.root_branch.total == 12, itm2.root_branch.total
def testTreeReducerState(self):
"""Test saving and recreation of an IterTreeReducer"""
itm1a = rorpiter.IterTreeReducer(ITRBadder, [])
for index in self.i1a:
val = itm1a(index)
assert val, index
itm1b = pickle.loads(pickle.dumps(itm1a))
for index in self.i1b:
val = itm1b(index)
assert val, index
itm1b.Finish()
assert itm1b.root_branch.total == 6, itm1b.root_branch.total
itm2a = rorpiter.IterTreeReducer(ITRBadder2, [])
for index in self.i2a:
val = itm2a(index)
if index == (): assert not val
else: assert val
itm2b = pickle.loads(pickle.dumps(itm2a))
for index in self.i2b:
val = itm2b(index)
if index == (): assert not val
else: assert val
itm2c = pickle.loads(pickle.dumps(itm2b))
for index in self.i2c:
val = itm2c(index)
if index == (): assert not val
else: assert val
itm2c.Finish()
assert itm2c.root_branch.total == 12, itm2c.root_branch.total
class CacheIndexableTest(unittest.TestCase):
def get_iter(self):
"""Return iterator yielding indexed objects, add to dict d"""
for i in range(100):
it = rorpiter.IndexedTuple((i,), range(i))
self.d[(i,)] = it
yield it
def testCaching(self):
"""Test basic properties of CacheIndexable object"""
self.d = {}
ci = rorpiter.CacheIndexable(self.get_iter(), 3)
val0 = ci.next()
val1 = ci.next()
val2 = ci.next()
assert ci.get((1,)) == self.d[(1,)]
assert ci.get((3,)) is None
val3 = ci.next()
val4 = ci.next()
val5 = ci.next()
assert ci.get((3,)) == self.d[(3,)]
assert ci.get((4,)) == self.d[(4,)]
assert ci.get((3,5)) is None
self.assertRaises(AssertionError, ci.get, (1,))
def testEqual(self):
"""Make sure CI doesn't alter properties of underlying iter"""
self.d = {}
l1 = list(self.get_iter())
l2 = list(rorpiter.CacheIndexable(iter(l1), 10))
assert l1 == l2, (l1, l2)
if __name__ == "__main__": unittest.main()
import os, cPickle, sys, unittest
from commontest import *
from rdiff_backup.rpath import *
from rdiff_backup import rpath
class RPathTest(unittest.TestCase):
lc = Globals.local_connection
prefix = "testfiles/various_file_types/"
mainprefix = "testfiles/"
rp_prefix = RPath(lc, prefix, ())
rp_main = RPath(lc, mainprefix, ())
class RORPStateTest(RPathTest):
"""Test Pickling of RORPaths"""
def testPickle(self):
rorp = RPath(self.lc, self.prefix, ("regular_file",)).getRORPath()
rorp.file = sys.stdin # try to confuse pickler
assert rorp.isreg()
rorp2 = cPickle.loads(cPickle.dumps(rorp, 1))
assert rorp2.isreg()
assert rorp2.data == rorp.data and rorp.index == rorp2.index
class CheckTypes(RPathTest):
"""Check to see if file types are identified correctly"""
def testExist(self):
"""Can tell if files exist"""
assert RPath(self.lc, self.prefix, ()).lstat()
assert not RPath(self.lc, "asuthasetuouo", ()).lstat()
def testDir(self):
"""Directories identified correctly"""
assert RPath(self.lc, self.prefix, ()).isdir()
assert not RPath(self.lc, self.prefix, ("regular_file",)).isdir()
def testSym(self):
"""Symbolic links identified"""
assert RPath(self.lc, self.prefix, ("symbolic_link",)).issym()
assert not RPath(self.lc, self.prefix, ()).issym()
def testReg(self):
"""Regular files identified"""
assert RPath(self.lc, self.prefix, ("regular_file",)).isreg()
assert not RPath(self.lc, self.prefix, ("symbolic_link",)).isreg()
def testFifo(self):
"""Fifo's identified"""
assert RPath(self.lc, self.prefix, ("fifo",)).isfifo()
assert not RPath(self.lc, self.prefix, ()).isfifo()
def testCharDev(self):
"""Char special files identified"""
assert RPath(self.lc, "/dev/tty2", ()).ischardev()
assert not RPath(self.lc, self.prefix, ("regular_file",)).ischardev()
def testBlockDev(self):
"""Block special files identified"""
assert RPath(self.lc, "/dev/hda", ()).isblkdev()
assert not RPath(self.lc, self.prefix, ("regular_file",)).isblkdev()
class CheckPerms(RPathTest):
"""Check to see if permissions are reported and set accurately"""
def testExecReport(self):
"""Check permissions for executable files"""
assert self.rp_prefix.append('executable').getperms() == 0755
assert self.rp_prefix.append('executable2').getperms() == 0700
def testhighbits(self):
"""Test reporting of highbit permissions"""
p = RPath(self.lc, "testfiles/rpath2/foobar").getperms()
assert p == 04100, p
def testOrdinaryReport(self):
"""Ordinary file permissions..."""
assert self.rp_prefix.append("regular_file").getperms() == 0644
assert self.rp_prefix.append('two_hardlinked_files1').getperms() == 0640
def testChmod(self):
"""Test changing file permission"""
rp = self.rp_prefix.append("changeable_permission")
rp.chmod(0700)
assert rp.getperms() == 0700
rp.chmod(0644)
assert rp.getperms() == 0644
def testExceptions(self):
"""What happens when file absent"""
self.assertRaises(Exception,
RPath(self.lc, self.prefix, ("aoeunto",)).getperms)
class CheckDir(RPathTest):
"""Check directory related functions"""
def testCreation(self):
"""Test directory creation and deletion"""
d = self.rp_prefix.append("tempdir")
assert not d.lstat()
d.mkdir()
assert d.isdir()
d.rmdir()
assert not d.lstat()
def testExceptions(self):
"""Should raise os.errors when no files"""
d = RPath(self.lc, self.prefix, ("suthosutho",))
self.assertRaises(os.error, d.rmdir)
d.mkdir()
self.assertRaises(os.error, d.mkdir)
d.rmdir()
def testListdir(self):
"""Checking dir listings"""
assert (RPath(self.lc, self.mainprefix, ("sampledir",)).listdir() ==
["1", "2", "3", "4"])
class CheckSyms(RPathTest):
"""Check symlinking and reading"""
def testRead(self):
"""symlink read"""
assert (RPath(self.lc, self.prefix, ("symbolic_link",)).readlink() ==
"regular_file")
def testMake(self):
"""Creating symlink"""
link = RPath(self.lc, self.mainprefix, ("symlink",))
assert not link.lstat()
link.symlink("abcdefg")
assert link.issym()
assert link.readlink() == "abcdefg"
link.delete()
class CheckSockets(RPathTest):
"""Check reading and making sockets"""
def testMake(self):
"""Create socket, then read it"""
sock = RPath(self.lc, self.mainprefix, ("socket",))
assert not sock.lstat()
sock.mksock()
assert sock.issock()
sock.delete()
def testLongSock(self):
"""Test making a socket with a long name
On some systems, the name of a socket is restricted, and
cannot be as long as a regular file. When this happens, a
SkipFileException should be raised.
"""
sock = RPath(self.lc, self.mainprefix, ("socketaoeusthaoeaoeutnhaonseuhtansoeuthasoneuthasoeutnhasonuthaoensuhtasoneuhtsanouhonetuhasoneuthsaoenaonsetuaosenuhtaoensuhaoeu",))
assert not sock.lstat()
try: sock.mksock()
except SkipFileException: pass
else: print "Warning, making long socket did not fail"
sock.setdata()
if sock.lstat(): sock.delete()
class TouchDelete(RPathTest):
"""Check touching and deletion of files"""
def testTouch(self):
"""Creation of 0 length files"""
t = RPath(self.lc, self.mainprefix, ("testtouch",))
assert not t.lstat()
t.touch()
assert t.lstat()
t.delete()
def testDelete(self):
"""Deletion of files"""
d = RPath(self.lc, self.mainprefix, ("testdelete",))
d.touch()
assert d.lstat()
d.delete()
assert not d.lstat()
class MiscFileInfo(RPathTest):
"""Check Miscellaneous file information"""
def testFileLength(self):
"""File length = getsize()"""
assert (RPath(self.lc, self.prefix, ("regular_file",)).getsize() ==
75650)
class FilenameOps(RPathTest):
"""Check filename operations"""
weirdfilename = eval('\'\\xd8\\xab\\xb1Wb\\xae\\xc5]\\x8a\\xbb\\x15v*\\xf4\\x0f!\\xf9>\\xe2Y\\x86\\xbb\\xab\\xdbp\\xb0\\x84\\x13k\\x1d\\xc2\\xf1\\xf5e\\xa5U\\x82\\x9aUV\\xa0\\xf4\\xdf4\\xba\\xfdX\\x03\\x82\\x07s\\xce\\x9e\\x8b\\xb34\\x04\\x9f\\x17 \\xf4\\x8f\\xa6\\xfa\\x97\\xab\\xd8\\xac\\xda\\x85\\xdcKvC\\xfa#\\x94\\x92\\x9e\\xc9\\xb7\\xc3_\\x0f\\x84g\\x9aB\\x11<=^\\xdbM\\x13\\x96c\\x8b\\xa7|*"\\\\\\\'^$@#!(){}?+ ~` \'')
normdict = {"/": "/",
".": ".",
"//": "/",
"/a/b": "/a/b",
"a/b": "a/b",
"a//b": "a/b",
"a////b//c": "a/b/c",
"..": "..",
"a/": "a",
"/a//b///": "/a/b"}
dirsplitdict = {"/": ("", ""),
"/a": ("", "a"),
"/a/b": ("/a", "b"),
".": (".", "."),
"b/c": ("b", "c"),
"a": (".", "a")}
def testQuote(self):
"""See if filename quoting works"""
wtf = RPath(self.lc, self.prefix, (self.weirdfilename,))
reg = RPath(self.lc, self.prefix, ("regular_file",))
assert wtf.lstat()
assert reg.lstat()
assert not os.system("ls %s >/dev/null 2>&1" % wtf.quote())
assert not os.system("ls %s >/dev/null 2>&1" % reg.quote())
def testNormalize(self):
"""rpath.normalize() dictionary test"""
for (before, after) in self.normdict.items():
assert RPath(self.lc, before, ()).normalize().path == after, \
"Normalize fails for %s => %s" % (before, after)
def testDirsplit(self):
"""Test splitting of various directories"""
for full, split in self.dirsplitdict.items():
result = RPath(self.lc, full, ()).dirsplit()
assert result == split, \
"%s => %s instead of %s" % (full, result, split)
def testGetnums(self):
"""Test getting file numbers"""
devnums = RPath(self.lc, "/dev/hda", ()).getdevnums()
assert devnums == (3, 0), devnums
devnums = RPath(self.lc, "/dev/tty2", ()).getdevnums()
assert devnums == (4, 2), devnums
class FileIO(RPathTest):
"""Test file input and output"""
def testRead(self):
"""File reading"""
fp = RPath(self.lc, self.prefix, ("executable",)).open("r")
assert fp.read(6) == "#!/bin"
fp.close()
def testWrite(self):
"""File writing"""
rp = RPath(self.lc, self.mainprefix, ("testfile",))
fp = rp.open("w")
fp.write("hello")
fp.close()
fp_input = rp.open("r")
assert fp_input.read() == "hello"
fp_input.close()
rp.delete()
def testGzipWrite(self):
"""Test writing of gzipped files"""
try: os.mkdir("testfiles/output")
except OSError: pass
rp_gz = RPath(self.lc, "testfiles/output/file.gz")
rp = RPath(self.lc, "testfiles/output/file")
if rp.lstat(): rp.delete()
s = "Hello, world!"
fp_out = rp_gz.open("wb", compress = 1)
fp_out.write(s)
assert not fp_out.close()
assert not os.system("gunzip testfiles/output/file.gz")
fp_in = rp.open("rb")
assert fp_in.read() == s
fp_in.close()
def testGzipRead(self):
"""Test reading of gzipped files"""
try: os.mkdir("testfiles/output")
except OSError: pass
rp_gz = RPath(self.lc, "testfiles/output/file.gz")
if rp_gz.lstat(): rp_gz.delete()
rp = RPath(self.lc, "testfiles/output/file")
s = "Hello, world!"
fp_out = rp.open("wb")
fp_out.write(s)
assert not fp_out.close()
rp.setdata()
assert rp.lstat()
assert not os.system("gzip testfiles/output/file")
rp.setdata()
rp_gz.setdata()
assert not rp.lstat()
assert rp_gz.lstat()
fp_in = rp_gz.open("rb", compress = 1)
assert fp_in.read() == s
assert not fp_in.close()
class FileCopying(RPathTest):
"""Test file copying and comparison"""
def setUp(self):
self.hl1 = RPath(self.lc, self.prefix, ("two_hardlinked_files1",))
self.hl2 = RPath(self.lc, self.prefix, ("two_hardlinked_files2",))
self.sl = RPath(self.lc, self.prefix, ("symbolic_link",))
self.dir = RPath(self.lc, self.prefix, ())
self.fifo = RPath(self.lc, self.prefix, ("fifo",))
self.rf = RPath(self.lc, self.prefix, ("regular_file",))
self.dest = RPath(self.lc, self.mainprefix, ("dest",))
if self.dest.lstat(): self.dest.delete()
assert not self.dest.lstat()
def testComp(self):
"""Test comparisons involving regular files"""
assert rpath.cmp(self.hl1, self.hl2)
assert not rpath.cmp(self.rf, self.hl1)
assert not rpath.cmp(self.dir, self.rf)
def testCompMisc(self):
"""Test miscellaneous comparisons"""
assert rpath.cmp(self.dir, RPath(self.lc, self.mainprefix, ()))
self.dest.symlink("regular_file")
assert rpath.cmp(self.sl, self.dest)
self.dest.delete()
assert not rpath.cmp(self.sl, self.fifo)
assert not rpath.cmp(self.dir, self.sl)
def testDirSizeComp(self):
"""Make sure directories can be equal,
even if they are of different sizes"""
smalldir = RPath(Globals.local_connection, "testfiles/dircomptest/1")
bigdir = RPath(Globals.local_connection, "testfiles/dircomptest/2")
# Can guarantee below by adding files to bigdir
assert bigdir.getsize() > smalldir.getsize()
assert smalldir == bigdir
def testCopy(self):
"""Test copy of various files"""
for rp in [self.sl, self.rf, self.fifo, self.dir]:
rpath.copy(rp, self.dest)
assert self.dest.lstat(), "%s doesn't exist" % self.dest.path
assert rpath.cmp(rp, self.dest)
assert rpath.cmp(self.dest, rp)
self.dest.delete()
class FileAttributes(FileCopying):
"""Test file attribute operations"""
def setUp(self):
FileCopying.setUp(self)
self.noperms = RPath(self.lc, self.mainprefix, ("noperms",))
self.nowrite = RPath(self.lc, self.mainprefix, ("nowrite",))
self.exec1 = RPath(self.lc, self.prefix, ("executable",))
self.exec2 = RPath(self.lc, self.prefix, ("executable2",))
self.test = RPath(self.lc, self.prefix, ("test",))
self.nothing = RPath(self.lc, self.prefix, ("aoeunthoenuouo",))
self.sym = RPath(self.lc, self.prefix, ("symbolic_link",))
def testComp(self):
"""Test attribute comparison success"""
testpairs = [(self.hl1, self.hl2)]
for a, b in testpairs:
assert a.equal_loose(b), "Err with %s %s" % (a.path, b.path)
assert b.equal_loose(a), "Err with %s %s" % (b.path, a.path)
def testCompFail(self):
"""Test attribute comparison failures"""
testpairs = [(self.nowrite, self.noperms),
(self.exec1, self.exec2),
(self.rf, self.hl1)]
for a, b in testpairs:
assert not a.equal_loose(b), "Err with %s %s" % (a.path, b.path)
assert not b.equal_loose(a), "Err with %s %s" % (b.path, a.path)
def testCheckRaise(self):
"""Should raise exception when file missing"""
self.assertRaises(RPathException, rpath.check_for_files,
self.nothing, self.hl1)
self.assertRaises(RPathException, rpath.check_for_files,
self.hl1, self.nothing)
def testCopyAttribs(self):
"""Test copying attributes"""
t = RPath(self.lc, self.mainprefix, ("testattribs",))
if t.lstat(): t.delete()
for rp in [self.noperms, self.nowrite, self.rf, self.exec1,
self.exec2, self.hl1, self.dir]:
t.touch()
rpath.copy_attribs(rp, t)
assert rpath.cmp_attribs(t, rp), \
"Attributes for file %s not copied successfully" % rp.path
t.delete()
def testCopyWithAttribs(self):
"""Test copying with attribs (bug found earlier)"""
out = RPath(self.lc, self.mainprefix, ("out",))
if out.lstat(): out.delete()
for rp in [self.noperms, self.nowrite, self.rf, self.exec1,
self.exec2, self.hl1, self.dir, self.sym]:
rpath.copy_with_attribs(rp, out)
assert rpath.cmp(rp, out)
assert rp.equal_loose(out)
out.delete()
def testCopyRaise(self):
"""Should raise exception for non-existent files"""
self.assertRaises(RPathException, rpath.copy_attribs,
self.hl1, self.nothing)
self.assertRaises(RPathException, rpath.copy_attribs,
self.nothing, self.nowrite)
class CheckPath(unittest.TestCase):
"""Check to make sure paths generated properly"""
def testpath(self):
"""Test root paths"""
root = RPath(Globals.local_connection, "/")
assert root.path == "/", root.path
bin = root.append("bin")
assert bin.path == "/bin", bin.path
bin2 = RPath(Globals.local_connection, "/bin")
assert bin.path == "/bin", bin2.path
if __name__ == "__main__":
unittest.main()
import os, unittest
from commontest import *
import rdiff_backup.Security
#Log.setverbosity(5)
class SecurityTest(unittest.TestCase):
def assert_exc_sec(self, exc):
"""Fudge - make sure exception is a security violation
This is necessary because of some kind of pickling/module
problem.
"""
assert isinstance(exc, rdiff_backup.Security.Violation)
#assert str(exc).find("Security") >= 0, "%s\n%s" % (exc, repr(exc))
def test_vet_request_ro(self):
"""Test vetting of ConnectionRequests on read-only server"""
remote_cmd = "rdiff-backup --server --restrict-read-only foo"
conn = SetConnections.init_connection(remote_cmd)
assert type(conn.os.getuid()) is type(5)
try: conn.os.remove("/tmp/foobar")
except Exception, e: self.assert_exc_sec(e)
else: assert 0, "No exception raised"
SetConnections.CloseConnections()
def test_vet_request_minimal(self):
"""Test vetting of ConnectionRequests on minimal server"""
remote_cmd = "rdiff-backup --server --restrict-update-only foo"
conn = SetConnections.init_connection(remote_cmd)
assert type(conn.os.getuid()) is type(5)
try: conn.os.remove("/tmp/foobar")
except Exception, e: self.assert_exc_sec(e)
else: assert 0, "No exception raised"
SetConnections.CloseConnections()
def test_vet_rpath(self):
"""Test to make sure rpaths not in restricted path will be rejected"""
remote_cmd = "rdiff-backup --server --restrict-update-only foo"
conn = SetConnections.init_connection(remote_cmd)
for rp in [RPath(Globals.local_connection, "blahblah"),
RPath(conn, "foo/bar")]:
conn.Globals.set("TEST_var", rp)
assert conn.Globals.get("TEST_var").path == rp.path
for rp in [RPath(conn, "foobar"),
RPath(conn, "/usr/local"),
RPath(conn, "foo/../bar")]:
try: conn.Globals.set("TEST_var", rp)
except Exception, e:
self.assert_exc_sec(e)
continue
assert 0, "No violation raised by rp %s" % (rp,)
SetConnections.CloseConnections()
if __name__ == "__main__": unittest.main()
from __future__ import generators
import re, StringIO, unittest, types
from commontest import *
from rdiff_backup.selection import *
from rdiff_backup import Globals, rpath, lazy
class MatchingTest(unittest.TestCase):
"""Test matching of file names against various selection functions"""
def makerp(self, path): return rpath.RPath(Globals.local_connection, path)
def makeext(self, path): return self.root.new_index(tuple(path.split("/")))
def setUp(self):
self.root = rpath.RPath(Globals.local_connection, "testfiles/select")
self.Select = Select(self.root)
def testRegexp(self):
"""Test regular expression selection func"""
sf1 = self.Select.regexp_get_sf(".*\.py", 1)
assert sf1(self.makeext("1.py")) == 1
assert sf1(self.makeext("usr/foo.py")) == 1
assert sf1(self.root.append("1.doc")) == None
sf2 = self.Select.regexp_get_sf("hello", 0)
assert sf2(self.makerp("hello")) == 0
assert sf2(self.makerp("foohello_there")) == 0
assert sf2(self.makerp("foo")) == None
def testTupleInclude(self):
"""Test include selection function made from a regular filename"""
self.assertRaises(FilePrefixError,
self.Select.glob_get_filename_sf, "foo", 1)
sf2 = self.Select.glob_get_sf("testfiles/select/usr/local/bin/", 1)
assert sf2(self.makeext("usr")) == 1
assert sf2(self.makeext("usr/local")) == 1
assert sf2(self.makeext("usr/local/bin")) == 1
assert sf2(self.makeext("usr/local/doc")) == None
assert sf2(self.makeext("usr/local/bin/gzip")) == 1
assert sf2(self.makeext("usr/local/bingzip")) == None
def testTupleExclude(self):
"""Test exclude selection function made from a regular filename"""
self.assertRaises(FilePrefixError,
self.Select.glob_get_filename_sf, "foo", 0)
sf2 = self.Select.glob_get_sf("testfiles/select/usr/local/bin/", 0)
assert sf2(self.makeext("usr")) == None
assert sf2(self.makeext("usr/local")) == None
assert sf2(self.makeext("usr/local/bin")) == 0
assert sf2(self.makeext("usr/local/doc")) == None
assert sf2(self.makeext("usr/local/bin/gzip")) == 0
assert sf2(self.makeext("usr/local/bingzip")) == None
def testGlobStarInclude(self):
"""Test a few globbing patterns, including **"""
sf1 = self.Select.glob_get_sf("**", 1)
assert sf1(self.makeext("foo")) == 1
assert sf1(self.makeext("")) == 1
sf2 = self.Select.glob_get_sf("**.py", 1)
assert sf2(self.makeext("foo")) == 2
assert sf2(self.makeext("usr/local/bin")) == 2
assert sf2(self.makeext("what/ever.py")) == 1
assert sf2(self.makeext("what/ever.py/foo")) == 1
def testGlobStarExclude(self):
"""Test a few glob excludes, including **"""
sf1 = self.Select.glob_get_sf("**", 0)
assert sf1(self.makeext("/usr/local/bin")) == 0
sf2 = self.Select.glob_get_sf("**.py", 0)
assert sf2(self.makeext("foo")) == None, sf2(self.makeext("foo"))
assert sf2(self.makeext("usr/local/bin")) == None
assert sf2(self.makeext("what/ever.py")) == 0
assert sf2(self.makeext("what/ever.py/foo")) == 0
def testFilelistInclude(self):
"""Test included filelist"""
fp = StringIO.StringIO("""
testfiles/select/1/2
testfiles/select/1
testfiles/select/1/2/3
testfiles/select/3/3/2""")
sf = self.Select.filelist_get_sf(fp, 1, "test")
assert sf(self.root) == 1
assert sf(self.makeext("1")) == 1
assert sf(self.makeext("1/1")) == None
assert sf(self.makeext("1/2/3")) == 1
assert sf(self.makeext("2/2")) == None
assert sf(self.makeext("3")) == 1
assert sf(self.makeext("3/3")) == 1
assert sf(self.makeext("3/3/3")) == None
def testFilelistWhitespaceInclude(self):
"""Test included filelist, with some whitespace"""
fp = StringIO.StringIO("""
+ testfiles/select/1
- testfiles/select/2
testfiles/select/3\t""")
sf = self.Select.filelist_get_sf(fp, 1, "test")
assert sf(self.root) == 1
assert sf(self.makeext("1 ")) == 1
assert sf(self.makeext("2 ")) == 0
assert sf(self.makeext("3\t")) == 1
assert sf(self.makeext("4")) == None
def testFilelistIncludeNullSep(self):
"""Test included filelist but with null_separator set"""
fp = StringIO.StringIO("""\0testfiles/select/1/2\0testfiles/select/1\0testfiles/select/1/2/3\0testfiles/select/3/3/2\0testfiles/select/hello\nthere\0""")
Globals.null_separator = 1
sf = self.Select.filelist_get_sf(fp, 1, "test")
assert sf(self.root) == 1
assert sf(self.makeext("1")) == 1
assert sf(self.makeext("1/1")) == None
assert sf(self.makeext("1/2/3")) == 1
assert sf(self.makeext("2/2")) == None
assert sf(self.makeext("3")) == 1
assert sf(self.makeext("3/3")) == 1
assert sf(self.makeext("3/3/3")) == None
assert sf(self.makeext("hello\nthere")) == 1
Globals.null_separator = 0
def testFilelistExclude(self):
"""Test included filelist"""
fp = StringIO.StringIO("""
testfiles/select/1/2
testfiles/select/1
this is a badly formed line which should be ignored
testfiles/select/1/2/3
testfiles/select/3/3/2""")
sf = self.Select.filelist_get_sf(fp, 0, "test")
assert sf(self.root) == None
assert sf(self.makeext("1")) == 0
assert sf(self.makeext("1/1")) == 0
assert sf(self.makeext("1/2/3")) == 0
assert sf(self.makeext("2/2")) == None
assert sf(self.makeext("3")) == None
assert sf(self.makeext("3/3/2")) == 0
assert sf(self.makeext("3/3/3")) == None
def testFilelistInclude2(self):
"""testFilelistInclude2 - with modifiers"""
fp = StringIO.StringIO("""
testfiles/select/1/1
- testfiles/select/1/2
+ testfiles/select/1/3
- testfiles/select/3""")
sf = self.Select.filelist_get_sf(fp, 1, "test1")
assert sf(self.makeext("1")) == 1
assert sf(self.makeext("1/1")) == 1
assert sf(self.makeext("1/1/2")) == None
assert sf(self.makeext("1/2")) == 0
assert sf(self.makeext("1/2/3")) == 0
assert sf(self.makeext("1/3")) == 1
assert sf(self.makeext("2")) == None
assert sf(self.makeext("3")) == 0
def testFilelistExclude2(self):
"""testFilelistExclude2 - with modifiers"""
fp = StringIO.StringIO("""
testfiles/select/1/1
- testfiles/select/1/2
+ testfiles/select/1/3
- testfiles/select/3""")
sf = self.Select.filelist_get_sf(fp, 0, "test1")
sf_val1 = sf(self.root)
assert sf_val1 == 1 or sf_val1 == None # either is OK
sf_val2 = sf(self.makeext("1"))
assert sf_val2 == 1 or sf_val2 == None
assert sf(self.makeext("1/1")) == 0
assert sf(self.makeext("1/1/2")) == 0
assert sf(self.makeext("1/2")) == 0
assert sf(self.makeext("1/2/3")) == 0
assert sf(self.makeext("1/3")) == 1
assert sf(self.makeext("2")) == None
assert sf(self.makeext("3")) == 0
def testGlobRE(self):
"""testGlobRE - test translation of shell pattern to regular exp"""
assert self.Select.glob_to_re("hello") == "hello"
assert self.Select.glob_to_re(".e?ll**o") == "\\.e[^/]ll.*o"
r = self.Select.glob_to_re("[abc]el[^de][!fg]h")
assert r == "[abc]el[^de][^fg]h", r
r = self.Select.glob_to_re("/usr/*/bin/")
assert r == "\\/usr\\/[^/]*\\/bin\\/", r
assert self.Select.glob_to_re("[a.b/c]") == "[a.b/c]"
r = self.Select.glob_to_re("[a*b-c]e[!]]")
assert r == "[a*b-c]e[^]]", r
def testGlobSFException(self):
"""testGlobSFException - see if globbing errors returned"""
self.assertRaises(GlobbingError, self.Select.glob_get_normal_sf,
"testfiles/select/hello//there", 1)
self.assertRaises(FilePrefixError,
self.Select.glob_get_sf, "testfiles/whatever", 1)
self.assertRaises(FilePrefixError,
self.Select.glob_get_sf, "testfiles/?hello", 0)
assert self.Select.glob_get_normal_sf("**", 1)
def testIgnoreCase(self):
"""testIgnoreCase - try a few expressions with ignorecase:"""
sf = self.Select.glob_get_sf("ignorecase:testfiles/SeLect/foo/bar", 1)
assert sf(self.makeext("FOO/BAR")) == 1
assert sf(self.makeext("foo/bar")) == 1
assert sf(self.makeext("fOo/BaR")) == 1
self.assertRaises(FilePrefixError, self.Select.glob_get_sf,
"ignorecase:tesfiles/sect/foo/bar", 1)
def testDev(self):
"""Test device and special file selection"""
dir = self.root.append("filetypes")
fifo = dir.append("fifo")
assert fifo.isfifo(), fifo
sym = dir.append("symlink")
assert sym.issym(), sym
reg = dir.append("regular_file")
assert reg.isreg(), reg
sock = dir.append("replace_with_socket")
if not sock.issock():
assert sock.isreg(), sock
sock.delete()
sock.mksock()
assert sock.issock()
dev = dir.append("ttyS1")
assert dev.isdev(), dev
sf = self.Select.devfiles_get_sf(0)
assert sf(dir) == None
assert sf(dev) == 0
assert sf(sock) == None
sf2 = self.Select.special_get_sf(0)
assert sf2(dir) == None
assert sf2(reg) == None
assert sf2(dev) == 0
assert sf2(sock) == 0
assert sf2(fifo) == 0
assert sf2(sym) == 0
def testRoot(self):
"""testRoot - / may be a counterexample to several of these.."""
root = rpath.RPath(Globals.local_connection, "/")
select = Select(root)
assert select.glob_get_sf("/", 1)(root) == 1
assert select.glob_get_sf("/foo", 1)(root) == 1
assert select.glob_get_sf("/foo/bar", 1)(root) == 1
assert select.glob_get_sf("/", 0)(root) == 0
assert select.glob_get_sf("/foo", 0)(root) == None
assert select.glob_get_sf("**.py", 1)(root) == 2
assert select.glob_get_sf("**", 1)(root) == 1
assert select.glob_get_sf("ignorecase:/", 1)(root) == 1
assert select.glob_get_sf("**.py", 0)(root) == None
assert select.glob_get_sf("**", 0)(root) == 0
assert select.glob_get_sf("/foo/*", 0)(root) == None
select.filelist_get_sf(StringIO.StringIO("/"), 1, "test")(root) == 1
select.filelist_get_sf(StringIO.StringIO("/foo/bar"), 1,
"test")(root) == 1
select.filelist_get_sf(StringIO.StringIO("/"), 0, "test")(root) == 0
select.filelist_get_sf(StringIO.StringIO("/foo/bar"), 0,
"test")(root) == None
def testOtherFilesystems(self):
"""Test to see if --exclude-other-filesystems works correctly"""
root = rpath.RPath(Globals.local_connection, "/")
select = Select(root)
sf = select.other_filesystems_get_sf(0)
assert sf(root) is None
assert sf(RPath(Globals.local_connection, "/usr/bin")) is None, \
"Assumption: /usr/bin is on the same filesystem as /"
assert sf(RPath(Globals.local_connection, "/proc")) == 0, \
"Assumption: /proc is on a different filesystem"
assert sf(RPath(Globals.local_connection, "/boot")) == 0, \
"Assumption: /boot is on a different filesystem"
class ParseArgsTest(unittest.TestCase):
"""Test argument parsing as well as filelist globbing"""
root = None
def ParseTest(self, tuplelist, indicies, filelists = []):
"""No error if running select on tuple goes over indicies"""
if not self.root:
self.root = RPath(Globals.local_connection, "testfiles/select")
self.Select = Select(self.root)
self.Select.ParseArgs(tuplelist, self.remake_filelists(filelists))
self.Select.set_iter()
assert lazy.Iter.equal(lazy.Iter.map(lambda dsrp: dsrp.index,
self.Select),
iter(indicies), verbose = 1)
def remake_filelists(self, filelist):
"""Turn strings in filelist into fileobjs"""
new_filelists = []
for f in filelist:
if type(f) is types.StringType:
new_filelists.append(StringIO.StringIO(f))
else: new_filelists.append(f)
return new_filelists
def testParse(self):
"""Test just one include, all exclude"""
self.ParseTest([("--include", "testfiles/select/1/1"),
("--exclude", "**")],
[(), ('1',), ("1", "1"), ("1", '1', '1'),
('1', '1', '2'), ('1', '1', '3')])
def testParse2(self):
"""Test three level include/exclude"""
self.ParseTest([("--exclude", "testfiles/select/1/1/1"),
("--include", "testfiles/select/1/1"),
("--exclude", "testfiles/select/1"),
("--exclude", "**")],
[(), ('1',), ('1', '1'), ('1', '1', '2'),
('1', '1', '3')])
def test_globbing_filelist(self):
"""Filelist glob test similar to above testParse2"""
self.ParseTest([("--include-globbing-filelist", "file")],
[(), ('1',), ('1', '1'), ('1', '1', '2'),
('1', '1', '3')],
["""
- testfiles/select/1/1/1
testfiles/select/1/1
- testfiles/select/1
- **
"""])
def testGlob(self):
"""Test globbing expression"""
self.ParseTest([("--exclude", "**[3-5]"),
("--include", "testfiles/select/1"),
("--exclude", "**")],
[(), ('1',), ('1', '1'),
('1', '1', '1'), ('1', '1', '2'),
('1', '2'), ('1', '2', '1'), ('1', '2', '2')])
self.ParseTest([("--include", "testfiles/select**/2"),
("--exclude", "**")],
[(), ('1',), ('1', '1'),
('1', '1', '2'),
('1', '2'),
('1', '2', '1'), ('1', '2', '2'), ('1', '2', '3'),
('1', '3'),
('1', '3', '2'),
('2',), ('2', '1'),
('2', '1', '1'), ('2', '1', '2'), ('2', '1', '3'),
('2', '2'),
('2', '2', '1'), ('2', '2', '2'), ('2', '2', '3'),
('2', '3'),
('2', '3', '1'), ('2', '3', '2'), ('2', '3', '3'),
('3',), ('3', '1'),
('3', '1', '2'),
('3', '2'),
('3', '2', '1'), ('3', '2', '2'), ('3', '2', '3'),
('3', '3'),
('3', '3', '2')])
def test_globbing_filelist2(self):
"""Filelist glob test similar to above testGlob"""
self.ParseTest([("--exclude-globbing-filelist", "asoeuth")],
[(), ('1',), ('1', '1'),
('1', '1', '1'), ('1', '1', '2'),
('1', '2'), ('1', '2', '1'), ('1', '2', '2')],
["""
**[3-5]
+ testfiles/select/1
**
"""])
self.ParseTest([("--include-globbing-filelist", "file")],
[(), ('1',), ('1', '1'),
('1', '1', '2'),
('1', '2'),
('1', '2', '1'), ('1', '2', '2'), ('1', '2', '3'),
('1', '3'),
('1', '3', '2'),
('2',), ('2', '1'),
('2', '1', '1'), ('2', '1', '2'), ('2', '1', '3'),
('2', '2'),
('2', '2', '1'), ('2', '2', '2'), ('2', '2', '3'),
('2', '3'),
('2', '3', '1'), ('2', '3', '2'), ('2', '3', '3'),
('3',), ('3', '1'),
('3', '1', '2'),
('3', '2'),
('3', '2', '1'), ('3', '2', '2'), ('3', '2', '3'),
('3', '3'),
('3', '3', '2')],
["""
testfiles/select**/2
- **
"""])
def testGlob2(self):
"""Test more globbing functions"""
self.ParseTest([("--include", "testfiles/select/*foo*/p*"),
("--exclude", "**")],
[(), ('efools',), ('efools', 'ping'),
('foobar',), ('foobar', 'pong')])
self.ParseTest([("--exclude", "testfiles/select/1/1/*"),
("--exclude", "testfiles/select/1/2/**"),
("--exclude", "testfiles/select/1/3**"),
("--include", "testfiles/select/1"),
("--exclude", "**")],
[(), ('1',), ('1', '1'), ('1', '2')])
def testAlternateRoot(self):
"""Test select with different root"""
self.root = rpath.RPath(Globals.local_connection, "testfiles/select/1")
self.ParseTest([("--exclude", "testfiles/select/1/[23]")],
[(), ('1',), ('1', '1'), ('1', '2'), ('1', '3')])
self.root = rpath.RPath(Globals.local_connection, "/")
self.ParseTest([("--exclude", "/home/*"),
("--include", "/home"),
("--exclude", "/")],
[(), ("home",)])
# def testParseStartingFrom(self):
# """Test parse, this time starting from inside"""
# self.root = rpath.RPath(Globals.local_connection, "testfiles/select")
# self.Select = Select(self.root)
# self.Select.ParseArgs([("--include", "testfiles/select/1/1"),
# ("--exclude", "**")], [])
# self.Select.set_iter(('1', '1'))
# assert lazy.Iter.equal(lazy.Iter.map(lambda dsrp: dsrp.index,
# self.Select),
# iter([("1", '1', '1'),
# ('1', '1', '2'),
# ('1', '1', '3')]),
# verbose = 1)
if __name__ == "__main__": unittest.main()
#!/usr/bin/env python
import sys, os
__doc__ = """
This starts an rdiff-backup server using the existing source files.
If not run from the source directory, the only argument should be
the directory the source files are in.
"""
def Test_SetConnGlobals(conn, setting, value):
"""This is used in connectiontest.py"""
conn.Globals.set(setting, value)
def print_usage():
print "Usage: server.py [path to source files]", __doc__
if len(sys.argv) > 2:
print_usage()
sys.exit(1)
try:
if len(sys.argv) == 2: sys.path.insert(0, sys.argv[1])
import rdiff_backup.Globals
from rdiff_backup.connection import *
except (OSError, IOError, ImportError):
print_usage()
raise
#Log.setverbosity(9)
PipeConnection(sys.stdin, sys.stdout).Server()
import unittest
from commontest import *
from rdiff_backup import SetConnections
class SetConnectionsTest(unittest.TestCase):
"""Set SetConnections Class"""
def testParsing(self):
"""Test parsing of various file descriptors"""
pfd = SetConnections.parse_file_desc
assert pfd("bescoto@folly.stanford.edu::/usr/bin/ls") == \
("bescoto@folly.stanford.edu", "/usr/bin/ls")
assert pfd("hello there::/goodbye:euoeu") == \
("hello there", "/goodbye:euoeu")
assert pfd(r"test\\ing\::more::and more\\..") == \
(r"test\ing::more", r"and more\\.."), \
pfd(r"test\\ing\::more::and more\\..")
assert pfd("a:b:c:d::e") == ("a:b:c:d", "e")
assert pfd("foobar") == (None, "foobar")
assert pfd(r"hello\::there") == (None, "hello\::there")
self.assertRaises(SetConnections.SetConnectionsException,
pfd, r"hello\:there::")
self.assertRaises(SetConnections.SetConnectionsException,
pfd, "foobar\\")
if __name__ == "__main__": unittest.main()
import unittest, types
from commontest import *
from rdiff_backup.static import *
class D:
def foo(x, y):
return x, y
def bar(self, x):
return 3, x
def _hello(self):
return self
MakeStatic(D)
class C:
_a = 0
def get(cls):
return cls._a
def inc(cls):
cls._a = cls._a + 1
MakeClass(C)
class StaticMethodsTest(unittest.TestCase):
"""Test StaticMethods module"""
def testType(self):
"""Methods should have type StaticMethod"""
assert type(D.foo) is types.FunctionType
assert type(D.bar) is types.FunctionType
def testStatic(self):
"""Methods should be callable without instance"""
assert D.foo(1,2) == (1,2)
assert D.bar(3,4) == (3,4)
def testBound(self):
"""Methods should also work bound"""
d = D()
assert d.foo(1,2) == (1,2)
assert d.bar(3,4) == (3,4)
def testStatic_(self):
"""_ Methods should be untouched"""
d = D()
self.assertRaises(TypeError, d._hello, 4)
assert d._hello() is d
class ClassMethodsTest(unittest.TestCase):
def test(self):
"""Test MakeClass function"""
assert C.get() == 0
C.inc()
assert C.get() == 1
C.inc()
assert C.get() == 2
if __name__ == "__main__":
unittest.main()
import unittest, time
from commontest import *
from rdiff_backup import statistics, rpath, restore
class StatsObjTest(unittest.TestCase):
"""Test StatsObj class"""
def set_obj(self, s):
"""Set values of s's statistics"""
s.SourceFiles = 1
s.SourceFileSize = 2
s.MirrorFiles = 13
s.MirrorFileSize = 14
s.NewFiles = 3
s.NewFileSize = 4
s.DeletedFiles = 5
s.DeletedFileSize = 6
s.ChangedFiles = 7
s.ChangedSourceSize = 8
s.ChangedMirrorSize = 9
s.IncrementFiles = 15
s.IncrementFileSize = 10
s.StartTime = 11
s.EndTime = 12
def test_get_stats(self):
"""Test reading and writing stat objects"""
s = statistics.StatsObj()
assert s.get_stat('SourceFiles') is None
self.set_obj(s)
assert s.get_stat('SourceFiles') == 1
s1 = statistics.StatFileObj()
assert s1.get_stat('SourceFiles') == 0
def test_get_stats_string(self):
"""Test conversion of stat object into string"""
s = statistics.StatsObj()
stats_string = s.get_stats_string()
assert stats_string == "", stats_string
self.set_obj(s)
stats_string = s.get_stats_string()
ss_list = stats_string.split("\n")
tail = "\n".join(ss_list[2:]) # Time varies by time zone, don't check
#"""StartTime 11.00 (Wed Dec 31 16:00:11 1969)
#EndTime 12.00 (Wed Dec 31 16:00:12 1969)"
assert tail == \
"""ElapsedTime 1.00 (1 second)
SourceFiles 1
SourceFileSize 2 (2 bytes)
MirrorFiles 13
MirrorFileSize 14 (14 bytes)
NewFiles 3
NewFileSize 4 (4 bytes)
DeletedFiles 5
DeletedFileSize 6 (6 bytes)
ChangedFiles 7
ChangedSourceSize 8 (8 bytes)
ChangedMirrorSize 9 (9 bytes)
IncrementFiles 15
IncrementFileSize 10 (10 bytes)
TotalDestinationSizeChange 7 (7 bytes)
""", "'%s'" % stats_string
def test_line_string(self):
"""Test conversion to a single line"""
s = statistics.StatsObj()
self.set_obj(s)
statline = s.get_stats_line(("sample", "index", "w", "new\nline"))
assert statline == "sample/index/w/new\\nline 1 2 13 14 " \
"3 4 5 6 7 8 9 15 10", repr(statline)
statline = s.get_stats_line(())
assert statline == ". 1 2 13 14 3 4 5 6 7 8 9 15 10"
statline = s.get_stats_line(("file name with spaces",))
assert statline == "file\\x20name\\x20with\\x20spaces 1 2 13 14 " \
"3 4 5 6 7 8 9 15 10", repr(statline)
def test_byte_summary(self):
"""Test conversion of bytes to strings like 7.23MB"""
s = statistics.StatsObj()
f = s.get_byte_summary_string
assert f(1) == "1 byte"
assert f(234.34) == "234 bytes"
assert f(2048) == "2.00 KB"
assert f(3502243) == "3.34 MB"
assert f(314992230) == "300 MB"
assert f(36874871216) == "34.3 GB", f(36874871216)
assert f(3775986812573450) == "3434 TB"
def test_init_stats(self):
"""Test setting stat object from string"""
s = statistics.StatsObj()
s.set_stats_from_string("NewFiles 3 hello there")
for attr in s.stat_attrs:
if attr == 'NewFiles': assert s.get_stat(attr) == 3
else: assert s.get_stat(attr) is None, (attr, s.__dict__[attr])
s1 = statistics.StatsObj()
self.set_obj(s1)
assert not s1.stats_equal(s)
s2 = statistics.StatsObj()
s2.set_stats_from_string(s1.get_stats_string())
assert s1.stats_equal(s2)
def test_write_rp(self):
"""Test reading and writing of statistics object"""
rp = rpath.RPath(Globals.local_connection, "testfiles/statstest")
if rp.lstat(): rp.delete()
s = statistics.StatsObj()
self.set_obj(s)
s.write_stats_to_rp(rp)
s2 = statistics.StatsObj()
assert not s2.stats_equal(s)
s2.read_stats_from_rp(rp)
assert s2.stats_equal(s)
def testAverage(self):
"""Test making an average statsobj"""
s1 = statistics.StatsObj()
s1.StartTime = 5
s1.EndTime = 10
s1.ElapsedTime = 5
s1.ChangedFiles = 2
s1.SourceFiles = 100
s1.NewFileSize = 4
s2 = statistics.StatsObj()
s2.StartTime = 25
s2.EndTime = 35
s2.ElapsedTime = 10
s2.ChangedFiles = 1
s2.SourceFiles = 50
s2.DeletedFiles = 0
s3 = statistics.StatsObj().set_to_average([s1, s2])
assert s3.StartTime is s3.EndTime is None
assert s3.ElapsedTime == 7.5
assert s3.DeletedFiles is s3.NewFileSize is None, (s3.DeletedFiles,
s3.NewFileSize)
assert s3.ChangedFiles == 1.5
assert s3.SourceFiles == 75
class IncStatTest(unittest.TestCase):
"""Test statistics as produced by actual backup"""
def stats_check_initial(self, s):
"""Make sure stats object s compatible with initial mirroring
A lot of the off by one stuff is because the root directory
exists in the below examples.
"""
assert s.MirrorFiles == 1 or s.MirrorFiles == 0
assert s.MirrorFileSize < 20000
assert s.NewFiles <= s.SourceFiles <= s.NewFiles + 1
assert s.NewFileSize <= s.SourceFileSize <= s.NewFileSize + 20000
assert s.ChangedFiles == 1 or s.ChangedFiles == 0
assert s.ChangedSourceSize < 20000
assert s.ChangedMirrorSize < 20000
assert s.DeletedFiles == s.DeletedFileSize == 0
assert s.IncrementFileSize == 0
def testStatistics(self):
"""Test the writing of statistics
The file sizes are approximate because the size of directories
could change with different file systems...
"""
Globals.compression = 1
Myrm("testfiles/output")
InternalBackup(1, 1, "testfiles/stattest1", "testfiles/output")
InternalBackup(1, 1, "testfiles/stattest2", "testfiles/output",
time.time()+1)
rbdir = rpath.RPath(Globals.local_connection,
"testfiles/output/rdiff-backup-data")
incs = restore.get_inclist(rbdir.append("session_statistics"))
assert len(incs) == 2
s2 = statistics.StatsObj().read_stats_from_rp(incs[0])
assert s2.SourceFiles == 7
assert 700000 <= s2.SourceFileSize < 750000
self.stats_check_initial(s2)
root_stats = statistics.StatsObj().read_stats_from_rp(incs[1])
assert root_stats.SourceFiles == 7, root_stats.SourceFiles
assert 550000 <= root_stats.SourceFileSize < 570000
assert root_stats.MirrorFiles == 7
assert 700000 <= root_stats.MirrorFileSize < 750000
assert root_stats.NewFiles == 1
assert root_stats.NewFileSize == 0
assert root_stats.DeletedFiles == 1, root_stats.DeletedFiles
assert root_stats.DeletedFileSize == 200000
assert 3 <= root_stats.ChangedFiles <= 4, root_stats.ChangedFiles
assert 450000 <= root_stats.ChangedSourceSize < 470000
assert 400000 <= root_stats.ChangedMirrorSize < 420000, \
root_stats.ChangedMirrorSize
assert 10 < root_stats.IncrementFileSize < 30000
if __name__ == "__main__": unittest.main()
import profile, pstats
from metadatatest import *
profile.run("unittest.main()", "profile-output")
p = pstats.Stats("profile-output")
p.sort_stats('time')
p.print_stats(40)
import unittest
"""This probably doesn't work any more - just run the tests manually."""
from connectiontest import *
#from destructive-steppingtest import *
from dstest import *
from highleveltest import *
from incrementtest import *
from iterfiletest import *
from lazytest import *
from rdifftest import *
from regressiontest import *
from restoretest import *
from rlisttest import *
from rorpitertest import *
from rpathtest import *
#from finaltest import *
from statictest import *
from timetest import *
from filelisttest import *
from setconnectionstest import *
if __name__ == "__main__":
unittest.main()
import unittest, time, types
from commontest import *
from rdiff_backup import Globals, Time
class TimeTest(unittest.TestCase):
def testConversion(self):
"""test timetostring and stringtotime"""
Time.setcurtime()
assert type(Time.curtime) is types.FloatType or types.LongType
assert type(Time.curtimestr) is types.StringType
assert (Time.cmp(int(Time.curtime), Time.curtimestr) == 0 or
Time.cmp(int(Time.curtime) + 1, Time.curtimestr) == 0)
time.sleep(1.05)
assert Time.cmp(time.time(), Time.curtime) == 1
assert Time.cmp(Time.timetostring(time.time()), Time.curtimestr) == 1
def testConversion_separator(self):
"""Same as testConversion, but change time Separator"""
Globals.time_separator = "_"
self.testConversion()
Globals.time_separator = ":"
def testCmp(self):
"""Test time comparisons"""
cmp = Time.cmp
assert cmp(1,2) == -1
assert cmp(2,2) == 0
assert cmp(5,1) == 1
assert cmp("2001-09-01T21:49:04Z", "2001-08-01T21:49:04Z") == 1
assert cmp("2001-09-01T04:49:04+03:23", "2001-09-01T21:49:04Z") == -1
assert cmp("2001-09-01T12:00:00Z", "2001-09-01T04:00:00-08:00") == 0
assert cmp("2001-09-01T12:00:00-08:00",
"2001-09-01T12:00:00-07:00") == 1
def testStringtotime(self):
"""Test converting string to time"""
timesec = int(time.time())
assert timesec == int(Time.stringtotime(Time.timetostring(timesec)))
assert not Time.stringtotime("2001-18-83T03:03:03Z")
assert not Time.stringtotime("2001-01-23L03:03:03L")
assert not Time.stringtotime("2001_01_23T03:03:03Z")
def testIntervals(self):
"""Test converting strings to intervals"""
i2s = Time.intstringtoseconds
for s in ["32", "", "d", "231I", "MM", "s", "-2h"]:
try: i2s(s)
except Time.TimeException: pass
else: assert 0, s
assert i2s("7D") == 7*86400
assert i2s("232s") == 232
assert i2s("2M") == 2*30*86400
assert i2s("400m") == 400*60
assert i2s("1Y") == 365*86400
assert i2s("30h") == 30*60*60
assert i2s("3W") == 3*7*86400
def testIntervalsComposite(self):
"""Like above, but allow composite intervals"""
i2s = Time.intstringtoseconds
assert i2s("7D2h") == 7*86400 + 2*3600
assert i2s("2Y3s") == 2*365*86400 + 3
assert i2s("1M2W4D2h5m20s") == (30*86400 + 2*7*86400 + 4*86400 +
2*3600 + 5*60 + 20)
def testPrettyIntervals(self):
"""Test printable interval conversion"""
assert Time.inttopretty(3600) == "1 hour"
assert Time.inttopretty(7220) == "2 hours 20 seconds"
assert Time.inttopretty(0) == "0 seconds"
assert Time.inttopretty(353) == "5 minutes 53 seconds"
assert Time.inttopretty(3661) == "1 hour 1 minute 1 second"
assert Time.inttopretty(353.234234) == "5 minutes 53.23 seconds"
def testGenericString(self):
"""Test genstrtotime, conversion of arbitrary string to time"""
g2t = Time.genstrtotime
assert g2t('now', 1000) == 1000
assert g2t('2h3s', 10000) == 10000 - 2*3600 - 3
assert g2t('2001-09-01T21:49:04Z') == \
Time.stringtotime('2001-09-01T21:49:04Z')
assert g2t('2002-04-26T04:22:01') == \
Time.stringtotime('2002-04-26T04:22:01' + Time.gettzd())
t = Time.stringtotime('2001-05-12T00:00:00' + Time.gettzd())
assert g2t('2001-05-12') == t
assert g2t('2001/05/12') == t
assert g2t('5/12/2001') == t
assert g2t('123456') == 123456
def testGenericStringErrors(self):
"""Test genstrtotime on some bad strings"""
g2t = Time.genstrtotime
self.assertRaises(Time.TimeException, g2t, "hello")
self.assertRaises(Time.TimeException, g2t, "")
self.assertRaises(Time.TimeException, g2t, "3q")
if __name__ == '__main__': unittest.main()
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment