Discussion:
File copy/move - something better than the Finder?
(too old to reply)
Tony Hall
2017-01-16 11:52:49 UTC
Permalink
Hello,

I've bought some more storage for my photo backups, which will mean
moving many terabytes of data around.

Is there a better, more reliable method that would be preferable to
simply using drag 'n' drop in the Finder, maybe something that performs
robust error checking etc. (*Not* command line - my brain doesn't work
that way!)

I already have SuperDuper! and ChronoSync - would these offer any
advantages over the Finder?

Any software recommendations? (Preferably low cost as I only do this
once every few years.)

Many thanks for your thoughts and advice.

Cheers,
Tony
Steve Hodgson
2017-01-16 13:49:45 UTC
Permalink
Post by Tony Hall
I've bought some more storage for my photo backups, which will mean
moving many terabytes of data around.
Is there a better, more reliable method that would be preferable to
simply using drag 'n' drop in the Finder, maybe something that performs
robust error checking etc. (*Not* command line - my brain doesn't work
that way!)
I already have SuperDuper! and ChronoSync - would these offer any
advantages over the Finder?
Any software recommendations? (Preferably low cost as I only do this
once every few years.)
Many thanks for your thoughts and advice.
Are you needing to move these photos once or as a periodic activity?

If it's the latter then Chronosync would be ideal as it can do this
automatically when external storage is connected and it can handle
multiple sources and destinations. Like SuperDuper, it is a *copy* that
you're making rather than a move.

You could try Unison (not the Newsreader) which is a folder
synchronizer that uses commend line rsynch as the back end I think.

Hope this helps.
--
Cheers,

Steve
J. J. Lodder
2017-01-16 14:09:51 UTC
Permalink
Post by Tony Hall
Hello,
I've bought some more storage for my photo backups, which will mean
moving many terabytes of data around.
Is there a better, more reliable method that would be preferable to
simply using drag 'n' drop in the Finder, maybe something that performs
robust error checking etc. (*Not* command line - my brain doesn't work
that way!)
No. For a first fill with terabytes a Finder copy
is the best and the fastest way there is.
Post by Tony Hall
I already have SuperDuper! and ChronoSync - would these offer any
advantages over the Finder?
Only for keeping up.
Post by Tony Hall
Any software recommendations? (Preferably low cost as I only do this
once every few years.)
If you are really worried about bits getting corrupted
you can add a percentage of par2 files.

Better than error-checking in copy,
for you can get positive confirmation at all times
that all is still well, and you can restore if not,

Jan
Graham J
2017-01-16 22:01:52 UTC
Permalink
Post by Tony Hall
Hello,
I've bought some more storage for my photo backups, which will mean
moving many terabytes of data around.
Is there a better, more reliable method that would be preferable to
simply using drag 'n' drop in the Finder, maybe something that performs
robust error checking etc. (*Not* command line - my brain doesn't work
that way!)
I already have SuperDuper! and ChronoSync - would these offer any
advantages over the Finder?
Any software recommendations? (Preferably low cost as I only do this
once every few years.)
Many thanks for your thoughts and advice.
OK Some thoughts ....

Do you have master files for your photos and - separately - backup
copies? I suspect from the way you've worded the question that you
don't, so I'm not sure exactly what it is you are trying to achieve.

Computers generally distinguish between "move" and "copy". They may
make assumptions - thus on a Windows PC "move" happens by default when
the source and destination folders are on the same logical device
(generally). When source and destination are on different physical
devices the default operation is "copy". Macs may be different in this
respect.

Further, if there are restrictive ownership settings on the files in the
source folder these might or might not be transferred to the files in
the destination folder, particularly if the destination folder is
"owned" by another user. Moving files about between user folders on a
Windows server is a classic cause of such problems because ownership
does not change. I think Novell Netware made the opposite assumption
and changed ownership permissions to match that of the destination
folder. But whatever happens, for some users and some requirements the
default option will be incorrect.

The problem with "move" when it is applied to many files may be that
each source file is first copied then the source is deleted before
working on the next file. Thus files may be lost if the copy process
has failed. Typically on Windows, if one such failure occurs then the
whole "move" terminates - leaving some files on the source and some on
the destination. This may also happen on a Mac.

In addition, the mouse-driven "drag-n-drop" move is very susceptible to
human error. I've seen users think they have deleted many files when
they have inadvertently dropped them inside the wrong folder. An
explicit "copy" can sometimes be undone if the user makes that sort of
error.

This is where your requirement for robustness is relevant.

Probably the safest procedure is never to "move" anything. If you
always copy from source to destination then nothing is at risk of
unexpected deletion. Then ideally you should log every file copied -
but reading through that log to spot the odd failure in perhaps
thousands of lines is a real challenge.

This is where command line tools really do come into their own, and are
well worth learning. For example, a command could:

A. Specify a copy from source to destination, but only log the files
that would be copied.

B. Carry out that copy, creating a new log file.

... then you could compare the log files to verify they match - giving
you confirmation that the procedure has worked.

C. Compare each source file in turn with the destination file - again
logging the result. This gives more confidence that the procedure has
worked.

D. Repeat the process with the destination directory on a different
device - you then have a trustworthy duplicate copy.

Windows has "Robocopy" which together with command line scripts can
achieve such things. Given that Linux underlies the Mac operating
systems I'm sure an exactly similar scripting process would achieve what
you want - it must have the necessary tools.

Finally - if you really do have master files and backups - never change
a master version without having a trustworthy backup.

So if what you want to achieve ultimately is to move all the master
files to new physical storage - try this:

Create the new storage structure on the new hardware, and restore the
files from the backup to that new hardware in the organisation you now
require; then create a separate new backup which replicates the new
organisation. Finally delete the files from their original storage
location, and allow the original backup to be re-used.

You can execute each step and test it for correctness before moving to
the next.

Hope this helps.
--
Graham J
Chris
2017-01-17 08:23:23 UTC
Permalink
Post by Graham J
Post by Tony Hall
Hello,
I've bought some more storage for my photo backups, which will mean
moving many terabytes of data around.
Is there a better, more reliable method that would be preferable to
simply using drag 'n' drop in the Finder, maybe something that performs
robust error checking etc. (*Not* command line - my brain doesn't work
that way!)
I already have SuperDuper! and ChronoSync - would these offer any
advantages over the Finder?
Any software recommendations? (Preferably low cost as I only do this
once every few years.)
Many thanks for your thoughts and advice.
OK Some thoughts ....
Do you have master files for your photos and - separately - backup
copies? I suspect from the way you've worded the question that you
don't, so I'm not sure exactly what it is you are trying to achieve.
Computers generally distinguish between "move" and "copy". They may
make assumptions - thus on a Windows PC "move" happens by default when
the source and destination folders are on the same logical device
(generally). When source and destination are on different physical
devices the default operation is "copy". Macs may be different in this
respect.
Further, if there are restrictive ownership settings on the files in the
source folder these might or might not be transferred to the files in
the destination folder, particularly if the destination folder is
"owned" by another user. Moving files about between user folders on a
Windows server is a classic cause of such problems because ownership
does not change. I think Novell Netware made the opposite assumption
and changed ownership permissions to match that of the destination
folder. But whatever happens, for some users and some requirements the
default option will be incorrect.
The problem with "move" when it is applied to many files may be that
each source file is first copied then the source is deleted before
working on the next file. Thus files may be lost if the copy process
has failed. Typically on Windows, if one such failure occurs then the
whole "move" terminates - leaving some files on the source and some on
the destination. This may also happen on a Mac.
In addition, the mouse-driven "drag-n-drop" move is very susceptible to
human error. I've seen users think they have deleted many files when
they have inadvertently dropped them inside the wrong folder. An
explicit "copy" can sometimes be undone if the user makes that sort of
error.
This is where your requirement for robustness is relevant.
Probably the safest procedure is never to "move" anything. If you
always copy from source to destination then nothing is at risk of
unexpected deletion. Then ideally you should log every file copied -
but reading through that log to spot the odd failure in perhaps
thousands of lines is a real challenge.
This is where command line tools really do come into their own, and are
A. Specify a copy from source to destination, but only log the files
that would be copied.
B. Carry out that copy, creating a new log file.
... then you could compare the log files to verify they match - giving
you confirmation that the procedure has worked.
C. Compare each source file in turn with the destination file - again
logging the result. This gives more confidence that the procedure has
worked.
The above can all be achieved in one go with rsync. Very robust file
syncing utility. Command-line only on the Mac I'm afraid.
Post by Graham J
D. Repeat the process with the destination directory on a different
device - you then have a trustworthy duplicate copy.
Windows has "Robocopy" which together with command line scripts can
achieve such things. Given that Linux underlies the Mac operating
systems
Nope. The Darwin kernel is based on freebsd which is Unix.

Many commands differ between macOS and linux.
Post by Graham J
I'm sure an exactly similar scripting process would achieve what
you want - it must have the necessary tools.
It does, but requires knowledge of the command-line which the OP doesn't
like.
Post by Graham J
Finally - if you really do have master files and backups - never change
a master version without having a trustworthy backup.
So if what you want to achieve ultimately is to move all the master
Create the new storage structure on the new hardware, and restore the
files from the backup to that new hardware in the organisation you now
require; then create a separate new backup which replicates the new
organisation. Finally delete the files from their original storage
location, and allow the original backup to be re-used.
That would work.
Post by Graham J
You can execute each step and test it for correctness before moving to
the next.
Hope this helps.
unknown
2017-01-17 22:18:20 UTC
Permalink
Post by Chris
Post by Graham J
worked.
The above can all be achieved in one go with rsync. Very robust file
syncing utility. Command-line only on the Mac I'm afraid.
Note that you should use a non Apple one for backups as the Apple one is
old and does not deal with all unicode paths and also I think extended
file attributes
There are GUI wrappers e.g. arSync but I habve not used them
Post by Chris
Post by Graham J
D. Repeat the process with the destination directory on a different
device - you then have a trustworthy duplicate copy.
Windows has "Robocopy" which together with command line scripts can
achieve such things. Given that Linux underlies the Mac operating
systems
Nope. The Darwin kernel is based on freebsd which is Unix.
No the kernel is XNU. The command line tools are Free BSD
Post by Chris
Many commands differ between macOS and linux.
and the POSIX standard matched the Mac tools more Linux has extensions,
Post by Chris
Post by Graham J
I'm sure an exactly similar scripting process would achieve what
you want - it must have the necessary tools.
It does, but requires knowledge of the command-line which the OP doesn't
like.
Post by Graham J
Finally - if you really do have master files and backups - never change
a master version without having a trustworthy backup.
So if what you want to achieve ultimately is to move all the master
Create the new storage structure on the new hardware, and restore the
files from the backup to that new hardware in the organisation you now
require; then create a separate new backup which replicates the new
organisation. Finally delete the files from their original storage
location, and allow the original backup to be re-used.
That would work.
Post by Graham J
You can execute each step and test it for correctness before moving to
the next.
Hope this helps.
I use Carbon Copy Cloner for backups - and some work like rsync but with
a GUI
--
Mark
Richard Tobin
2017-01-18 12:34:08 UTC
Permalink
Post by unknown
Post by Chris
Nope. The Darwin kernel is based on freebsd which is Unix.
No the kernel is XNU. The command line tools are Free BSD
XNU contains a lot of FreeBSD code, to provide a Posix process model
and interface on top of the Mach-based microkernel.

-- Richard
Bruce Horrocks
2017-01-16 22:35:21 UTC
Permalink
Post by Tony Hall
(*Not* command line - my brain doesn't work
that way!)
That's a pity because you're excluding the cheapest and most reliable
method!

How about a half command line, half drag and drop solution?

1) Make sure your source and destination drives are mounted.
2) Go to LaunchPad and start the Terminal
3) Type the command "rsync -av " without the quotes. Note the space at
the end. Don't press enter.
4) Drag and drop the source directory onto the Terminal window.
(Terminal will automatically fill in the directory name as if it had
been typed in.)
5) Drag and drop the destination directory onto the Terminal window.
(Again the name gets added.)
6) Press Enter in Terminal
7) Watch the progress as the files are transferred.
8) Save these notes in a file for next year!

If you repeat the transfer (i.e. same source and destination
directories) at a later date then only the changes will be transferred.
--
Bruce Horrocks
Surrey
England
(bruce at scorecrow dot com)
Tony Hall
2017-01-18 17:13:34 UTC
Permalink
Post by Bruce Horrocks
Post by Tony Hall
(*Not* command line - my brain doesn't work
that way!)
That's a pity because you're excluding the cheapest and most reliable
method!
I was afraid someone would say that!

What makes copying via command line more reliable than via the Finder
(faster, more robust, less error prone, something else)?
Post by Bruce Horrocks
How about a half command line, half drag and drop solution?
...
<Snip> - Step-by-step details saved for future reference - thank you.
Post by Bruce Horrocks
If you repeat the transfer (i.e. same source and destination
directories) at a later date then only the changes will be transferred.
nospam
2017-01-18 17:46:38 UTC
Permalink
Post by Tony Hall
Post by Bruce Horrocks
Post by Tony Hall
(*Not* command line - my brain doesn't work
that way!)
That's a pity because you're excluding the cheapest and most reliable
method!
I was afraid someone would say that!
What makes copying via command line more reliable than via the Finder
(faster, more robust, less error prone, something else)?
nothing.
unknown
2017-01-18 19:34:38 UTC
Permalink
Post by Tony Hall
Post by Bruce Horrocks
Post by Tony Hall
(*Not* command line - my brain doesn't work
that way!)
That's a pity because you're excluding the cheapest and most reliable
method!
I was afraid someone would say that!
What makes copying via command line more reliable than via the Finder
(faster, more robust, less error prone, something else)?
Because rsync and other tools copy then verify the copy and Finder does
not. rsync also only copies files that have changed whilst Finder would
copy unchnaged files again as well.

You can use backup GUIs like Carbon Copt Cloner and I think SuperDuper
to do non whole filesystem copies. Given not wanting to use command
line it might be worth you paying for one of them
Post by Tony Hall
Post by Bruce Horrocks
How about a half command line, half drag and drop solution?
...
<Snip> - Step-by-step details saved for future reference - thank you.
Post by Bruce Horrocks
If you repeat the transfer (i.e. same source and destination
directories) at a later date then only the changes will be transferred.
--
Mark
Tony Hall
2017-01-19 03:36:19 UTC
Permalink
Post by unknown
Post by Tony Hall
What makes copying via command line more reliable than via the Finder
(faster, more robust, less error prone, something else)?
Because rsync and other tools copy then verify the copy and Finder does
not. rsync also only copies files that have changed whilst Finder would
copy unchnaged files again as well.
Good to know - thanks.

You mentioned in an earlier post about not using Apple's rsync. I have
little idea about using Apple's command line, let alone installing and
using someone elses. Is this something a neophyte like me could get to
grips with?
Post by unknown
You can use backup GUIs like Carbon Copt Cloner and I think SuperDuper
to do non whole filesystem copies. Given not wanting to use command
line it might be worth you paying for one of them
I already have SuperDuper.
Chris
2017-01-19 22:03:11 UTC
Permalink
Post by Tony Hall
Post by unknown
Post by Tony Hall
What makes copying via command line more reliable than via the Finder
(faster, more robust, less error prone, something else)?
Because rsync and other tools copy then verify the copy and Finder does
not. rsync also only copies files that have changed whilst Finder would
copy unchnaged files again as well.
Good to know - thanks.
You mentioned in an earlier post about not using Apple's rsync. I have
little idea about using Apple's command line, let alone installing and
using someone elses. Is this something a neophyte like me could get to
grips with?
I really don't think it's an issue unless you're in the habit of naming
files with odd characters or emojis.

Alternatively, I've just spotted this gui for rsync
https://rsyncosx.blogspot.co.uk/2017/01/rsyncosx_52.html?m=1

Not used it myself, but could be helpful?
Post by Tony Hall
Post by unknown
You can use backup GUIs like Carbon Copt Cloner and I think SuperDuper
to do non whole filesystem copies. Given not wanting to use command
line it might be worth you paying for one of them
I already have SuperDuper.
Tony Hall
2017-01-18 17:13:34 UTC
Permalink
Many thanks everyone for the detailed replies, it's given me some food
for thought.

I think I've introduced some unintentional confusion.

For a little clarity (hopefully!):

Currently I store my photos (plus related files) on three 4TB external
drives:
1 - Working drive
2 - Backup of working drive
3 - Off-site backup of working drive
All 3 drives' content is currently identical.

I'm running out of space on the 'working drive' (and consequently the
two backup drives) so I'm formulating a more thorough and efficient
storage/archiving strategy and have initially purchased an additional
8TB external drive.

Consequently I will be shifting a lot files around between drives.

Normally I would simply copy the files using Finder drag & drop and then
manually delete unneeded files as necessary (ie. the Finder is not
'moving' any files).

It occurred to my tiny mind that copying TBs of files around using the
Finder may not be the safest method and I'd hate to introduce some
corruption, hence my poorly thought out question.


So my questions are...

- Is the Finder prone to introducing unreported corruption or errors
during file copying?

- Is there a better method of copying large amounts of data? **

- Would an applications like ChronoSync (only mentioned because I own a
currently unused copy) perform additional integrity checks while copying
files (maybe by analysing originals and their copies during the process
and thus be preferable to using the Finder)?

- What is a good application to scan hard drives for file corruption?

- Am I over thinking the whole thing and should simply stop
procrastinating and get on with the job at hand?

** People have already stated that rsync is the way to go - why?


This exercise has raised a whole host of additional questions, but I'll
ask those separately.

I hope that all makes some kind of sense and many thanks for reading
this far!

Cheers,
Tony
nospam
2017-01-18 17:46:39 UTC
Permalink
Post by Tony Hall
So my questions are...
- Is the Finder prone to introducing unreported corruption or errors
during file copying?
no. why would you think it would be?
Post by Tony Hall
- Is there a better method of copying large amounts of data? **
depends what you mean by 'better'. sometimes finder is better,
sometimes it isn't.
Post by Tony Hall
- Would an applications like ChronoSync (only mentioned because I own a
currently unused copy) perform additional integrity checks while copying
files (maybe by analysing originals and their copies during the process
and thus be preferable to using the Finder)?
generally no, because that can be *extremely* time consuming for
something that very rarely happens.
Post by Tony Hall
- What is a good application to scan hard drives for file corruption?
diskwarrior will detect directory corruption. for bit rot, there's not
much you can do other than having redundant backups or wait for apfs.
Post by Tony Hall
- Am I over thinking the whole thing and should simply stop
procrastinating and get on with the job at hand?
yes
Post by Tony Hall
** People have already stated that rsync is the way to go - why?
because they don't know of any other method.
Chris Ridd
2017-01-18 21:04:24 UTC
Permalink
Post by nospam
for bit rot, there's not
much you can do other than having redundant backups or wait for apfs.
Does APFS really detect and prevent bit-rot, a la ZFS? From what I've
read it does NOT do that for files, so if you've got a good
counter-reference I'd love to see it.

See <http://dtrace.org/blogs/ahl/2016/06/19/apfs-part5/>
--
Chris
Paul Sture
2017-01-23 19:27:52 UTC
Permalink
Post by Chris Ridd
Post by nospam
for bit rot, there's not
much you can do other than having redundant backups or wait for apfs.
Does APFS really detect and prevent bit-rot, a la ZFS? From what I've
read it does NOT do that for files, so if you've got a good
counter-reference I'd love to see it.
See <http://dtrace.org/blogs/ahl/2016/06/19/apfs-part5/>
The paragraph about checksumming takes me back to discussions we had
about turning checksumming off (for performance reasons) for backup
tapes back when tape drives with ECC protection started appearing.[1]

Explicitly not checksumming user data is a little more interesting.
The APFS engineers I talked to cited strong ECC protection within
Apple storage devices. Both flash SSDs and magnetic media HDDs use
redundant data to detect and correct errors. The engineers contend
that Apple devices basically don’t return bogus data. NAND uses
extra data, e.g. 128 bytes per 4KB page, so that errors can be
corrected and detected. (For reference, ZFS uses a fixed size 32
byte checksum for blocks ranging from 512 bytes to megabytes. That’s
small by comparison, but bear in mind that the SSD’s ECC is required
for the expected analog variances within the media.) The devices
have a bit error rate that’s tiny enough to expect no errors over
the device’s lifetime. In addition, there are other sources of
device errors where a file system’s redundant check could be
invaluable. SSDs have a multitude of components, and in volume
consumer products they rarely contain end-to-end ECC protection
leaving the possibility of data being corrupted in transit. Further,
their complex firmware can (does) contain bugs that can result in
data loss.

Adam makes the very same point that came up in the tape discussions,
that of end-to-end ECC protection for the complete data path (cables,
controllers etc).

[1] my preference was to stay with checksumming switched on, and
the next generation of CPUs moved the bottleneck back to I/O instead
of CPU, so it was an easy decision to keep on checksumming.
--
A supercomputer is a device for turning compute-bound problems into
I/O-bound problems. ---Ken Batcher
Chris Ridd
2017-01-23 20:04:51 UTC
Permalink
Post by Paul Sture
Post by Chris Ridd
Post by nospam
for bit rot, there's not
much you can do other than having redundant backups or wait for apfs.
Does APFS really detect and prevent bit-rot, a la ZFS? From what I've
read it does NOT do that for files, so if you've got a good
counter-reference I'd love to see it.
See <http://dtrace.org/blogs/ahl/2016/06/19/apfs-part5/>
The paragraph about checksumming takes me back to discussions we had
about turning checksumming off (for performance reasons) for backup
tapes back when tape drives with ECC protection started appearing.[1]
Interesting perspective!
Post by Paul Sture
Adam makes the very same point that came up in the tape discussions,
that of end-to-end ECC protection for the complete data path (cables,
controllers etc).
I’ve seen many instances where devices raised no error but ZFS (correctly) detected corrupted data. Apple has some of the most stringent device qualification tests for its vendors; I trust that they really do procure the best components. Apple engineers I spoke with claimed that bit rot was not a problem for users of their devices, but if your software can’t detect errors then you have no idea how your devices really perform in the field. ZFS has found data corruption on multi-million dollar storage arrays; I would be surprised if it didn’t find errors coming from TLC (i.e. the cheapest) NAND chips in some of Apple’s devices. Recall the (fairly) recent brouhaha regarding storage problems in the high capacity iPhone 6. At least some of Apple’s devices have been imperfect.
Apple really cannot honestly say that their devices don't return bad data.
Post by Paul Sture
[1] my preference was to stay with checksumming switched on, and
the next generation of CPUs moved the bottleneck back to I/O instead
of CPU, so it was an easy decision to keep on checksumming.
It seems like a no brainer. CPUs are hardly ever going to be the bottleneck.
--
Chris
Tony Hall
2017-01-19 03:36:19 UTC
Permalink
Post by nospam
Post by Tony Hall
- Is the Finder prone to introducing unreported corruption or errors
during file copying?
no. why would you think it would be?
Another photographer/educator mentionied that he uses some utility for
copying large batches of files, so it raised some doubt in my mind wrt
the Finder's suitability - hence I asked the question.

Many thanks for your input on the other questions.
Graham J
2017-01-18 18:04:39 UTC
Permalink
Tony Hall wrote:

[snip iseful description]
Post by Tony Hall
So my questions are...
- Is the Finder prone to introducing unreported corruption or errors
during file copying?
Copying large volumes of data may take a long time - hours. What
happens if you want to temporarily suspend the process to (for example)
disconnect the source disk part way through the process? Or the power
fails (here in rural England that hapens occasionally).

I know nothing about Finder, but I'm reasonably certain that a Mac
command line method to "copy new and changed files ..." could be stopped
and restarted; and would in effect pick up where it left off ...
--
Graham J
Tony Hall
2017-01-19 03:36:19 UTC
Permalink
Post by Graham J
I know nothing about Finder, but I'm reasonably certain that a Mac
command line method to "copy new and changed files ..." could be stopped
and restarted; and would in effect pick up where it left off ...
As would ChronoSync (mentioned in my original question).

It also has the advantage of a more visual interface. Unfortunately my
brain doesn't seem to work well with abstract command lines and code
(heck, I struggle to get my head around creating simple scripts and
occasional basic html/css etc.).

Thanks.
J. J. Lodder
2017-01-18 20:01:54 UTC
Permalink
Post by Tony Hall
Many thanks everyone for the detailed replies, it's given me some food
for thought.
I think I've introduced some unintentional confusion.
Currently I store my photos (plus related files) on three 4TB external
1 - Working drive
2 - Backup of working drive
3 - Off-site backup of working drive
All 3 drives' content is currently identical.
I'm running out of space on the 'working drive' (and consequently the
two backup drives) so I'm formulating a more thorough and efficient
storage/archiving strategy and have initially purchased an additional
8TB external drive.
Consequently I will be shifting a lot files around between drives.
Normally I would simply copy the files using Finder drag & drop and then
manually delete unneeded files as necessary (ie. the Finder is not
'moving' any files).
It occurred to my tiny mind that copying TBs of files around using the
Finder may not be the safest method and I'd hate to introduce some
corruption, hence my poorly thought out question.
So my questions are...
- Is the Finder prone to introducing unreported corruption or errors
during file copying?
Not prone, but read/write errors are possible.
The Finder doesn't verify writes.
I tried it once, running File Buddy to find unique files
(by data and resource fork) on a newly copied big volume.
(taking more than 24 hours of scan time)
It found one, yes one, deviant file. (a broken pdf)
Curiously enough the broken file was on the source volume,
so the Finder copy was better than the original.
The Finder rereads when it spots a read error.
Post by Tony Hall
- Is there a better method of copying large amounts of data? **
No. As said, you can introduce it by other means.
Best, but somewhat cumbersome, is par2.
Post by Tony Hall
- Would an applications like ChronoSync (only mentioned because I own a
currently unused copy) perform additional integrity checks while copying
files (maybe by analysing originals and their copies during the process
and thus be preferable to using the Finder)?
- What is a good application to scan hard drives for file corruption?
Scanning for corruption is in principle impossible.
If a valid file has changed into another valid file
there is no way to detect that.
Introducing redundancy is the only way to accomplish that.
(by adding parity files, recording checksums, or something like that)
Post by Tony Hall
- Am I over thinking the whole thing and should simply stop
procrastinating and get on with the job at hand?
Is losing one picture in a million, perhaps once in a thousand years
a disaster to you? Would you ever detect it?
Post by Tony Hall
** People have already stated that rsync is the way to go - why?
To a hammer everything looks like a nail.

Jan
Tony Hall
2017-01-19 03:36:19 UTC
Permalink
<Snip - interesting thought - many thanks.
Post by J. J. Lodder
Post by Tony Hall
- Am I over thinking the whole thing and should simply stop
procrastinating and get on with the job at hand?
Is losing one picture in a million, perhaps once in a thousand years
a disaster to you? Would you ever detect it?
Nope. Currently with my system I have 6 copies of each and every camera
RAW file (spread over 3 drives) that is processed and delivered to
clients. Plus processed exported JPGs also on the 3 drives and on my web
server.

As you say, the possibility of irrevocably losing an important image is
pretty small.

I guess I should stop procrastinating and just dive in and get things
organised.

Thanks for your thoughts on these matters.
J. J. Lodder
2017-01-19 10:55:46 UTC
Permalink
Post by Tony Hall
<Snip - interesting thought - many thanks.
Post by J. J. Lodder
Post by Tony Hall
- Am I over thinking the whole thing and should simply stop
procrastinating and get on with the job at hand?
Is losing one picture in a million, perhaps once in a thousand years
a disaster to you? Would you ever detect it?
Nope. Currently with my system I have 6 copies of each and every camera
RAW file (spread over 3 drives) that is processed and delivered to
clients. Plus processed exported JPGs also on the 3 drives and on my web
server.
As you say, the possibility of irrevocably losing an important image is
pretty small.
I guess I should stop procrastinating and just dive in and get things
organised.
Thanks for your thoughts on these matters.
The problem is not losing an image, or even a pixel by bitrot.
It is finding out which of all those copies is not intact.

The problem with all those repeated transfers or backups is
that you may be replacing good files with a broken ones,
untill you no longer have a good copy left.
Comparing volumes for brokn bits is extremely time-consuming.

If you are really worried about file integrity
parity files are the only solution.
Parity files are very fast by comparison with full scans.
It is not just that you can repair damage,
it is also that you can know positively that all is well.

Jan
Chris
2017-01-19 23:15:25 UTC
Permalink
Post by Tony Hall
Many thanks everyone for the detailed replies, it's given me some food
for thought.
I think I've introduced some unintentional confusion.
Currently I store my photos (plus related files) on three 4TB external
1 - Working drive
2 - Backup of working drive
3 - Off-site backup of working drive
All 3 drives' content is currently identical.
How are they connected? Firewire, usb or Ethernet
Post by Tony Hall
I'm running out of space on the 'working drive' (and consequently the
two backup drives) so I'm formulating a more thorough and efficient
storage/archiving strategy and have initially purchased an additional
8TB external drive.
Consequently I will be shifting a lot files around between drives.
The connectivity will define how fast that can happen, it could take days
in the worst case scenario. Any solution needs to take that into account.
Post by Tony Hall
Normally I would simply copy the files using Finder drag & drop and then
manually delete unneeded files as necessary (ie. the Finder is not
'moving' any files).
It occurred to my tiny mind that copying TBs of files around using the
Finder may not be the safest method and I'd hate to introduce some
corruption, hence my poorly thought out question.
Given that it's going to take many hours to complete, using Finder could
introduce problems as you may lose track of which files have been copied
and which haven't.

Rsync can simplify that and will only copy files which don't already exist
at the destination.
Post by Tony Hall
So my questions are...
- Is the Finder prone to introducing unreported corruption or errors
during file copying?
Not Finder itself, but things like usb can die during very large,
continuous data transfers.

Or finger trouble, or any number of unexpected issues.
Post by Tony Hall
- Is there a better method of copying large amounts of data? **
- Would an applications like ChronoSync (only mentioned because I own a
currently unused copy) perform additional integrity checks while copying
files (maybe by analysing originals and their copies during the process
and thus be preferable to using the Finder)?
- What is a good application to scan hard drives for file corruption?
- Am I over thinking the whole thing and should simply stop
procrastinating and get on with the job at hand?
Not really. It is important to weigh up your options before you start than
after.
Post by Tony Hall
** People have already stated that rsync is the way to go - why?
It checks file integrity, can resume incomplete copies, can produce a log,
it's efficient and can do incremental backups.
Loading...