Resurrect a backup corrupted by FTP (ASCII) [message #312852] |
Wed, 09 April 2008 22:55 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
It's a reasonably common occurrence to corrupt a binary file when pulling it from Unix to Windows with the Windows FTP client.
The reason is because the Windows FTP client (and some others) default to ASCII mode, which translates Unix linefeed characters (CHR(10)) to Linefeed / Carriage Return ( CHR(10)+CHR(13) ).
Typically, you don't know you've done it until you try to use the file. Not so tragic if it's a JPEG photo of your mum, not so good if its a database backup.
This happend to us (backup, not photo of Mum).
Since the action of ASCII FTP is deterministic, it is actually reversible. Sadly you cannot just FTP it back in ASCII mode - I think that just blindly strips AND CHR(13) - which could occur naturally in a binary file. But below is a nice Perl script that does the job.
#!/usr/local/bin/perl -w
# File: unftp.pl
# Description: Restores a binary file corrupted by ASCII FTP
# from Unix to Windoze
# Syntax: unftp.pl infilename > outfilename
use strict;
my $chr;
my $prv;
my $cr = chr(13);
my $lf = chr(10);
my $crlf = chr(13) . chr(10);
my $iscr = 0;
my $fname = shift @ARGV || die "Expecting file name";
open (INP, "<$fname") || die $!;
while (read(INP, $chr, 65536) ) {
$chr = $cr . $chr if $iscr;
$iscr = 0;
if (substr($chr, length($chr)-1, 1) eq $cr) {
$chr = substr($chr, 0, length($chr)-1);
$iscr = 1;
}
$chr =~ s/$crlf/$lf/g;
print $chr;
}
|
|
|
|
|
Re: Resurrect a backup corrupted by FTP (ASCII) [message #313155 is a reply to message #312969] |
Thu, 10 April 2008 22:20 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Glad to help - hope you never need it.
We had a tense time for a while. We're in early design phase - too early to acquire a dev environment - so we Frankensteined a PC and made a "sandpit" server for a proof-of-concept. As part of it, we developed a bunch of prototypes that would prove valuable when we came to Dev.
Since we're a bunch of undisciplined techos, we installed 11g to check it out. But when it came time to get serious, we had to blow it away and revert to 10g.
Initially we thought it was not a corruption but a versioning problem - none of us are DBAs (hence the lax backup strategy in first place!). So we got a DBA in to reinstall 11g properly and confirmed that the backups were truly corrupt.
Since every backup was affected, FTP seemed the likely culprit.
Lucky the Unix->Win ASCII transform is deterministic and additive. We would have been in real trouble the other way around where the transformation is destructive.
Ross Leishman
|
|
|
Re: Resurrect a backup corrupted by FTP (ASCII) [message #314070 is a reply to message #313155] |
Tue, 15 April 2008 14:35 |
andrew again
Messages: 2577 Registered: March 2000
|
Senior Member |
|
|
Couple of other ways (probably not all suitable for BINARY files)...
-- convert ^M (DOS style to Unix style)
in.txt
-------
line1^M
line2^M
line3^M
/tmp>>col < in.txt > out.txt
/tmp>>od -c in.txt
0000000 l i n e 1 \r \n l i n e 2 \r \n l i
0000020 n e 3 \r \n
0000025
/tmp>>od -c out.txt
0000000 l i n e 1 \n l i n e 2 \n l i n e
0000020 3 \n
0000022
perl -pi -e "s:^V^M::g" <filenames>
cat <filename1> | tr -d "^V^M" > <newfile>
sed -e "s/^V^M//" <filename> > <output filename>
On HP-UX try dos2ux or dos2unix on Solaris.
|
|
|
|
Re: Resurrect a backup corrupted by FTP (ASCII) [message #314510 is a reply to message #314480] |
Thu, 17 April 2008 00:22 |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
The problem with tr -d "\015" is that it blanket removes CR characters, which can occur naturally in a binary file. The same is true of all of @AA's solutions (I think - didn't test)
I'm not sure about the Perl one. I thought the s// operator required two arguments. This syntax failed on my Linux box.
perl -p -i -e 's/' < b > c
Substitution pattern not terminated at -e line 1.
Ross Leishman
|
|
|
|
|
Re: Resurrect a backup corrupted by FTP (ASCII) [message #390495 is a reply to message #378928] |
Fri, 06 March 2009 10:43 |
gregbahun
Messages: 1 Registered: March 2009 Location: McMaster University
|
Junior Member |
|
|
dgdot - THANK YOU!!!!!!!!!!!
to reiterate what dgdot said:
Hopefully this helps the next person searching for help on this issue: Simple solution. You'll need a decent hex editor. I used Hex Editor Neo. It has a good batch find/replace feature.
Search for the following string (hex bytes): "0d 0a" and replace it with "0a". Replace every instance in your file.
i'm a chemistry grad student who inadvertently transferred my raw NMR data via FTP and didn't realized the FTP client transferred the binary data as ASCII files. i was trying various NMR programs to process the data, and at first the programs all sucked. but, it was pointed the problem was the binary to ASCII conversion. i applied the method above suggested by dgdot and it worked like a charm. the only thing - there are text files in the folders from the NMR data. so, I did a search for those txt files in those folders, and made them all read only. after this, the hex editor neo did not change those files - but it did change everything else (in a bulk (or batch) method) and i am now able to open those nmr data files and process them and get good spectra.
holy crap...thank you!!!!!!!!!!!!!
|
|
|