Hi,
As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line.
As DIFF command wont work for big files, i tried to use BDIFF instead.
I am getting incorrect number of records.
I have done the following test:
I have got a dat file with a few million records in it and to generate a another file i have used sed '1,100d' oldfile > newfile
so i am using Bdiff oldfile newfile | sed -n '/^</p' > DIFF.DAT
The output(DIFF) should be having 100 records in it. But i am getting an output with several records in it.
Could anyone help me out from this situation?
Thanks
Sue
Page 1 of 1
Comparing Two Huge Files
#2
Posted 15 September 2008 - 05:42 AM
Try using the fold command.
This will break your file in sizeable chunks
Now you can process it easily
This will break your file in sizeable chunks
Now you can process it easily
Vibhor Kumar Agarwal
#3
Posted 15 September 2008 - 12:41 PM
If i divide the file into chunks, how i can compare the individual chunks to another file?
Please let me know the logic.
Thanks
Sue
Please let me know the logic.
Thanks
Sue
#4
Posted 16 September 2008 - 05:45 AM
You will be applying fold to both files.
Break it into equal sizes.
Result:
You won't be comparing 1 file of 1 GB, but comparing 10 files of 100 MB.
Break it into equal sizes.
Result:
You won't be comparing 1 file of 1 GB, but comparing 10 files of 100 MB.
Vibhor Kumar Agarwal
Share this topic:
Page 1 of 1

Help












