Your logic is mostly sound. You are on the right track with your train of thought:
- Read a line into
previous (a).
- Read another line into
current (b).
- If
previous and current have the same contents, go to step 2.
- Print
previous.
- Move
current to previous.
- Go to step 2.
This still has some problems, however.
Unnecessary line-read
To start, consider this bit of code:
while(fgets(b,6000,stdin)!=NULL) {
...
if(test==0) {
fgets(b,6000,stdin);
}
else {
printf("%s",a);
}
...
}
If a and b have the same contents (test==0), you use an unchecked fgets to read a line again, except you read again when the loop condition fgets(b,6000,stdin)!=NULL is evaluated. The problem is that you're mostly ignoring the line you just read, meaning you're moving an unknown line from b to a. Since the loop already reads another line and checks for failure appropriately, just let the loop read the line, and invert the if statement's equality test to print a if test!=0.
Where's the last line?
Your logic also will not print the last line. Consider a file with 1 line. You read it, then fgets in the loop condition attempts to read another line, which fails because you're at the end of the file. There is no print statement outside the loop, so you never print the line.
Now what about a file with 2 lines that differ? You read the first line, then the last line, see they're different, and print the first line. Then you overwrite the first line's buffer with the last line. You fail to read another line because there aren't any more, and the last line is, again, not printed.
You can fix this by replacing the first (unchecked) fgets with a[0] = 0. That makes the first byte of a a null byte, which means the end of the string. It won't compare equal to a line you read, so test==1, meaning a will be printed. Since there is no string in a to print, nothing is printed. Things then continue as normal, with the contents of b being moved into a and another line being read.
Unique last line problem
This leaves one problem: the last line won't be printed if it's not a duplicate. To fix this, just print b instead of a.
The final recipe
- Assign
0 to the first byte of previous (a[0]).
- Read a line into
current (b).
- If
previous and current have the same contents, go to step 2.
- Print
current.
- Move
current to previous.
- Go to step 2.
As you can see, it's not much different from your existing logic; only steps 1 and 4 differ. It also ensures that all fgets calls are checked. If there are no lines in a file, nothing is printed. If there is only 1 line in a file, it is printed. If 2 lines differ, both are printed. If 2 lines are the same, the first is printed.
Optional: optimizations
- Instead of checking all 6000 bytes, you only check up to the first null byte in either string since
fgets will automatically add one to mark the end of the string.
- Faster still would be to add a
break statement inside the if statement of your for loop. If a single byte doesn't match, the entire line is not a duplicate, so you can stop comparing early—a lot faster if only byte 10 differs in two 1000-byte lines!