如何从文件中提取单个字节块?

在 Linux 桌面(RHEL4)上,我想从一个大文件(> 1G)中提取一个字节范围(通常小于1000)。我知道文件的偏移量和块的大小。

我可以编写这样做的代码,但有命令行解决方案吗?

理想情况下,比如:

magicprogram --offset 102567 --size 253 < input.binary > output.binary
65219 次浏览

Try dd:

dd skip=102567 count=253 if=input.binary of=output.binary bs=1

The option bs=1 sets the block size, making dd read and write one byte at a time. The default block size is 512 bytes.

The value of bs also affects the behavior of skip and count since the numbers in skip and count are the numbers of blocks that dd will skip and read/write, respectively.

The dd command can do all of this. Look at the seek and/or skip parameters as part of the call.

This is an old question, but I'd like to add another version of the dd command that is better-suited for large chunks of bytes:

dd if=input.binary of=output.binary skip=$offset count=$bytes iflag=skip_bytes,count_bytes

where $offset and $bytes are numbers in byte units.

The difference with Thomas's accepted answer is that bs=1 does not appear here. bs=1 sets the input and output block size to 1 byte, which makes it terribly slow when the number of bytes to extract is large.

This means we leave the block size (bs) at its default of 512 bytes. Using iflag=skip_bytes,count_bytes, we tell dd to treat the values after skip and count as byte amount instead of block amount.

head -c + tail -c

Not sure how it compares to dd in efficiency, but it is fun:

printf "123456789" | tail -c+2 | head -c3

picks 3 bytes, starting at the 2nd one:

234

See also:

Even faster

dd bs=<req len> count=1 skip=<req offset> if=input.binary of=output.binary

I have had the same problem, trying to cut parts of a RAW disk image. dd with bs=1 is unusable, therefore I have made a simple C program for the task.

// usage:
//  ./cutfile srcfile destfile offset length
//  ./cutfile my.image movie.avi 4524 20412452
// compile, presuming it is saved as cutfile.cc:
//  gcc cutfile.cc -o cutfile -std=c11 -pedantic -W -Wall -Werror
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>


int main(int argc, char *argv[])
{
if(argc != 5) {
printf("error, need 4 arguments!\n");
return 1;
}




const unsigned blocksize = 16*512;  // can adjust
unsigned char buffer[blocksize];


FILE *f = fopen(argv[1], "rb");
FILE *fout = fopen(argv[2], "wb");
long offset = atol(argv[3]);
long length = atol(argv[4]);
if(f==NULL || fout==NULL) {
perror("cannot open file");
return 1;
}
fseek(f, offset, SEEK_SET);


while(length > blocksize) {
fread(buffer, 1, blocksize, f);
fwrite(buffer, 1, blocksize, fout);
length -= blocksize;
}
if(length>0) { // copy rest
fread(buffer, 1, length, f);
fwrite(buffer, 1, length, fout);
}


fclose(fout);
fclose(f);
return 0;
}