Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

split file

Status
Not open for further replies.

AndrisS

IS-IT--Management
Oct 12, 2006
20
LV
Good day,

I need help from you.

I have to split and create output files by every 30.000 recordings.
Example ,

Input file has 150.000 (transaction.txt) recordings of bank accounts transactions.
The C code has to split this file and create output files by 30.000 records each from input file.
(transaction1.txt, transaction2.txt, transaction3.txt and etc.).The transaction.txt (150.000 recordings) has to be split on many files with 30.000 recordings.
So , does anybody know how to do this?
How C code will look like for this operation?

Thank you very much.
 
Please see how Account.txt file looks.

07004262890147655236 74105616281002275393260000000001007000000002000428000000002000428LHZB ATMH0418 - Limbaz Limbazi LV 6011 1009N3049252 2 6282
0701 000000 00002 5904180 ATMH0418000000000000 000000 00 000000000
07004389380039629638 74105616281002275394193000000001007000000015000428000000015000428LHZB ATMH0401 - Elizab Riga LV 6011 1009N2084002 2 6282
0701 000000 00002 5904016 ATMH0401000000000000 000000 00 000000000
07004262890148616575 74105616281002275257663000000001007000000002000428000000002000428LHZB ATMH0422 - Aizkra Aizkraukle LV 6011 1009N2201432 2 6282
0701 000000 00002 5904222 ATMH0422000000000000 000000 00 000000000
07004262890146321046 74105616281002275252417000000001007000000010000428000000010000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N3462742 2 6282
0701 000000 00002 5904339 ATMH0433000000000000 000000 00 000000000
07004262890146321046 74105616281002275252409000000001007000000002000428000000002000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N3369872 2 6282
0701 000000 00002 5904339 ATMH0433000000000000 000000 00 000000000
07004262890144868220 74105616281002275252391000000001007000000002000428000000002000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N3123242 2 6282
0701 000000 00002 5904339 ATMH0433000000000000 000000 00 000000000
07004262890147233141 74105616281002275252342000000001007000000001000428000000001000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N2373252 2 6282
0701 000000 00002 5904339 ATMH0433000000000000 000000 00 000000000
07004262890146395032 74105616281002275252300000000001007000000001000428000000001000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N2273262 2 6282
0701 000000 00002 5904339 ATMH0433000000000000 000000 00 000000000
07004262890146802094 74105616281002275252292000000001007000000005000428000000005000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N2232802 2 6282
0701 000000 00002 5904339 ATMH0433000000000000 000000 00 000000000
07004262890144868220 74105616281002275252326000000001007000000003000428000000003000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N2325232 2 6282
0701 000000 00002 5904339 ATMH0433000000000000 000000 00 000000000
07004389380039446967 74105616281002275252334000000001007000000005000428000000005000428LHZB ATMH0433 - Kudras Riga LV 6011 1009N2355882 2 6282
0701 000000 00002
 
The records appear to be fixed length. Is that right?

If they are fixed length, then the easiest and most efficient way would probably to read record_size * 30000 bytes & write that to a file, then continue do that until you reach the end of the input file.
 
[tt]
$ split --help
Usage: split [OPTION] [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is `x'. With no INPUT, or when INPUT
is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.
-a, --suffix-length=N use suffixes of length N (default 2)
-b, --bytes=SIZE put SIZE bytes per output file
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file
-d, --numeric-suffixes use numeric suffixes instead of alphabetic
-l, --lines=NUMBER put NUMBER lines per output file
--verbose print a diagnostic to standard error just
before each output file is opened
--help display this help and exit
--version output version information and exit

SIZE may have a multiplier suffix: b for 512, k for 1K, m for 1 Meg.

Report bugs to <bug-coreutils@gnu.org>.[/tt]

But if it's your homework to do it in C, then start with fgets() and fputs().

--
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top