How to split a file alternating the prefix used for the output files?

I have large file, I would like to sort each 80 lines to two types of files each one has 40 lines and give the first types of files the sequential names start at X_0001 and the second type of files the sequential names start with Y_0001

I used this command but it can only split into pieces with the same prefix:

 split -d -l 40 -a 4  inputfile X_  


With gnu split you could use the --filter option:

split --numeric-suffixes=0001 -l 80 -a 4 \
--filter='sed -n -e "1,40w $FILE" -e "41,80w ${FILE/X/Y}"' infile X_

This will split the file into 80-lines pieces, piping the content of each piece to sed which writes the first 40 lines to $FILE (the piece name, in this case split replaces it with X_???? - see man split) and the rest to ${FILE/X/Y} which is the same name but with X replaced by Y.

Since the requirement has changed and you only need to split into pieces with alternating names you could also use awk:

awk 'BEGIN{c=1;p="X"}
{close(fn);fn=sprintf("%s_%04d", p, c);print >> fn}
NR%40==0{p="Y"}NR%80==0{p="X";c++}' file1

This sets the piece name based on two variables, prefix and counter. Each 40 lines, the prefix changes to Y, each 80 lines the prefix changes to X and the counter is incremented.

One way is to use split and rename the files afterwards.

Given that there are only two parts, you could extract them independently with head and tail.

for x in whole_[0-9]*; do
  head -n 40 <"$x" >"X_${x#whole}"
  tail -n +41 <"$x" >"Y_${x#whole}"

But the simplest is probably to call awk. You can use the variables FILENAME to get the name of the current input file and FNR to get the current line number in the current input file. Awk's redirection automatically takes care of opening files. You should close files explicitly if you use a lot of different ones, otherwise you might run into a limit on open files.

awk '
  FNR==1 { close(out); out = sub(/^whole/, "X_", FILENAME); }
  FNR==41 { close(out); out = sub(/^whole/, "Y_", FILENAME); }
  { print >out }

Category: text processing Time: 2016-07-28 Views: 1

Related post

iOS development

Android development

Python development

JAVA development

Development language

PHP development

Ruby development


Front-end development


development tools

Open Platform

Javascript development

.NET development

cloud computing


Copyright (C), All Rights Reserved.

processed in 0.135 (s). 12 q(s)