Friday, December 21, 2007

Merge Multiple Files and Sort Simultaneously

The other day I was gathering historical files populated with multiple logins. I was trying to determine how many user logins were created over the past three years. Obtaining a unique login count would involve simultaneously merging the files and then sorting.

The actual run was much larger than the example run below.

# ls
user1.txt user2.txt user3.txt user4.txt group1.txt group2.txt
# more user[1-4].txt
::::::::::::::
user1.txt
::::::::::::::
user1
user2
user3
tom
mike
unix
sysad
::::::::::::::
user2.txt
::::::::::::::
ott
bird
user1
mike
sysad
paul
user2
user3
bob
::::::::::::::
user3.txt
::::::::::::::
mike
unix
blogger
rick
senior
chopper
paulie
mikey
tom
vince
cody
::::::::::::::
user4.txt
::::::::::::::
vince
mikey
paulie
senior
cody

# wc -l user[1-4].txt
7 user1.txt
9 user2.txt
11 user3.txt
5 user4.txt
32 total

Here is the merged, sorted and unique output
# sort user[1-4].txt | uniq

There are 19 unique logins for this example
# sort user[1-4].txt | uniq | wc -l
19

1 comment:

UX-admin said...

Sorting? Here's a little tip to save time and CPU cycles:

sort -u user[1-4].txt -o results.txt

The correct interpretation of the "[1-4]" regex will of course depend on the shell.

Did you know that it is possible to simultaneosly read and write to the same file with sort(1)?

sort file.txt -o file.txt

Enjoy.