Title : How to split a file into small parts
Author: Solène
Date : 21 March 2021
Tags : openbsd unix
# Introduction
Today I will present the userland program "split" that is used to split a single file into smaller files.
OpenBSD split(1) manual
# Use case
Split will create new files from a single files, but smaller. The original file can be get back using the command cat on all the small files (in the correct order) to recreate the original file.
There are several use cases for this:
- store a single file (like a backup) on multiple medias (floppies, 700MB CD, DVDs etc..)
- parallelize a file process, for example: split a huge log file into small parts to run analysis on each part
- distribute a file across a few people (I have no idea about the use but I like the idea)
# Usage
Its usage is very simple, run split on a file or feed its standard input, it will create 1000 lines long files by default. -b could be used to tell a size in kB or MB for the new files or use -l to change the default 1000 lines. Split can also create a new file each time a line match a regex given with -p.
Here is a simple example splitting a file into 1300kB parts and then reassemble the file from the parts, using sha256 to compare checksum of the original and reconstructed files.
```split and reassemble example
solene@kongroo ~/V/pmenu> split -b 1300k pmenu.mp4
solene@kongroo ~/V/pmenu> ls
pmenu.mp4 xab xad xaf xah xaj xal xan
xaa xac xae xag xai xak xam
solene@kongroo ~/V/pmenu> cat x* > concat.mp4
solene@kongroo ~/V/pmenu> sha256 pmenu.mp4 concat.mp4
SHA256 (pmenu.mp4) = e284da1bf8e98226dc78836dd71e7dfe4c3eb9c4172861bafcb1e2afb8281637
SHA256 (concat.mp4) = e284da1bf8e98226dc78836dd71e7dfe4c3eb9c4172861bafcb1e2afb8281637
solene@kongroo ~/V/pmenu> ls -l x*
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaa
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xab
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xac
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xad
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xae
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaf
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xag
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xah
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xai
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaj
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xak
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xal
-rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xam
-rw-r--r-- 1 solene wheel 810887 Mar 21 16:50 xan
```
# Conclusion
If you ever need to split files into small parts, think about the command split.
For more advanced splitting requirements, the program csplit can be used, I won't cover it here but I recommend reading the manual page for its usage.
csplit manual