Csplit
csplit#
csplit - split a file into sections determined by context lines
introduction#
Sometimes you have a file of certficiates that you want to split into component files.
The source file looks like this:
-----BEGIN CERTIFICATE-----
MIIGXzCCBeWgAwIBAgISBrIDKzxu2DUBARSOuwi/l1h/MAoGCCqGSM49BAMDMDIx
...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIGXzCCBeWgAwIBAgISBrIDKzxu2DUBARSOuwi/l1h/MAoGCCqGSM49BAMDMDIx
...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIGXzCCBeWgAwIBAgISBrIDKzxu2DUBARSOuwi/l1h/MAoGCCqGSM49BAMDMDIx
...
-----END CERTIFICATE-----
We can split on the first line: -----BEGIN CERTIFICATE-----. Every time we encounter that line, split into a new file.
csplit -z -f 'cert-' -b '%02d.crt' wikipedia.crt '/-----BEGIN CERTIFICATE-----/' '{*}'
Breaking this down:
-z: suppress empty files- Why (come back to this after reading more): The pattern we’re using goes up to but not including. In our case that patten is the first line and there’s nothing before the first line. So
cpslitproduces in my case.
- Why (come back to this after reading more): The pattern we’re using goes up to but not including. In our case that patten is the first line and there’s nothing before the first line. So
-f cert-: prefix the output files withcert-instead of the defaultxx-b '%02d.crt': suffix the output files with two digits and extension.crtinstead of the default of just two digits.wikipedia.crt: this is the name of the cert file I’m using'/-----BEGIN CERTIFICATE-----/': first pattern to match on that cert starting line- The
/pattern/will only:copy up to but not including a matching line
- The
'{*}': second pattern the repeats the prior pattern as much as possible
So every time we see -----BEGIN CERTIFICATE-----, make a new file and dump the contents of the current file into it and name them like cert-02.crt. Here’s what the output looks like for me:
❯ csplit -z -f 'cert-' -b '%02d.crt' wikipedia.crt '/-----BEGIN CERTIFICATE-----/' '{*}'
2269
1566
1939
❯ ls -l
total 20
-rw-rw-r-- 1 geoff geoff 2269 Sep 13 17:58 cert-00.crt
-rw-rw-r-- 1 geoff geoff 1566 Sep 13 17:58 cert-01.crt
-rw-rw-r-- 1 geoff geoff 1939 Sep 13 17:58 cert-02.crt
-rw-rw-r-- 1 geoff geoff 5774 Sep 13 17:36 wikipedia.crt
The output is the number of bytes that were written to each of the files. You can add on -s for silent output if you want. And then you can get on with your life doing whatever it is you need to do:
❯ for f in cert-0{0..2}.crt; do echo "== ${f}"; openssl x509 -in ${f} -noout -subject -issuer -dates; done
== cert-00.crt
subject=CN=*.wikipedia.org
issuer=C=US, O=Let's Encrypt, CN=E6
notBefore=Aug 10 23:56:29 2025 GMT
notAfter=Nov 8 23:56:28 2025 GMT
== cert-01.crt
subject=C=US, O=Let's Encrypt, CN=E6
issuer=C=US, O=Internet Security Research Group, CN=ISRG Root X1
notBefore=Mar 13 00:00:00 2024 GMT
notAfter=Mar 12 23:59:59 2027 GMT
== cert-02.crt
subject=C=US, O=Internet Security Research Group, CN=ISRG Root X1
issuer=C=US, O=Internet Security Research Group, CN=ISRG Root X1
notBefore=Jun 4 11:04:38 2015 GMT
notAfter=Jun 4 11:04:38 2035 GMT