February 25, 2025
One rainy day a friend from Facebook messaged me "Hey Brooks, do you work with code?". After some back and forth, we decided that the crux of his issue was an ill-formated Excel Spreadsheet at the core of his idea's inner mechanisms.
The main problem was: the Excel sheet was not only massive but also broken up by sheet into sub-sheets. This meant that when looking for a piece of data, he had to check across all 15 tabs 15 times, instead of searching through one sheet. He asked:
Can you help solve this problem?
To which I replied:
Of course! I can use POI.
I decided that since I knew the Apache POI library (Java) already, it should be doable. However, I wanted to spice it up. For this task, I decided to use Kotlin as a scripting language, and to try and write/use an Apache POI DSL for Excel where possible.
After some perusing of Github, I found a project actually building this DSL leveraging Kotlin's full Java interop so I decided to try it out. Since the code was simple, and I wanted to customize it, I shamelessly copied over the Poink library into my repo and started coding.
DSL's in Kotlin are the future of all coding. The kotlinx.html DSL is far superior to JSX in terms of first class support from the IDE, meaning, things that you would never expect autocomplete.
Now, I am prompted through my code as if the rules were encoded into my IDE at compile time. It means I don't really need to know POI. This is indispensable; it turns out Excel sheets are pretty simple to work with from code once it's all autocompleted for you.
A snippet of the Poink DSL
println("~~~~~~~~ Reading workbook...")
val rowCollection = mutableListOf<Row>()
workbook(inputName) {
repeat(numberOfSheets) { sheetIndex ->
sheet(sheetIndex) {
iterator().forEach { row ->
rowCollection.add(row)
}
}
}
}
Here I am looping through the rows of a workbook while iterating through each sheet, collecting them in a mutable list of rows.
Using Poink, I can simply add the rows to a new file with a new sheet with another simple loop:
println("~~~~~~~~ Merging Sheets...")
workbook {
sheet("Merged") {
rowCollection.forEach {
val cells = it.cellIterator().asSequence().toList()
row(cells = cells)
}
}
println("~~~~~~~~ Creating Output file...")
}.write("$mergeFileName.xlsx")
It's possible to produce your own binary of this for your own machine. For now, the easiest way to tinker with the code is to bootstrap a project with intellij using kscript. Kscript is easy to install with SDkMan.
install that then just run sdk install kscript
and you should be all set. Now just clone the repo and run kscript --idea ./CsxMerge/CvxMerge.kts
.
This will bootstrap a new project for you in Intellij IDE. From there all you need to do is add a path to either a
And they will be merged into one workbook for you. This is useful for database output or just general excel issues. I've included a small test file to tinker with for added results.
While this code only serves a specific purpose and doesn't cover a lot of use cases, it's still pretty useful. I've had a number of friends ask me this same question and it's worked reliably for normalized data.
Some issues arise when working with massive spreadsheets, Java will throw a heap overflow error. This could be improved by using a streaming api, which would need to be built into the DSL.
There is a good amount of flexibility to this code, once you get familiar with the DSL aspect of it. It can be adapted for numerous use cases easily, just by tinkering in the IDE. This discovery adds a level of fun to writing the code, as it's now an exploratory process.
The code is located on Gitlab.