this time I want to talk about how agile methods or approaches can help you cope with uncertainty.
I want to tell you about the last little development job I did and how I approached it.
This one project was just successfully finished and the product shipped to the customer when I had to find out there is no proper way to find out how many installations we have.
So I started researching and found out there is a little project that was started and could help me find out these numbers.
The little project was about parsing logfiles and writing the parse results in a database.
But the way that the logfiles were parsed was very specific to other products that were released earlier, so it didn’t help me with the new and fancy product that apparently our customers liked a lot.
Analysing what we had a bit more showed that I was not the only one who needed to find out installation numbers for their products. And it was not only product management that was interested but also Sales and others.
So my idea was to start a mini-agile-project out of this.
I called it v4.
The plan for v4 was simple.
- Adjust the parser so it is more generic and writes all data needed in the database
- Modify the database or create a new database schema
- Visualize the data in a simple filtering webpage with a line-chart
This is how perl can look like:
- $text =~ s//[^/]*/..//;
- parse the logfiles and create an external sql file with insert statements that could be feeded to mysql
- directly insert data from perl script to the database using dbi
When I started using direct database access from perl things improved quite a lot. A full logfile could be parsed in about 1,5 – 2 hours time on the test machine. On the productive machine which had much more horse power (8 quad core cpu’s and a huge amount of RAM) I hoped that the speed was much higher.
Then I checked how many files in total I’d have to parse to get the data of 4 weeks.
The bad news: ~11.000 files. 11k files * 2 hours each = totally not doable
- greatly improve the speed of the dbi inserts
After some research about mysql I was able to do it. From 2 hours down to 20 seconds. But the journey was not over yet. After parsing a few thousand logfiles the database grew too large and became unresponsive. What took only about 20 seconds when the database was empty took around 2 hours when the database had over 200 million records.
- reduce the raw data to keep the database responsive
There were many more steps but I’ll stop here.
What I wanted to show you with these steps is the way I approached the problem. I couldn’t do it like I was used to develop. I didn’t have the time to first learn the programming language and the system I was working on so this had to go along the way.
If you are in an area of absolute uncertainty it is very hard if not impossible to take the traditional appraoch of requirements, design, programming, test.
This is where the agile approach can shine:
- implement a basic prototype to find out if you’re even able to do what you planned to do (version 1)
- continuously refactor as you find out more about the platform and the system you are using (version 2)
- let the architecture of your system emerge while you learn the pitfalls of the new environment (version 3 + 4)
- constantly question the requirements that you have, there might be a better/different way how to do it (version 4)
- automate as much as possible, the easier you can start from scratch the more likely will you succeed
- test a lot, best create automated tests for the system so you can be sure the refactoring doesn’t break anything
- keep your code as simple and readable as possible this will benefit the refactoring
- if you find complex blocks of code, refactor it and break them down to smaller chunks or modules until you are satisfied with the result
- expect the unexpected and be flexible enough to deal with things that could not be foreseen
- deal with unknowns as soon as you uncover them (thx @lukadotnet for pointing me to these last 2)
v4 was not completed entirely but part 1 and 2 could be finished. The resulting database could answer questions that no other system before could answer. Even though the gold plating – the small node js webpage – was not finished the whole project was a great success. Out of a project that in its entirety was supposed to only take 1-2 weeks resulted a development effort for part 1-2 of about 4-6 weeks. I’m pretty sure there are different and more efficient ways to do it. But if you are uncertain – you never know…