Saturday, February 18, 2017

Creaky or Cranky Code

Sometimes you build a code Taj Mahal, or at least try to. A thing of rare beauty, of soaring architecture and fine attention to detail, the combined efforts of many over time.

This is not about that.

Sometimes you're a bit off the reservation - you're exploring new technologies by doing simple tasks, and that is not going to look pretty. Building the huts of sticks comes first. It shows you can at least do something with the available materials, but the first few cuts are often a bit unstable and poorly designed.

We have an embarrassment of riches available for free - well, as long as you have a computer and can pay for internet access, anyway. The barrier is much lower than it once was, and keeps getting lower.

I've been playing with AngularJS and needed to get some REST data as a client. AngularJS does that well, so I made a simple app and added a service to supply the REST data via promises. I copied large portions of the solution from assorted "How Do I...?" posts on the internet with good answers. In no time it was working, and I added some controls and UI embellishments easily. That's also something AngularJS does well.

AngularJS does have it's limitations. For local file access and more complex tasks without obvious solutions to crib it can be a slog.

Luckily up my other sleeve I've got Python, to cover the cases not easily covered via an MVC interface in the browser and access to backends. AngularJS isn't good at everything! Python mostly is good at (or at least capable of) most everything, it seems like.

Python is great for cases like this recent one. I needed to grab some data from a REST API with a data rate cap, with assorted further processing needed before shipping the data sliced various ways to an SQL database. Using Python allows me to solve all the issues directly in one script.

Python will handle pulling data from a REST API easily enough, and it's one of the more pleasant languages when it comes to getting things written to disk. It understands XML easily enough, and interfaces to SQL are simple enough. A script or two to create a database and tables and an import or two in the main script and we're off.

I was able to mine logged streaming data for input keys used to request records from a different but related REST API. Handling the REST client details and getting all of the SQL tables updated properly for each data point turns out to be easy to do in Python - I got it running in an afternoon.

Then the cranky part came in. Rate limiting made test runs slower, and my local mySQL implementation seemed to like to confuse the mySQL Workbench tool periodically, making tests even more tedious as I frequently exited and restarted utilities.

Just when I think I have it working, past 1,000 records into a 50K record run, things looking good, bang! An exception. The first of many. Characters in JSON results that can't be translated to the local code page (exception!), remote requests failing in various interesting ways, most triggering exceptions as it hadn't occurred to you that might happen so you certainly didn't prepare for it.

Your good idea that was oh so close to completion and looked like a nice reasonable tight bit of code - well, something happened when we weren't looking. It's a creaky thing, prone to falling over at the first hint of trouble. There are cures for that, but they take time and attention and may have their own issues.

Your code accumulates "Try:/except:"s all over, parameters are checked, text describing important features in the comments is added, logging of results and branches, days and weeks go by and it mostly does the same thing just more correctly. Code bloats up until you have to lose yet more time restructuring and then chasing down and correcting the inevitable bugs that introduces.

Even though the code does not look the same or as nice, with some careful refactoring it won't be too bad. Get some good unit test coverage and automated tests going, now hook up the CI/CD. Uh, where to hook up the CI/CD, well that's another topic.

Simple ideas don't stay that way if they get worked on, I suppose. The process of using AngularJS to get the first level of details recorded, then Python scripts to do the heavier API, data analysis and SQL output gets more complex over time, but I'm also gaining increasing value.

A bit more re-architecting and getting past a few one-time startup issues (i.e. initial data load with throttling) and I'll have a much more useful set of information, broken out nicely in tables, where it can be searched and sorted and used to drive other processes.

The next layer of intellectual property is to master how to manipulate that data to generate code, in this case for smartphone apps, that carries useful or important information.

I remember being told once that only an idiot, somebody who did not know what they were doing, used run-time code generation. He was wrong then, and still is. Plenty of smart people have joined the idiots doing run-time code generation, and it continues to be an appropriate way to solve various sorts of interesting problems.

This is simpler, this is back end analysis and code generation, not real time. The code thus generated in Java will be compiled into an Android phone app and apply the data to the user's benefit. This is data we gleaned from our complicated dance across the internet and an assortment of REST APIs, tricks and simple approaches to aggregate and increase the usefulness of the results.

I enjoy the complex mental work, figuring out where something that can be leveraged into something useful is available without too much effort (and for free is always nicest). I can see how the IP will fit together before I've written a line of it. It will change a bit as I continue - next time I'll lead with Python, probably, it's just a bit more suited to ad-hoc on the fly API ingestion via code and reverse engineering - but the phases are all there, just the flavor of the mouthwash used in this step might change, or a different brush there...

Now I'm down in the guts of the architecture, across the first chasm and running the creaky/cranky engine that will cross the next one, getting me useful data properly organized in mySQL.

The final step is to flesh out a specific task that could use this data, write the code and design the data object (a map of something, basically) then write code to generate the specific implementation needed based on the data and the use case we're trying to solve. This would be easier to describe if it were public so I could explain exactly what is going on. The details are trade secrets for now. Maybe later, we'll see.

I plan on doing this final step multiple times, working out specific implementation forms then generating code for them based on insights from the data. The data can source an awful lot of different useful features if you can figure out how to extract the needed details and turn it into code. As I said, I enjoy this sort of work and look forward to the challenge.

No comments:

Post a Comment