During the evolution of my current company’s product line and technical skillset, there was a point when leadership decided it wanted to realign our current vision and start focusing on the needs of our customers rather than solely on the development project of building the worlds fastest triple-store.
It was at that point that leadership said, in order to build a better relationship with our customers, we’re going to step into their shoes, and do with our products what they wish they could do, if only they had a better understanding of linked data and graphs. In that process, our team started vigorously researching and developing a process that performs the very analytics that our tool is supposed to foster.
I was made the product manager / product owner of that effort within the company.
Over the next several weeks, it became clear to me that analytics is not only a technical skillset, but a process, and an IT management challenge. In order to be successful, the analytics team had to be able to easily ingest data from the client, meet with them regularly to truly understand their set of needs, our analysts needed to be able to ask tough questions about the data, and then rapidly generate data analysis snippets that represented answers to those questions while doing it in a repeatable way. Those analysis snippets couldn’t just stand on their own, though. They needed to be addressed by our analysts explicitly — so, after performing those analysis tidbits, they needed to officially record insights for those questions. And finally, after all of that, we needed a way to generate a report that encompassed all of that effort.
During that time, we were heading down two major paths simultaneously. On one path, we were manually performing all of our data analysis for our clients by composing queries, running command line tools, building data models, consuming APIs and other services, and manually reporting all of our efforts. So far, I’ve learned that it takes FOREVER to do all of this work manually.
Luckily, as a development team, we also are very good at writing applications. So, the other path, naturally, was to write an application that could organize, enable, and demonstrate all of these pieces of the process at scale. So, during the actual work, we also built a full application that did all of that.
I am happy to report that after exactly 1 month of working tirelessly literally from the moments we woke up to the moments we went to bed, we have it. It is a working prototype of an analytics suite that literally does everything we need, and improves the performance in which we do them.
The toolset consisted of the following:
NodeJS for middle tier development
AzureSQL for MSSQL data storage and retrieval
MongoDB for rapid caching and application data storage
Mongoose for mid-tier object relational mapping within the application
AngularJS for the consumer application
Twitter bootstrap and SASS for dynamic interface styling
Countless 3rd party NPM modules for specific data analysis (like Natural Language Processing, Correlation Generation, Machine Learning, Data Cleansing, etc)
Azure and Amazon Web Services for storage and deployment of the application
The combination of these elements has resulted in a multi-tier application that not only can perform many difficult types of data analysis, but also is designed to be extended to use additional data sources, and perform new types of analysis in the future, including graph exploration and ontology management.
This is one of the most difficult projects I’ve done in a while, not only because of the time constraints, but also because of the requirement to literally become well versed in data analysis and data science. It was not easy, but it has been done, and from here on out, I expect every addition to the application to produce some very interesting and insightful data revelations to our customers.