Questions w/ Dr. Leo McHugh
How do you describe yourself in terms of data science?
I consider myself more of a process architect than a power cruncher. I enjoy the creative aspects of optimizing the statistical power for what you want to achieve. One of my heroes, JW Deming once said “without data, you’re just another person with an opinion”, I believe that the reverse also applies in data science, because without an opinion, you’re just another person with data – and that opinion part is the ‘meta’ over data science that I find most interesting and I spend the most time and thought invested.
Is there a simple definition of data science?
I think there are actually two distinct forms of data science, one an art, the other a science. There is an art in data science visualization and presentation, and a simple definition for this side of data science is the art of turning hidden connections into a story that makes sense. The other side could be defined as the process of inventing and testing either/or scenarios to rule things in and rule things out until you know enough to stop. That’s the lay definition, most scientists would call this data-driven hypothesis testing.
How subjective is data interpretation?
One thing that was sadly not emphasized enough in school (at least wasn’t for me), was to invest as much into understanding your confidence intervals and assumptions as you do into your process and conclusions. When these qualifiers are not meaningfully presented alongside an interpretation, it becomes much more subjective. I have found in science and in life, that a difference in interpretation is most often a result of neglecting to sufficiently lay out these components.
How can objectivity be optimized?
Will there ever be a “perfect” algorithm?
I hope not. There’s nothing epic in applying the rulebook. I don’t think that’s what drives us, not me anyway. The cosmological algorithm of Aristotle was eclipsed by Newton and then (literally) by Einstein, and I see the same progression in data science: the perfect algorithm using an abacus isn’t always the same using the extra dimensions on a silicon chip, and with quantum computing around the corner, the bar for “perfect” may be lifted again. There can be no way to see into the infinite future, and therefore no way to define what ‘perfect’ would be considered at that time. A sly question!
What domain have you chosen to master?
My peers have given me the feedback that much of the value that I bring to teams is the ability to increase a sense of team participation and ownership by being a conduit to the techniques of applied data science, and to consolidate group consensus by effectively communicating priorities and risks as revealed through analysis. I will continue to attempt to master the ’soft’ skills built over the hard skills of technical expertise. These soft skills are worth trying to master because it helps to foster an attitude of carrying the team, rather than just delivering an informatic service.
How do you grow your knowledge in new domains?
Say yes! Go try! Always have a few challenging professional side projects unrelated to paid employment.
How do you predict data science will evolve?
My computing mentors described a time when there was a premium on experimental design because of compute limitations. I learned my trade in a time when for most applications you could just throw packages at the problem – maximizing the time efficiency of the data scientist rather than the code. I predict that as the bell curve of data science applications in general shifts to an abundance of data, data science will go full circle back to a culture of data scientists and analysts investing a lot more time into thinking about which data and why, and upstream and downstream needs before executing. This in turn will change how data science is taught for the next generation and consequently the role of data science in industry will shift away from its perception a as service and move towards a communications role in an era where the technical processing part is assumed and ubiquitous.