The work of data science teams can be intertwined with cloud and other tech assets, which can make them part of budgetary questions raised about cloud spending. This is just one of the ways data scientists have expanded beyond some old expectations of the work they do and the assets they leverage. If steps are not taken to sort out how such resources are used, organizations might see data science contribute more to costs rather than returns.
Shane Quinlan, director of product management with Kion, spoke with InformationWeek about how data science has evolved, and ways data scientists can efficiently use the cloud.
Data science wasn’t something really on my radar when I started working in technology. The buzz started in 2015-2018, when data science become the thing. New positions started getting created and we started getting things like DataOps and MLOps. Big data--if you slap that onto any company, then gold mine.
I got pulled into it around that same timeframe, moving from a job where I was working, mostly supporting federal and law enforcement customers, jumping into healthcare. Switching from web and endpoint solutions to analytics. That was my first jump into data science.
Now I’m seeing it from a different angle because our product focus is much more on platform and infrastructure management. I’m looking at it from the cloud towards data science instead of looking at from data science towards the cloud.
I see two trends. One is around changes in technology and availability. Early on, it was kind of the Wild West. There were tons of new service offerings, technology stacks, and the skillsets were really divergent and started to be a little bit more accessible.
Data science was this big world. You had everything from your Excel data scientist literally using Microsoft Excel, to an expectation that you could write Java applications that could perform data functions and provide different output. You had mathematicians, you had statisticians, you had software developers, and you had folks who had more of a business intelligence-analyst role all coming at the same space and trying to find different ways to meet their expectations.
That’s when you saw a push for better user interfaces, making the development side less of a requirement. That’s where you have the introduction of notebooks like Jupyter and Zeppelin and derivations thereof to make that a little bit easier. You had like a human interpretable code and not-code interface with the way that you’re shaping data. Behind the scenes, I think there’s been this huge explosion of ways to shape that as well. You have tech like DBT that’s making the data transformations a lot easier. Technologies that were centered around the Apache Hadoop ecosystem have now shifted and morphed and moved all over the place making it a lot more portable. Apache Spark can be run in all kinds of different contexts now.
There’s been a drive towards a more user-centric model of data science. More user-friendly, more user interfaces, more easily interpretable. You can bring common skillsets like Excel or BI tools or SQL and do enough with that to make a difference.