A couple of weeks ago, I met with a junior software engineer I’d not seen for a few years. The last time I’d spoken to him was when I offered him a place at my (then) company — an offer he eventually turned down. “I want to work with machine learning and data,” he had said back then, “And I think this other company would give me a better opportunity to do so.”
He was absolutely right, of course. My old company didn’t have much of a data capability; he wouldn’t have been able to learn the skills he wanted at the role I was offering him. But in our most recent meeting, he told me that it took a few years before he could get into ‘real’ data science at the other startup: “We’re only now starting to build up a data department, and I’m shifting from general software engineering to data science things. I’m really happy about that!”
There’s an interesting truth that’s buried in his experience that I think is worth exploring here, in greater detail. My friend’s experience is a reflection of an underlying reality: data team careers are different from careers in software engineering, product management, or user interface design. They are different because they depend on the surrounding context of the organization, in ways that careers in the latter three categories do not.
In software engineering, product management and design, job requirements are broadly similar. Sure, software engineering in a services company is marginally different from software engineering in a product company, but by-and-large, software people have rather similar responsibilities. This is also true for UI/UX designers — and slightly less true for product managers — but these roles are largely clustered around the same sorts of experiences.
Contrast that with a data analyst or a data scientist working in a data team. Their lived experiences differ vastly according to the organization their team is in. Certain companies demand that their analysts do proper data modeling. Others deal solely in free-form SQL queries, and share those queries in a Google sheets. Some lucky ones perform machine learning and statistical analysis. But still others expect their data analysts to be nothing more than SQL-and-Excel-exporting-monkeys.
Is it any wonder, then, that we get blog posts written by analysts leaving data because of ‘lack of impact’?
Why is this the case? The answer is obvious: unlike software engineers, product managers, or designers, data team members serve internal users, not external ones. This means that your day-to-day experiences and your learning opportunities depend greatly on the organization you are joining. It means that you cannot assume that all data team roles (at different companies) are equal. And it means — as far as your career is concerned — that you must develop an ability to evaluate the data maturity of organizations before you choose to work for them.
The Three Levels of Data Analysis
Emilie Schario has a great blog post over at GitLab on the three levels of data analysis. Companies exist in one of these levels, and they must advance from one level to the next. The three levels are, in order:
- Reporting — Reporting is the lowest level. As Schario puts it: when you have no answers, you never get beyond looking for facts. Example questions at this level are things like: ‘how many new users visited our website last week?’ and ‘how many leads did we capture this month?’ Sometimes, companies don’t even get to this level, because they lack an organizational capability to systematically collect data in their business. Other times, they do collect the data, but they don’t spend any time paying attention to it. Reporting is the lowest level of data analytics; if you do not collect data or if you do not have the cultural expectation of using it, you’re not going to base your decisions on facts.
- Insights — Insights is the next level above reporting. If reporting is about gathering facts to report on them, insights are about understanding relationships between facts. Often, insights only emerge when you combine data from multiple sources. For example: the number of new customers who cancelled their subscription this month is a reporting metric. If we combine this data with deals data in our sales CRM, however, we might learn that we have been targeting a terrible subsegment of our market. This latter observation is an insight, and can lead to behavioral change among sales and product (‘don’t target or serve this market segment in the future; we’re not a good fit for them’).
- Predictions — Predictions come after insights. It is at this level that you begin to see sophisticated techniques like statistical analysis and machine learning. This makes sense: after your organization increasingly understands the relationships between various metrics, you may begin to make informed business decisions to drive outcomes that you desire. A famous example here is Facebook’s discovery that users who add at least seven friends in their first 10 days are the most likely to stick around. This single discovery drove an incredible amount of product decisions at Facebook — leading them, eventually, to win the social media race. Such predictive discoveries can only come after a reporting function and an insight-mining function are second-nature throughout the organization.
I’ve provided a quick summary of Schario’s three levels, but I encourage you to read her post in full. Schario focuses on leaders within the organization, however, and she goes on to discuss a few ways to advance along the three levels.
I am not as interested in talking about that. My question is simpler: let’s say that you are a data analyst, or a data scientist. You’re looking for your next job. It’s clear that you should pick organizations based on their data maturity — if you want to learn machine learning or use regression models, you should pick companies at the ‘prediction’ tier; if you want to make a huge impact on business decisions, you want to work at a company that’s firmly on the ‘insight’ tier.
(A more sophisticated form of this is to identify companies that are just about to transition from one level to another, for this is when you get the most learning opportunities. And that’s not to mention the opportunity to make your mark on the organization as it grows into the new level.)
But how do you do this? How do you identify which level the company is currently on?
Head Fake Questions
What you want is to ask what I’ll call ‘head-fake questions’. A head fake is a move in American football where a player turns their head to look in one direction, but runs in another direction instead. A head-fake question is a question that pretends to ask about one thing, but in reality is asking about something else.
For instance, if you’re interviewing with a venture-funded startup, you might ask “tell me about the changes you’ve experienced at the company over the last year.”
An acceptable response to this question is a thoughtful answer about the ways the company has grown. Another good answer is “wow I can’t believe how much we’ve done/how much has changed/how far we've come when I think about it.”
A worrisome answer is “honestly, it’s about the same.” When venture-funded startups stagnate, they die, so you might want to follow-up with additional questions about the recent history of the company.
This is a head-fake question because it is difficult to guess at what you’re really asking for. On the surface, you’re asking for information about life at the company. Your interviewer might think that you’re trying to get a gauge for the culture, or that you’re trying to evaluate what it’s like to work there. Or perhaps they think you want to get a feel for the growth opportunities in the organization. The reality, of course, is that you want to evaluate the growth rate of the startup, since this affects your time there (and your career) in material ways.
Similarly, when you ask questions to evaluate the data maturity of a prospective employer, you’ll want to ask questions in a way that doesn’t completely reveal your intent. Here are a few suggestions, though you should obviously tailor your questions to the specific industry you're operating in:
- “What's your data stack like?” This is an obvious question to ask, but what you're looking for are signals that tell you which level the company currently belongs to. A company at the ‘prediction’ tier might have invested some effort into hosting Jupyter notebooks on a server, for instance, whereas a company on the ‘insight’ tier would not. The worse case scenario here is one where you are expected to manually export csv files for your business users on a regular basis — and those business users run their analyses on those files, within Excel. A company can't be very far along the ‘reporting’ level if they're doing that.
- “How does sales or marketing evaluate their performance?” Companies are often more data-driven with sales and marketing than they are with other functions of the organization (if the company doesn’t have a sales function, replace the subject of this question with ‘customer acquisition’ — that is, whichever part of the company it is that brings in new users). A good answer here is if there is some sort of cross-department involvement between ‘frontend’ sales, data analytics, and ‘backend’ product/customer support. Remember, insight can’t happen without straddling between multiple sources of data. If the company is evaluating its sales funnel purely inside the sales org, this means that it’s most likely at the Reporting level of data analytics, and no further.
- “What departments use the data team’s reports?” A good answer here is ‘many’, with concrete examples for each. This is relatively rare. An ok response is that sales, marketing and product does it, but nobody else really looks at analytics; a bad response is that sales does it on their own, in Excel, and product doesn’t really see insight-discovery as an important process.
In practice, of course, the questions you’ll ask should be adapted for the norms of your specific industry. At Holistics, we work with companies in Asia as well as companies from the West, and we notice that Western companies in general appear more data-driven than Eastern ones. Speaking broadly, companies with low margins tend to be more reporting-driven than companies with high margins; tech companies tend to be more data-driven than non-tech ones, and companies from the US are more data-driven than companies outside. Your questions should reflect the base realities in your given job market.
But really, if there’s a single idea you should take away from this piece, it is that data team careers depend very much on the realities of the company you’re working for. If you want to learn data science, pick a company that does data science; if you want to feel like you’re making an impact as an analyst, pick a company that’s data-driven in their decision making. Use the three levels of data analysis to figure which tier a prospective company belongs to … and do so before you start working for them. Godspeed, and good luck.
Sign up for our BI newsletter
Insights from practitioners around the globe.
In your inbox. Every week.
No spam, ever. We respect your email privacy. Unsubscribe anytime.
From SQL Queries To Beautiful Charts
Connect to your database and build beautiful charts with Holistics BILearn More
"Holistics is the solution to the increasingly many and complex data requests from the operational teams"
Tang Yee Jie
Senior Data Analyst, Grab