Opinion
Show me the Data: A Guide to the Hottest Jobs in Tech
What’s the difference between a data analyst, a data scientist and a data engineer?
Data related positions have been the hottest trend in the tech job market over the last couple of years, and with demand growing rapidly, this is not expected to change anytime soon. While everyone wants to join the party and enter this fascinating field, it is essential to first get an understanding of the various roles and responsibilities.
In this quick guide, I’ll do my best to dispel the confusion and crystalize the essence of the various different positions.
Data Analyst
The main responsibility of a data analyst is to identify important business questions, and then process and use data to help organizations make more informed decisions.
This role requires a wide set of skills, primarily the ability to gather large amounts of data and organize it in order to arrive at insights. Data analysts must possess both analytical and technical capabilities, and are expected to be familiar with ETL tools, data visualization and languages or technologies such as: R, Python, SQL, SAS, and more.
Business Analyst
While this role is not as technical as the other ones on the list, business analysts play an important role in the data world as the link between the technical personas and the business side. They must have a deep understanding of their specific industry (e.g.: healthcare, insurance, finance) and its business processes.
Since business analysts are the intermediators to the business side and management, they need to be able to produce reports, have decent data visualization skills and obviously be top-notch communicators.
Data Engineer
Data Engineers are the “builders” in the group. Some refer to them as the DevOps of the data sphere. I’ve seen companies define this role quite differently, but in my view data engineers lay the groundwork that enables other roles, such as data scientists and data analysts to successfully do their jobs. In order to achieve that, data engineers are trusted with the responsibility of building and maintaining the big data ecosystem for the organization, while making sure it is robust and runs smoothly.
Data engineers need to be pretty savvy about data systems, such as: Hadoop, Hive, MongoDB, MySQL, etc. They should also have hands-on experience with data streaming tools, ETL tools, and data modelling.
Data Scientist
Well, I initially wanted to leave this one to the end, since it is the most sought-after position out there — not only in the data world, but also in the wider tech community. The reason I think it attracts so many professionals lies in the fact that data science, by definition, is the junction between three key areas: programming, statistics and business knowledge. It also involves a lot of creativity, since data scientists start from a business question and need to find the optimal path to answering it, using advanced techniques like predictive analysis. Data scientists are tuned towards researching observations one wouldn’t have reached without deep analysis of data to the point of identifying patterns, linkages and behaviors of data, and then being able to realize how to utilize those patterns to benefit their organization.
Data Scientists are expected to be experts in statistics and math, and in programming languages, such as: Python, R, and Scala.
Machine Learning Engineer
This is another in-demand role, which has some overlap with data engineering and data science. Machine learning engineers are in charge of bridging the gap between the data scientist and the technology that facilitates delivering the benefits of the data scientist’s work to production or to the service of the organization. They do so by building data pipelines, moving models to production, exposing APIs, training the models, and performing A/B testing.
Machine learning engineers need to have in-depth knowledge of various machine learning libraries, for example, Tensorflow and NLTK, coding experience and strong knowledge in SQL, Rest APIs, and other complementary technologies.
Business Intelligence Developer
While most of the focus during the last couple years has shifted towards artificial intelligence, we must not forget the importance of business intelligence. Both AI and BI are key to successful decision making in modern organizations.
Business intelligence developers are usually assigned to develop and maintain BI interfaces: data visualization and dashboards, reporting, and query tools. They require skills in SQL, deep understanding of OLAP and ETL, and experience with BI systems, such as Power BI or Qlik Sense.
Database Administrator (DBA)
This role is the veteran on the list. The DBA has the critical role of setting up and maintaining the organization’s database. Being responsible for the health of the organization’s database, the DBA is basically in charge of one of the firm’s most valuable assets. The DBA’s activities include: managing access to the database, planning and archiving backup routines and recoveries, planning and executing installations and upgrades, monitoring the database, and optimizing its performance.
ETL Developer
In a nutshell, the ETL developer is in charge of the process of transferring data from a source database to a target database, including monitoring and testing the performance of the process and fixing it when needed. In large-scale systems, this process occurs very frequently which makes it a crucial role. ETL developers must have experience with ETL tools , such as Talend, Informatica, and Datastage, SQL, scripting languages and modelling tools.
Data Architect
I consider this role and the following one as the glue of the team. The data architect is basically the technical adhesive, leading all architectural activities. That includes creating blueprints and design documents for specifying database flows and integration points, evaluation, and approval of proper tools for the engineers to deploy and use. The data architect should also act as a “gatekeeper,” making sure the organization’s data vision is enforced, with the appropriate security measures.
The way I see it, data architects must be a jack of all trades. This means having in-depth knowledge when it comes to data technologies and best practices and keeping up-to-date with the latest advances.
Data Product Owner
Data product owners are responsible for leading the data strategy of the organization and overseeing the product portfolio in terms of leveraging data and alignment to the vision.
A data product owner is, well, a product owner. In general, the product owner defines the roadmap, collaborates with internal and external stakeholders to make sure it is moving forward and functions as the “glue of the project.” The data product owner is also in charge of making sure the organization maximizes the value of data to achieve optimal business results. In some cases, this means influencing senior management by presenting the benefits of leveraging data.
Data talents are no longer exclusively head-hunted by tech companies. These days, most companies out there already understand the power of data and its importance to the growth of the organization. Keep in mind that companies may vary in their definitions and scope of the different roles described above.
Alon Mei-raz is the vice president of Data & Insights at Bank Hapoalim, heading all data platforms and groups which supply insights to support over two million customers.