Challenges of Transitioning to Management in Data Teams
- Many high-performing individual contributors in data roles are often promoted to management positions without adequate coaching or advising.
- Running a data team successfully is challenging and requires different skills than individual contributions.
- There is a sentiment that many data teams fail to meet business needs, although this is not universally agreed upon.
"I find that so many data engineers, data analysts, and scientists who have never managed a team often get thrown into being a leader or director or some level of management with very little in terms of coaching and advising being provided to them."
- Many data professionals are promoted to management without the necessary support and guidance.
"Running a data team successfully is hard."
- Managing a data team is inherently difficult and requires specific skills and strategies.
Common Pitfalls in Data Teams
- Some data teams spend excessive time on infrastructure without delivering tangible results.
- The focus should be on delivering what the business needs rather than on the technology itself.
"There are a lot of data teams out there who struggle to find and deliver what the business wants."
- Many data teams fail to align their work with business objectives.
"I've seen people who spent way too much time doing what I like to call infrastructure for infrastructure's sake."
- Overemphasis on infrastructure can lead to inefficiency and failure to deliver results.
Importance of Business Alignment
- The business cares more about solutions to their problems than the specific technologies used.
- Effective communication with stakeholders should focus on high-level outcomes rather than technical details.
"The business doesn't care about technology; they don't really care about how you solve the problem."
- Business stakeholders are primarily interested in results, not the technical means to achieve them.
"Never talk about data technology, infrastructure, or queries with people outside of the data team; they just don't care."
- Discussions with non-technical stakeholders should avoid deep technical details.
Effective Handoffs and Communication
- When handing off work to another team, focus on high-level summaries and avoid unnecessary technical details.
- Clear communication about the current state, problems, and processes is crucial.
"You really should just be saying, okay, here is where we are at from a high level, here are the problems we've run into."
- High-level summaries help stakeholders understand the project's status and challenges.
"What people need is an understanding at a high level like what's working, what's not working, where does this data come from, where is it going."
- Stakeholders need a clear understanding of the data flow and any issues encountered.
Leadership and Process Setup
- Effective data team leadership involves not only technical guidance but also process setup and team management.
- Hiring the right people and establishing efficient processes are critical for success.
"I've had to step in as a head of data for companies and provide coaching and advising both in terms of what should get done as well as who should be hired and how to actually set up processes."
- Leadership in data teams requires a holistic approach, including team building and process management.
Summary
- Transitioning to a management role in data teams is challenging and often unsupported.
- Avoid over-focusing on technology; prioritize business needs and clear communication.
- Effective handoffs and high-level communication are essential for successful project management.
- Leadership in data teams requires a balance of technical guidance, process setup, and team management.
Effective Communication with Management
- Use a common vernacular to ensure clarity when discussing technical aspects with management.
- Avoid going into excessive detail unless speaking to someone with a deep technical interest, like a CTO.
- Focus on high-level outcomes and the status of projects rather than intricate technical details.
"If you're deciding to call something a data warehouse, call it a data warehouse. Be clear on what it does. If you're deciding to call it a data pipeline, call it a data pipeline. Be clear on what it does."
- Emphasizes the importance of using clear and consistent terminology.
"You don't need to be like, 'Hey, in this data pipeline, we have S3, we have all these other tools in here, go deep in depth.'"
- Advises against overwhelming management with technical specifics.
"More than likely, they should understand from a high level which component, if there's a problem, is blocking a project or maybe where you are in this project."
- Stresses the importance of high-level communication to help management understand project status and issues.
"A lot of CEOs and CTOs do understand technology... but you need to be very diligent on how you bring it up."
- Acknowledges that while some executives have technical knowledge, communication should still be strategic and clear.
Role of Data Engineers and Analysts
- Data engineers and analysts have distinct roles and should focus on their specific tasks to optimize productivity.
- Data analysts typically build solutions for specific use cases, whereas data engineers create generalized solutions that support multiple use cases.
"Data analysts, for example, are generally focused on building things that are like analysis, doing some research, building dashboards, things that have like some final point."
- Describes the primary focus of data analysts.
"If you ask that analyst to build a data pipeline, what they will build is a pipeline that just serves that one use case."
- Highlights the narrow focus of data analysts when tasked with building data pipelines.
"If you ask a data engineer to build a data pipeline, their goal, a good data engineer, will be to build a generalized table that doesn't just support this one use case but may support a dozen or so or all of these cases that your company has for that specific data set."
- Explains the broader scope and goals of data engineers in creating versatile data pipelines.
"The way I like to look at it is basically the data engineer's goal is to build your core data layer."
- Summarizes the fundamental objective of data engineers in terms of building a robust and versatile data infrastructure.
Data as Infrastructure
- Data in healthcare should be treated like infrastructure, implying stability and reliability.
- Reliable data quality checks are crucial to ensure the consistency and accuracy of data.
- Analysts and analyst engineers need tools to build pipelines and perform analyses quickly.
- Leaf nodes in data structures should have minimal dependencies to maintain simplicity and reliability.
"It's almost like you treat data like infrastructure. This set of data shouldn't change much; it should be very tested, should be very reliable."
- Emphasizes the importance of data stability and reliability in healthcare.
"Analysts and analyst engineers, whoever the role is above that, need to get things done quickly. So they want to build pipelines; they want to build their own analysis."
- Highlights the need for tools that allow quick and efficient data analysis and pipeline building.
"If they ever start getting a massive amount of dependencies on it or have business-critical functionality being added to it, then it goes back into, 'Hey, let's have engineers look at this to make it reliable.'"
- Suggests a process for ensuring reliability when data structures become complex or critical.
Data Quality and its Costs
- Poor data quality can have significant financial impacts over time.
- Small errors can escalate into major issues, affecting the reliability and trustworthiness of data.
- Consistent errors can lead to a loss of trust and strategic partnership with management.
"Data quality will cost you. If you have bad data quality, it will cost you."
- States the direct financial impact of poor data quality.
"What might seem like a small error or issue that you're like, 'Oh, we can ignore this for now,' generally will catch you somewhere down the line."
- Warns that small data errors can grow into larger, more costly problems.
"You only have so many apps. If you are wrong multiple times in a row, eventually management will come to you and be like, 'You no longer work here.'"
- Emphasizes the career risks associated with consistent data inaccuracies.
Less is More
- Building fewer, high-quality data pipelines and dashboards is preferable to creating numerous low-quality ones.
- Maintaining a balance between the quantity and quality of data structures is essential.
- Each data pipeline or dashboard adds complexity and potential points of failure.
"Sometimes it feels like the goal is to build thousands of data pipelines and thousands of dashboards, but I do think there is this balance."
- Suggests that the focus should be on quality over quantity in data projects.
"Every data pipeline you build, yeah, it'll run, but every once in a while something bad's going to happen in the back end."
- Acknowledges that more data pipelines increase the chances of encountering issues.
"You'd rather build less things that are accurate than build large quantities that are low quality."
- Advocates for prioritizing accuracy and reliability over the sheer number of data structures.
Simplification of Infrastructure
- Simplifying infrastructure minimizes maintenance challenges and increases efficiency.
- Overloading with numerous components and solutions can complicate management and reduce productivity.
- Facebook's infrastructure is highlighted as an example of effective simplification.
"You don't want to have every component under the sun in your system, right? PKA, Airflow, Dagster, Estuary, DBT, Colest...you don't want a Borg of solutions; you want to be very clear and intentional on what you're developing but also what you're developing on."
- Emphasizes the importance of being selective and intentional with the tools and components used in infrastructure.
"The simpler your infrastructure, the easier it is to maintain."
- Simplified infrastructure is easier to manage and maintain.
"Facebook has really centralized on a few components that they really believe in, and that's what they build on."
- Facebook's strategy of centralizing on a few trusted components is presented as a model of effective infrastructure management.
Understanding the Business
- Deep understanding of the business context is crucial for delivering meaningful data insights.
- Knowledge of specific industry terminology and concepts (e.g., healthcare billing codes) enhances the ability to generate valuable hypotheses and insights.
- Being knowledgeable about the business allows data professionals to become strategic partners rather than just technical executors.
"Understand what people mean when they say ICD1 versus ICD9 versus, you know, billing codes and things like that."
- Highlights the importance of understanding industry-specific terminology.
"Understanding that business is what gives you that ability to have conversations with the business, become better more of a strategic partner with the business, and just play a bigger role."
- Understanding the business enables better communication and strategic partnership with business stakeholders.
"You are the person that's going to understand the data the most, and so if you even understand how the business works at a very base level, you can provide tons of value."
- Combining data expertise with business understanding amplifies the value that data professionals can deliver.
Transition from Individual Contributor (IC) to Management
- Transitioning from an individual contributor to a managerial or director role involves growing people and understanding the business deeply.
- Building strategic relationships with business stakeholders is key to advancing beyond technical roles.
- Focusing on leadership skills and business acumen is essential for career growth in data teams.
"Think about how you plan to transition from an IC to some sort of manager or director. How will you grow people? How will you understand the business more?"
- Encourages data professionals to plan their transition to leadership roles by focusing on people development and business understanding.
"How do you become a strategic partner and not just keep your kind of IC mentality like 'I build pipelines'?"
- Stresses the importance of evolving from a purely technical mindset to a strategic, business-oriented approach.
"What is the next level with it?"
- Challenges data professionals to think about their next steps in career progression and how they can add more strategic value.