A data platform engineer’s very bad day and what they learned from it
By: Arne Lapõnin & Daniel Fratte
Think about Arne, a data platform engineer, working with around 10 other folks in a team that’s responsible for providing platform solutions for data consumers. Apart from Arne’s Consumption team, there’s another data platform team in the same business unit (BU) responsible for providing solutions for teams responsible for data ingestion. That team is called the Ingestion team.
Arne has had a particularly bad day. He is trying to figure out what could have been the cause of it.

His day started with a sprint planning meeting, where the team was deciding which stories from an estimated backlog they were going to pick for the next sprint. During the meeting, the BUs Architect says that the Ingestion team needs some help with a proof of concept they are running for an authorization tool. The tool should make it really easy for the company to restrict the access of data consumers to data sets based on security rules. The Architect tells the team that all they need to do is spin up a cluster for the Ingestion team so that they could validate that the authorization tool works correctly, with the query engine the data consumer would use to access the data. The team includes the task into the sprint with the assumption that it’s the highest priority item, due to the Ingestion team having limited time to complete their proof of concept. All the developers assume that it’s just one team helping out the other team and they don’t think much about it.
After the sprint started, Arne picked up the high priority task. He contacted a developer from the Ingestion team to double check what parameters they need for the cluster. The Ingestion dev is surprised to hear from them. They didn’t know that the Consumption would be providing them the cluster. The Ingestion dev was under the impression that Consumption would be validating how data consumers interact with the authorization tool.
Arne was confused, they had no idea what they were supposed to validate since before the sprint planning meeting they had been told that the authorization tool’s proof of concept is none of their concern; the Ingestion team would be taking care of it.
Later that day, Arne was in a meeting with the Ingestion dev, and the Rep, a representative of the authorization tool company. The goal of the meeting was to talk about the configuration needed for the authorization tool to work with the query engine on the cluster that the Consumption team uses. Arne had prepared a bit by reading the documentation on how to install the authorization tool; it all seemed straightforward to him. He didn’t really understand what he was supposed to say or do in the meeting, since just a couple of hours earlier the Ingestion dev had showed him the tool running in a cluster.
The Rep started the meeting by asking to see the cluster up and running so that they could double check everything. The Ingestion dev looked at Arne with the expectation that he would start the cluster up and share their screen to show it. Arne was perplexed and asked the Ingestion dev to do it since he knew they had one, while he had nothing. The Ingestion dev was also confused but started spinning up the cluster. They messaged Arne that they thought Consumption would be driving this meeting and showing the cluster at work. Arne felt utterly perplexed about who was responsible for what and what they were supposed to achieve in that meeting.
The next day, Arne has a call with his mentor and tells them the bad day he had. The Mentor asks him to dig deeper into what he thinks was the source of all the confusion. Arne starts thinking about how the Architect brought in the last minute tasks with scarce details. He realizes that he had no idea what the goal of the authorization tool’s proof of concept was.
The Mentor asks him whether getting the business context of the proof of concept at the start of the sprint would have helped. Arne nods along enthusiastically, but the Mentor is a bit skeptical and asks him to think more about it. Arne starts thinking whether knowing more about how the company planned to use the authorization tool would have helped him perform his tasks. The bigger problem was that he lacked context on how his task was connected to other tasks that were in the backlog of the other team. Arne starts realizing that he would have had a clearer view of the dependencies had he been involved in the planning of the proof of concept. Since the authorization tool is part of the whole data flow, both teams should have been involved.
The Mentor then asks him if it would have been sensible to have two teams planning this due to the number of people involved. Even if both teams would have been involved, where would it have stopped; the authorization tool isn’t the only project the platform teams have that touches many aspects of the data flow. Arne starts realizing that the vertical split between the teams has really hurt them. It has been increasing the cognitive load of the team and making it difficult to focus on developing holistic solutions for the data engineering teams.
Disclaimer: the statements and opinions expressed in this article are those of the authors and do not necessarily reflect the stances of Thoughtworks.
Want to join Thoughtworks Spain? Apply for our vacancies on our website https://thght.works/3F3T4JA