Zippi Valeria Gomes

Zippi: Empowering Micro Entrepreneurs Through Machine Learning

Valeria Gomes is the Data Lead at Zippi. Brazil-based Zippi’s mission is to provide affordable and accessible financial services to over 30 million micro entrepreneurs, many of whom face unique challenges accessing credit from traditional banks. We spoke to Valeria about Zippi’s origin story, why machine learning (ML) is central to the company’s efforts, and best practices for building an ML practice in fintech.

Please briefly introduce yourself, covering your background and role at Zippi.

Valeria Gomes: As the Data Lead at Zippi, I bring over eight years of experience in the fintech industry. Half of that time was spent at Nubank, the largest fintech bank in Latin America. During my tenure there, I contributed to a wide range of projects from credit and investments to anti-money laundering and customer segmentation. At Zippi, I’m now focused on leveraging data and machine learning to empower micro-entrepreneurs in Brazil to improve their financial lives.

I’m especially glad to have met Ludmila Pontremolez (Co-Founder at Zippi) and to have become good friends during our time at university. We always like to mention that we were roommates during our time studying engineering at ITA, which is widely regarded as the most prestigious engineering university in Brazil.

I believe strongly in the power of data to drive business value, and I’m passionate about working collaboratively with our product and business teams to develop models that provide tangible benefits to our customers. Beyond my technical role, I’m deeply committed to supporting the growth and success of our team at Zippi. I’m thrilled to be part of a company that prioritizes people development and am excited to help build a world-class data team.

Can you tell us a little bit more about Zippi’s origins, vision, and how it stands out – both as a fintech company and with its micro entrepreneur customers?

Zippi was founded in 2019 by MIT alumni Andre Bernardes and Bruno Lucas along with Ludmila Pontremolez, a technologist working in Silicon Valley at the time. Zippi was born from a shared goal to make a positive impact on the lives of entrepreneurs in their home country of Brazil.

Zippi exists to provide affordable and accessible financial services to the 30 million micro entrepreneurs in Brazil who usually face challenges trying to receive credit from traditional banks because their businesses are informal and hard to track. We take a customer-centric approach and are committed to providing a seamless and user-friendly experience. Our team is made up of experts in finance, technology, and entrepreneurship who are passionate about creating a positive impact on the lives of micro entrepreneurs.

What sets Zippi apart is our innovative use of technology, including machine learning and artificial intelligence, to develop customized solutions for our customers and help them achieve their business goals. We’re dedicated to helping micro entrepreneurs in Brazil succeed by providing them with the financial tools they need to thrive.

What inspires you about Zippi’s mission?

What inspires me is our commitment to providing affordable and accessible financial services to micro entrepreneurs in Brazil. Zippi’s approach to leveraging cutting-edge technology and best practices in the market to create customized solutions is truly unique. Moreover, the company has a highly talented team with many different areas of expertise, driven by a shared passion for making a positive impact on the lives of micro entrepreneurs. We also enjoy a safe and fun space that includes every perspective and moves fast. I am thrilled to be part of this team and am eager to contribute my experience and knowledge to help drive the company’s success.

What are Zippi’s primary machine learning use cases?

Our main ML focus is credit, so we use machine learning models to assess credit risk, price sensitivity, and limit sensitivity. We leverage data science to define the most affordable and customized credit solutions for our customers, meeting their cash flow needs without compromising their payment capabilities. Ultimately, we believe responsible credit is a driver of better decisions and higher chances of success to our customers.

More broadly, we are a data-driven company that uses data and analysis to drive both marketing and product decisions. We are passionate about targeting the right customers and offering products that will improve their financial decisions over time. We run many experiments to validate (or refute) our hypothesis of how customers will behave or which types of customers will be more likely to use our products. Our goal is to help our customers succeed financially, and we use data to make that vision come true.

How do you view the evolving MLOps and ML infrastructure space?

As machine learning becomes increasingly prevalent in business, it’s important to consider how to operationalize and scale these models. MLOps and ML infrastructure play a crucial role in achieving this goal. As a small startup, we rely on SaaS solutions and cloud computing to take advantage of state-of-the-art technologies while keeping a lean team. By leveraging cloud-native ML infrastructure and partnering with cutting-edge providers like Arize, we ensure that we stay ahead of the curve and continue to use machine learning effectively, while keeping our core team focused on solutions that are tailored to our customers.

What are some of the challenges you deal with once models are deployed into production – and why are model monitoring and observability important?

Model monitoring and observability are critical for addressing challenges that arise when deploying machine learning models into production. Using machine learning in credit means you have automated your decision process, and not catching errors fast means losing money fast.

Once models are deployed into production, there is a risk that their performance may deteriorate over time. This can be due to changes in the underlying data distribution or changes in the business environment. For example, for the credit assessment use case, a model that performs well today may lose accuracy as we add new acquisition channels. It’s critical that we detect these shifts as soon as possible, so we can make adjustments to policies and models, minimizing loss in the process.

Data quality issues can arise when data sources change or when data is incomplete or incorrect. It is important to monitor data quality and address any issues to ensure that models are working with the best possible data. In one of my previous roles, a data provider changed the definition of one important feature for the credit model that was used to make decisions in real time, so every second counted. Having good monitoring helped us to deploy mitigation plans until the issue was solved.

Hidden feedback loops are also a potential challenge. When models are deployed in a live system, they can start to influence the system they are predicting, leading to the model predicting past decisions instead of current real-world events. This can result in inaccurate predictions and poor business outcomes. To avoid this issue, random experiments, debiasing techniques, and continuous monitoring can be used. One classic example in credit is the limit assignment: if your lines are conservative and you try to predict how much customers are going to use and spend, you may get your system predicting your past limit decisions because they constrain the utilization potential.

Why did you select Arize as your model monitoring and ML observability partner?

Zippi’s data team articulated the need for model monitoring, and they identified Arize as the best solution and drove its adoption within Zippi. The team evaluated several options and ultimately chose Arize due to its strong support, effective onboarding process, and commitment to helping us scale up our skills to consistently leverage the tool. Arize will help us further improve our models’ performance, ensure the quality of our data, and maintain fairness and transparency in our machine learning processes.

The team prioritized and leveraged our culture of high trust and autonomy in pursuing a solution to enhance our monitoring infrastructure, and we are excited about the decisions we can expect from them going forward. We are fortunate to have such a talented and committed group leading our efforts in this critical area of our business.

The choice and execution of Arize was one of the moments that made me most proud of our data team, and the kind of moment that I find most rewarding professionally.  They were dedicated and driven, and relentlessly worked to improve their own practice. Tools like Arize can be powerful but the dedication and attitude of the people using the tools will always be the real value driver.

What was your process for making that selection?

During the decision-making process, the Zippi data team wanted to evaluate Arize’s monitoring tool to see how it could help them improve their machine learning processes. They leveraged Arize’s trial to experiment using their own data in a real-world scenario. This proved to be essential in understanding how the tool could help them draw conclusions more quickly and effectively, and clarify the value brought by the solution.

By using their own data, the team was able to see the benefits that Arize brings to their monitoring and observability processes. The trial period was a crucial step in their decision-making process, and it helped them confidently move forward with adopting Arize as their primary monitoring tool.

Why is it important to get to the root cause of model performance issues quickly?

Getting to the root cause of model performance issues quickly is essential to speeding up decision-making, reducing business risk, and increasing trust in models. Models are often used to support critical business decisions, and delays in decision-making can be costly. By quickly identifying and resolving performance issues, businesses can make faster, more informed decisions that are based on the most accurate data available.

How do you collaborate with business and product leads and ensure models are delivering business results?

Collaboration with business and product leaders is essential to ensuring that machine learning models deliver results. By aligning models with business goals, incorporating feedback, and leveraging each other’s strengths, we can create value and drive success. Establishing a two-way partnership with business teams is critical to mutual understanding of needs and delivering optimal solutions to our customers and stakeholders.

Success metrics should be defined and agreed upon by both the data team and business/product leads. This ensures that everyone is aligned on the expected outcomes and how they will be measured. Regular communication is also critical to ensuring that machine learning models are delivering business results, especially in remote environments. By keeping business and product leads informed of model performance, changes, and improvements, we can ensure that they have the most up-to-date information to make informed decisions.

Sharing knowledge between the data team and business/product leads is essential to ensuring that everyone understands the context and can make informed decisions. We work with business and product leads to understand their domain expertise and incorporate it into the machine learning models. We also provide education and training on machine learning concepts and techniques, so business and product leads understand how the models work and what they can and cannot do.

Business and product leads have a deep understanding of the organization’s needs and priorities. We leverage this knowledge to estimate the potential impact of machine learning models and prioritize development efforts accordingly. By working with business and product leads to prioritize machine learning initiatives, we can ensure that the models are delivering the most value to the organization.

Evaluating machine learning models should be an ongoing process. We work closely with business and product leads to evaluate model performance against success metrics and make any necessary adjustments to improve results.

Are you hiring?

The data team has open roles for machine learning practitioners, data scientists and analytics engineers. Each role has its nuances, but in general we are looking for strong analytical thinking, problem solving skills, business acumen, high autonomy, flexibility and collaboration. The data team works closely with our business and product teams to design and implement experiments, analyze data, and develop models to support data-driven decision-making. We are looking for people passionate about data who are capable of delivering high-quality insights. For those interested in joining our team, please reach out!

What do you look for in data science hires – and what’s one business or technical interview question you always ask?

When we hire data scientists, we prioritize candidates with a strong interest in statistics and machine learning, as well as experience working with real-world data. In addition to technical skills, we value candidates who possess critical thinking abilities, a track record of delivering high-quality insights, and effective communication skills to both technical and non-technical stakeholders, allied to business acumen.

One question we often ask in interviews is for the candidate to walk us through one of the most challenging problems they have had to solve, and to reflect on what they would do differently if they had to solve the problem again. This question assesses s problem-solving ability, capacity for self-reflection and continuous improvement, and communication skills.

We believe that effective communication is critical to success in data science. Therefore, we look for candidates who can explain complex problems and insights in a clear and concise manner to both technical and non-technical stakeholders.