Being resourceful means reaching out to your colleagues

How launching an internal data collection platform attracted over 20k employees and led to 103k advanced-topic data in 6 months

RoleSolo designer working with one PMT and five SDEs.
0 -> 1 product development. Product scoping for MVP release.

TimelineJanuary — May 2024
MVP shipped in April 2024

ContextImagine you are a software engineer at a company. Some team just announced that they are collecting coding-related data sets to train their ML model and it sounds interesting to you. You want to check out this data thing, especially because AI is exciting, but you don’t have that much time to spare...

Impact

25k employee sign-up since the MVP launch in April
51 collections launched between April and November 2024
103k collected data through this channel as of November 2024
Tooling satisfaction increased by .24 from 4.02/5 (162 responses) in April -> 4.26/5 (95 responses) in July

Driven by a strong business need

At the beginning of 2024, I was contacted by my organization’s leadership to address a need of collecting data that is expensive from external vendors. Compared to these vendors, employee-generated data was deemed to be

Cost-effective as the team could determine how much to pay each contributor.
Easy-to-find as each person’s specialty was already indicated via role title.
Less reliance on external companies as we could obtain these data in-house.

By contributing to these data collection, employees would get paid for the number of submitted items and the quality score.

The goal of the platformThe goal of the project was to create a new channel to collect skill-specific data from the company’s employees. Prioritized skills for the launch included ‘coding,’ ‘law and legal,‘ ‘ the arts,’ and ‘general knowledge.’

Users

Company’s employees, referred as “1P expert.” These volunteers would get paid per completed item.

Key user questions

How can I have a seamless labeling experience?
How can I easily identify the next task to do?
How can I find my participation rewards?

Product scoping

Identifying the location of this product in the existing suite
As a data routing channel, this tool should not affect collection design. I had been working on the collection design tool simultaneously of developing this project and knew that this tool would have dependencies on the other tool.

Click here to read end-to-end documentation of the workflow creation tool.

Figure 1. Flow between this internal experts channel and other tools

Labeling at the center of interactionThe product is expected to revolve around data collection. The product’s end goal is to ensure that the users can identify their areas of expertise, find appropriate data collection that suits their interests, complete data collection, and access statistics on their collection progress. Secondary functions like expertise identification and access leaderboard ensure that the participants stay engaged with the platform.

Figure 2. Envisioned product end-to-end flow

The PM, engineers, and I then scoped down MVP scope to the following:

A profile for the user to identify skillsets that have their expertise in.
A landing page where the user can see all the available options collection options for their skillset.
A data collection routing page, where they would see the workflow designed via collection design tool.
A leaderboard to see how they are performing against other users.

Figure 3. High-level goal of each page, connected via landing page

The MVP goal is to focus on collecting industry-specific data on a platform that users find engaging

Product tenets

A quick and easy access. Launching a collection should be one-click away so the experts can work on their data collection task immediately.
Skillset expectation transparency. Experts will be thoroughly informed about the collection expected level and type of skillset before clicking that workflow.
Engaging experience. The platform must have engaging experience outside of labeling to keep the users engaged on the tool.

User statesThe aligned MVP scope provided a simple and direct experience for the users via “beginning,” “during,” and “after” labeling collection campaign ends.

A campaign is a combination of a skill and a language, such as "coding" and "U.S. English," that best fits a user's area of expertise. Each user will only see the skills that they have initially selected. A user who has identified "coding" as their skillset will not see "humanities" campaigns and vise versa. Each campaign has an expiration date to keep up with scientists’ collection cycles and to keep the experts engaged.

Figure 4. User states diagram

Initial launch & user feedback

As the product was being developed, the initial launch date was requested to move forward. So in August, the team and I launched an MVP tool focusing on a model-in-the-loop workflow’s coding, arts, and general challenges. This product was leaner than anticipated due to shifting timelines.

Figure 5. Tooling landing page (left) and labeling page (Right)
The landing page was deemed as necessary and stayed on the UI. The workflow page itself was being rendered from the design tool. Missing features during this launch included:

A my profile page where the user can view their self-identified area of expertise.
An in-tool leaderboard. Users could go to a different dashboard to manually find their data, but this was offered on a completely separate platform.

162 survey responses and a focus groupSoon after the release, users quickly pointed out various problems in the tool, with the biggest issues being not being able to see how many items have been completed. (We were compensating users based on completion count.)

Figure 6.Synthesized survey responses categories from individuals across two different dates
After acquiring quantitative data, I also conducted a focus group with seven users to gather their feedback on the tooling. The general output had been that the users are excited to participate in training AI models, but features across the data collection tool and the design tool could improve the ease of interaction.

Figure 7. Labeler feedback from a seven-people focus group
The top three requests were from synthesized feedback were as follows:

A dashboard to see the total response count -> to be addressed via in-tool solution
Task submission confirmation -> to add behavior in workflow tool itself
Model error and timeout issues -> to share with the model connection team and provide better experience for the users

“It’s exciting to have an opportunity to play with our in-progress ML models. I am impressed with what our model is capable of, but wish I could check how many tasks I’ve completed directly on the portal so I don’t have to play a guessing game.” — Fernando D.

Designing & revision

After the initial unblocking of data collection, I went back to the drawing board and thought about how the team can deliver delight to the users. I wanted the portal to be more stylized and engaging, so that they can recognize the branding from afar and stay on the page for as long as possible.

The more revised version of the tool was launched in phases after the initial release in April. As of November, we have released the core features of leaderboard.

Landing pageThe landing page provides an overview of all available campaigns for a user. This page is personalized per expert and ensures that the user can easily find a campaign they want to participate in.

Figure 8. Landing page

Content generation or labeling pageLabeling page renders the data collection for experts to generate valuable data. A labeling campaign can range from demonstration (content generation) to evaluation workflows of various topics.

Figure 9. Labeling page
The input content and the questions are created via data scientist’s data collection design tool (read more here) and is rendered via frame. This ensures a 1:1 mapping between customer-created workflows and labeler-experienced workflows.

Task submission confirmation & Model error Two major feedback from users included having a proper feedback system for 1) task submission and 2) model errors.

Figure 10. Task submission banner
Task submission was confirmed by adding a dismissible banner to the top right corner every time the user clicks on a “Submit” button.

Figure 11. Model error type breakdown
After a deepdive, I concluded that there are three types of model error behaviors: Resubmission due to a change of mind or similar model-generated answers, one model fails to generate an answer, and all models fail to generate answers. Each scenario required a different solution.

Figure 12. Outcome per model error type

My ProfileThe profile page showcases user-specific details and allows the experts to

Check, add, and/or remove their expertise
View ongoing challenges that they've participated so far
View past challenges and if they've won a reward

Figure 13. Profile page
Leaderboard
The profile page showcases user-specific details and allows the experts to

Check, add, and/or remove their expertise
View ongoing challenges that they've participated so far
View past challenges and if they've won a reward

Figure 14. Leaderboard

Impact

“We have utilized the platform for experts to acquire data for our human evaluation effort. Through the platform we have been able to collect a dataset that is diverse in complexity, use-case, and subject area, which is crucial for evaluating our model performance.” — Joel C., Applied scientist

2k active users/month
Signed up by 25,000+ corporate employees across seven areas of expertise
51 collections launched between April and October 2024
103k collected data through this channel as of October 2024
Tooling satisfaction from 4.02/5 (162 responses) -> 4.26/5 (95 responses) between the initial release in April and July