As product leaders over the past 15 years, we’ve experienced first hand, over and over, how hard it can be for organizations to effectively evaluate feature releases, and, most importantly, act on the results. We’ve seen these issues in startups as well as in post-IPO companies.
Why is that?
There are three main issues:
- Features haven’t been treated as first-class citizens. Product analytics tools don't have a concept of a feature, which makes feature evaluation a custom undertaking, which leads to inconsistency and added workload.
- Lack of framework leads to bespoke definitions and dashboards making it difficult to trust, compare and share results across the organization.
- Customer satisfaction isn’t accounted for when only looking at e.g. retention metrics. Without qualitative feedback we’re missing half of the story, and we only focus on what we measure.
Our goal with this framework is to establish a common baseline for measuring feature engagement and satisfaction that product teams can use to make better decisions, consistently.
The framework is open source and contributions are welcome! We hope you find it useful.
STARS definition #
STARS is a funnel that measures the satisfaction of a feature. The framework is based on quantitative engagement data and qualitative satisfaction scoring.
TL;DR Define the feature's target audience
Not all features are relevant for all users. When writing a feature specification, product managers are attempting to solve a recurring problem for a specific segment of accounts.
For example, a new onboarding feature might only be relevant for new accounts. On the other hand, a data export feature might only be available for Enterprise customers, which may be less than 1% of the total accounts.
In this step, we clearly define the feature’s target audience so that the results are only measured against those accounts. This ensures that the results are as actionable as possible and not skewed by non-target accounts.
Having a target audience also enables comparable rates across features. Since each feature is filtered down to targeted accounts only, we can compare engagement rates across features effectively.
Typically, segments are defined based on account traits or historic actions.
TL;DR Measure feature awareness
The purpose of the Tried step is to figure out if accounts are aware of the feature or not.
When releasing a new feature, product marketing or in-app guides will nudge accounts to try out the new feature.
Besides measuring awareness, the purpose of this step is to highlight accounts that are aware of the feature but never got beyond the feature activation threshold (more on than in Adopted).
If Tried is low, it’s a clear indicator that the feature hasn’t been communicated well enough to its audience or the audience are aware the feature exists but it hasn’t yet been appealing enough for them to try out.
This step is therefore useful for both product teams and product marketing teams.
TL;DR Measure feature activation
Account activation is a common product metric. Similarly with features, there’s feature activation.
The purpose of this step is to measure how many accounts have used the feature enough to count them as having adopted the feature and filter out the accounts that just interacted with it a few times.
When new features are released, accounts might try it out because it’s new and they’re curious what it looks like. However, such accounts haven’t committed to using the feature and we therefore want to filter them out before later measuring feature retention and churn. That should only be measured against accounts that are truly using the feature.
Adopted typically sets an activation threshold to enable such filtering. For example, a threshold filter could be that accounts need to have used the feature N times before they can reach Adopted and/or they need to have used the feature for at least N weeks.
This step is a good early proxy for feature success. Since features are measured against a target Segment, we want to see a majority of accounts in Adopted.
If Adopted is less than that, there’s an indication that there’s a discrepancy between the user problem and the implemented solution. Either the solution isn’t good enough or the problem isn’t significant enough for users to spend time on it (or the adoption threshold isn’t appropriate for the given feature)
For features with higher than expected Adoption, it’s an opportunity to double down as you’ve introduced a feature that seemingly is of high interest to your users.
TL;DR Measure feature retention
This is the critical step of the funnel. It shows how many accounts from Adopted that keep using the feature over time. These are the accounts that are actively using your feature.
In SaaS products, most key features should likely be used at least once every subscription cycle, which typically means monthly, and ideally every week.
However, some features aren’t meant to be used that frequently. For example, an accounting export functionality is a key feature that may only be used once a quarter when the quarterly finances are due. Even though this feature is a lifesaver to the finance team, it’s used fairly infrequently.
Defining the right retention period per feature is critical to measuring Retained correctly.
For newly released features that are meant to be used weekly, you may want to lower the retention period in the first couple of weeks, so that you can keep track of early churners. Those churned accounts are great for in-depth discovery calls about why the feature was interesting enough for them to adopt, but not to keep using after a short time.
When users churn from a feature, meaning they haven’t used it within the retention period, they are no longer counted as part of the Retained group (but are still counted in the Adopted group)
TL;DR Measure satisfaction of retained accounts
Long-time retention is a strong indicator of a successful feature. However, to know for sure, we must go beyond the metrics and ask the customers if they are also satisfied with the feature.
Sometimes accounts use features because they have to and not because they like it. Maybe they’re forced to use it at work, are on a lengthy subscription or just haven’t had the time to switch to a competitor yet.
Such accounts we want to learn about as quickly as possible so we can address their dissatisfaction and retain them.
In the Satisfied step, we ask the Retained accounts how satisfied they are with the feature. Collecting feature satisfaction can be done with a scoring framework like CES or CSAT. With CSAT, the customer provides a score between 1-5. 1 being very dissatisfied and 5 being very satisfied.
Catching low satisfaction scores from Retained accounts will provide you with incredibly valuable information ahead of potential churn. Asking users in Retained is the best way to get actionable feedback as those are the users trying to solve a specific problem with the respective feature on a recurring basis.
The Satisfied step is a powerful final step in STARS as it ensures that the entire organization is thinking about feature success in a consistent way. For example, drip campaigns could help trick churning accounts back to a feature, which will look good on the metrics, but that will most likely result in a very negative satisfaction score, meaning a low STARS feature score. By combining engagement data and customer satisfaction we ensure that everyone is working towards increasing the Satisfied score as opposed to just parts of the funnel.
Who can use this? #
First and foremost, this framework is designed for SaaS products (B2C/B2B) where customer retention is a key metric.
While this framework is primarily designed for product teams, we believe the output results are highly useful for several functions in the organization.
Product manager #
As initiators of most features, product managers need to make sure that the features they put on the roadmap also see sufficient customer satisfaction upon release. Are the features moving the needle? Of course, not all features need to have a business impact, but they need to fulfill a purpose and be used sufficiently.
Product managers also constantly need to think about how to best use the engineering resources: Is it on building new releases, iterating on recent releases that receive negative feedback or getting rid of unsuccessful features?
The CPO needs to zoom out and gauge the overall health and trajectory of key strategic features. Understand feedback coming from the customers - quantitative and qualitative - as well track the performance of product teams. Which teams are releasing impactful features and which are struggling. The customer feedback can help get teams back on track going forward.
Engineering resources are always short in demand. Feature success can only truly be measured when you factor in the feature delivery and maintenance cost. Engineering and product should work together on evaluating individual feature impact versus its cost and overhead. This way, features that have a high monthly overhead but insignificant impact, can be removed and resources can be allocated where it matters more.
As an engineer, it can sometimes feel like sitting at the bottom of a conveyor belt and you never really get any customer feedback on the features you write and deploy. Increasing transparency between product, customer feedback and engineering, can increase productivity and a sense of purpose. In our experience, most engineering teams love getting feedback on their work - negative and positive. We’ve seen many times that a certain bug that had been lying around forever suddenly got fixed overnight because the engineer was (finally) exposed to the customer's pain directly.
In organizations with a ProductOps role, this framework can help streamline the way teams are gathering data and evaluating features. Creating a common language and workflow across product teams, with the right information at the right time in the same framework, is key to the success of ProductOps, and for the product organization as a whole.
Product Marketing #
Product marketing needs to raise awareness of new features. The Tried step in the STARS framework will inform product marketing if the messaging has resonated with customers or not. If the Tried count is low, there’s a discrepancy between what the features do and what the customers are interested in. Either, the feature isn’t useful for the audience or its messaging needs to be tweaked.
Why do we need this?
Lack of common language #
One of the main causes for frustration when it comes to feature evaluation is data trust. As an industry, we’ve established a rather consistent way of measuring key product metrics, like e.g. Monthly Active Users.
But when it comes to the individual features - that make up the product -, we haven’t established a common language yet. Actually, in most product analytics tools there isn’t even a concept of a feature.
This often leads to ad-hoc solutions by product managers and data analysts when evaluating a feature. Since feature engagement data typically is based on event tracking, the will typically visualize feature engagement by charting a distinct event on a times series chart, aggregated by certain filters.
Each team may have come up with a good and consistent way of doing so, but it’s often different per team which makes comparison across features in the product very difficult and data trust hard.
Feature comparison is an important aspect of feature evaluation as measuring features individually can be tricky: Is 80% adoption good? Why not 85%? Comparing key features will clearly show which features are the main drivers of adoption and retention in the product, and which aren’t.
Missing alignment without qualitative feedback #
It’s well established that you need both engagement metrics and qualitative insights to truly evaluate the success of a feature. For example, a feature might have decent retention but are the customers satisfied using it, or do they just have to use it because of their workplace? Such insights are incredibly important and can be left undiscovered, if the customer satisfaction isn’t factored into the evaluation. If the customers are dissatisfied with key features, they are a churn risk and the feature metrics will conceal this fact until it’s too late.
Feature evaluation should therefore be a combination of engagement metrics and customer satisfaction. However, the engagement metrics and the qualitative insights often live in isolated silos and aren’t looked at in combination. If they aren’t combined, some teams might be hard at work on bumping the adoption and retention metrics not knowing that drip campaigns to unsatisfied customers will have the opposite effect.
It’s crucial to ensure that everyone on the team rallies around the same key metric: Customer satisfaction of retained accounts, which is the final step of the STARS funnel.
Pausing the feature factory #
Features are often large projects and often more complex than anticipated. It’s understandable that when they’re finally released, we feel like taking a breather from it.
However, at this stage, the feature is now with the customers for the very first time. Objectively speaking, it’s the most important time in the feature journey - and it’s the time we pay the least attention to!
Part of this is due to the fact that it’s overwhelming and time consuming to come up with a framework for how to evaluate the feature. With product managers having a myriad of things to do, it’s often the case that evaluation gets postponed or never done.
This can lead to product and engineering organizations becoming feature factories that ship a lot of features that work technically, but never really hit home with the customers.
The easier it is for product teams to access feature satisfaction, the more likely it is for them to take a second to look at the feature. Doing so will pause the conveyor belt and focus on making existing features impactful before moving on to the next roadmap items.
The missing evaluation step #
Over the past decade, tools for delivering features have matured tremendously. From kanban boards, to code reviews, to continuous integration and commit-to-deploy. As a product manager or engineer, it feels like you could walk into any modern tech company and start delivering features within a week given the standardized workflows of the past decade. This has been a massive boost in productivity.
However, those workflows all stop when we mark a feature as “done”. Sure, the feature is done technically but is now in the hands of the customers.
Establishing a final “evaluation” step is the missing step of the feature journey in today’s workflow. Like the rest of the delivery workflow, the evaluation step has to be consistent, which is what this framework aims to provide.
ProductOps symptom #
In recent years, a new role has emerged in the product world: ProductOps. There’s a bunch of different definitions for this role, but we think this definition by Marty Cagan makes the most sense.
ProductOps is empowering product teams with:
- Qualitative insights
- Quantitative insights
- Tools and Best Practices
In other words, the role of the ProductOps person or team is to come up with a way to feed product teams with the customer feedback they need - qualitative and quantitative - to build better features and products.
While this makes sense on the surface, the implementation of how to do it is still a custom implementation choice per ProductOps team. In many ways, the fact that this role is emerging is a symptom of a historic lack in tooling and frameworks for evaluating features.
The STARS framework offers an approach to measure feature engagement (quantitative) and feature satisfaction (qualitative) for any feature.
Thanks for reading #
Many thanks for reading. If you have any questions or comments, please reach out to us at firstname.lastname@example.org. We welcome your feedback and contributions.