Feature Success Team

See what we're building

People
What we're building
Roadmap & recently shipped
Goals
Handbook

People

What we're building

No-code experiments / Visual editor
A visual editor for experiments would allow users to test changes to their website / app without having to touch the code.
Progress
Project updates
No updates yet. Engineers are currently hard at work, so check back soon!

Roadmap

Here’s what we’re considering building next. Vote for your favorites or share a new idea on GitHub.

Feature flags for Java#16419
- ❤️22
- 👍56
Vote on GitHub
Feature flags based on events (or behavioral cohorts)#14796
- 👍8
Vote on GitHub
Correlation analysis for cohorts#7875
- ❤️5
- 👍12
Vote on GitHub
Feature flag code references#13845
- 👍8
Vote on GitHub
Do surveys in emails#21071
Vote on GitHub
Auto rollback for feature flags#22883
Vote on GitHub

Recently shipped

Surveys now support 7-point Likert scale responses

Want to get a more detailed understanding of customer responses but don't want to overwhelm them with too many options? Now you can do that with 7-point Likert scale responses.

As the name suggests, it offers users more than just 5 options for their response, but less than 10 - so it's the perfect middle-ground!

Likert scale surveys are most useful for measuring a positive or negative response and are especially suited to collecting feedback in response to a trigger statement - don't you agree? (Please respond on a scale of 1 to 7.)

Goals

As always, reliability is the #1 unwritten goal: Making sure feature flags are reliable trumps every other objective.

Objective: Make sure feature flags can handle 10x current scale

In the beginning of 2024, we hit some scaling limits on flags: It's now very expensive to run flags on Django and we'll hit some scaling limits at 5x our scale.

To get ahead of this problem, we'll rewrite our flags service to be more performant and reliable. This is a continuation of last quarter's goal, since we aren't done yet.

Objective: No-code experiments

Last quarter we polished up our new experiment UI. This quarter, we want to address some issues that came up, and then focus on building out some more ambitious features on top of experiments.

Broadly, we will:

Address outstanding experiments UI/UX issues
Build no-code experiments
Explore some cool things we can do on top of no-code experiments, like suggesting and automatically running some no-code experiments based on replay and analytics data

Objective: Split out experiments into its own product

Experiments can now stand on its own, without the linking to feature flags in the UI, and in pricing.

The current state is problematic because

It confuses some users over difference between flags and experiments
Pricing for experiments is implicit, and can be hard to understand

We'll update SDKs to have experiment-specific constructs, stop the controlling experiment via feature flag issues, make them independent, track flag calls separately for experiments, and update how we price experiments.

Handbook

Values

Fast, iterative and high output rather not slow and thoughtful - achieving this
Feedback-driven not spec-driven - we do a decent job at this
Missionary (we have a clear problem definition and are aligned on how impactful a solution would be) not mercenary - glimpses of this
Collaborative not lone wolf - glimpses of this

Personas

Company Persona

Primary
- Size:
  - 20-75 employees
- Stage:
  - Post-PMF
  - Series A-D
- Customer type:
  - B2B/B2C/(B2B2C)
- High expectation traits:
  - Use the modern data stack
  - Frontend uses typescript and react
  - High-growth
Not:
- API companies
- Shopify stores/no-code companies

User Persona

Primary
- Role
  - Product-minded front-end engineer
  - Growth engineer
- Seniority
  - Decision-making seat on product
  - Senior engineer
  - IC
- High expectation traits
  - Reads HackerNews
  - Educated about the other feature flagging/experimentation tools in the space
  - Needs high-reliability and high-performance
  - Uses best-in-class tools such as Linear/Figma
Secondary:
- Role:
  - Product Manager
Not:
- Role:
  - Backend engineer
  - Marketing

Jobs to be done

Feature flags

Primary
- Safely rollout frontend features with the least risk
Secondary
- Persistent feature flags e.g. country/pay gate
- Build/test in production
- Enable beta users to try out experimental features ahead of time

Experimentation

Primary
- Test whether a particular feature achieves the desired change in user behavior

Feature ownership

You can find out more about the features we own here

Long term vision

Imagine Bob is a product manager, and Alice is an engineer, both of whom love using PostHog.

During their weekly growth review, Posthog shows them that one of their workflows is performing 50% worse than other SaaS companies with a similar flow. They decide to build a new feature together, but they're unsure of the impact, so Bob & Alice decide to gate the feature via a feature flag.

Alice builds the feature and runs the PostHog CLI, automatically converting his feature branch to a feature-flagged version. During creation, he selects the team template they normally use, called "Autorollout based on conversion metric", using the conversion metric that Posthog suggests. The feature progressively rolls out to internal users, then to beta users, then to remaining users. If their conversion metric falls by more than 20% the feature automatically rolls back and alerts their team. Alice requests a feature flag review from Bob.

Bob checks the Posthog UI and because it's such an important feature - adds a safety condition for Sentry errors increasing by 30% and a few counter metrics. This should result in an automatic rollback as well. Bob starts the experiment.

Thankfully, nothing goes wrong when the feature is rolled out. The team is disappointed that the feature doesn't seem to move any of the core company metrics, however. This doesn't fit into either of Alice's or Bob's model, so they dig deeper why this was the case.

Before they even start, PostHog automatically does some impact analysis on their core metrics, and generates some insights into what properties are highly correlated with conversion & which aren't.

As it turns out, people in USA and India love their new feature and show a 40% increase in conversion. Other countries, especially the UK, seem to dislike it so much that it negatively affects conversion. In the end, these forces balance out, leading to similar total conversion rates.

They suspect it might have something to do with their positioning in other countries, so they run a marketing experiment using PostHog, where PostHog automatically generates recommended copy text to try out. It generates 5 variants, and they test these in all countries.

As it turns out, copy wasn't the issue, and there's no significant change here. They watch a few recordings from the experiment to confirm there's nothing off here.

Since it's not a positioning issue, Bob & Alice decide that it makes sense to introduce some personalisation, and let people opt-in to the new feature, and have it on by default for USA and India. They can customise this right from the feature flag, and set this up such that any users who opt-in on their UI automatically get the flag.

PostHog keeps analysing metrics for this flag over time, and notifies Bob and Alice when their customers behaviour change. For example, if the conversion for users in UK has taken a turn for the better, or if enterprise customers have taken a turn for the worse.

Our long term vision is to make all of this possible.