Erhalte wöchentliche Updates über Udacity in Deutschland, Österreich und der Schweiz: Abonniere jetzt unseren Newsletter.

Fortgeschrittene

Approx. 1 months

Assumes 6hrs/wk (work at your own pace)

Built by
Join thousands of students

Start Free Course

Start Free Course
Free
You get
Instructor videos
Learn by doing exercises and view PROJEKT instructions

Course Summary

This course will cover the design and analysis of A/B tests, also known as split tests, which are online experiments used to test potential improvements to a website or mobile application. Two versions of the website are shown to different users - usually the existing website and a potential change. Then, the results are analyzed to determine whether the change is an improvement worth launching. This course will cover how to choose and characterize metrics to evaluate your experiments, how to design an experiment with enough statistical power, how to analyze the results and draw valid conclusions, and how to ensure that the the participants of your experiments are adequately protected.

Why Take This Course?

A/B testing, or split testing, is used by companies like Google, Microsoft, Amazon, Ebay/Paypal, Netflix, and numerous others to decide which changes are worth launching. By using A/B tests to make decisions, you can base your decisions on actual data, rather than relying on intuition or HiPPO's - the highest paid person's opinion! Designing good A/B tests and drawing valid conclusions can be difficult. You can almost never measure exactly what you want to know (such as whether the users are "happier" on one version of the site), so you need to find good proxies. You need sanity checks to make sure your experimental set-up isn't flawed, and you need to use a variety of statistical techniques to make sure the results you're seeing aren't due to chance. This course will walk you through the entire process. At the end, you will be ready to help businesses small or large make crucial decisions that could significantly affect their future!

Prerequisites and Requirements

This course requires introductory knowledge of descriptive and inferential statistics. If you haven't learned these topics, or need a refresher, they are covered in the Udacity courses Inferential Statistics and Descriptive Statistics.

Prior experience with A/B testing is not required, and neither is programming knowledge.

See the Technology Requirements for using Udacity.

What Will I Learn?

Projects

Design an A/B Test

Make design decisions for an A/B test, including which metrics to measure and how long the test should be run. Analyze the results of an A/B test that was run by Udacity and recommend whether or not to launch the change.

Syllabus

Lesson 1: Overview of A/B Testing

This lesson will cover what A/B testing is and what it can be used for. It will also cover an example A/B test from start to finish, including how to decide how long to run the experiment, how to construct a binomial confidence interval for the results, and how to decide whether the change is worth the launch cost.

Lesson 2: Policy and Ethics for Experiments

This lesson will cover how to make sure the participants of your experiments are adequately protected and what questions you should be asking regarding the ethicality of experiments. It will cover four main ethics principles to consider when designing experiments: the risk to the user, the potential benefits, what alternatives users have to participating in the experiment, and the sensitivity of the data being collected.

Lesson 3: Choosing and Characterizing Metrics

One of the most important and time-consuming pieces of designing an A/B test is choosing and validating metrics to use in evaluating your experiment. This lesson will cover techniques for brainstorming metrics, what to do when you can't measure what you want directly, and characteristics you should consider when validating your metrics.

Lesson 4: Designing an Experiment

This lesson will cover how to design an A/B test. This includes how to choose which users will be in your experiment and control group - different online definitions of a "user", and what effects different choices will have on your experiment. It will also cover when to limit your experiment to a subset of your entire user base, how to calculate how many events you will need in order to draw strong conclusions from your results, and how this translates into how long to run the experiment. Finally, the lesson will cover how various design decisions affect the size of your experiment, so you will know which decisions to revisit if you need results more quickly.

Lesson 5: Analyzing Results

This lesson will cover how to analyze the results of your experiments. Step one is always to run some sanity checks so that you can catch problems with your experiment set-up. Then, you will learn how to check conclusions with multiple methods, including a hypothesis test on the effect size and a binomial sign test, if you get results that surprise you. You will also learn how measuring multiple metrics for the same experiment can make analysis difficult, and some techniques for handling multiple metrics. Finally, you will learn about several analysis "gotchas", and what to do if you see them, including how Simpson's Paradox can affect A/B tests, and why even statistically significant results might disappear when you launch.

Final Project: Design and Analyze an A/B Test

Make design decisions for an A/B test, including which metrics to measure and how long the test should be run. Analyze the results of an A/B test that was run by Udacity and recommend whether or not to launch the change.

Instructors & Partners

instructor photo

Carrie Grimes

Carrie Grimes Bostock is currently a Distinguished Engineer at Google, working on data driven resource planning, cost analysis, and distributed cluster management software as part of the Technical Infrastructure group. She joined Google in 2003, and spent most of the last 12 years working in Search and Search infrastructure on statistical and engineering problems in crawling and indexing quality, ranking evaluation, and forecasting. She graduated from Harvard with an A.B. in Anthropology/Archaeology in 1998, and an interest in quantitative methods for dealing with disparate data. She received a PhD at Stanford in 2003 in Statistics after working with David Donoho on Nonlinear Dimensionality Reduction problems.

instructor photo

Caroline Buckey

Before joining Udacity, Caroline worked as a Software Engineer at Quixey, a startup building a search engine for apps. While receiving her undergraduate degree from Carnegie Mellon, she was a TA for six different courses, and that same love for teaching later led her to join Udacity. Outside of work, she likes reading fiction, playing board games, and drinking bubble tea.

instructor photo

Diane Tang

Diane Tang is a Google Fellow currently working in Google Research on building data infrastructure and analytics for biological & medical applications. Prior to 2014, she was a leader on the AdsQuality team at Google. She joined Google in 2003 and has focused on logging, large-scale data analysis & infrastructure, experiment methodology and ads systems. She earned a bachelor's degree in Computer Science from Harvard in 1995 and a master's degree and PhD in Computer Science from Stanford in 2001. She holds many patents and is the author of numerous publications in mobile networking, information visualization, experiment methodology, data infrastructure, and data mining / large data.