Backgrounder in Statistical Methods
This is the fourth module in the 2016 Informatics and Statistics for Metabolomics workshop hosted by the Canadian Bioinformatics Workshops. This lecture is by Jeff Xia from McGill University.
How it Begins by Kevin MacLeod is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/...)
Source: http://incompetech.com/music/royalty-...
Artist: http://incompetech.com/
Table of Contents:
00:10 -
01:16 - Yesterday
02:42 - Today
03:26 - Learning Objectives
05:37 - What is Statistics
07:02 - Main Components
09:01 - Types of Data
10:30 - Types of Data
10:55 - Quantitative Data
11:58 - Categorical Data
13:29 - Some Jargons (I)
15:25 - Some Jargons (II)
16:26 - Key Concepts in Statistics
17:30 - Issues when making inferences
19:06 - From samples to population
20:16 - P values
22:34 - Summary/Descriptive Statistics
22:52 - How do we describe the data?
26:36 - Mean, Median, Mode
26:46 - Mean, Median & Mode
27:56 - Variance, SD and SEM
29:59 - Quantiles
31:49 - Mean vs. Variance
33:56 - Univariate Statistics
34:35 - A Bell Curve
36:55 - Features of a Normal Distribution
37:27 - Normal Distribution
38:22 - Some Equations
39:09 - Standard Deviation (σ)
39:42 - Different Distributions
40:41 - Skewed Distribution
41:49 - Fixing a Skewed Distribution
42:23 - Log Transformation
43:35 - Log Transformation (Real Data)
44:31 - Centering, scaling, and transformations
47:13 - The Result
47:15 - Centering, scaling, and transformations
48:48 - The Result
49:28 - Centering, scaling, and transformations
49:29 - Log Transformation (Real Data)
49:31 - Centering, scaling, and transformations
49:34 - Log Transformation (Real Data)
49:34 - Log Transformation
49:35 - Fixing a Skewed Distribution
49:35 - Skewed Distribution
52:15 - Fixing a Skewed Distribution
52:16 - Log Transformation
52:17 - Log Transformation (Real Data)
52:17 - Centering, scaling, and transformations
52:18 - The Result
52:49 - The Result
53:05 - The Result
53:06 - The Result
53:10 - The Result
53:18 - The Result
53:26 - t-tests
54:45 - Types of t-tests
55:39 - Paired t-tests
57:02 - Another approach to group differences
58:13 - Calculating F
59:10 - What can be concluded from a significant ANOVA?
01:00:15 - Different types of ANOVA
01:02:57 - Conclusions
01:04:07 - Understanding
P values
01:06:58 - The p-value
01:07:27 - How to we compute a p value
01:07:42 - Non-normal distribution
01:07:51 - How to we compute a p value
01:08:21 - Non-normal distribution
01:09:55 - Normalization
01:10:39 - Boxplots of “standardized” data
01:11:13 - Non-parametric tests
01:12:18 - Empirical P values
01:13:37 - Basic Principle
01:16:02 - A simple example
01:17:12 - Permutation One
01:17:13 - A simple example
01:17:29 - Permutation One
01:17:39 - Permutations
01:17:40 - Permutation One
01:17:40 - A simple example
01:17:47 - Permutation One
01:18:03 - Permutations
01:18:13 - Compute empirical p value
01:20:23 - General Advantages
01:21:16 - Question
01:21:19 - General Advantages
01:21:54 - Question
01:23:01 - Hypothesis Testing & multiple testing issues
01:23:17 - Hypothesis Testing
01:23:52 - Hypothesis Testing (more details)
01:24:20 - Hypothesis Testing & P Value
01:24:51 - Multiple Testing Issues
01:25:39 - Multiple Testing Correction (I)
01:27:09 - Multiple Testing Correction (II)
01:28:43 - High-dimensional data
01:29:10 - Multivariate Statistics
01:29:15 - Multivariate Statistics
01:29:27 - Normal distribution
– a single variable
01:29:34 - Bivariate Normal
01:29:43 - Trivariate Normal
01:30:02 - The Reality
01:31:25 - The Practice
01:32:32 - Machine Learning
01:33:38 - Unsupervised Learning methods for high-dimensional data
01:34:36 - Clustering
01:34:42 - Clustering Requires...
01:35:23 - Two common clustering algorithms
01:36:29 - K-means clustering
01:36:34 - K-means clustering
01:37:17 - Nearest Neighbor Algorithm
01:37:24 - K-means clustering
01:37:26 - Nearest Neighbor Algorithm
01:38:00 - Hierarchical Clustering
01:38:08 - Key Parameters: similarities
01:38:42 - Similarity Measurements
01:39:04 - Similarity Measurements
01:39:26 -
01:40:07 - Hierarchical clustering & heatmap
01:40:49 - Principal Component Analysis (PCA)
01:40:50 - Hierarchical clustering & heatmap
01:41:01 - Principal Component Analysis (PCA)
01:42:35 - Visualizing PCA
01:43:27 - PCA - The Details
01:44:45 -
01:44:58 - PCA - The Details
01:45:09 -
01:45:32 - Principal Components Analysis on:
01:45:58 - Eigenfaces
01:46:19 - Widely used in metabolomics
01:47:15 - PCA Loadings Plot
01:47:30 - Scores & Loadings
01:47:35 - PCA Details/Advice
01:47:36 - Scores & Loadings
01:48:29 - PCA Details/Advice
01:49:18 - PCA Summary
01:49:51 - PLS-DA
01:50:50 - PCA vs. PLS-DA
01:51:16 - Use PLS-DA with Caution
01:51:38 - Cross Validations
01:52:17 - Common Splitting Strategies
01:52:40 - Components and Features
01:52:45 - Common Splitting Strategies
01:52:58 - Components and Features
01:54:47 - Permutation Tests
01:58:48 -