Primer on the math of Machine Learning

    

     1. Dot Product of vectors (Inner Product or Scalar Product) <a | b>

  • Dot product of 2 vectors a and b is defined as:
  • aT . b , It can also be represented as  bT . a
  • The dot product of two vectors a = [a1a2, …, an] and b = [b1b2, …, bn] is defined as:
Dot Product is also called scalar product. Since, it produces a real valued output when operated on 2 vectors.


2. Random Variable

Random Variable is actually a function used quantify outcomes of a probabilistic event/experiment. It is not a variable in the sense that it can be solved for a value. Rather, it can be defined in a way to take arbitrary values for specific set of events.

Example -
Let X be a random variable

In the event of tossing a coin,
  1. X can take value 1 if it is heads
  2. Likewise, value 0 if it is tails

There are 2 types of random variables - 

1. Discrete Random Variables

Rolling a die, heads or tail, rains tomorrow (Countable number of outcomes)

Let's look at probability distribution/probability density  in case of tossing a die -
This is a uniform distribution, all the outcomes are equally likely. If the dice was skewed, the probability density (distribution) would not be uniform.

Let's add some Random variable formalism here. Given the above density graph, I were to ask you to give me the probability of an event (6 appearing on the die).

We would define it using Random Variable X (function taking a value) - 
P(X = 6) = 1/6

2. Continuous Random Variables

Infinite number of outcomes, it can take any value.
Inches of rain tomorrow, 1.111 inches or 2.9 inches of rain tomorrow. 

Let's look at an example - 

In case of continuous random variable, since there are infinite possible outcomes (in a range) - the probability distribution looks like a curve (bell-shaped). 

Let, Y be the random variable that defines the outcome (amount of rain to expect in inches). However, the probability that a discrete outcome occurs in case of a continuous random variable is infinitesimally small.

P(Y=2) ~ 0. Continuous random variables are usually defined as a range rather than a discrete value.
Example - 
P(1.9<Y<2) this is the area under the curve (integral) of the probability density function of our continuous random variable.

Summarizing, probability distribution function for Discrete Random Variables is called Probability Mass Function (PMF) and that for Continuous Random Variables is called Probability density function.

3. Derivates and Matrix Calculus

I highly recommend these PDFs to brush up Matrix calculus fundamentals.
Source - https://atmos.washington.edu/~dennis/MatrixCalculus.pdf

Source - https://arxiv.org/pdf/1802.01528.pdf



References:

https://en.wikipedia.org/wiki/Dot_product
Khan Academy - https://www.youtube.com/watch?v=dOr0NKyD31Q

Comments

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. Thanks for sharing the best information and suggestions, I love your content, and they are very nice and very useful to us. if you are looking for the Smartest Ways to Make Money With Machine Learning then you can read my article I have mentioned the best smart way. I appreciate the work you have put into this.

    ReplyDelete
  3. Hey, nicely done and written. I also noticed since I started developing recently that people rehash old ideas but add nothing. But I had not made the leap that you made, or at least not a leap of "THAT Size"!! Fantastic idea. It is going on my list of things I need to do as a new developer to try to find out the Business Intelligence Consulting Services. Thanks for sharing this content about Iphone Application Development Services

    ReplyDelete
  4. I like this article, really explained everything in the detail, keep rocking like this. i understood the topic clearly, to learn more join Machine Learning course

    ReplyDelete

Post a Comment