Despite two linear algebra classes, my knowledge consisted of “Matrices, determinants, eigen something something”.

Why? Well, let’s try this course format:

- Name the course
*Linear Algebra*but focus on things called matrices and vectors - Teach concepts like Row/Column order with mnemonics instead of explaining the reasoning
- Favor abstract examples (2d vectors! 3d vectors!) and avoid real-world topics until the final week

The survivors are physicists, graphics programmers and other masochists. We missed the key insight:

**Linear algebra gives you mini-spreadsheets for your math equations.**

We can take a table of data (a matrix) and create updated tables from the original. It’s the power of a spreadsheet written as an equation.

Here’s the linear algebra introduction I wish I had, with a real-world stock market example.

## What’s in a name?

“Algebra” means, roughly, “relationships”. Grade-school algebra explores the relationship between unknown numbers. Without knowing x and y, we can still work out that $(x + y)^2 = x^2 + 2xy + y^2$.

“Linear Algebra” means, roughly, “line-like relationships”. Let’s clarify a bit.

Straight lines are predictable. Imagine a rooftop: move forward 3 horizontal feet (relative to the ground) and you might rise 1 foot in elevation (The slope! Rise/run = 1/3). Move forward 6 feet, and you’d expect a rise of 2 feet. Contrast this with climbing a dome: each horizontal foot forward raises you a different amount.

Lines are nice and predictable:

- If 3 feet forward has a 1-foot rise, then going 10x as far should give a 10x rise (30 feet forward is a 10-foot rise)
- If 3 feet forward has a 1-foot rise, and 6 feet has a 2-foot rise, then (3 + 6) feet should have a (1 + 2) foot rise

In math terms, an operation F is linear if scaling inputs scales the output, and adding inputs adds the outputs:

In our example, $F(x)$ calculates the rise when moving forward x feet, and the properties hold:

## Linear Operations

An operation is a calculation based on some inputs. Which operations are linear and predictable? Multiplication, it seems.

Exponents ($F(x) = x^2$) aren’t predictable: $10^2$ is 100, but $20^2$ is 400. We doubled the input but quadrupled the output.

Surprisingly, regular addition isn’t linear either. Consider the “add three” function $F(x) = x + 3$:

We doubled the input and did not double the output. (Yes, $F(x) = x + 3$ happens to be the equation for an *offset* line, but it’s still not “linear” because $F(10) \neq 10 \cdot F(1)$. Fun.)

So, what types of functions are *actually* linear? Plain-old scaling by a constant, or functions that look like: $F(x) = ax$. In our roof example, $a = 1/3$.

But life isn’t *too* boring. We can still combine multiple linear functions ($A, B, C$) into a larger one, $G$:

$G$ is still linear, since doubling the input continues to double the output:

We have “mini arithmetic”: multiply inputs by a constant, and add the results. It’s actually useful because we can split inputs apart, analyze them individually, and combine the results:

If we allowed non-linear operations (like $x^2$) we couldn’t split our work: $(a+b)^2 \neq a^2 + b^2$. Limiting ourselves to linear operations has its advantages.

## Organizing Inputs and Operations

Most courses hit you in the face with the details of a matrix. “Ok kids, let’s learn to speak. Select a subject, verb and object. Next, conjugate the verb. Then, add the prepositions…”

**No!** Grammar is not the focus. What’s the key idea?

- We have a bunch of inputs to track
- We have predictable, linear operations to perform (our “mini-arithmetic”)
- We generate a result, perhaps transforming it again

Ok. First, how should we track a bunch of inputs? How about a list:

```
x
y
z
```

Not bad. We could write it (x, y, z) too — hang onto that thought.

Next, how should we track our operations? Remember, we only have “mini arithmetic”: multiplications by a constant, with a final addition. If our operation $F$ behaves like this:

We could abbreviate the entire function as (3, 4, 5). We know to multiply the first input by the first value, the second input by the second value, etc., and add the result.

Only need the first input?

Let’s spice it up: how should we handle multiple sets of inputs? Let’s say we want to run operation F on both (a, b, c) and (x, y, z). We could try this:

But it won’t work: F expects 3 inputs, not 6. We should separate the inputs into groups:

```
1st Input 2nd Input
--------------------
a x
b y
c z
```

Much neater.

And how could we run the same input through several operations? Have a row for each operation:

```
F: 3 4 5
G: 3 0 0
```

Neat. We’re getting organized: inputs in vertical columns, operations in horizontal rows.

## Visualizing The Matrix

Words aren’t enough. Here’s how I visualize inputs, operations, and outputs:

Imagine “pouring” each input through each operation:

As an input passes an operation, it creates an output item. In our example, the input (a, b, c) goes against operation F and outputs 3a + 4b + 5c. It goes against operation G and outputs 3a + 0 + 0.

Time for the red pill. A matrix is a shorthand for our diagrams:

A matrix is a single variable representing a spreadsheet of inputs or operations.

**Trickiness #1: The reading order**

Instead of an input => matrix => output flow, we use function notation, like y = f(x) or f(x) = y. We usually write a matrix with a capital letter (F), and a single input column with lowercase (x). Because we have several inputs (A) and outputs (B), they’re considered matrices too:

**Trickiness #2: The numbering**

Matrix size is measured as RxC: row count, then column count, and abbreviated “m x n” (I hear ya, “r x c” would be easier to remember). Items in the matrix are referenced the same way: a_{ij} is the ith row and jth column (I hear ya, “i” and “j” are easily confused on a chalkboard). Mnemonics are ok *with context*, and here’s what I use:

- RC, like Roman Centurion or RC Cola
- Use an “L” shape. Count down the L, then across

Why does RC ordering make sense? Our operations matrix is 2×3 and our input matrix is 3×2. Writing them together:

```
[Operation Matrix] [Input Matrix]
[operation count x operation size] [input size x input count]
[m x n] [p x q] = [m x q]
[2 x 3] [3 x 2] = [2 x 2]
```

Notice the matrices touch at the “size of operation” and “size of input” (n = p). They should match! If our inputs have 3 components, our operations should expect 3 items. In fact, we can *only* multiply matrices when n = p.

The output matrix has m operation rows for each input, and q inputs, giving a “m x q” matrix.

## Fancier Operations

Let’s get comfortable with operations. Assuming 3 inputs, we can whip up a few 1-operation matrices:

- Adder: [1 1 1]
- Averager: [1/3 1/3 1/3]

The “Adder” is just a + b + c. The “Averager” is similar: (a + b + c)/3 = a/3 + b/3 + c/3.

Try these 1-liners:

- First-input only: [1 0 0]
- Second-input only: [0 1 0]
- Third-input only: [0 0 1]

And if we merge them into a single matrix:

```
[1 0 0]
[0 1 0]
[0 0 1]
```

Whoa — it’s the “identity matrix”, which copies 3 inputs to 3 outputs, unchanged. How about this guy?

```
[1 0 0]
[0 0 1]
[0 1 0]
```

He reorders the inputs: (x, y, z) becomes (x, z, y).

And this one?

```
[2 0 0]
[0 2 0]
[0 0 2]
```

He’s an input doubler. We could rewrite him to `2*I`

(the identity matrix) if we were so inclined.

And yes, when we decide to treat inputs as vector coordinates, the operations matrix will transform our vectors. Here’s a few examples:

- Scale: make all inputs bigger/smaller
- Skew: make certain inputs bigger/smaller
- Flip: make inputs negative
- Rotate: make new coordinates based on old ones (East becomes North, North becomes West, etc.)

These are geometric interpretations of multiplication, and how to warp a vector space. Just remember that vectors are *examples* of data to modify.

## A Non-Vector Example: Stock Market Portfolios

Let’s practice linear algebra in the real world:

- Input data: stock portfolios with dollars in Apple, Google and Microsoft stock
- Operations: the changes in company values after a news event
- Output: updated portfolios

And a bonus output: let’s make a new portfolio listing the net profit/loss from the event.

Normally, we’d track this in a spreadsheet. Let’s learn to think with linear algebra:

The input vector could be (\$Apple, \$Google, \$Microsoft), showing the dollars in each stock. (Oh! These dollar values could come from

*another*matrix that multiplied the number of shares by their price. Fancy that!)The 4 output operations should be: Update Apple value, Update Google value, Update Microsoft value, Compute Profit

Visualize the problem. Imagine running through each operation:

The key is understanding *why* we’re setting up the matrix like this, not blindly crunching numbers.

Got it? Let’s introduce the scenario.

Suppose a secret iDevice is launched: Apple jumps 20%, Google drops 5%, and Microsoft stays the same. We want to adjust each stock value, using something similar to the identity matrix:

```
New Apple [1.2 0 0]
New Google [0 0.95 0]
New Microsoft [0 0 1]
```

The new Apple value is the original, increased by 20% (Google = 5% decrease, Microsoft = no change).

Oh wait! We need the overall profit:

Total change = (.20 * Apple) + (-.05 * Google) + (0 * Microsoft)

Our final operations matrix:

```
New Apple [1.2 0 0]
New Google [0 0.95 0]
New Microsoft [0 0 1]
Total Profit [.20 -.05 0]
```

Making sense? Three inputs enter, four outputs leave. The first three operations are a “modified copy” and the last brings the changes together.

Now let’s feed in the portfolios for Alice \$1000, \$1000, \$1000) and Bob \$500, \$2000, \$500). We can crunch the numbers by hand, or use a Wolfram Alpha (calculation):

(Note: Inputs should be in columns, but it’s easier to type rows. The Transpose operation, indicated by t (tau), converts rows to columns.)

The final numbers: Alice has \$1200 in AAPL, \$950 in GOOG, \$1000 in MSFT, with a net profit of \$150. Bob has \$600 in AAPL, \$1900 in GOOG, and \$500 in MSFT, with a net profit of \$0.

What’s happening? **We’re doing math with our own spreadsheet.** Linear algebra emerged in the 1800s yet spreadsheets were invented in the 1980s. I blame the gap on poor linear algebra education.

## Historical Notes: Solving Simultaneous equations

An *early* use of tables of numbers (not yet a “matrix”) was bookkeeping for linear systems:

becomes

We can avoid hand cramps by adding/subtracting rows in the matrix and output, vs. rewriting the full equations. As the matrix evolves into the identity matrix, the values of x, y and z are revealed on the output side.

This process, called Gauss-Jordan elimination, saves time. However, linear algebra is mainly about matrix transformations, not solving large sets of equations (it’d be like using Excel for your shopping list).

## Terminology, Determinants, and Eigenstuff

Words have technical categories to describe their use (nouns, verbs, adjectives). Matrices can be similarly subdivided.

Descriptions like “upper-triangular”, “symmetric”, “diagonal” are the shape of the matrix, and influence their transformations.

The **determinant** is the “size” of the output transformation. If the input was a unit vector (representing area or volume of 1), the determinant is the size of the transformed area or volume. A determinant of 0 means matrix is “destructive” and cannot be reversed (similar to multiplying by zero: information was lost).

The **eigenvector** and **eigenvalue** represent the “axes” of the transformation.

Consider spinning a globe: every location faces a new direction, except the poles.

An “eigenvector” is an input that doesn’t change direction when it’s run through the matrix (it points “along the axis”). And although the direction doesn’t change, the size might. The eigenvalue is the amount the eigenvector is scaled up or down when going through the matrix.

(My intuition here is weak, and I’d like to explore more. Here’s a nice diagram and video.)

## Matrices As Inputs

A funky thought: we can treat the operations matrix as inputs!

Think of a recipe as a list of commands (*Add 2 cups of sugar, 3 cups of flour…*).

What if we want the metric version? Take the instructions, treat them like text, and convert the units. The recipe is “input” to modify. When we’re done, we can follow the instructions again.

An operations matrix is similar: commands to modify. Applying one operations matrix to another gives a new operations matrix that applies *both* transformations, in order.

If N is “adjust for portfolio for news” and T is “adjust portfolio for taxes” then applying both:

TN = X

means “Create matrix X, which first adjusts for news, and then adjusts for taxes”. Whoa! We didn’t need an input portfolio, we applied one matrix directly to the other.

The beauty of linear algebra is representing an entire spreadsheet calculation with a single letter. Want to apply the same transformation a few times? Use N^{2} or N^{3}.

## Can We Use Regular Addition, Please?

Yes, because you asked nicely. Our “mini arithmetic” seems limiting: multiplications, but no addition? Time to expand our brains.

Imagine adding a dummy entry of 1 to our input: (x, y, z) becomes (x, y, z, 1).

Now our operations matrix has an extra, known value to play with! If we want `x + 1`

we can write:

```
[1 0 0 1]
```

And `x + y - 3`

would be:

```
[1 1 0 -3]
```

Huzzah!

Want the geeky explanation? We’re pretending our input exists in a 1-higher dimension, and put a “1” in that dimension. We *skew* that higher dimension, which looks like a *slide* in the current one. For example: take input (x, y, z, 1) and run it through:

```
[1 0 0 1]
[0 1 0 1]
[0 0 1 1]
[0 0 0 1]
```

The result is (x + 1, y + 1, z + 1, 1). Ignoring the 4th dimension, every input got a +1. We keep the dummy entry, and can do more slides later.

Mini-arithmetic isn’t so limited after all.

## Onward

I’ve overlooked some linear algebra subtleties, and I’m not too concerned. Why?

These metaphors are helping me *think* with matrices, more than the classes I “aced”. I can finally respond to “Why is linear algebra useful?” with “Why are spreadsheets useful?”

They’re not, unless you want a tool used to attack nearly every real-world problem. Ask a businessman if they’d rather donate a kidney or be banned from Excel forever. That’s the impact of linear algebra we’ve overlooked: efficient notation to bring spreadsheets into our math equations.

Happy math.

## Other Posts In This Series

- A Visual, Intuitive Guide to Imaginary Numbers
- Intuitive Arithmetic With Complex Numbers
- Understanding Why Complex Multiplication Works
- Intuitive Guide to Angles, Degrees and Radians
- Intuitive Understanding Of Euler's Formula
- An Interactive Guide To The Fourier Transform
- Intuitive Understanding of Sine Waves
- An Intuitive Guide to Linear Algebra
- A Programmer's Intuition for Matrix Multiplication
- Imaginary Multiplication vs. Imaginary Exponents

Frederick RossOctober 9, 2012 at 3:26 pmA spreadsheet is actually a much more general structure than a vector space, but leaving that aside, there are two insights being mashed into one here:

1. Many problems, though not obviously geometric, can be shoved into some geometric form and more easily tackled thus.

2. There is a scatter of algebraic structures describing features of various geometries that have proved useful over the years. The scatter is roughly a cone from the origin of unstructured sets out to things like noncommutative geometry.

There are other things near to vector spaces in that scatter, like affine spaces (vector spaces without origin) and modules (vector spaces over rings instead of fields), and all kinds of fascinating things that show up (look up tropical semirings and you’ll go down an enormous rabbit hole, all without ever leaving familiar algebraic operations). Reach a little farther out and you find yourself with inner product spaces, then add some calculus and you get Banach spaces, and a little farther in Hilbert space (weird fact: there’s only one Hilbert space; all its various appearances are all isomorphic). Keep going and you start finding yourself in Riemannian geometry, and then even farther out in noncommutative geometry and the like.

Vector spaces are a sweet spot, for three reasons:

1. They’re sufficiently unstructured where most of the components of more complicated geometries like tangent spaces and the like will all be vector spaces.

2. They’re just enough structure to always be able to express any abstract vector space in familiar vectors of numbers and operations on them as matrices. Thus you can always grab a basis and start computing, no matter how exotic your vector space may seem.

3. They’re turtles all the way down. The space of linear operations on a vector space is a vector space. The space of coordinate transformations from one basis on a vector space to another basis is a vector space. When you start adding inner products and the like, you can pretty well always find a way of looking at them where they’re just another vector space.

They have oddly nice properties as well. For example, no matter how weird the vector space, it has a well defined dimension (though it may be infinite).

Linear algebra “done right” is really a question about the structure that emerges from a very broad class of geometric problems. The really interesting part is how you suture vector spaces together in various ways to get other classes of geometries entirely. For instance, a curved space isn’t a vector space, but define a tangent space at every point of that space. The tangent space at a point is a vector space with the same dimension as the space. You can think of it as the velocity of an object at that point. The geometry comes in when you use the notion of tangents as velocities to map back to actual paths in the space.

So matrices and vectors of numbers are nice, but they’re barely the tip of the ice berg of linear algebra.

Bill KidderOctober 9, 2012 at 4:07 pmThis is the 2-week intro to linear algebra I received in Grade 12. The real interesting stuff starts with those eigen-things, which leads you to solving interesting problems in time series analysis and systems of ecology, among others.

D.DickOctober 9, 2012 at 5:46 pm“Guassian elimination” typo

Jeremy KunOctober 9, 2012 at 7:04 pmI think you’re glazing over the main point of matrices:

Every linear map can be represented by a matrix.

This should not be obvious to the beginning student. We don’t work with matrices just because they give us a useful way to organize information, because that simply wouldn’t be useful if we couldn’t use them to represent any linear map. We work with matrices because they completely characterize the functions we care about.

Some linear algebra comments | ajkjkOctober 9, 2012 at 7:53 pm[…] is in reply to this linear algebra guide, which came up on Hackernews today. The author wrote that he didn't have a good intuition about […]

AlexOctober 9, 2012 at 7:54 pmI wrote some of what I know about eigenvalues and determinants down here: http://ajkjk.com/blog/?p=18

Maybe it will be helpful for intuition.

zidniApril 26, 2017 at 9:51 pmcould you please repost the link ? thx alot

Bookmarks for 4 October 2012 through 9 October 2012 « Pilgrim StewardOctober 9, 2012 at 8:02 pm[…] An Intuitive Guide to Linear Algebra | BetterExplained – […]

Sriram SrinivasanOctober 9, 2012 at 9:46 pmBeautifully explained, as usual. Thank you.

A good accompaniment to your explanation is a geometric intuition of matrices, eigenstuff and singular value decomposition here: http://www.ams.org/samplings/feature-column/fcarc-svd

Linear Algebra, a Memory Test, & How I Discovered Programming | Always searchingOctober 9, 2012 at 10:28 pm[…] reading programming forums and tutorials because, well, because. And then I see a link for ‘An Intuitive Guide to Linear Algebra‘ and though “hey, wonder what I remember.” May as well test myself. I’m not […]

SDX2000October 10, 2012 at 1:45 amThanks for sharing your insights on matrices…reading this brought a tear to my eye…I wish my school teachers were like you.

MentockOctober 10, 2012 at 10:38 am“Linear algebra emerged in the 1800s yet spreadsheets were invented in the 1980s. I blame the gap on poor linear algebra education.”

Spreadsheets have been used by accountants for hundreds of years ( http://dssresources.com/history/sshistory.html ), and programs for computers were developed almost as soon as there were computers.

“However, linear algebra is mainly about matrix transformations, not solving large sets of equations (It’d be like using Excel for your shopping list).”

I’ve used it to solve large sets of equations, with thousands of equations, but I’ve also used Excel for a shopping list too… :)

seldonApril 23, 2016 at 6:14 amActually computer programs existed before computers, and I think as well he underestimates solving multiple equations with matrices (that is done all the time in science, e.g. for modelling data).

Amr AbughazalaApril 6, 2017 at 8:38 ambut his point is still correct, the linear algebra is wasted due to the way of explanation and still is.

The way they teach school or university up till today and all the online courses was exactly about knowing, sizes, vectors, positive definite and determinant zero. Those things that you always end why? and now I am working with tensors not knowing why they said some times skewing. I never saw what was skewed.

unconedOctober 10, 2012 at 12:44 pmI have to disagree on the “spreadsheet” approach to linear algebra. Matrix/vector multiplication never made any sense to me, until I realized it’s just projecting the vector onto the original identity basis, and then reconstituting it using the new basis instead. You can discover and draw this process entirely visually.

The relationship between a matrix and the vectors made up of its rows or columns is ridiculously obvious once you see it in action. Yet in years of linear algebra and engineering, nobody ever bothered to show this to me.

Ann LoraineOctober 11, 2012 at 6:15 amWhen I printed this, the first letter of every line was cut off. Can you fix the print css?

IlyaOctober 11, 2012 at 7:09 amthank you very much!)

brian mOctober 11, 2012 at 7:51 amLike most things the best way of learning something is to approach it from different viewpoints. This article does that although not convinced about keeping examples abstract.

I only really understood the advantages of a Matrix when I had to write a program to rotate points in 3d space on a different course. Did the rotation equations then the whole concept of matrix maths ‘clicked’ when I realised its nothing special it’s just a neat way of doing the maths!

In fact the whole mystery of maths clicked – maths is nothing more than a human language and tool for describing how things interact. Maths doesn’t have rules it simply implements observations of physical reality in a convenient way. For example complex numbers in Electrical engineering integration etc.

Of course the discovered rules can then hint at other physical rules that haven’t yet been discovered, which is the real power of maths in general!

NeoOctober 11, 2012 at 9:41 am… actually it’s Gauss-Jordon Elimination. Gaussian elimination would only give you an Upper Trianglular Matrix instead of an Identity Matrix.

mark ptakOctober 11, 2012 at 9:58 amI Love Linear Algebra but until the K-12 system gets a clue have taken to promoting column vectors as often as possible. Points become x stacked on y stacked on z and of course one can always do the transpose it the medium makes row vectors more palatable. And lets not forget to pay homage to Gilbert Strang in these discussion as one who didn’t need to but stuck his folksy lecture out at MIT Open Courses.

Four short links: 10 October 2012 - O'Reilly RadarOctober 12, 2012 at 10:35 am[…] An Intuitive Guide to Linear Algebra — Here’s the linear algebra introduction I wish I had. I wish I’d had it, too. (via Hacker News) […]

Tom Elovi SpruceOctober 12, 2012 at 1:27 pm“No! Grammar is not the focus.”

To be honest, that part of the article threw me off because it sounded like you were criticizing yourself for shifting the focus from math to english.

I think you jumped to the analogy too abruptly and its link to how matrices are taught isn’t clear.

GeorgeOctober 15, 2012 at 6:51 pm@Tom Elovi Spruce,

I believe, the topic of the article at hand is math, as opposed to literature, but if it were, you’d have a valid point.

Issue 22 – Programming Two Miracles A Year — TLNOctober 16, 2012 at 10:31 am[…] An Intuitive Guide to Linear Algebra You need to understand matrices for data mining, machine learning, and other advanced computer science – so if you happen to forget what you learned in a college (“the survivors are physicists, graphics programmers and other masochists”) this is a good basic primer. […]

kalidOctober 26, 2012 at 10:42 pmLong-lost reply (I lost my laptop the day after I posted this article…argh).

@Frederick: Thanks for the note, and the detailed examples! There’s definitely lots to explore — I’m barely getting my toes wet — and I like the analogy of a “cone” of possibilities. Also, the idea that a curved space is not a vector space, but its tangent space is — pretty cool transformation. So much of math is just shifting your perspective.

@Bill: Those eigen-things seem to be the heart of it all.

@D.Dick: Thanks, fixed.

@Jeremy: Matrixes can definitely go deeper (to any linear operation) but it’s a crawl/walk/run thing.

@Alex: Thanks so much! Appreciate the detailed overview. I’ll have to dive into it.

@Sriram: Glad it clicked, and thanks for the link.

@SDX2000: Really appreciate it :).

@Mentock: Good point. Maybe a better phrasing is that spreadsheets have been used by accountants for centuries, without them realizing they could have been helped by “linear algebra” :).

@unconed: No problem. 99% of linear algebra courses will use vectors / projections, but I like spreadsheets because they’re so tangible and familiar. We should use every analogy we can.

@Ann: Thanks for the report, I’ll take a look.

@Ilya: Welcome!

@brian m: Yep, matrixes started off as bookkeeping for equations. And math is definitely a tool/language for communication. If we’re using math, but missing the ideas, we’re not doing math!

@Neo: Thanks, fixed.

@mark: Thanks for the reminder, I need to revisit the Strang lectures :).

@Tom, @George: Yep, “Grammar” was my analogy for focusing on structure but not ideas. Maybe I can think about the transition there.

Mladen StificOctober 28, 2012 at 12:00 pmGreat choice of topic! I jumped on this one hoping to refresh my memory on linear algebra and reconsider its usefulness, but in this article it was a bit of a bumpy ride. I would say your usual style allows for a much smoother transition from building blocks to a-ha moments.

In “Organizing Inputs and Operations”, if you look carefully, this will read smoothly to someone who already understands what you’re talking about, but a novice would be lost. You introduce two different operations at the same time as you’re explaining what the rows mean in the matrix notation, leading to both points being hard to catch. Going forward, you frequently forget you’ve not introduced a notion before you start using it (“transformation”), using the “axes” in the globe analogy without really explaining what you’re doing there etc.

I hope you take it well – this article definitely is better explained than what I got in college, but I came to expect even better from you :)

An Interactive Guide To The Fourier Transform | BetterExplainedDecember 20, 2012 at 3:43 am[…] Post navigation ← Previous […]

TysenDecember 21, 2012 at 11:01 amThanks for this article. You have a typo in this sentence:

A determinant of 0 means matrix is “desctructive” and cannot be reversed (similar to multiplying by zero: information was lost).

desctructive => destructive

kalidDecember 21, 2012 at 4:15 pmThanks Tysen, just fixed!

kalidDecember 21, 2012 at 4:32 pm@Mladen: Thanks for the feedback! Yes, the “Organizing Inputs and Operations” section is the tricky transition, I’ll have to see if I can make it a bit smoother. One thing I love about the net :).

mophismDecember 23, 2012 at 4:25 amCool articles. Could you do one that covers isomorphism, monomorphism epimorphism and so on?

KalidJanuary 28, 2013 at 10:01 pmThanks mophism, appreciate the suggestion!

rrdillonFebruary 7, 2013 at 7:20 pmOMG in a matter of just a few lines you’ve completely de-mystified the notion of eigenvector. Thanks!

kalidFebruary 11, 2013 at 11:25 am@rrdillon Glad it clicked!

Linear Algebra Math – Math - Marcus KazmierczakFebruary 11, 2013 at 7:13 am[…] An Intuitive Guide to Linear Algebra [BetterExplained] […]

AbdulMarch 31, 2013 at 12:09 pmHi, and thank you for making this article.

I’d like to point out that an eigenvector is a vector whose direction is unchanged or invariant under a transformation.

An invariant line is one where any point on the line is mapped to another point on the same line. This means that under a transformation, a vector could change its direction to point in the opposite direction (and this would also mean it would be on the same line), and hence this vector would also be an eigenvector and have a corresponding eigenvalue which would be negative in this case (the vector would be scaled in the opposite direction).

I thought I should mention this as your explanation (and the wiki demo) is quite misleading as it only demonstrates one of the two possible cases (direction being the same)

Thanks

kalidApril 3, 2013 at 3:02 pmHi Abdul, great comment. Yes, a reflection is a good example — we stay along the same line, but are pointing the other way. Appreciate the clarification.

SandalsMay 29, 2013 at 10:36 pmThis has literally blown my mind multiple times, on multiple levels. This is exactly what I’ve needed… so glad I found this before the final exam lol. Thank you so much for posting this, keep up the good work!

FestusSeptember 27, 2013 at 4:21 pmThis is epic! Simply epic! Linear Algebra I got you now..

SamNovember 18, 2013 at 1:50 pmThis is not giving the correct intuition for linear algebra.

See gilbert strang’s first couple of lectures. They give an intuitive feel and are presented by someone who really understands linear algebra.

kalidNovember 18, 2013 at 10:04 pmHi Sam! I think intuition clicks different — if one analogy helps elucidate an aspect of the subject, so much the better (it’s not like you’re limited to one metaphor). I like Strang’s work in general, but didn’t have much intuition even after acing my university class that used his book! There’s more metaphors I need to find for myself.

KumarJanuary 25, 2014 at 10:06 pmLooking at matrices as “operations” that take “input” data and transform to “output” data, is very intuitive.

kalidJanuary 31, 2014 at 6:44 pmThanks Kumar, glad it clicked.

DimitrisFebruary 19, 2014 at 9:07 amG(x , y , z) = F(x + y + z) = F(x) + F(y) + F(z)

Very good (as always), however I think you do not explain the crucial aspect of dimension independence of a vector space. The “mini arithmetic” addition above can never actually happen as each element represents an independent dimension. This is one of the most difficult concepts to understand IMO, especially when you think of polynomial or function vector spaces.

kalidFebruary 19, 2014 at 9:15 amThanks Dimitris, great feedback. Down the road I’d like to do a follow-up on linear algebra, with independent vectors as the focus. I think the idea of a spreadsheet gets the notational/mechanical elements out of the way, so we can then begin exploring the underlying concepts (just what is an input, anyway?). Appreciate the thoughts!

daily 03/12/2014 | Cshonea's BlogMarch 12, 2014 at 1:31 pm[…] An Intuitive Guide to Linear Algebra | BetterExplained […]

KumarMarch 18, 2014 at 10:00 pmHi Kalid, I couldn’t quite figure out why F(x) = x+3 is not linear. After all, y=x+3 is a straight line meeting the y axis at (0,3), and with a slope of 1. This definition of a straight line (i.e. linear) is different from the definition of ‘linear’ you gave. Am I missing something here. BTW, I am not a math major :) but your explanations to complex math are quite intuitive. thanks

kalidMarch 18, 2014 at 10:24 pmHi Kumar,

Great question. The term “linear function” actually refers to two separate (but related) concepts (see http://en.wikipedia.org/wiki/Linear_function for more detail).

1) A polynomial of degree 0 or 1, i.e. f(x) = ax + b

2) a “linear map”, meaning a function that has the properties that scaling the inputs scales the outputs, f(c*a) = c*f(a), and adding the inputs adds the outputs, f(a + b) = f(a) + f(b).

The function f(x) = x + 3 meets the first definition (polynomial of degree 1), and it is a straight line when drawn. But it doesn’t have the linear input/output relationship. For example, f(1) = 4, but f(2) = 5. We doubled the input, but did not double the input.

The main reason a line is not “linear” (in the linear map sense) is because of that + b term, which is +3 in our case. That +3 is the same amount, no matter how the input changes.

The two meanings are easily-confused, and did confuse me for a long time! Linear algebra refers to deal with behavior of functions that are linear maps.

KumarApril 5, 2014 at 7:57 pmThanks. By the way, thanks to your post, I finally understood the reason behind the movie name “Matrix”. It is about, matrices, the “transformations” of “real space” into “virtual space”. It just dawned on me a moment ago when watching the matrix reloaded! Granted it is a science fiction movie, but still, for the movie producers or whoever, to have come up with that very apt name is really amazing (because of the required mathematical insight).

Eric VMay 7, 2014 at 12:33 amLove your flair Kalid!

This post had me laughing hysterically! Especially the part about “The survivors are…”

Widh I’d seen something like this when I was first exposed to matrices, maybe I wouldn’t have run the other way. Imagine my dismay when, in pursuit of my true love of physics, I encountered the Riemanian metric tensor and had to go back and learn all that matrix stuff I’d ignored for years. Oh yeah, and then there’s moment of inertia. For so long I had been taught it is a single number, but no it just had to be a matrix. It made sense though. Matrix transforms one vector to another. Torque = I * alpha, moment of inertia transforms angular acceleration (vector) to torque (vector). Now if those pesky eigenthingies would just leave me alone!

B. RichJuly 23, 2014 at 8:30 pmHi Kalid,

I disagree with your analysis of the principle of homogeneity in your above example.

A function f(x) is homogeneous if f(n*x) = n*f(x). To use your example:

f(x) = x + 3

f(2*x) = 2*(x+3)

For x = 1, then:

f(1) = 4

f(2*1) = 8

This should be true for all x.

kalidJuly 24, 2014 at 9:45 amHi B. Rich, when evaluating the function you need to replace “x” with the value. So,

f(2*x) = (2*x) + 3

f(1) = 1 + 3 = 4

f(2) = 2 + 3 = 5

devsjeeSeptember 15, 2014 at 3:12 amVery beautiful ! Thank you ..

DeterminatorOctober 30, 2014 at 6:39 pmHi,

I have a quick question about

“The determinant is the “size” of the output transformation. If the input was a unit vector (representing area or volume of 1), the determinant is the size of the transformed area or volume.”

If I have

\[A = \begin{pmatrix} 2 & 1 \\

0 & 1 \end{pmatrix}$ \]

and then feed the column vector (1, 1) into this operation, I get (3, 2).

The determinant of A is |A| = 2. So in what sense is the area of (3, 2) twice the area of (1, 1)?

DeterminatorOctober 30, 2014 at 6:40 pmCorrigendum: I get (3, 1) and not (3, 2). The question remains though.

Ignace ErauwJanuary 6, 2015 at 1:57 pmI agree with the person linking linear algebra to far more advances spaces like Sobolev spaces, Hilbert spaces.

Problem as always in these tutotials is proofs.

Math is about proving theorems. So to in LA.

In Belgium, 1st year at University LA class, the same.

Examination tests is on proving theorems.

So. Proofs (building) books are essential to maths. If you can’t take that hurdle, forget it. Analysis, heavy proofs, algebra the same.

Try once to explain a proof theorem in LA.

It will help a lot of people. Math is not about calculating. Leave solving systems of linear equations to the computer.

Ashish ShuklaAugust 5, 2018 at 8:06 pmHi,

I have heard this many times and I don’t really get it. People say “Math is not about ….”, “Math isn’t …..”! What really is MATH or ABOUT?! The answer to this question hasn’t happened to me yet…

MagnusOctober 19, 2018 at 7:58 amMaths is at its purest a language used to organise, structure and transform complex sets of data… Mathmaticians in general do 3 things. We explore, generalise and “reduce”. We explore the “world of maths” in hopes of finding new data with interesting properties. If the data is unique we attempt to generalize it’s properties. If the properties/data seems familiar we “reduce”/restructure the information into something that has been rigorosly proven to be true.

Linear Algebra Intuition Part 1: Vectors | Mark's BlogJanuary 20, 2015 at 9:23 pm[…] EDIT: Note that this series is not intended to teach linear algebra, but to record insights into visualisation aids for linear algebra for those who are already in the process of learning from some other source. If you’d like to actually learn linear algebra, there are other great resources like this one. […]

Anthony ClohesyJanuary 22, 2015 at 2:57 amThank you so much for this – having read and used your imaginary numbers post in my recent introduction for year 12, I thought ‘I wonder if he’s done anything on matrices?’ Watched Derek Holt’s lectures on Linear Algebra over the summer (they’re very good), but for a really intuitive introduction I can’t ask for better than this page. Granted, it won’t be a full description, but what I really need is an intuitive hook to get started with, and this was definitely it! My year 12 Further Maths group thank you!

kalidJanuary 22, 2015 at 3:07 amAwesome, really glad to hear it helped :). Linear algebra befuddled me for a while because I always associated with “advanced” operations, like rotating a robotic arm in 3d or solving a giant system of equations. No — we can just take an everyday example, like having a stock portfolio and updating it based on some event. Seeing it as a ‘mini spreadsheet’ helped me wrap my mind around the use cases, which can of course expand into to the fancy vector operation stuff.

Bookmarks for January 22nd | Chris's Digital DetritusJanuary 22, 2015 at 3:50 pm[…] An Intuitive Guide to Linear Algebra | BetterExplained – […]

MauricioFebruary 17, 2015 at 9:19 am@Determinator

I’m also confused about what he says with he determinants and the unit vector!

First of all a unit vector is not a vector that has an area of one: http://en.wikipedia.org/wiki/Unit_vector

And multiplying a matrix A by a unit vector does not result in a vector of area = det(A), as you can see by simple examples.

So, I’m still confused about what a determinant means.

But wikipedia helps a little:

http://en.wikipedia.org/wiki/Determinant#2.C2.A0.C3.97.C2.A02_matrices

Also, look at

http://www.wolframalpha.com/input/?i=det+%7B%7B2%2C1%7D%2C%7B0%2C1%7D%7D

The area of the paralelogram represented by the matrix is 2, which is the determinant of the matrix.

kalidFebruary 27, 2015 at 6:53 pm@Determinator, @Mauricio: Whoops, I wasn’t clear enough. Let me clarify. Imagine an x-y axis. A unit square would be determined by two vectors, one on the x-axis (1, 0) and one along the y-axis (0, 1). In a matrix this is {{1 0}, {0 1}} which indeed has unit area:

http://www.wolframalpha.com/input/?i=det+%7B%7B1%2C0%7D%2C%7B0%2C1%7D%7D

Take another matrix, such as {{2,1},{0,1}}. The determinant is 2. Before doing the math, I know my original unit area will be transformed to some set of vectors that sweep out an area of 2.

Wolfram alpha shows the result:

http://www.wolframalpha.com/input/?i=%7B%7B2%2C1%7D%2C%7B0%2C1%7D%7D+.+%7B%7B1%2C0%7D%2C%7B0%2C1%7D%7D

which has area 2:

http://www.wolframalpha.com/input/?i=det+%7B%7B2%2C1%7D%2C%7B0%2C1%7D%7D

We can see that one vector is unchanged, but the other has been skewed, increasing the area of the total to 2. If my original vector was something like {{5,0},{0,1}}, with area 5, I know the result would be 10 after being transformed:

http://www.wolframalpha.com/input/?i=det+%7B%7B2%2C1%7D%2C%7B0%2C1%7D%7D+.+%7B%7B5%2C0%7D%2C%7B0%2C1%7D%7D

Hope this helps.

MauricioFebruary 27, 2015 at 9:34 pmyes, I understand. Thanks for the clarification.

直观理解线性代数的一些概念 | Spark & ShineMarch 6, 2015 at 11:12 am[…] 可见，线性方程计算可以表示成矩阵的运算，更加直观。我的理解是：将线性方程表示成矩阵，旨在便于大规模的处理数据，即批处理。从上图还可以看出，矩阵的乘法是将行乘以列，下图更清晰地表达了矩阵乘法的直观理解(图片来源于这里)： […]

TlApril 24, 2015 at 6:39 pmI like to think of Eigenvectors and eigenvalues from the perspective of a pitcher standing on the mound on a windy day. In the space between the pitcher and home plate there are many lines of force going in different directions created by the swirling winds. Some of these happen to line up perfectly with the direction the pitcher is throwing the ball. These are Eigenvectors. When the ball is thrown along this perfectly lined up vector it will gain speed in addition to the initial speed it was thrown with. This additional speed is the eigenvalue. The lines of force are contours in the space that influence the way the ball moves through that space.

kalidApril 25, 2015 at 1:43 pm@TI: Great analogy, thank you!

FationApril 28, 2015 at 3:43 amCan you come up with a real world problem using matrices, like you have a real world problem that actually happens out there in the world and then you take this problem you write it in terms of equations with x and y and z and then you make matrices and you solve the problem, most importantly find the inverse of that matrix and tell us what does that inverse matrix mean in terms of real world problems, what does it tell us about the real world problem we just broke down into equations, what does the inverse matrix mean in that sense.

I know how to find the inverse matrix and all that stuff but in the end I dont have any idea what the hell it means in the real world, it just looks like arbitrary made up nonsense number trick game.

TlApril 28, 2015 at 7:18 pm@Fation

Economists use linear algebra and specifically matrix inversion all of the time. One classic example would be input – output analysis. The form would be something like Q = AQ + B. Q would be the overall quantity demanded of a good that is both an input in making other goods (The AQ portion) and is also sold to an end user on its own (the B portion). A is a matrix of “technical coefficients” that describes how much Q goes in to make the other goods. In linear algebra because of the nature of matrices you can’t simply divide one by the other. Inversion takes the place of division (it is even denoted A^-1 which is another way of saying it’s a divisor). Multiplying by an inverse is = to division. So in this example Q-AQ=B –> (I-A)Q = B –> Q = (I-A)^-1 * B. “I” being the identity matrix. By multiplying the inverse of (I-A) by B we get Q. So through inversion we discovered an independent expression for Q which is very useful for figuring out some stuff. This is just a simple example but it is used in practice for modeling in the real world. This link provides more detail if you’re interested http://www.math.unt.edu/~tushar/S10Linear2700%20%20Project_files/Davidson%20Paper.pdf

PrashantSeptember 21, 2015 at 6:41 amI am getting a headache on tensors. Please post a lesson about it.

RameshSeptember 24, 2015 at 9:45 amBeautifully explained . Having always been baffled by matrices and determinants , I have to say that this is the best lesson that I have had

ArwaApril 10, 2016 at 3:04 pmخالد

Some how when I am progressing in the ‘learning scheme’, I find your words describe exactly my thoughts!

Along the way you ans question I have had on my mind & couldn’t visualise it, like det of matrix = 0

thanks a lot and keep doing what U do!

AndalusOpenSourceMay 6, 2016 at 5:29 pmKalman Filter

http://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures/

why don’t you make a tutorial about this topic

i’m looking forward to see you talking about Kalman Filter

i gave you a good tutorial about Kalman filter

but i’m sure you will do some thing with more insight and betterunderstanding

as you always do

best of luck

and many thanks for your great effort

JJJMay 13, 2016 at 11:54 amI LOVE you! You’re amazing! I didn’t know that reading about linear algebra could be so entertaining! <3333 Many thanks ^^

DerekJuly 8, 2016 at 4:48 pmA few more ideas, mostly for those who code:

Not only are square matrices useful for storing directed graphs, the relationship between matrix vector multiplication and Breadth First Search on a graph is a very powerful one. Put simply, for any given state on your graph, you can iterate each ‘marker’ one step forward via a matrix vector multiplication. In this sense, your matrix quite literally is a map from one allowed point to the next.

With respect to eigenvalues and eigenvectors: High levlel, I wouldn’t worry about nits like Jordan blocks. The gist is: if you want to, you can diagonalize a matrix. You can always do so when you have unique eigenvalues. If eigenvalues are not unique, you still can sometimes — but even then if you have linearly dependent eigenvectors associated with a matrix A, you can still diagonalize an approximation of A if you perturb the diagonal of A slightly. In such a case you have an approximation of A (of course some care is needed here in how/why you approximate A) in a very interpretable form and that can be very useful.

One of the things Linear Algebra really hits home on is how a change in representation (via factorization) can make certain problem a lot easier to understand. Indeed if you have diagonalized A such that $A = PDP^{-1}$ you now can easily see what is going on under the hood of A when you multiply AA, AAA and AAAA, and so on. Something that may have been a bit confusing or opaque, now has been transformed to simply using exponents on scalars (the eigenvalues along the diagonal matrix D) . And if A is not invertible, it becomes even more obvious why — D is a diagonal matrix with one or more zeros on its diagonal… so to invert A we’d need to (among other things) get the reciprocoals of the diagonal of D… which turns this into a classic problem of trying to divide a number by zero which prevents you from completing the inversion of A.

And once you have this down, you can grab the taylor series for the exponential function, and quickly understand that exp(A) really just means to expotentiate the eigs for A — which can be very useful for solving certain classes of problems.

On top of all this, if there is a single dominant steady state that A seems to push a nonzero vector toward, (or equivalently: your graph seems to always end up in a similar and constant looking configuration after iterating through it for a bit), we can very easily understand this is related to the dominant eigenvector of A, and that the rate of convergence is a function of the magnitude of A’s ‘second biggest’ eigenvalue relative to the dominant eig. This is perhaps most obvious with Markov Chains but it applies in general.

There is quite a bit more to be said about the links between graph theory and matrices, but the above are a few of the big ideas. One thing that is nice about graph theory is that you can draw pictures and then use the m to model out complicated behavior. (If you can get a computer to draw the pictures and visually show what happens when you move through a graph, even better.) This allows a nice visualization tie in with matrix vector operations that hopefully is complimentary to the geometric ones other people have mentioned.

BenJuly 11, 2016 at 12:22 amIs there a book on this intuitive linear algebra and where can i buy it?

DorothyJanuary 26, 2017 at 7:27 amwow, I like it

Dr CL VermaMarch 30, 2017 at 5:03 amI want to know how linear algebra is used in Applies Statistics. I did only normal classical algebra.

AleixApril 11, 2017 at 11:57 amHello

Im a begginer learning algebra. Excuse my is my question is absurd.

When you present the operation displaystyle{F(x, y, z) = 3x + 4y + 5z}

Why do we sum their results?

I mean.. we did summed in the cases you presented earlier because the opration was the same for all inputs, like in

displaystyle{G(x, y, z) = F(x + y + z) = F(x) + F(y) + F(z)}

But {F(x, y, z) = 3x + 4y + 5z} is composed of three different operations, could be like {F(x, y, z) = A(x) +B(y)+ C(z)}

I understand that we combine linear operations (A(x), B(y), C(z)) to create F(x), but how this one respect the rule “adding inputs adds the outputs”?

Thanks a lot!

Machine Learning Introduction – Deep Machine LearningSeptember 2, 2017 at 5:14 am[…] Basic linear Algebra […]

Pattern Recognition using PCA: Variables and their Geometric Relationships – Laplace BayesSeptember 13, 2017 at 8:29 am[…] the determinant of the matrix (which intuitively means the scale of the transformation – read here for some more details). Only square matrices have determinants, and if the determinant is zero, […]

Pattern Recognition using PCA: Variables and their Geometric Relationships – DataCATzSeptember 13, 2017 at 8:37 am[…] the determinant of the matrix (which intuitively means the scale of the transformation – read here for some more details). Only square matrices have determinants, and if the determinant is zero, […]

BartNovember 17, 2017 at 10:48 amYou lost me at “Exponents…aren’t predictable.”

干货 | 请收下这份2018学习清单：150个最好的机器学习，NLP和Python教程-时讯快报January 29, 2018 at 3:40 pm[…] 线性代数简明指南(betterexplained.com) […]

Jae Duk SeoJuly 5, 2018 at 12:52 pmamazing and very veyr good

Jeff ConnorsAugust 22, 2018 at 3:16 amSuch a great explanation,, can’t tell you how thankful I am :-) THANK YOU so much for the beautiful explanation,,

An Intuitive Guide to Linear Algebra – LinktailJanuary 9, 2019 at 12:32 pm[…] Click here […]

Data Science学习第三天: Free Data Science learning resources. - { 远山 } 江南August 22, 2019 at 8:16 pm[…] https://betterexplained.com/articles/linear-algebra-guide […]