REPO: algorithms
See References for links to course content
Work notes from Introduction to Algorithms MIT 6.006 course lectures on ytube
MIT 6.006 course info ~
Lecture & Problem Schedule ~
Lecture notes ~
Resources - Python / Latex ~
Quizzes / Exams
KEY: (:white_check_mark:) watched, (:mag:) rewatch, (:flashlight:) unseen / to watch, (:question:) problem / open question
CODE: (:seedling:) code complete, (:cactus:) incomplete / needs work, (:lemon:) not happy / code smells,
TABLE created with Table Generator - load table from here
- Abstract
- Progress
- Contents
- AIM:
- Intoduction to algorithms MIT - part 1 / 3:
- Unit 1: Introduction
- Problem set 1.
- Problem 1-1. [15 points] Asymptotic Practice Calculating asymptotic complexity (Big O notation)
- Problem 1-2. [15 points] Recurrence Relation Resolution
- Problem 1-3. [16 points] Peak-Finding Correctness
- Problem 1-4. [16 points] Peak-Finding Efficiency
- Problem 1-5. [19 points] Peak-Finding Proof
- Problem 1-6. [19 points] Peak-Finding Counterexamples data that shows how the python algorithms can fail
- Unit 2: Sorting and Trees
- Problem set 2.
1. 2-1 Fractal rendering [40pts]
2. 2-2 Digital Circuit Simulation [60pt] - Unit 3: Hashing
- Problem set 3 - Range Query (Analysis) / Circuit Layout (Tool optimisation)
- Problem set 4
- Quiz 1
- Unit 4: Numerics
- Unit 5: Graphs
- Problem set 5 due
- Unit 6: Shortest Paths
- Unit 7: Dynamic Programming
- L19 - Memoization, subproblems, guessing, bottom-up; Fibonacci, shortest paths
- R19 - Dynamic Programming: Crazy Eights, Shortest Path
- L20 - Dynamic Programming II: Text Justification, perfect information Blackjack. parent pointers
- R20 - Dynamic Programming: Blackjack
- L21 - String subproblems, psuedopolynomial time; parenthesization, edit distance, knapsack
- R21 - Dynamic Programming: Knapsack Problem
- [L22 - Two kinds of guessing; piano/guitar fingering, Tetris training, Super Mario Bros.vid](#l22---two-kinds-of-guessing-pianoguitar-fingering-tetris-training-super-mario-brosvid)
- R22 - Dynamic Programming: Dance Dance Revolution
- Problem set 7 due
- Unit 8: Advanced Topics
- Glossary of terms
- How To s
- References
Create an algorithms reference, and aide-memoire
In the mean time here are two great resources:
Big O Cheat Sheet: https://www.bigocheatsheet.com/
Open data structures: http://opendatastructures.org/
LECTURE PLAYLIST on YOUTUBE
COURSE INFO @ MIT
contents ~ vid ~ lect notes
time | notes |
---|---|
0m - 15m45 | Introduction to course and 8 units its separated into - mentions prerequisite Maths 6.042 |
15m45 - 36m20 | Peak finding 1D |
17m50 | defining the solution to 1D algorithm |
27m40 | Divide & conquer (binary search) |
32m30 | Recurence relation overview |
36m20 | Peak finding 2D |
37m13 | defining the solution to 2D algorithm - note a >= b - means |
38m | Greedy ascent algorithm - find highest neighbour, go in that direction, loop |
40m43 | Greedy ascent algorithm complexity Θ(mn) (rown x columns) O(n^2) IE BAD! |
43m | 2D binary search |
46m | EG 1 - incorrect search |
47m45 | EG2 - working version |
51m25 | Recurrence for 2D algorithm |
54m15 | Note on PS1 - Prove one algorithm is correct & find counter examples for the rest (which are not) |
Time complexity (always worst case complexity) aka: Big O notation / Bachmann-Landau notation / asymptotic notation.
See Time Complexity of Common Data Structures great summary of array sorting algorithm complexity and data structure operations
The following symbols o, Ω, ω, and Θ, are used to describe differing kinds of bounds on asymptotic growth rates.
symbol | reads | description |
---|---|---|
O | big O | describes the asymptotic behaviour of functions WORST case or UPPER bound (common in CompSci) |
Θ | big Theta | describes the asymptotic behaviour of functions AVERAGE case - LOWER BOUND & UPPER BOUND (common in CompSci) (note the upper and lower bounds are the same function that only differs by a constant) |
Ω | big Omega | BEST case or LOWER bound (common in CompSci) |
o | little O | loose upper bound (common in Maths rare in CompSci) |
ω | little omega | rough estimate of the order of the growth (rarely used) |
T(n) | time function? | function defining the exact Time or number of steps to complete an algorithm for n items |
Big O Notation wikipedia Family of Bachmann-Landau notations
Big O Notation notes from MIT p3 - from wikipedia?
Maths Symbols by subject Asymptotic behaviour
Time Complexity - Order Severity w/ Examples
Image source: wikipedia Licence: Attribution-Share Alike 4.0 International
The rate of growth of a function is also known as its order.
Great resource about Common Time Complexities
Where c is a constant and n is the size of data - number of elements to process:
name | notation | example |
---|---|---|
LOW | - | - |
constant time | O(1) | Simple calculations, array lookups |
logarithmic time | O(log n) | Binary search |
linear time | O(n) | Finding smallest / large item in unsorted list |
linearithmic time | O(n log n) | comparison sort, Fast Fourier Transform |
polylogarithmic time | O((log n) ^ c) | |
quadratic time | O(n^2) | Bubble sort |
polynomial time | O(n^c) where constant c > 1 | AKS Primality Test |
exponential time | O(2^n) | Various brute force searches |
factorial time | O(n!) | Brute force travelling salesman |
HIGH | - | many more here - wikipedia |
If a function is made up of multiple components, (nearly always) the highest order is used:
(this is because big O is the upper bound (worst case) and highest order will be fastest growing and eventually dwarf the other terms)
Egs
Simplified set theory? . . .
Some people (mostly mathematicians, as opposed to computer scientists) prefer to define O(g(x)) as a set-valued function, whose value is all functions that do not grow faster then g(x), and use set membership notation to indicate that a specific function is a member of the set thus defined. Both forms are in common use, but the sloppier equality notation is more common at present.
Peak finding: assuming single peak, and
Initial look at 1d peak finding (an single index array)
a) linear
b) search by halves - height of a binary tree is logn (log base 2)
34m50 T(n) = Θ(1) + . . . Θ(1) = Θ(logn)
Binary search tree is formed by the search by halves algorithm and the number of steps to find the target is the height of the tree.
See R1 31m & 43m
Question: What comprises work due to n and what can be counted as constant time Θ(1) and disregarded!
51m48
m = columns n = rows **That all makes sense but binary search relies on sorted data and the data in the example is NOT sorted.** Swapping 14 & 20 would result in a fail, maybe it�ll make sense later!Fundamental to binary tree algorithms:
contents ~
vid ~
lect notes ~
Code - PSet 1
Reading:
time | notes |
---|---|
2m - 13m40 | Asymptotic complexity Ω, Θ, O |
13m40 | common worst case runtimes |
16m | example asymptotic notation |
23m | log(log(n)) |
25m | log ( N choose N/2 ) = Θ(n) |
31m | 43m peak finding 1d - running time T(n) |
43m - 53m | peak finding - running time T(n) |
NOTE
g(x) = O(f(x)) - UPPER bound - O - big O
g(x) = Θ(f(x)) - UPPER & LOWER bound - Θ - big Theta
g(x) = Ω(f(x)) - LOWER bound - Ω - big Omega
But in this course where big O is written it usually mean big Θ 13m - thanks for the confusion
31m 1d Peak finding - running timeT(n) WRITE OUT - in hand written notes L1
43m 2d Peak finding - running timeT(n) WRITE OUT - in hand written notes L1
Finish R1 notes from - Recurrence Traps & 2-D Peak Finding: Algorithm 5
(26m) Stirlings approximation - equation for n!
(26m) [Equation for Series - summation](https://en.wikipedia.org/wiki/Stirling%27s_approximation)contents ~ vid ~ lect notes ~ DocDistance 8 versions Code see R2 ~ DocDistance 8 versions R2 notes
time | notes |
---|---|
0-6m | Whats an algorithm |
6m | model of computation |
7m50 | Random access machine (model of computation) |
13m40 | Pointer machine (model of computation) |
19m-32m | Python model |
32m-44m | Document distance |
44m | 8 version of doc distance and optimisations |
Algorithm: computational procedure for solving a problem
What is time (complexity)? O(1) = constant time
Operations an algorithm can perform
Time cost of those operations
Models of computation:
a) random access machine RAM - word array
- load O(1)
- compute O(1)
- store O(1)
- word log(size of memory)
b) pointer machine (OO programming)
- dynamically allocated objects O(1)
- object has constant no of fields O(1)
- word or pointer / references / null or None O(1)
Increment array item:
list[i] = list[i] + 5 is O(1)
Object attribute access:
obj w/ O(1) number of attributes (constant no. of attributes)
attribute access is O(1) - pointer/ref access
Access next item in linked list:
x = x.next is O(1)
Add item to linked list:
list.append(x) ? python uses table doubling O(1)
Adding two linked lists together
list1 + list2 O(n)
Code steps:
L = [] O(1)
for x in list1: L.append(x) O( len(list1) ) - for list L - len(L) also written | L |
for x in list2: L.append(x) O( len(list2) ) - len, length, size, no of elements
total: **O(1 + len(list1) + len(list2) ) = O(n) **
Checking existence of item in list:
x in L O(n) - linear time
from python check to see if x is in list L
required search through whole list (worst case)
Sorting a list:
L.sort() O(| L | log | L |) - ie O(n log n)
Covered Lecture 3 Insertion sort, merge sort
Retrieving item from hash w/ key:
retrieve dict[key] O(1) constant time (uses lookup hash)
Covered in Lectures 8-10 Hashing with chaining
Adding two longs (many word numbers)
add two number of x words & y words:
x+y O(|x| + |y|)
x*y O((|x| + |y|)^lg3)
NOTE lg used instead of log base 2 (log_2)
lg3 = 1.6 so better than quadratic time
Covered in Lecture 11 - Integer arithmetic, Karatsuba multiplication
heapq covered in lecture 4 - Heaps and heap sort
d�(D1, D2) like a correlation function, similarity of two documents
The idea is to look for shared words:
Start by creating the vector of a document, hash of words (key = word) with a count of each one as value.
Create a vector common words, it contains only the word in both doc vectors
Sum them up to give a value.
The larger the value the more correlated they are.
Obviously bigger documents will naturally give bigger numbers do the value is normalised by dividing by the size of the original vectors.
41m angle of two vectors
more detail?
REFS
Time Complexity of Common Data Structures
Look into this Latex insertion solution
using pdfLatex may work
Or use LaTeXiT from the command line to parse and auto generate equations
Or readme2tex
contents ~ vid ~ lect notes - CODE handout ~ python cost model 8 sources compared ~ Doc Distance 7 stages optimisation local code ~ Code - PSet 1
time | notes |
---|---|
0-7m | inner product |
7m** | go through docdist1 |
13m** | cost of code by line - docdist1: get_words_from_string() - CODE handout |
One line, N = characters, w = word size, number of word = N / w + 1 (1 for each space) |
Texify issue notes
Let me try this again but without the opening and closing brackets "<" and ">" so it won't try to render :)
What TeXify emits:
<img src="/docs/Tuple/tex/4b83ab92144c8118c40653189ab60df5.svg?invert_in_darkmode&sanitize=true" align=middle width=81.54091274999998pt height=22.831056599999986pt/>
What it needs to be:
<img src="/docs/Tuple/tex/4b83ab92144c8118c40653189ab60df5.svg?invert_in_darkmode&sanitize=true" align="middle" width="81.54091274999998pt" height="22.831056599999986pt"/>:
Compare dd1 vs dd2 optimisation
Any equation identities / topics for this lecture include context and uses for later reference
For each group of functions, sort the functions in increasing order of asymptotic (big-O) complexity:
Order of complexity here Time Complexity - Order Severity
Plotting functions to get a feel for them ./matplotlib/time_complexity_plot_q.py
Comment functions in/out of the source or add custom functions!
Problem 1.1a
Problem 1.1b
Note for f3() boils down to this [proof I think . . ](https://github.com/UnacceptableBehaviour/algorithms/blob/master/formulae/1st_stab_nCr_proof.jpeg) with the left dominating the right give nlog(n)Problem 1.1c
For each of the following recurrence relations, pick the correct asymptotic runtime:
asymptotic complexity of an algorithm with runtime T (n, n)
(a)
(b)
(c)
Maths course:
Lots of resources for recurrence relations
Double click /problems/MIT6_006F11_ps1/visualizer.html to see algorithms in operation.
Python code notes:
Its python2 so run with
> cd /algorithms/problems/MIT6_006F11_ps1 # source unzip directory
> python ./main.py
Enter a file name to load from (default: problem.py):
Algorithm 1 : (4, 4) => is a peak
Algorithm 2 : (4, 4) => is a peak
Algorithm 3 : (4, 4) => is a peak
Algorithm 4 : (4, 4) => is a peak
OK so whats going on here?
main.py
load problem(matrix) from problem.py(or pas file name via CLI arg)
create list of algos, tuples of (name, algo_no_func)
iterate through list and call each algo w problem, trace
trace object is a list of dictionaries, each method appends a result of each step in the algorithm as a dict
the list of dicts is coverted into json file (trace.jsonp)
print results
Once run the results can be visualised with visualizer.html which loads data created by the tracer (in trace.jsonp)
json file loaded by visualiser: <script type="text/javascript" src="./trace.jsonp"></script>
Asses each algorithm to see if it is correct, and if it is efficient:
a) algo 1 correct? r x c
p1 (4,4) yes
p2 (2,7) yes
p3 (3,5) no - 11 surrounded by 3 11�s finds peak right there
p4 (3,4) yes
p5 (3,8) no - stops on consecutive same values
efficient?
b) algo 2 correct? r x c
p1 (4,4) yes
p2 (4,3) no - stops on consecutive same values - CHANGE TEST
p3 (4,3) no - stops on consecutive same values - CHANGE TEST
p4 (4,4) yes
p5 (4,5) no -
efficient?
c) algo 3 correct? r x c
p1 (4,4) yes
p2 (5,4) no - goes left - stops on consecutive same values - CHANGE? TEST - recursion required
p3 (6,5) no - stops on consecutive same values - > test
p4 (4,4) yes
p5 (5,8) no -
efficient?
d) algo4 correct? r x c
p1 (4,4) yes
p2 (5,4) no - goes left - stops on consecutive same values - CHANGE? TEST - recursion required
p3 (6,5) no - stops on consecutive same values - > test
p4 (4,4) yes
p5 (5,8) no -
efficient?
e) algo 5 brute force scan - CORRECT
p1 (4,4) yes
p2 (2,7) yes
p3 (4,6) yes
p4 (4,4) yes
p5 (6,6) yes
Look at 4 alorithms in algorithms.py
Assess correctness, efficiency
proof for one of the algorithms
Problem 1-6. [19 points] Peak-Finding Counterexamples data that shows how the python algorithms can fail
REFERENCES:
Correctness - various proofs
Reccurence relation by Induction
contents ~
vid
recitation
lect notes
Sorted list have various properties:
simple to find median (constant time)
find item:
scan to find O(n) in sorted & unsorted list
binary search O(log n) (halves the list at each step)
Application: compression (frequency counting), graphic z-order etc
Insertion sort - also in doc distance python files (/algorithms/lecture_code/L2_doc_distance)
Bubble sort (scan list swap values in the wrong order, repeat) O(n^2)
Insertion Sort
a) Vanilla insertion sort (L2_doc_distance)
Move 1st item to new list, move next item to new list scan until correct place found, insert, repeat
n^2 - breaks down as: (n for number of items) * (n for search & insert )
b) Binary insertion sort
Move 1st item to new list, move next item to new list binary search until correct place found, insert, repeat
n log n - breaks down as: (n for number of items) * (log n for binary search & insert )
Merge sort in 3mins
Pseudocode: (@ 2m32 in above link)
Split array into 2,
repeat until only 2 items in each leaf,
sort those two items,
go up a layer and merge leaves
[Python implementation here] (https://github.com/UnacceptableBehaviour/algorithms/blob/master/algos/merge_sort.py) likely naive.
Concept of auxiliary space, in the above python code that would be stack I assume.
Instrument the merge sort code - s
Difference between theta, omega complexity and big O
vid
lect notes
Reading: none listed
Code: Python
type: - (maxheap or minheap) - prefered implementation ?
use cases: finding min or max value in sorted set (not both), scheduling, flattening linked list, finding non overlaping intervals, ROAM
queries: min/max
updates: pop_min/max - Θ(logn), insert item - Θ(logn), delete item - Θ(logn), change priority
RI max heap: Node is larger than or equal child nodes, complete binary tree
RI min heap: Node is less than or equal child nodes, complete binary tree
properties: root contains max/min value, comparison model
node positions in array:
root i=1
parent = i/2
lc_i = 2i
rc_i = 2i+1
Credit: https://en.wikipedia.org/wiki/Heap_(data_structure)#/media/File:Heap-as-array.svg
RI - representation invariant
Implements a set of elements associated with a key - methods:
insert(x, into set S) - add S as leaf node and bubble up to correct position,
get max priority (of set S) - read max - O(1)
extract_max (of set S) - pop(max) replace max w/ last leaf and maxheapify - swap it w/ higher of the two leaves until in correct position, O(logn)
inc_key (in set S, increase element x�s key, to value k)
get min priority (of set S),
delete, change priority in Q.
Is an implementation of a priority Q, array structure visualised as a nearly complete binary tree - p4
Root of tree is array index 0, (tree node i=1)
1,2 are LEFT & RIGHT split
3,4 - 5,6 are next layer LEFT & RIGHT split
counting on like that see page 4
see page 5
Using array to implement the heap
root i=1
parent = i/2
left = 2i
right = 2i+1
No pointers required.
Max heap property:
the key of a node >= keys of its children
(the key being the value in the circle)
Min heap property:
the key of a node <= keys of its parent
Note for array of any size: element A[n/2+1 . . n] are ALL leaves!
contents ~ vid ~ lect notes ~ Code 1st guess ~ Code MIT ~ Reading: CLRS Chapter 10, 12.1-3
type: BST - binary search tree
use cases: sorted data, priority queue
queries: next_smaller, next_larger, search/find, min, max, node_count_before
updates: insert, delete
representation invariant (RI): ALL children to left are smaller, ALL children to the right are larger.
properties: sorted data, comparison model
See R5 - Recursion Trees, Binary Search Trees for queries / updates walkthrough
time | notes |
---|---|
0-6m | define problem - runway scheduling - to demonstrate BST ADT |
6m-21m | EG�s things that don�t work: sorted array, sorted list(no fast insertion), heap(no successor/ predecessor or pointers) |
21m | intro to BSTs |
24m | BST RI |
26m | insert() - O(h) - height of tree - O(logn) - n=number of nodes |
35m | min() - O(h) - go farthest left |
36m | max() - O(h) - go farthest right |
37m-43m | Functional AUGMENTATION - Rank(t) - how many planes land before time t? |
| Add number of nodes below node to it. (number includes node itself)
43m | AUGMENTATION - Rank(k) - algorithm code - O(logn) for a balanced tree!! Pseudo code below.
Concept: Representation Invariant - property of the data structure
Invariant (RI): All children to left are smaller, all children to the right are larger.
Each node has 3 pointers: parent, lchild, rchild
x
/ \
<x >x
Runway scheduling problem. Insert aircraft landing at least k minutes (3 min in this case) away from any other scheduled landings.
Plane landing @: 2 5 37 44 99
Steps
Find position
Check if theres 3min space either side
Insert new landing
Implemented as sorted array
Find insertion point: use binary search - O(logn) logarithmic time
Check space to land (3minutes) - O(1) constant time
Insert (requires shifting each element to make space for insertion - worst case front of array) O(n) - linear time
Implemented as sorted (linked) list
Insert is pointer manipulation - O(1) constant time - better
No binary search on a list! - so brute force O(n)
Implemented as heap
Check for element n1 <= k <= n2 requires searching whole tree - O(n)
Implemented as binary search tree (BST)
Find, navigate tree left if looking for an earlier time, right if larger. worst case from root of tree to leaf - ie height O(h)
Check do check at each node O(1)
Insert pointer manipulation O(1)
Other O(h) operations
Min - far left leaf
Max - far right leaf
Next largest value - Up a node? NO its next right node - parent.min() - smallest node of parent subtree. (or parent if no subtree)
Augmented BST add subtree size to the data in the node. Includes the node and its children (number after dash in tree below).
Maintained by incrementing the tree size by one as theyre traversed to an insertion.
To find how many planes scheduled to land before time t?
Find highest node that is smaller than t return tree size to the left of the node including it.
Which is node subtree size - right_child subtree size (since all these nodes are higher) - ALMOST
To the left of the tree including back up the tree! Work through example!
129-12
104-5 158-6
82-3 116-1 **138-3 184-2
77-1 103-1 - - 134-1 141-1 175-1 -
For t=138 number planes scheduled before are - 43m
All of left subtree: left_child.tree_size = 1
plus
(find first parent node less than 138) = 129.left_child.tree_size + 1(node 129) = 6
gives
total: 7
Algorithm:
Start at root is **t** larger than node?
YES - add left_child.tree_size + 1 to total, move to right_child
NO - move to left_child
Repeat until no more children or t = node
Return total
contents ~ vid ~ lect notes ~ Code 1st guess ~ Code MIT ~
time | notes |
---|---|
3m-14m | Solving Recurrence for merge sort. (PS2 problem 1) |
14m-26m | Data structures, HEAP |
26m-35m | Data structures, BST, (unbalanced) |
35m-42m | BST find successor/predecessor |
42m-54m | BST delete, 3 cases O(h) height of tree |
54m-end | BST augmentation - Uses example min - needed for problem set 3 PS3 |
Solving Recurrence for merge sort:
def merge_sort(unsorted_array) # n elements
bisect unsorted_array into aL & aR
while aL elements > 1: left = merge_sort(aL) # keep going if more than one element
while aR elements > 1: right = merge_sort(aR)
return merge(left, right) # nulls removed in merge
1st term: merge_sort calls itself twice with n/2 elements 2T(n/2) 2nd term: merge take O(n)? but n=1 in final node so O(1) constant time
T(n) = 2T(n/2) + O(n) # recurrence T(1) = Θ(1) # base case - merge (1 element)
merge_sort call with number of elements
n
n/2 n/2
n/4 n/4 n/4 n/4 # each recursion
|
until reach base case
|
1 1 1 1 . . . 1 1 1 1 # base case
Binary Search Tree (BST) Local: algos/binary_search_tree.py
Example tree
67
45 129
15 57 104 158
- 29 - 66 82 116 138 184
- - 17 35 - - - - 77 103 - - 134 141 175 -
- - - - - - - - - - - - - - - - - - 90 - - - - - - - - - - - - -
rc = right_child rp = right_parent(node == node.parent.left)
# case 1 - node has rc
return rc.min
# case 2 - node has no rc (dont care about lc always smaller)
- has rp (left of parent : node == node.parent.left)
return parent
# case 3 - node has no rc or rp
- right of parent : node == node.parent.right
go up parent until one has a right parent
return that
Case 3 covers case two - so no need to implement case 2
Example tree - BST (Binary search tree - unusually flat! Can be very lopsided)
67
45 129
15 57 104 158
- 29 - 66 82 116 138 184
- - 17 35 - - - - 77 103 - - 134 141 175 -
- - - - - - - - - - - - - - - - - - 90 - - - - - - - - - - - - -
rc = right_child rp = right_parent(node == node.parent.left)
case 1: leaf
simply delete
case 2: delete single node with only one sub-tree
EG delete 15, 57 or 184
replace parent pointer to node to point at subtree
straight or zig/zag its the same
case 2: deleting a node that has 2 subtrees
replace node with successor - smallest element in right subtree
successor may have a subtree so need to call delete on it first
replace original deleted node with it
(from R5-44m50)
running time
find key O(h) +
delete - possible 2 subtrees O(h)
link swaps constant time O(1)
= O(h) + O(h) + O(1) = 2 * O(h) + O(1) = O(h)
contents ~
vid -
lect notes -
MIT EG code -
guess implementation code
Reading: see lecture notes p6 summary of BST�s and where in CLRS to find them
type: tree (balanced )
use cases: sort & retrieve data set, prefered in search intensive application, insert more costly
queries: min, max, successor, predecessor, search - Θ(logn), in-order traversal - Θ(n)
updates: insert item - Θ(logn), insert n items - Θ(nlogn), delete item - Θ(logn), rebalance
RI: height left/right trees only every differ by 1 - balanced tree
RI from BST: ALL children to left are smaller, ALL children to the right are larger.
properties: height = logn => tree balanced - height & balance maintained in each node, comparison model
RI - representation invariant
time | notes |
---|---|
0-2m | BST summary - in-order traversal using recursion |
2m-11m | importance of being balanced - getting HEIGHT to be logn - local HEIGHT calculation |
11m | AVL trees definition and balance |
18m | showing height is logn |
10-28m | height of balanced tree maths |
19m-25m | Analyse Nh min nodes in a tree of height h - v1 |
26m-XXm | Analyse Nh min nodes in a tree of height h - v2 - ps3 |
32m-48m | Rotations |
**48m-52 ** | AVL sort |
50m | summary of heap / bst AVL reasons for use |
AVL - inventors Adelson-Velsky and Landis
Visualisation of AVL tree
In order traversal - process nodes by key order
successor - Next larger
predecessor - next smaller
height of a node - longest path down to a leaf from node including itself (the +1 below)
depth of a node - node in path from root to node
balanced tree - height of left child = height of right child +/-1 - height h = logn
IMPORTANCE OF A BALANCED TREE - height being log n
Unbalance tree worst case height - n average n/2 << V.BAD!
height = max(lchild height, rchild height) +1 max(3,8)+1 = 9 (maintained in each node)
NOTE: NULL child node have a height of -1 so cal works - max(-1,-1)+1 = 0
information local to node has low (constant time) maintenance over head
largest_height also store which height is larger +1 left, 0 equal, -1 right
Min number of nodes in a balanced tree
IE In a tree of height h, number of nodes n, is the sum of: the root + the two sub trees (that differ in height by 1).
The above recurrence is similar to the Fibonacci sequence, defined as:
an approximation for which is (Fibonacci number = nearest integer . . .
Since Nh = min nodes in a tree of height h, and phi = 1.618.
Method of maintaining property: balanced AVL tree. << WHOLE LECTURE ABOUT THIS!
insert(x) - steps to do an insert
insert as normal for BST
update each nodes height while parsing tree (and largest_height L M R - left middle right)
largest_height = hL - hR
from inserted node walk up the tree checking largest_height if abs(largest_height) > 2
if largest_height > 2 right-rotate (left height too large)
if largest_height < -2 anti-rotate (right height too large)
What the above is saying is - if the difference in height between to children is more than +/-1 then tree need rebalancing
Rotation Cases (@ 32m)
29
/
26
/
23 # left height too large
26 # right-rotate 29
/ \
23 29
2nd case
65
/
50
\
55 # zig zag requires 2 rotations:
65 # left-rotate 50
/
55
/
50 # right-rotate 65:
55
/ \
50 65
50m good summary of heap / bst AVL reasons for use
contents ~ vid ~ lect notes ~ Code Handout ~ Reading: none listed
time | notes |
---|---|
0-7m | BST review, height |
7m-11m | AVL balance |
**11m ** | Why height is logn (number of nodes in h rows (doubles each row) n = 2^h so height h = logn) |
18m | BST review, insert |
21m | AVL review, left_rotate, right_rotate |
28m22 | pointer exchange pseudo code right_rotate |
40m | AVL review REBALANCE |
50m | rebalance synopsis - AUGMENTATION (height in this case) needed for ps3 |
height h = longest path to leaf
4m28 - diagram on board show a single node has a height of 0
then he writes
height = max(lchild height, rchild height) +1 max(3,8)+1 = 9 (maintained in each node)
which means tit should be 1 - ??
The code line - Height Augmentation ln 7 says 1
def update_height(node): 8 node.height = max(height(node.left), height(node.right)) + 1
6m for case where node is a leaf height returns -1!
so the root evaluates to max(-1,-1)+1 = 0
Reads for all n, height of left and right subtree differs by 1 or less - basically says this tree is balanced
REBALANCE CASE 1 - unbalanced in a straight line: Rotate to fold the line
Any equation identities / topics for this lecture include context and uses for later reference
contents ~ vid ~ lect notes ~ radix_sort MSB 1m56 - LSB 2m11 Code: Reading: none listed
type: , mediocre sorting algorithm
comp model: non comparative, ram model, sorting algorithm
use cases: good for bla
queries:
updates: sort O(k + n) n=num of keys, k=key range (32 for a 5it key)
(RI):
properties: in place? stable
comp model = computational model
type: (bucket / digital sort), sort by digit LSB > MSB or vice versa, based on counting sort, stable
comp model: non comparative, ram model, sorting algorithm
use cases: good for bla
queries:
updates: sort O(d + n) n=num of keys, d=digits in key
(RI):
properties: in place? stable
comp model = computational model
time | notes |
---|---|
0-2m | Introduction |
2m-32m | Computational models - Comparison model |
5m20 | Descision tree |
14m48 | properties of descision tree - node, leaf, path, path length, height |
16m | Searching lower bound |
18m | PROOF searching Ω(lg n) - simple |
20m-24m | Sorting lower bound - logic |
24m-32m | Sorting lower bound - maths PROOF - p2 in notes |
32m-37m | INTRODUCTION RAM MODEL & INTEGER SORTING |
37m-40m | COUNTING SORT |
40m-44m | Counting sort RUNNING TIME - O(k + n) n=keys k=key range - no of digits? |
45m | RADIX Sort MSB 1m56 - LSB 2m11 |
- comparison model (computation model)
- Only operation allowed are comparisons
- lower bounds
searching: Ω(lg n) - binary search is optimal
sorting: Ω(n lg n) - merge sort is optimal
CONCEPTS: Models of computation: Comparison model 2-32m
decision tree | algorithm |
---|---|
internal node | binary decision (comparison) |
leaf | answer |
root-to-leaf path | algorithm execution |
path length | running time |
height of tree | worst case running time |
- ram model (computation model)
- O(n) sorting algorithms
. . counting sort
. . radix sort
Assumptions:
Go through unsorted items using key to allocate to an index of an array. This basically creates a linked list at the index of an array for duplicates
Running cost: O(k + n) n=number of keys, k=key range (5bits key this is 32?)
Conceptually simple - sort by each digit starting from LSB or MSB
contents ~ vid ~ lect notes ~ Code: Reading: none listed
time | notes |
---|---|
0m | Summary sorting Algos so far |
7m | Python comparison overloading __lt__,__le__,__gt__,__ge__,__gq__,__ne__ |
7m-35m | implementing COUNTING sort in place? |
35m | implementing RADIX sort |
42m-47m | RADIX sort running time justification - REDO this when awake - ps2 or ps3 |
47m-52m | STABILITY, implications & summary of algos |
Note for sorting you have to output the data set size n for optimal run time(Ω) is Ω(n)
algorithm | running time | model | stability |
---|---|---|---|
insertion sort | O(n^2) | comparison model | stable |
merge sort | O(n.logn) | comparison model | stable |
heap sort | O(n.logn) | comparison model | unstable |
counting sort | O(k + n) | ? model | stable |
radix sort | - | O(d + n) | stable |
- | - |
Notes Comparison model (only using comparisons) SORTING run time Ω(n.logn) counting sort k=digits
Any equation identities / topics for this lecture include context and uses for later reference
Koch snowflake rendering: computational requirements of 4 ways of rendering LoD n (Level of Detail 0-n)
Recitation 5 (0m - 13m) �explains how to do this - dont be scared if you costs at each level arent the same sum them up and youll get the right answer
Recursion tree - forrest of trees in this case
Run example code fractal.html (/algorithms/problems/MIT6_006F11_ps2/fractal) git difftool
First - 3D hardware accelerated rendering . .
Surface > triangles > CPU co-ords list > GPU renders
a) [1pt] height of recursion tree for rendering snowflake of LoD n?
b) [1pt] how many node in the tree at level n
c) [1pt] whats the rendering time (triangle count) for a node at depth i
d) [1pt] whats the rendering time (triangle count) at each level i (all nodes on that level)
e) [1pt] total asymptotic cost for the CPU to render LoD n (using this method)
Second - 2D accelerated rendering . .
Surface oulines > open/closed paths > CPU co-ords list > GPU renders (used in laser cutters & plotters)
Properties of a koch snowflake
f) [1pt] height of recursion tree for rendering snowflake of LoD n?
g) [1pt] how many node in the tree at level n
h) [1pt] whats the rendering time (line segment count) for a node at depth i
i) [1pt] whats the asymptotic rendering time (line segment count) for a node in the last level n
j) [1pt] whats asymptotic rendering time (line segment count) at each level of the tree
k) [1pt] whats the asymptotic rendering time (line segment count) at the last level n
l) [1pt] total asymptotic cost for the CPU to render LoD n (using this method)
Third - 2D unaccelerated rendering . . aka software rendering (CPU only) (used in laser cutters & plotters)
Surface oulines > open/closed paths > CPU co-ords list > CPU rasterises co-ords
NOTE the rasterised pixels represent the ink required to print or the the power required for laser to cut the image!!
m) [1pt] height of recursion tree for rendering snowflake of LoD n?
n) [1pt] how many node in the tree at level n
o) [1pt] whats the rendering time (line segment length) for a node at depth i (assume original triangle side length = 1)
p) [1pt] whats the asymptotic rendering time (line segment length) for a node in the last level n
q) [1pt] whats asymptotic rendering time (line segment length) at each level of the tree
r) [1pt] whats the asymptotic rendering time (line segment length) at the last level n
s) [1pt] total asymptotic cost for the CPU to render LoD n (using this method)
Fourth - 3D unaccelerated rendering . . (CPU only)
Surface > triangles > CPU co-ords list > CPU rasterises
t) [4pt] total asymptotic cost for the CPU to render LoD n (using this method - assume initial triangle side length = 1)
u) [15pt] prove using recursion tree method
a) What name of method w highest CPU usage? _find_min part of class
Profiler Qs ncalls tottime percall cumtime percall filename:lineno(function) 15/6 0.000 0.000 0.001 0.000 sre_parse.py:414(_parse) Whats the 15/6 mean under ncalls? Recursive call here: 6 primitive call 15 in total
Profiler column meanings: https://docs.python.org/3/library/profile.html
ncalls
for the number of calls.
tottime
for the total time spent in the given function (and excluding time made in calls to sub-functions)
percall
is the quotient of tottime divided by ncalls
cumtime
is the cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.
percall
is the quotient of cumtime divided by primitive calls
filename:lineno(function)
provides the respective data of each function
Full result of profiler in: circuit/profiler_output.txt
628290079 function calls (628095620 primitive calls) in 474.045 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
259964 275.241 0.001 471.218 0.002 circuit.py:381(_find_min) ** CULPRIT
625762426 195.969 0.000 195.969 0.000 circuit.py:286(__lt__) # less than dunder function
64400 0.788 0.000 473.659 0.007 circuit.py:423(step) # simulation time step > _find_min > __lt__
828936/634537 0.264 0.000 0.340 0.000 {len}
65554 0.233 0.000 0.421 0.000 circuit.py:163(transition_output)
194381 0.216 0.000 471.437 0.002 circuit.py:361(min)
1 0.173 0.173 473.948 473.948 circuit.py:456(run)
1 0.001 0.001 474.045 474.045 circuit.py:3(<module>)
To run tests:
> python circuit_test.py # whole suite - 6min
> python circuit.py < tests/1gate.in > out # run single test write results out
> diff out tests/1gate.gold # compare result to verified result
> python -m cProfile -s time circuit.py < tests/5devadas13.in # profile test - take nearly 8min!
> TRACE=jsonp python circuit.py < tests/1gate.in > circuit.jsonp # create a trace for visualiser
> TRACE=jsonp python circuit.py < tests/2gates.in > circuit.jsonp # create a trace for visualiser
> TRACE=jsonp python circuit.py < tests/3xor.in > circuit.jsonp
> TRACE=jsonp python circuit.py < tests/4sort.in > circuit.jsonp
EG
> python circuit_test.py
Testing correctness:
Testing 1gate.in ...... OK
Testing 2gates.in ...... OK
Testing 3xor.in ...... OK
Testing 4sort.in ...... OK
Testing 5devadas13.in ...... OK
.
----------------------------------------------------------------------
Ran 1 test in 360.847s
OK
> python circuit.py < tests/1gate.in
19 axb 1
29 axb 0
b) How many time is this method called? 259964 c) What tightest asymptotic bound of the worst case running time of the method with the bottleneck? d) If implemented optimally whats the tightest asymptotic bound of the re-implemented method? e) Re-write the data structure using the most efficient method from class (no lib imports)
contents ~ vid ~ lect notes ~ succinct vid on hashing, open addressing & chaining ~ Code - 1st try - associative array ~ Reading: none listed
time | notes |
---|---|
0-5m | Into to dictionary & functions( insert, delete, search) - implement - O(1) constant time (w/ high probability) |
5m-13m | Python dict API summary, and Motivarion (see lect notes) |
13m-24m | Solution 1 - Direct access table |
24m | Solution 2 - Hashing |
29n | introduces collisions - chaining |
34m | Simple uniform hashing - Mapping number of keys to number of slots |
42m | hashing to table size m |
43m | DIVISION |
44m | MULTIPLICATION |
type: dictionary (ADT)
use cases: dictionaries, dbs, compilers, symbol lookup, routers, virt mem, all over
queries: search(key) O(1 + chain length = alpha)
updates: insert(key, item), delete(key) both O(1 + chain length = alpha)
(RI):
properties: m = Θ(n) ie number of spaces in table m ~ number of keys n
Motivation:
DB - uses hashing and search trees
Dictionary lookup - use hash table
Spell check - trial perturbation of each letter - really?
13m Solution 1 - Direct access table
Problems:
- key not being integers (therefore dont map direct to memory)
- if there are a lot of gaps in keys this solution is a huge memory hog!
Solution to 1:
Prehashing - map keys to integers (array location)
Python built in hash() - should be called prehash - but basically return and integer based on the input. (to use as a key for dict for example)
Python dunder function __hash__ is called when built in hash(object) is called.
if you dont implement hash hash(obj) used id(obj) to as the hash value
this avoid collision because no two things occupy the same space in memory so id is unique
24m Solution 2 - Hashing - Mapping from Giant space of key to array indexes
Solution to 2:
I you have a space of keys, run those through hash function to generate indexes (ideally they should be equally distributed through the target array)
The ratio of number of indexes (entries) (n) : number of array spaces (m) is the load factor - alpha = n/m (aka expected collision chain length)
Methods for dealing with collisions (Chaining @ 30m & open addressing (next week) ) one type of open addressing is linear probing
See L9 1m-4m sunccict synopsis of last lecture and this solution!!
note the n may end up in a linked list for chaining
Hashing methods: DIVISION, MULTIPLICATION, UNIVERSAL HASHING (prefered)
34m - Simple uniform hashing - Mapping number of keys to number of slots
Assumptions:
a) each key equally likely to be hashed to ANY slot in the table
b) INDEPENDENT of where other keys hashing
Hash function need to know size of table - m.
43m - DIVISION method
Note: m should be PRIME and not too close to power of 2.44m - MULTIPLICATION method - preferred over division (notes p5)
a should be random, odd, and in the range 2^(r-1) < a < 2^r, and NOT close to a power of 2. w number of bits in key rangeBasically multiply the key (w bits) by an odd (fixed to the function) random number resulting in a number of length 2w bits. (assuming a is also w bits long) Take the section with the highest diversity (in the middle) that generates a number of at least m (required spaces). IE 2^r bits must be at least m Which is why the m=2^r part which select the length of th section that is taken for the hash result. See diagram notes top
48m - UNIVERSAL HASHING (preferred)
blur. . . note bottom p6 . . . take 6.046 to understand better. . .
Revisit this - implementation seems relatively simple
TERMS
prehash collisions
collisions - chaining = storing the collision in a linked list at the address
collisions - linear probing = store the collision in the next available free address
collisions - open addressing = linear probing is open addressing (may not find item exactly where expected)
collisions - closed addressing = chaining is closed addressing (item store always in same location)
simple uniform hashing
universal hashing
contents ~ vid ~ lect notes ~ lect notes code ~ Code: Reading: none listed
time | notes |
---|---|
0-xm | Walk the code in problem set 3 - |
9m | introduce lexicographic comparison sort |
17m-29m | RangeIndex methods and running time for example (poor) code |
29m | KeyWirePairs, |
42m | RangeIndex - comparision model penalty |
43m | LIST LCA pseudo code |
49m | LCA run time intuition |
All the lines(vectors) are added to the RangeIndex CrossVerifier((layers)._events_from_layer).wire_crossings > _compute_crossings > list
7m
> python -m cProfile -s time circuit2.py < tests/10grid_s.in
file contains list wires: wire v7035 61240 -815340 61240 410695
name x1 y1 x2 y2
vertical wire since x1 = x2
command[0] wire
command[1] v7035
coordinates 61240 -815340 61240 410695
Loaded into layers
layer = WireLayer.from_file(sys.stdin) # ln
if command[0] == 'wire':
coordinates = [float(token) for token in command[2:6]]
layer.add_wire(command[1], *coordinates)
Wire(name, x1, y1, x2, y2) + internally enumerated wire_id
x1, y1, x2, y2 = *coordinates
layer.wires = dict with name as key, Wire object as value
Once loaded the layer is passed to CrossVerifier
_events_from_layer(layer)
left_edge = min([wire.x1 for wire in layer.wires.values()]) # find leftmost co-ordinate
go through wires and sort into �events� - add for horizontal, query for vertical
if wire.is_horizontal():
self.events.append([left_edge, 0, wire.object_id, 'add', wire]) # left_edge - constant
else:
self.events.append([wire.x1, 1, wire.object_id, 'query', wire]) # wire.x1 - wire dependant
EG
1) ADD [-100, 0, 1, �add�, wire_obj] # horizontal wire - _events_from_layer:line 333
2) QUERY [-50, 1, 2, �query�, wire_obj] # vertical wire - _events_from_layer:line 333
CrossVerifier.events is a list[]
self.events.sort() # sort by 1st column then second then 3rd if 1st column same
So will sort:
Horizontal vectors 1st (all have left_edge in 1st column) sorted by incrementing obj_id,
then vertical vectors sorted by x.1 co-ordinate (left to right)
self.index = RangeIndex() # create RangeIndex
self.result_set = ResultSet() # create ResultSet
result = verifier.wire_crossings()
self._compute_crossings(False) # count_crossings calls self._compute_crossings(True)
result = self.result_set # initialised as ResultSet()
@loop through self.events # KeyWirePair - (KeyWirePairL & KeyWirePairH lower & upper limits obj_id)
@ # have comparison operators __le__, __ge__ etc
@ # add all the horizontal vectors as KeyWirePair objects
@self.index.add(KeyWirePair(wire.y1, wire)) the KEY in Range index is the horizonal lines vertical position y1
@ # wire is vertical wire
@for kwp in self.index.list(KeyWirePairL(wire.y1),KeyWirePairH(wire.y2)):
@ get all horizontal vertices that lie in the vertical range of this wire
@ check if they cross, and add to list cross_wires and from there compile in result
Steps to optimise code: todo. . .
Adapt AVL tree code to implement rank & count.
Implement LCA, Node-List, & List
Integrate into circuit2.py
Run profiler
sort lexicographic comparison
Any equation identities / topics for this lecture include context and uses for later reference
contents ~
vid ~
lect notes
Code:
Reading:
time | notes |
---|---|
0-3m | Refresh Hashing with Chaining: Load factor |
3m | Resizing table, for optimum performance - Start small (constant) and grow (or shrink) as necessary. |
7m30 | Growing, allocate the memory & rehash all entries (w/ a hash function appropriet to new table size) |
15m | Amortisation - COST of table doubling operation calculated as a series |
21m | Inserts - Table doubling costs |
23m | Delete - Table downsizing |
23m46 | Delete - PROBLEM w/ table downsizing by half |
23m50 | Optimal upsized & downsize - double giong up only halve when n (entries) down to 1/4 |
27m | strategy for implementing table doubling in an RTOS |
28m | String Matching |
34m | Cost for simple string matching theta Θ(s*t) could be QUADRATIC |
35m | using a rolling hash ADT - concepts |
41m-47m | Rolling hash ADT - pseudo code - problem set 4 |
How to choose m (table size) - (on overflow double it)
When array becomes full n=m, double in size m*2 When shrinking shrink when n gets to m/4 shrink m to m/2
** Invariant property: n <= m <= 4n *** n has to be smaller than m (size of structure) or entries wont fit in the data structure If n falls to m/4 halve structure size m = m/2, results in amortised constant time
27m - listen to concept - table doubling in RTOS, to minimise rebuild time (and there minimise impact to latency) see Alternatives to all-at-once rehashing.
Term: amortised constant time - note on python lists
String Matching
needed for Problem set 4 ? ps4 see notes p4
Rolling hash ADT - pseudo code. 41m - 47m - 1987? getting more recent use choosing size using random prime
rehash function: 4m41 https://www.youtube.com/watch?v=oxd_Z1osgCk c / c++ / python / java code
Rehashing: hash(txt[s+1..s+m]) = d( hash(txt[s..s+m-1]) - txt[s]*h ) + txt[s+m]) mod q )
hash(txt[s+1..s+m]) = hash value @ shift s
hash(txt[s..s+m-1]) = hash value @ next shift (s+1)
d: number of character in the alphabet (256 for 8bits)
q: prime number (laregr number = less collisions / false positives)
h: d^(m-1)
https://www.geeksforgeeks.org/rabin-karp-algorithm-for-pattern-searching/
compiled local
https://github.com/UnacceptableBehaviour/gdb_lldb/blob/master/karp_rabin.cc
build instruction in file
contents ~ vid ~ lect notes ~ Code: Reading: none listed
time | notes |
---|---|
0-5m | Walk through naive implementation |
5m30-10m | Avoiding false positives / maths of collisions |
10m | Running time |
14m | fast hashing - mod prime number |
18m-22m | SLIDE implementation |
22m | prehash - smart way |
28m | contsructor(init) for rolling_hash |
30m | append & skip |
36m-38 | modular / multiplicative inverse - precomputed because base on const |
40m | rolling_hash, append, skip, hash - cost |
52m | amortised analysis breakdown |
54m | listing BST nodes in order - In order traversal - amortised cost |
59m | amortised analysis - ps4 |
Naive implementation of string match O(n*k), where n is string length & m is the search string length. For each of he n positions each letter in k needs to be compared to the to search text.
For hashing the smaller the m (number of available spaces) the hash maps to the the higher the number of collisions (and in this application potential matches) 5m30-10m
Naively implements hash function for n character string will be O(n) - (10m30 - Assume O(1) using rolling hash)
54m Number of edges in a tree of N nodes is N-1
Demonstrates intuitive proof of in order traversal in O(N) giving an amortized cost of O(1) per node < check understanding
Best from G4G ~ Computation of modular inverse ~ Using python built in pow() ~ Wha?
Multiplicative inverse / Modular inverse - 36m - also covered in R9b first add steps in calculation / proof - definition of modular inverse - revisit context
Given two integers �a� and �m�, find modular multiplicative inverse of �a� under modulo �m�.
The modular multiplicative inverse is an integer �x� such that.
a x ≡ 1 (mod m)
The value of x should be in {0, 1, 2, � m-1}, i.e., in the range of integer modulo m.
The multiplicative inverse of �a modulo m� exists if and only if a and m are relatively prime (i.e., if gcd(a, m) = 1).
To calculate the inverse of 38 mod 97
>>> pow(38, -1, mod=97)
23
>>> 23 * 38 % 97 == 1
True
y = x**(p-2) mod p # pseudocode
y = pow(x, p-2, p) # python built ins
Fermat little theorem:
pow(x,m-1,m) must be 1
Hence (pow(x,m-2,m) * x) % m == 1
So pow(x,m-2,m) is the inverse of x (mod m)
contents ~ vid ~ lect notes ~ Code Zip ~ Code git ~ Reading: none listed
time | notes |
---|---|
0m-1m | what to cover? Hashes, Code & Amortisation |
1m-3m | Hashes, Extended Euclids Method, GCD greatest comm on divisor - modular multiplicative inverse |
3m-14m | Why using primes is important in hash function |
13m6 | Caching - better to use time on a good mod (hash) function than use up all your memory - warm cache performs better |
14m-00m | Walk the code in problem set 4 ps4 |
15m | Rollinghash, API diff - slide(new, old) split out into: append(new), skip(old), |
16m20 | kfasta.py - iterator implemtaion, iter, next() |
21m-44m | dnaseq.py method by method & python features needed |
21m | implementing reverse w/o using a lot of memory |
23m-37m | generators / yield / subsequences() |
38n-40m | intervalSubsequenceHashes(), concept of DRY code use subsequenceHashes() to implement |
40m-44m | getExactSubmatches() |
44m | Amortized analysis |
49m-53m | Time analysis list, append, memory alocation |
53m | Aggregate analysis / Cost based accounting |
53m-57m | Maths on aggregate analysis - write |
Generators:
def subsequences(seq, k):
try:
subseq = ''
while True:
while len(subseq) < k:
subseq += seq.next()
yield subseq
subseq = subseq[1:]
except StopIteration:
return
for subseq in subsequences(seq, 3):
print subseq
i += 1
# another method for generating primes
def isPrime(n):
if n < 2 or n % 1 > 0:
return False
elif n == 2 or n == 3:
return True
for x in range(2, int(n**0.5) + 1):
if n % x == 0:
return False
return True
def getPrimes(value = 1):
while True:
if isPrime(value):
yield value
value += 1
p2 = getPrimes()
while True
table_size_m = next(p2)
print(table_size_m)
Any equation identities / topics for this lecture include context and uses for later reference
13m6 reason to use prime over p - 2^w / bla see 1st 2mins
44m Amortized analysis
Example using lists:
list function | cost |
---|---|
l = list() | O(1) |
for list size N | |
l.append() | O(N) worst case |
The argument goes that since l.append() is O(1) and only O(N) infrequently, the amortised cost is O(N)
insertions + number of table doubles
PDF here ~ PS3 problem code ~ [Setting up the Komodo IDE for python debegging] (https://www.google.com/search?q=integrating+python+debugging+in+komodo&oq=integrating+python+debugger+in+komodo&aqs=chrome.1.69i57j33.16303j1j7&sourceid=chrome&ie=UTF-8)
Overview of how this code work is in recitation 8
> python -m cProfile -s time circuit2.py < tests/10grid_s.in # profile test - take nearly 15min!
> python -m cProfile -s time circuit2.py < tests/10grid_s.in
124668802
1436321483 function calls (1436321412 primitive calls) in 917.681 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
187590314 279.644 0.000 428.213 0.000 circuit2.py:55(intersects)
20000 195.692 0.010 366.372 0.018 circuit2.py:157(list) < < < high tot.time
562840882 148.598 0.000 148.598 0.000 circuit2.py:47(is_horizontal)
1 113.702 113.702 917.153 917.153 circuit2.py:336(_compute_crossings) 3
299400000 100.610 0.000 100.610 0.000 circuit2.py:246(__le__)
261444830 70.069 0.000 70.069 0.000 circuit2.py:256(__ge__)
124719049 8.749 0.000 8.749 0.000 {method 'append' of 'list' objects}
1 0.135 0.135 0.371 0.371 circuit2.py:119(from_file)
34970 0.108 0.000 0.146 0.000 circuit2.py:20(__init__)
1 0.085 0.085 0.110 0.110 circuit2.py:327(_events_from_layer)
20000 0.052 0.000 0.052 0.000 circuit2.py:281(__init__)
34970 0.044 0.000 0.191 0.000 circuit2.py:99(add_wire)
1 0.041 0.041 0.041 0.041 {method 'sort' of 'list' objects}
34972 0.024 0.000 0.024 0.000 {method 'readline' of 'file' objects}
20460/20445 0.022 0.000 0.022 0.000 {len}
34972 0.021 0.000 0.021 0.000 {method 'split' of 'str' objects}
34970 0.021 0.000 0.021 0.000 circuit2.py:81(next_object_id)
20000 0.017 0.000 0.017 0.000 circuit2.py:290(__init__)
14970 0.013 0.000 0.013 0.000 circuit2.py:233(__init__)
20000 0.009 0.000 0.009 0.000 circuit2.py:363(trace_sweep_line)
20000 0.005 0.000 0.005 0.000 circuit2.py:51(is_vertical)
14970 0.005 0.000 0.006 0.000 circuit2.py:147(add)
57 0.003 0.000 0.003 0.000 {min}
1 0.001 0.001 917.681 917.681 circuit2.py:3(<module>) 2
1 0.000 0.000 917.153 917.153 circuit2.py:313( ) 1
1 0.000 0.000 0.151 0.151 circuit2.py:299(__init__)
6 0.000 0.000 0.001 0.000 sre_parse.py:725(parse)
9 0.000 0.000 0.000 0.000 sre_compile.py:428(_simple)
80 0.000 0.000 0.000 0.000 sre_parse.py:207(match)
Profiler Qs
ncalls tottime percall cumtime percall filename:lineno(function)
15/6 0.000 0.000 0.001 0.000 sre_parse.py:414(_parse)
What�s the 15/6 mean under ncalls? Recursive call here: 6 primitive call 15 in total
Profiler column meanings: https://docs.python.org/3/library/profile.html
ncalls
for the number of calls.
tottime
for the total time spent in the given function (and excluding time made in calls to sub-functions)
percall
is the quotient of tottime divided by ncalls
cumtime
is the cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.
percall
is the quotient of cumtime divided by primitive calls
filename:lineno(function)
provides the respective data of each function
3-2 a) list
3-2 a) 187590314 calls - why per call time zero?
Setup for next questions
Copy good_trace.jsonp to /circuit2/visualizer/bin/trace.jsonp # its a trace of how sweeper should work
dbl click /circuit2/visualizer/bin/visualizer.html # or drag it into chrome
Steps to optimise code: todo. . .
Adapt AVL tree code to implement rank & count.
Implement LCA, Node-List, & List
Integrate into circuit2.py
Run profiler
Using sProfile - https://towardsdatascience.com/how-to-profile-your-code-in-python-e70c834fad89
Alternative profilers - https://pythonspeed.com/articles/beyond-cprofile/
Python debugger - https://docs.python.org/3/library/pdb.html
Python debugger tutorial - https://realpython.com/python-debugging-pdb/
Alternative implementation to CPython - https://www.pypy.org/
Working out if two vectors intersect + special cases - https://www.youtube.com/watch?v=bbTqI0oqL5U
Sweepline Algorithm Background
Sweep line algorithm 2/5 theory - https://www.youtube.com/watch?v=qkhUNzCGDt0
event (point) queue (use balanced BST) & sweepline status (use balanced BST)
Sweep line algorithm 3/5 pseudo code - https://www.youtube.com/watch?v=I9EsN2DTnN8
Sweep line algorithm 3/5 correctness proof (by induction) - https://www.youtube.com/watch?v=8C3_ZKy4KkA
contents ~
vid ~
lect notes
Code:
Reading: CLRS Chapter 11.4 (and 11.3.3 and 11.5 if interested)
Collisions 6m refresh 5m open addressing: linear probing - 9m load factor & chaining: closed addressing
type:
use cases: good for bla
queries:
updates:
(RI):
properties:
time | notes |
---|---|
0m-3m | Into - Open Addressing, Uniform hashing analysis, Cryptographic Hashing |
3m30-m | Open Addressing - no chaining = no pointers |
8m | Mapping from key through trial count to slots U x {0,1. . . m-1} > {0,1. . . m-1} |
11m | Eg of INSERTION w/ linear probing - see animation in Collisions 6m refresh link at top! |
15m | Eg of SEARCH w/ linear probing - see animation in Collisions 6m refresh link at top! |
22m | Coping w/ a search after a deletion! Inserts a DELETE_ME flag in place. |
30m-00m | Probing strategies |
30m | 1 - linear probing |
33m50 | clustering issue |
36m | Double hashing to solve clustering |
39m | Uniform Hashing Assumption - NOT same as SIMPLE Uniform hashing |
41m-46m | Followed by Uniform Hashing Analysis CRITICAL info! |
46m | Cryptographic Hashing - NOT on quiz FYI only |
3m30 Open Addressing:
Linear probing: insert, search delete
One item per slot so m > n (more array slots than items n)
U x {0,1. . . m-1} > {0,1. . . m-1} U - universe of keys x trials count > slots in table
Notation for trials:
h(k,0) 1st attempt to insert k
h(k,1) 2nd attempt to insert k
h(k,2) 3rd attempt to insert k
etc
30m - Probing Startegies - Linear Probing
hp - hprime is an ordinary hashing function
i - tries
m - table entries
Problem with Linear probing hash function
h(k,i) = (hp(k) +i ) mod m
Permutation OK but has problems w/ clustering - when a CLUSTER STARTS TO FORM the probability of colliding w/ it GOES UP
If 0.01 < alpha (load factor m/n) < 0.99
then cluster size = Θ(logn) (35m20)
36m Double hashing to solve clustering
h(k,i) = (h1(k) + i*h2(k) ) mod m
Term relatively prime - see maths below - follow up math insert formula - get intuition
39m - Uniform hashing assumption
Each key is equally likely to have any one of the m! permutations as its probe sequence
- not really true
- but double hashing can come close
Analysis: basically as table starts to fill alpha n/m (slots become fewer so load factor goes UP) Cost if insert <= 1/(1-alpha) Add equations in latex As alpha -> tends to 1 insert goes though the roof
Requires resize of table by time alpha gets to 0.5-0.6
46m Cryptographic Hashing - Password Talks about one way cryptographic hashing for password. So only meet to store
Term "relatively prime" aka "coprime" or "mutually prime":
When two numbers have no common factors other than 1.
In other words there is no value that you could divide them both by exactly (without any remainder).
Simplifying a fraction as much as possible with give a numerator and denominator that are coprime.
EG
factors of 21 are 1,3,7,21
factors of 22 are 1,2,11,22
only common factor is 1 so they are relatively prime.
21 and 24 are NOT relatively prime:
factors of 21 are 1,3,7,21
factors of 24 are 1,2,3,4,6,8,12,24
common factors are 1 AND 3
Why do we care?
contents ~ vid ~ lect notes ~ Code: Reading: none listed
time | notes |
---|---|
0m-3m | table, check empty vs move items - time complexity |
4m30-5m | Solving a Recursion problem using RECURSION TREE |
14m30-24m | Gold coins problem - DECISION TREE |
24m-36m | Bunker Hill problem - RABIN KARP - ROLLING HASH |
36m-50m | Rank search in parallel of 2 lists |
50m-54m | Time complexity - set pieces, techniques |
1m Q? time complexity to move items from one table to another, COMPARED to checking an empty table
Moving items from one table to another, is moving a pointer NOT a whole object & therefore O(1).
Table of m entries:
action | time | empty | full table |
---|---|---|---|
index into table | O(1) | O(m) | O(m) |
move item | 0 | 0 | O(m) |
total | O(m) | 2O(m) | |
constants cancel | total | O(m) | O(m) |
Implies same time complexity |
4m30 - Solving recursion - 2 methods:
- SUBSTITUTION - CLRS 4.3 p83 - Ex p87
- RECURSION TREES - CLRS 4.4 p88 - Ex p92/93 Maths: series, powers, logs
The hint given is the a formula involving n is a constant - 2 - which makes it a base case:
T(C) = O(1)
Hint:
14m30-24m - Gold coins problem - DECISION TREE** Using a COMPARISON model the BEST you can do sort Ω(nlogn) - L7, R7
24m-37m - Bunker Hill problem - RABIN KARP - ROLLING HASH
Walks though scanning a map for a target pattern.
pattern = h.w pixel
map = H.W pixel
Approches
Brute force - O(WHwh)
Check each pixel in the map (w.h) against each available position it can occupy (W-(w-1)).(H-(h-1))
In (W-(w-1)).(H-(h-1)) large W & H overwhelm other terms > WH
Rolling hash on height - O(WHh)
31m
Rabin Karp rolling hash - O(WH)
33m
Compute hash for each h cells going across the map (keeping hashes)
Then when computing the next row down roll the hash down one cell instead of computing the 3 each time
Memory requirements to store hashs (work done) goes up to 4W!
36m-50m - Rank search in parallel of 2 lists
50m-54m - Time complexity - set pieces, techniques
If comparing to complicated time complexities take logs - because they are monotic and will have the same relationship!!
Any equation identities / topics for this lecture include context and uses for later reference
PDF here ~ Code ~ git Solution Code
vid ~ lect notes ~ R12 Karatsuba Multiplication
time | notes |
---|---|
0m-3m | intro, work w/ many digit numbers, very large, or high precision |
3m-18m | irrationals, Catalan numbers - loong aside? |
18m-31m | Newtons method, leading to need for high precision multiplication |
30m-38m | High precision multiplication of d digit #s |
34m | High precision multiplication of d-digit number |
38m-42m | Karatsuba Multiplication see R12 |
42m- | Circular Geopmetry Problem |
4m - Catalan numbers - ??
Lambda beloning to P
If alpha & beta belong to P, then (alpha)beta belong to P
Cardinality of the set Cn = number of elements of the set.
Series
18m-30m - Newtons method, leading to need for high precision multiplication
Start with
Using newtons method first need to know how to do multiplication.34m - High precision multiplication of d-digit number Note the switch form d-digit numbers to n-digit numbers!
TWO n-digit numbers (radix r=2, 10) < is this base 2 for decimal numbers? CHECK
numbers x, y where: 0 <= x, y < r^n. (EG for a 64 digit binary number then x & y < 2^64)
Makes sense since largest 64bit number is 2^64 - 1
So for 64Kword digit number recursively split them in half until their length reaches your target platforms word size. 64bits for example. Then multiply back up!??
a b # x.y = 7006652
x = 5678 # n digits a = n/2 digits
y = 1234 # n digits as do b,c,d
c d
r1 = a.c = 672 # 10^4 - shift right 4 zero's for ballpark result MS addition
r3 = a.d + b.c = 2840 # 10^2 - shift right by 2 zeros - middle addition
r2 = b.d = 2652 # 10^0 - the LS addition
r1 << 4 6720000
r3 << 2 284000
r2 << 0 2652 +
total 7006652 # tada!
but r3 involves 2 multiplications (expensive component) so above results in
4 calls to the recursive multiplication w/ numbers half the size: n/2 yielding . .
T(n) = 4T(n/2) + O(n) complexity O(n^2) - R12 0-10m
^
split split
x . y r1 ( r3 ) r2
(a + b)(c + d) = a.c + a.d + b.c + b.d
so r3 = (a + b)(c + d) - r1 - r2
So r3 can be generated using the data from r1 & r2 multiplication need only 1 multiply instead of 2
resulting in 3 calls to the recursive multiplication w/ numbers half the size: n/2 yielding . .
T(n) = 3T(n/2) + O(n) complexity O(n^log3)
^
See 37m26:
Quiz 2 ps - O(n^2) - worse than Karatsuba
Ex Code Karatsuba multiplication in python.
Take 2 input s multiply them together using recursive Karatsuba call.
wikipedia pseudo code
contents ~ vid ~ lect notes ~ Code: Reading:
time | notes |
---|---|
0m-14m | HOW TO APPROACH A SOLUTION using EG problem Sorting a shifted / rotated array |
14m-42m | Extract kth laregest element from MinHeap example |
42m-end |
More details in lecture notes p1
- Experiment w/ examples - generate inputs to test ideas on
- Simplify the problem - break into sub problems, do some pre processing on data, set one of the inputs to be contant, including some assumptions about the data to minimise special cases.
- Look for similar problems - leverage elements or approaches used in those problems.
- Delegate the work - use recursion, dividing the problem delegating the subproblems to recursive function calls. If you can�t figure out how to solve a problem, see if you can figure out how to solve only the last bit of it. Then see if you can use recursion to solve the rest of it.
- Design according to the runtime - to be able to do this we need a list of algo vs runtime & recurrence relations - CHEAT SHEET
Exersize implement n!, profile it plot time to compute against n implement stage memoized n! profile it plot time to compute against n - how much space is required for a 100x compute advantage? Don�t forget in practice simply use stirlings approximation
Problem 0 - Maintaining Medians Final Exam Fall 2008 Make notes
N elements in the array & array is sorted. k - amount of shift - rotation e - target number
Linear search O(N)
Binary algo: If you know k You know it�s shifted by k so centre is 0 index N-k mod N middle index ((N-k) + N/2) mod N Use de-referenced binary search - IE translate index w/ offset k w/ wrap around (size of array N)
Problem 1 - USE BINARY SEARCH HOME IN ON THE BREAK W/o knowing k - find the break - searching for a number to the left that is larger than its successor
At each step check for the target number e.
If at any point the target is in between one step and the next & that is ascending range re-scope the search limits to those points
get first element - n0 - need a point of reference so that sequence break can be found
n0 > n[N-1]
start index idx = n/2
loop_start:
is number target e? - found target
is number to the left larger? - found break
YES found target
NO go to n/4 or 3n/4
is number smaller than n0?
YES - break on left - idx = n/4
NO - break on right - idx = 3n/4
loop - until found break (or target)
then look for target
Problem 2 - Using MinHeap - Find the Kth ranked number in the heap
How do we achieve a good running time for this hard problem of O(KlogK)
MinHeap
N elements
extract kth smallest element
val-rank
k=3 is element val 6
2-1
5-2 7-4
6-3 9-6 8-5
Brute force solutions:
a) sort array O(nlogn), retrieve kth element - O(1) - O(nlogn)
b) pop_heap K times - O(Klogn) - K number of times you call pop(logn) - pop log n to bubble leaf down to right position
Problem 3
Array of random elements e = 56,756,27,98,56,4,6356,879,68,8,4,64,6746,7,746,72,9,99,2
Query is range i to j 0 <= i,j <= n
Return minimum in the range
a) precompute all answers
Number of possible range queries is nn (i, j) create a hash to lookup the precomputed answer
Hash convert 2 digit base n number into table index. EG 23,53 = index 23n+53
Precompute cost: O(n^3) or O(n^2) if using rolling hash style minimum assessment (see 48m)
Space cost: O(n^2) approx, there would be some duplicate inverted ranges
Query cost: O(1)
b) brute force search element on every query
Precompute cost: O(1)
Space cost: O(n^2)
Query cost: O(N)
thoughts)
Pre-compute ranges larger than n/4, or predetermined sweet-spot.
Compute remaining queries.
c) Solution - 52m
Space cost: O(nlogn) / O(1) elements
IMPLEMENT THIS SOLUTION
Maintining medians not covered in the recitation.
contents ~
vid
lect notes
Code:
Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0m-2m | Motivation, use of multiplication in division |
2m-5m | Review |
5m-10m | Error analysis of Newtons method Epsilon n being the error |
11m-23m | Multiplication Algorithms |
23m- | High precision division |
26m30- | Division |
26m30- | Newton iteration < |
30m47 | Critical - in the box! Derived from Newtons iteration |
33m- | Example w/ numbers plugged in!! |
36m- | Complexity calculation of division - SPOILER SAME AS MULTIPLICATION! |
47m30- | Complexity of computing square toots - SPOILER SAME AS MULTIPLICATION! |
29m33
contents ~
vid ~
lect notes
Code:
Reading:
time | notes |
---|---|
1m-10m | Karatsuba Multiplication - see L11 34m |
3m | time complexity normal multiply vs Karatsuba O(n^log3) |
10m | Newtons Method |
11m20 | Graphical walk through theory |
10m-20m | deriving Newtons Method |
30m33 | One thing to remember is using this trick |
31m | Example of finding cubed root |
11m20 Graphical walk through theory
tan(Θ) = opposite / adjacent ()
This works by making a first guess.
Finding the tangent at that guess and using it�s intersection w/ y=0 as the next guess.
Repeat this log d times where d is the precision in digits. [explain log d comes from raising to a power below 1 inserts 0s in e]
We need to remove difficult division which is done my multiplying everything by 10^d and working w/ integers.
Feed the result back into the approximation until you get the same result twice which means we have converged on a solution.
The initial guess is arrived at by binary search to get a number that produced a correct most significant digit.
Then homing in on then solution w/ Newtons method.
30m33 removing difficult division
for 10 digits of precision multiply by 10^d IE left shift 10 places in decimalSummary of steps - towards code:
Approach: Create function whose root is the answer we seek & use newtons method to home in on the solution
Generate function.
See worked notes for steps w/ maths R12 p2-3
Type up add here
CODE this up - requires Karatsuba Coded. .
contents ~ vid ~ lect notes ~ Code: BFS EG w/ visualisation Reading:
type: Breadth-first search (BFS)
use cases: web crawlers, find shortest path & minimum spanning tree for unweighted graphs, route planning GPS, broadcasting in a network, detecting unreachable vertices from s (start)
queries: UPDATE from notes R13 at end
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0m-2m | Intorduction to graphs, adjacency lists |
5m | Applications |
10m | Pocket cube |
20m | Graph representation, adjacency lists |
29m30 | Space used Θ(V+E) vertices + edges I assume |
34m | BFS algorithm pseudo / python code - writing it on the board yawn. . |
37m | BFS algorithm pseudo / python code - WALKING the code |
43m | Parent pointer concept to create shortest paths |
45m | Shortest paths properties |
49m30 | Total running time BFS 2.E |
49m45 | Runtime & Handshaking Lemma |
Handshaking Lemma, CLRS p1173 - says every edge increases the degrees by 2 - one for each node it connects.
Lemma - �a subsidiary or intermediate theorem in an argument or proof.�
Any equation identities / topics for this lecture include context and uses for later reference
Graphs appendix in text book!
contents ~
vid ~
lect notes ~
Code: maze.py &
bfs.py &
bfs_maze.py
Reading: CLRS C22 p589-602
time | notes |
---|---|
0-17m | PS5 help |
17m-28m | Terms BFS graphs, handshaking lemma |
28m-34m | Representing graphs in python, adjacency lists |
34m30 | Representing graphs in python, adjacency matrix |
39m | Walk Breadth First Search - BFS |
44m-52 | BFS running time decomposition |
52 | FB / twitter thoughts - directed / undirected |
TERMS
Vertex - node in the graph.
Edge - line connecting two nodes.
Connected component - set of nodes connected by edges that can all be reached from each other.
Degree - number of other nodes connected to a node
Handshaking Lemma - Sum of all the degrees = 2 x number of edges. (each edge add 1 to the degree of 2 nodes)
For directed graphs:
In degree - number of incoming edges
Out degree - number of outgoing edges
Total in degrees = Total out degrees
Sum of in degrees + Sum of out degrees = 2E
28m - Representing graphs in python
Graph is represented using a dict of vertices.
Each vertex has an adjacency list of the vertices (nodes) its connected to.
The length of the adjacency list is the degree of the node (number of other nodes its connected to).
Space = O(V+E) assuming empty lists 33m40
34m30 - Representing graphs in python, adjacency matrix
See lect notes p2
Square matrix of nodes using a bit per node.
If nodes are connected a 1 is placed in the cross section.
If nodes are NOT connected a 0 is placed in the cross section.
Where a node crosses with itself the cross is set to 1.
Number of 1s in the matrix = 2E + V (the +V comes from the nodes crossing with themselves, V nodes so +V)
Number of 0s in the matrix = V^2 - (2E + V) (V^2 is all the cross points - total # of 1s)
Space V^2 bits
39m - Walk Breadth First Search - BFS
BFS start form a particular node will discover the elements in a component. NOT necessarily the whole graph.
Uses a Queue data structure to store nodes as they are discovered, then remove them after their connections have been checked.
Nodes are kept in a dict.
44m - BFS running time decomposition
Assumtions:
Vertex/Node seen held in a hash table w/ T/F value - EG dict seen or dict level
Hash: search O(1), delete O(1), insert O(1)
BFS running time:
BFS hash of adjacency lists: Look up edges for each node: O(V) - V number of nodes for each node O(degree) - number of edges connected - stored in list
Matrix - O(V^2) Adjacency list - O(V + E) or O(E) - 47m40
See lect notes p2
type | matrix | adjacency list |
---|---|---|
space | Θ(V^2) bits | Θ(Ew) bits or Θ(V + E) nodes |
time | - | - |
add edge | O(1) | O(1) |
has edge?(U,V) | O(1) | O(degree) |
all neighbours | Θ(V) | O(degree) |
BFS | O(v^2) | O(V + E) or O(E) - 47m40 |
w - word size V - number of vertices E - number of edges del edge - similar to find & add
The adjacency list representation provides a compact way to represent sparse graphs - those for which |E| is much less than |V^2| - it is usually the method of choice.
We may prefer an adjacency matrix representation when:
A) the graph is dense - |E| is close to |V^2| or
B) when we need to be able to tell quickly if there is an edge connecting two given vertices.
Number zones in a maze
Code: maze.py &
bfs.py &
bfs_maze.py
> cd algos
> ./maze.py # display random maze
> ./bfs_maze.py # create random maze & graph of maze run BFS on it label distance from origin (0,0) TLHC
Maze
.123456789.123456789.123456789.123456789.123456789
0 @@@@@ @ @ @ @ @@@@@@@@@ @ @ @@@@@ @ @@@@@ @ @@@@@ @@@@@
1 @ @ @ @ @ @ @ @ @ @ @ @
2 0 @ 11@ 12 13 14@ 17@ 22 23@ 26@ 29@ 30@ 41 40@ 17 18@ 11 10@ 5 6 @ 1
3 @ @ @ @ @ @ @ @ @ @ @ @
4 @ @ @@@@@@@@@@@@@ @ @ @ @ @ @ @ @@@@@ @@@@@@@@@ @@@@@ @
5 @ @ @ @ @ @ @ @ @ @ @
6 @ 1 @ 12 13 14 15 16@ 21@ 24 25@ 28@ 29@ 40@ 39@ 20 19 20 21@ 4 3 2 @
7 @ @ @ @ @ @ @ @ @ @ @
8 @ @@@@@@@@@@@@@@@@@@@@@ @ @@@@@ @ @ @ @ @@@@@@@@@ @@@@@@@@@@@@@
9 @ @ @ @ @ @ @
10 @ 2 3 4 5 @ -1 -1@ 20@ 25 26 27 28@ 39 38@ 21 22 23 22 23 24 25@
11 @ @ @ @ @ @ @
12 @@@@@@@@@@@@@ @ @ @ @@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@ @ @
13 @ @ @ @ @ @ @ @ @ @ @ @
14 @ 33 32@ 7 6 @ -1@ -1@ 19@ 20 21 22@ 39 38@ 37 36 35@ 26 25 24@ 25@ 26@
15 @ @ @ @ @ @ @ @ @ @ @ @
16 @ @ @ @@@@@ @ @ @ @@@@@ @ @ @@@@@@@@@ @ @@@@@ @ @ @
17 @ @ @ @ @ @ @ @ @ @ @ @
18 @ 32 31@ 8 9 @ -1 -1@ 18@ 19@ 24 23@ 38 37@ 32 33 34@ 27@ 22 23@ 26 27@
19 @ @ @ @ @ @ @ @ @ @ @ @
20 @@@@@ @@@@@ @@@@@@@@@ @ @ @@@@@@@@@ @ @@@@@@@@@ @ @ @@@@@ @
21 @ @ @ @ @ @ @ @
22 29 30@ 11 10@ 15 16 17 18@ 25 26@ 35 36@ 31 30 29 28@ 21@ 24 25@ 28
23 @ @ @ @ @ @ @ @
24 @@@@@@@@@ @@@@@ @@@@@@@@@@@@@@@@@ @ @@@@@@@@@@@@@@@@@ @ @@@@@ @@@@@
25 @ @ @ @ @ @ @ @ @
26 28 29@ 12 13 14@ 23 22@ 25 26@ 27@ 34@ 35 34 33 32@ 29@ 20 19@ 26 27
27 @ @ @ @ @ @ @ @ @
28 @ @ @@@@@@@@@@@@@ @ @ @ @ @ @ @@@@@@@@@ @ @ @ @@@@@@@@@
29 @ @ @ @ @ @ @ @ @ @ @ @
30 29@ 30@ 29 28@ 25 24@ 21@ 24@ 27@ 28@ 33@ 36 35@ 32 31 30@ 19 18 17@ 30
31 @ @ @ @ @ @ @ @ @ @ @ @
32 @@@@@ @ @ @ @@@@@ @ @ @ @ @ @ @ @@@@@@@@@@@@@@@@@ @ @
33 @ @ @ @ @ @ @ @ @
34 32 31 30@ 27 26@ 19 20@ 23@ 28 29@ 32@ 37@ 34 33@ 14 13 14 15 16@ 31
35 @ @ @ @ @ @ @ @ @
36 @@@@@@@@@@@@@@@@@@@@@ @ @ @@@@@@@@@ @ @@@@@@@@@ @ @@@@@@@@@@@@@@@@@
37 @ @ @ @ @ @ @ @ @
38 9 10 11@ 14 15@ 18@ 21 22@ 27 28@ 31@ 38 39@ 16 15@ 12@ 9 8 7 8
39 @ @ @ @ @ @ @ @ @
40 @@@@@ @ @ @ @ @@@@@@@@@ @ @ @@@@@ @ @@@@@ @ @@@@@ @@@@@
REFS: BFS
contents ~
vid ~
lect notes ~
Code: maze.py &
dfs.py &
dfs_maze.py
Reading: CLRS C22 p603-602
type: Depth First Search
use cases: prescedence scheduling, cycle detection, logic requiring backtracking
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
1m-m | DFS using recursion to leave trail |
2m-7m | Recursive call to find all nodes connected to a source |
7m-9m | Build on recursive function to find ALL vertices / nodes |
9m | Example |
14m | Running time O(V+E) |
18m-30m | Edge classification, in directed & undirected graphs |
30m | Use of edge type for cycle detection |
31m-42m | cycle detection - and proof, uses matched parenthesis - recursion entry exit pairs |
42m | topological sort - scheduling |
46m | Correctness proof - topological sort |
Code - broad brush
parent = {s: None} # keep track of vertices / nodes visited
DFS-Visit(Adj, s): # recursively find nodes in graph (Adj - dict of adjacency lists)
for v in Adj[S]:
if v not in parent:
parent[v] = s
DFS-Visit(Adj, v)
# visit all components (subgraphs) by node dict
# V is a set of vertices of the graph
DFS(V, Adj):
parent = {s: None} # will contain list of root nodes at the end
for s in V: # one for each component / subgraph
if s not in parent:
parent[s] = None
DFS-Visit(Adj, s) # different Adj for each component!
14m - Running time Analysis - Θ(V+E)
In DFS visit each vertex once O(V)
In DFS-visit called once per vertex V, pay len(Adj) or |Adj[v]|
This says for all v ∈ V (v in set V ie that are members of set V)
|Adj[v]| is saying len(Adj[v]) which is the list of successor nodes and implicitly edges.
So the sum of the lengths of all the adjacency lists is the number of edges. Thus O(E)
So DFS + DFS-visit is Θ(V+E)
Edge Clasification
Tree edge - edge that goes to something so far unvisited. (parent holds the tree edges)
(these create a forrest of trees)
Back edge - node to ancestor in the tree / forrest - indicates a cycle (all nodes in the stack - detected by marking inStack flag in the node and checking it in subsequent node visits)
Forward edge - node to descendant in the tree (further along from source) (detected by counter stamping each node as its passed - forward node low to high)
Cross edge - edge between two non ancestor related subtrees, ancestral fork resulting in subtrees / siblings.
Cycle detection
Back edges are detected by setting an INPROGRESS (node in the stack) flag at start of each recursive node check.
If the node being checked has the INPROGRESS flag set the edge is a back edge.
To get the cycle follow parent list back to the node.
42m - Topological sort - scheduling
Name comes from sorting the topology of the graph.
DAG - Directed Acyclic Graph
Acyclic Graph - graph w/o any cycles in it.
Output the reverse of the finishing times of each vertex.
Worked example CLRS p607.
When DFS processing a graph first encounter & �starttime� depicted by (n and finish time by n).
Entry & exit will be a well formed set of parentheses:
(s (z (y (x x) y) (w w) z) s) (t (v v) (u u) t)
Do Practical Code example
46m Correctness proof
Number components / subgraphs / zones in a maze
Code: maze.py &
dfs.py &
dfs_maze.py
> cd algos
> ./maze.py # display random maze
> ./dfs_maze.py # create random maze & graph of maze run DFS on it label components
# A component being each complete distinct subgraph
Maze
.123456789.123456789.123456789.123456789.123456789
0 @@@@@@@@@@@@@ @ @ @@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@
1 @ @ @ @ @ @ @ @
2 1 @ 2 2 @ 3 @ 4 4 @ 4 4 @ 4 4 4 4 4 @ 5 5 5 @ 3 @ 1 1 1
3 @ @ @ @ @ @ @ @
4 @ @ @ @ @@@@@@@@@ @ @ @@@@@@@@@ @ @ @@@@@ @ @ @@@@@@@@@
5 @ @ @ @ @ @ @ @ @ @
6 1 @ 2 2 @ 3 @ 6 6 @ 4 @ 4 4 @ 4 4 4 @ 4 @ 5 5 5 @ 3 @ 1 1 1
7 @ @ @ @ @ @ @ @ @ @
8 @@@@@@@@@@@@@ @ @ @ @@@@@ @ @@@@@@@@@ @@@@@@@@@@@@@ @@@@@@@@@@@@@
9 @ @ @ @ @ @
10 7 @ 3 3 3 @ 6 6 @ 4 4 @ 4 4 4 4 4 @ 3 3 3 3 3 @ 7 7
11 @ @ @ @ @ @
12 @ @ @@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@ @@@@@ @@@@@@@@@ @ @ @@@@@
13 @ @ @ @ @ @ @
14 7 @ 3 3 3 @ 8 8 8 @ 4 4 4 4 4 @ 3 3 @ 3 3 3 @ 3 @ 7 7
15 @ @ @ @ @ @ @
16 @@@@@@@@@@@@@ @ @@@@@ @ @@@@@@@@@@@@@@@@@ @ @ @@@@@@@@@ @@@@@@@@@
17 @ @ @ @ @ @ @ @
18 @ 3 3 3 3 @ 8 8 8 @ 4 4 @ 3 3 3 3 @ 3 @ 3 3 3 @ 3 3 3 @
19 @ @ @ @ @ @ @ @
20 @ @@@@@@@@@@@@@@@@@ @@@@@@@@@ @ @@@@@@@@@ @ @@@@@@@@@ @@@@@@@@@ @
21 @ @ @ @ @ @ @ @ @ @
22 @ 3 @ 9 9 9 9 @ 8 8 @ 4 4 @ 3 3 3 @ 3 @ 3 3 3 @ 3 3 @ 3 3 @
23 @ @ @ @ @ @ @ @ @ @
24 @ @ @@@@@@@@@ @@@@@ @ @@@@@@@@@ @ @ @@@@@@@@@ @@@@@ @ @@@@@
25 @ @ @ @ @ @ @ @ @
26 3 @ 9 9 9 9 @ 8 8 @ 4 @ 3 3 @ 3 3 @ 3 @ 3 3 3 @ 3 3 @ 3 3
27 @ @ @ @ @ @ @ @ @
28 @@@@@@@@@@@@@@@@@@@@@ @@@@@ @ @ @@@@@@@@@ @ @@@@@ @ @@@@@@@@@@@@@
29 @ @ @ @ @ @ @ @ @ @ @
30 @ 10 10@ 8 8 8 8 @ 4 4 @ 3 @ 3 3 3 @ 3 @ 3 @ 3 3 @ 3 @ 3 3 3 @
31 @ @ @ @ @ @ @ @ @ @ @
32 @ @ @ @@@@@@@@@ @ @ @ @@@@@@@@@ @ @ @ @@@@@ @ @@@@@ @
33 @ @ @ @ @ @ @ @ @ @ @
34 @ 10 10@ 8 8 8 8 @ 4 @ 4 @ 3 3 3 3 @ 3 @ 3 3 @ 3 3 @ 3 3 @ 3 @
35 @ @ @ @ @ @ @ @ @ @ @
36 @@@@@@@@@@@@@@@@@@@@@@@@@ @ @@@@@@@@@ @@@@@ @ @@@@@ @ @ @ @ @
37 @ @ @ @ @ @ @
38 3 3 3 3 @ 4 4 4 @ 4 4 4 @ 3 3 3 @ 3 3 3 @ 3 @ 3 3 @ 3
39 @ @ @ @ @ @ @
40 @@@@@@@@@@@@@ @ @ @@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@
[ 0 1, 1 1, 3 9, 4 9, 13 1, 4 2, 0 3, 4 4, 1 6, 0 8] < Component Start nodes
0,0 = TRHC 19,9 = BRHC
contents ~ vid ~ lect notes ~ Code:dfs.py ~ Reading:
time | notes |
---|---|
0-2m | What to cover? |
2m-3m | Graphs, dict of adjacency lists |
3m-6m | Going over graph DS implementation |
6m | Going over DFS code |
23m | Edges |
32m | Undirected graph walk through |
35m | Edges in directed vs undirected graphs |
39m | Topological Sort - Directed Acyclic graphs only (DAGS only) |
39m - Topological Sort - Directed Acyclic graphs only DAGS only
�A > B - forward edge means B depends on A� < this doesn�t sound right to me . .
If B follows A then sure B depends on A!?? CHECK - CONFIRM
contents ~ vid ~ lect notes ~ Code: ~ Reading: CLRS C24 p643-683
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0-4m | Intro Motivation |
4m | Dijkstra (non negative edges) O(VlgV + E) |
6m | Bellman Ford (+ve -ve negative edges) O(VE) = O(V^3) |
8m | Notation 1 |
14m-18m | Notation 2 |
18m | Example notes p3 |
24m | Predecessor relationship PI ∏ shorted current path d(V) |
27m | -ve weights - |
32m | -ve cycles |
37m | Relaxation - resolving to shortest paths |
38m | Structure of shortest path algorithms (Single source / Brute force) |
45m | Exponential graph example - notes p5 |
50m | Optimum substructure |
Graph G(V, E, W)
V - vertices
E - Edges
W - Weights
4m - Dijkstra (non negative edges) dominating factor is E (edges)
O(VlgV + E) - R15 18m49 �you need a fancy DS for this running time� O(ElgV) more practical running time.
E = O(V^2)
6m - Bellman Ford (+ve -ve negative edges)
O(VE) = O(V^3)
IE
prefer Dijkstra . .
8m - Notation
Definition of path and its weight
- Path p is the set of vertices 0 to k
- (vertex i to vertex i+1) and edge is a member of E - set of edges where i is between 0 and k
- the weight of the path w(p) is the sum of the weights w(vertex i to vertex i+1) of paths it contains
14m-17m - Notation 2
18m - Example notes p3
24m | Predecessor relationship**
d(V) - value inside the circle - current best path weight
∏[V] - Predecessor on the (current ) best path to V - ∏[S] = NIL
32m - Structure of shortest path algorithms
45m - Exponential graph example - notes p5
contents ~ vid ~ lect notes 1 ~ lect notes 2 ~ Code: Reading:
time | notes |
---|---|
0m-16m | Adapting BFS to solve undirected graph w/ weighted nodes |
16m | Running time of adapted BFS |
20m | Exam Q - transforming a problem - graph layers to represent state odd/ even in this case |
40m | Mulitlayered state representation - highway network - travel time different at time t |
0m-16m - Adapting BFS to solve undirected graph w/ weighted nodes
For a path of length 4 insert 3 fake nodes to simulate a distance of 4 w/ 4 edges.
Do the same for all the paths then run BFS on the new graph.
Take the output and reverse the process - remove fake nodes leaving the original nodes forming the path.
16m - Running time of adapted BFS
Dijkstra - O(VlgV + E) or O(ElgV) - R15 18m49
BF - O(V.E)
Adapted BFS - O(v)
20m - Exam Q - transforming a problem - graph layers to represent state odd/ even in this case
Tim & Jim traveling from A to E.
Constraints, they must taking it in turns driving and Tim must start and end the journey.
So shortest path starts with Tim and ends with Tim so must have an odd number of steps.
Duplicate the graph labelling one set ODD & the other EVEN.
Use Dijkstra to find shortest path between A_even and E_odd
Code this
40m - Mulitlayered state representation - highway network - travel time different at time t
Problem: Least amount of fuel to get from a to b. (Most ecological)
Fuel used to get from path a to be is a also a function of t (integer in minutes)
F(a,b,t) = fuel used. going from node a to b leaving at time t
Also introduces the concept of waiting which uses no fuel
contents ~ vid ~ lect notes ~ Dijkstra Shortest path Computerphile ~ A* A star Search Algorithm ~ Code: Reading: CLRS, Sections 24.2-24.3 p648
type: greedy BFS to find shortest path
use cases: routing tables (networks), route finding (maps)
queries: find(graph, start, target)
updates: none
Basics to understand Dijkstra algorithm.
DFS vs BFS
With DFS the search uses a stack to store new nodes as the graph is searched, this also lends itself to using recursion since it has an implicit stack in the call stack.
With BFS a queue structure is used to store an process nodes, this is what is used by Dijkstra
Dijkstra is a greedy algorithm (it makes the locally optimal choice at each stage).
The first step is to process all the nodes adjacent to the start node and queue them up in order of shortest distance first. (Shortest distance is the optimal - greedy - next choice).
The algorithm then repeats this step with each element in the Q:
It pulls a node off the Q, goes through each adjacent node, calculates the distance of the adjacent node from the start and places that adjacent node back in the Q according to its distance from the start.
Note if an early node (processing wise) is a long distance away from the start, a multistep route with a smaller distance will be processed first because it will appear earlier in the queue. This ensures the shortest route requirement.
The algorithm also maintains a set of visited_nodes to detect when the target node has been found.
Relaxation - fundamental concept:
d[v] length current SP from source (blue) to v
- (d[v] updates for each node as algorithm runs - set to ∞ to start)
- except for S which is 0 these all start at infinity until they are deduced
delta(s,v) length of a shortest path between s & v
∏[v] predecessor of v in the SP from s to v (used to reconstruct the SP)
∏[S] = NIL
w(u,v) distance from u to v
Relaxation operation:
RELAX(u, v, w)
if d[v] > d[u] + w(u, v) if dist from S to u + dist from u to v is < dist from S to v
then d[v] = d[u] + w(u, v) then update v with new shortest path
∏[v] = u and update its predecessor with u
It is basically saying: if we have found a shorter route to v via u, update the details in v to reflect this.
Keep going until T is in the visited_nodes.
Abr | TERMS |
---|---|
S | Source |
T | Target |
SP | Shortest Path |
DAG | Directed Acyclic Graph |
1 | 2 | 3 |
---|---|---|
![]() |
![]() |
![]() |
Code @ citimap.py | More nodes | More nodes |
![]() |
![]() |
![]() |
Code @ u9_fp_random_target.js | More nodes | More nodes |
To see short animation navigate here and click DOWNLOAD for mp4.
time | notes |
---|---|
2m-6m | Review - Shortest Paths in DAGs - Directed Acyclic Graphs |
6m | Relaxation & its being SAFE |
6m-17 | Relaxation & its being SAFE proof - dubious |
17m-21m | DAGs (-ve edges OK cant have cycles) - SPECIAL CASE SHORTEST PATH |
21m | DAG graph visual example |
28m-41 | Dijkstras Algorithm |
41 | Dijkstras Algorithm - walking example |
2m - Review - Shortest Paths in DAGs - Directed Acyclic Graphs
d[V] - shortest path to V from source S - number in the centre of the node in diagram!!
d[V] at S is 0, rest are initialised to infinity and are then relaxed AFAP (< Far)
∂(S,V) delta(S,V) - length of A shortest path from S to V
∏[V] pi[V] - Predecessor on the (current ) best path to V - ∏[S] = NIL
w(u,v) - weight of path from u to v
7m - Relaxation step
Relaxation step involves updating a d[V] with the value of a newly found shorter route.
AND
Updating its predecessor ∏[V] w/ the new predecessor nodes
7m10-17m - Relaxation SAFETY
Converging on the delta value from the top and stopping at the minimum delta.
Lemma: The relaxation algorithm maintains the invariant that d[v] ≥ ∂(s, v) for all v ∈ V
**17m - DAGs (-ve edges OK - CANT have cycles) **
SPECIAL CASE (Graph w/o cycles -ve edges OK) SHORTEST PATH - O(V+E)
- Topologically sort the DAG
- Then relax in sorted order - Note a path from u to v in the sorted graph implies u preceeds v.
Process O(V+E)
What is this used for?? Getting shortest paths of DAG weighted graph w/o cycles -ve edges OK!
28m-41 - Dijkstras Algorithm
Watch for an uncomplicated accurate walkthrough Dijkstra Shortest path Computerphile ~ developed into A* A star Search Algorithm
Next iteration of processing is taken from the PRIORITY QUEUE (MinHeap) where nodes are inserted when found ordered by their d[node] from S
From:
./citimap.py # create a random set out cities connect by routes to present to ./dijkstra.py
S = source node
Π[v] = predecessor node
w(v1,v2) = sum of edge weights from v1 to v2
See generic pseudo code L17 top p2
Each node should start w/ distance from S set to math.inf
Predecessor node Π[v] (pi = symbol for predecessor) set to NIL / None
Distance from S at S is obviously 0! Since its the source node
Iterate through all outbound nodes and only update the vertex if this route is shorter than the a previous routes update.
Note nodes / vertices may have been reached by another route and may already have a bette path
in which case don�t update!
Only on and update(shorter path found) add/update the node to/in the priorityQ - - - I think? this the bit Im hazy on
PriorityQ no support update? < CHECK if this will
Any equation identities / topics for this lecture include context and uses for later reference
contents ~ vid ~ lect notes ~ Code: Reading:
time | notes |
---|---|
0m-30m | Rubiks cube walk through |
22m44 | Inverse of permutation |
29m | Starcraft |
Any equation identities / topics for this lecture include context and uses for later reference
contents ~ vid ~ lect notes ~ Code: Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0m-2m | Intro |
2m-4m40 | Summary of graph w/ -ve cycle & expectations of Bellman-Ford algo |
4m40-10m | Summary generic shortest path algo & problems |
10m50 | Polynomial Time vs Exponential time top p3 in notes - CRITICAL - top p4 too! |
11m50-15m | Bellman-Ford algo pseudocode - bottom p3 in notes |
16m-18m | Bellman-Ford complexity O(VE + E) = O(VE) |
18m-30m | Proof part 1 |
30m-34m | Proof part 2 - corollary |
34m-40m | Couple queries answered |
40m-end | Finding longest paths in a graph / problems w/ cycles |
contents ~ vid ~ lect notes ~ Code: Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0m-4m | Intro, ps6, bidirectional search |
4m-6m | basic Dijkstra pseudo code review |
7m30 | Bi-directional search |
11m40 | df[u] & db[u] terminology - distance forward & backward |
11m50 | Qf[u] & Qb[u] terminology - PriortyQ forward & backward |
28m | Constructing SHORTEST path problems notes p5 for diagram |
42m | Constructing SHORTEST path SOLUTION |
46m | Goal directed search & A* |
7m30 - Bi-directional search
Essentially:
searching from S > t (source to target)
and from t > S and look for frontier collision
Requires Π[v] (predecessor node) from S and from t for traversal in both directions,
and obviously shortest path info too.
for each node
df[u] & db[u] - distance forward & backward
fer each search
Qf[u] & Qb[u] - PriortyQ forward & backward
Then detecting the collision of the frontier by looking for a node that has bean extracted from both PriorityQs
NOTE
Concatenate subpaths to that node from each direction:
S > node + node > t
is NOT the way to get the shortest path!
28m - Constructing SHORTEST path:
42m - Constructing SHORTEST path SOLUTION
S > node + node > t gives a starting point for comparison
Then you need to Look in the PriQs for nodes that appear in both and check to see if an paths are shorter.
Check this, doesn�t seem particularly rigorous!?
*46m - Goal directed search & A **
Used combination of path weight and a guiding metric to prefer paths going towards the goal.
EG in a map route, could be weight + distance to goal: Computerphile A* - 14m vid
Very
Add code.
Any equation identities / topics for this lecture include context and uses for later reference
contents ~ vid ~ lect notes ~ Code: Reading:
time | notes |
---|---|
0m-5m | Covering: Numerics . . see below |
5m-8m | Why you might get different shortest routes bast on relaxation order |
8m | Numerics: Newtons method NM, choosing good start function |
19m | Numerics: NM, avoiding fraction, shifting, |
23m | Numerics: NM, estimating initial guess, order of magnitude |
25m | Numerics: NM, using binary search to generate first guess - home in on approximation |
27m | Numerics: NM, stop when approximation match - CONVERGED |
30m | Numerics: NM, speed of convergence. |
30m-35m | Numerics: NM, student Q&A - good summary |
35m- | Graph tranformation - 2d graphs, layers R15 |
45m- | Graph tranformation - running complexity |
47m- | Graph tranformation - expressing constraints |
51m- | BFS & DFS use case summary L13, R13, L14 |
57m- | DFS edge types L14 |
COMPLETE EXs in Notes
0m-5m - Covering:
Why you might get different shortest routes bast on relaxation order
Graph transformation - converting and complicated problem into a shortest route problem to solve it.
Numerics: Karatsuba
Numerics: Newtons method - FORMULA SB on CHEAT SHEET
DFS Edge types
Transformations & Layers
8m | Numerics: Newtons method, choosing good start function
Add formula for Newtons method & step summary, rationale behind each step
B - base d - digits of precision
23m - estimating initial guess, order of magnitude
35m - Graph tranformation
Path from S > t
Coloured edges have associate cost (5) for switching for on to the other
51m - BFS & DFS use case summary
CHEAT SHEET
Any equation identities / topics for this lecture include context and uses for later reference
Problem set 6 due
Quiz 2
contents ~
vid ~
lect notes
Code:
dyn_00_fib.py
dyn_01_shortest_paths.py
good side by side review L20 - 6m-17m
Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0-5m | Introduction to Dynamic Programming (DP) recursion + memoization + guessing |
5m-11m20 | Solving Fibonacci w/ Memoization - naive version |
11m20-15m40 | Memoized version - running cost Θ(n) |
15m40 | running cost Θ(n) - rule 30m49 flashback for acyclic sub problems |
19m30-23m | DP generalisation, running cost |
23m-29m | Bottom-up DP algorithm (topological sort) |
29m-51m | shortest paths, converting cyclic paths to acyclic & bellman-ford |
35m | shortest paths - GUESSING |
0-5m - Introduction to Dynamic Programming (DP) recursion + memoization + guessing**
Wikipedia definition:
Dynamic programming is both a method and a computer programming method. The method was developed by in the 1950s and has found applications in numerous fields, from to .
In both contexts it refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a manner. While some decision problems cannot be taken apart this way, decisions that span several points in time do often break apart recursively. Likewise, in computer science, if a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, then it is said to have .
If sub-problems can be nested recursively inside larger problems, so that dynamic programming methods are applicable, then there is a relation between the value of the larger problem and the values of the sub-problems. In the optimization literature this relationship is called the Bellman equation.
SEE L20 6m - for 5 steps!
5m-11m20 - Solving Fibonacci w/ Memoization - naive version
8m15 Recurrence p2 notes - revises Fibonacci recurrence
11m20-15m40 - Memoized version
Main takeaway here is that running cost described as a tree once a particular value has been calculated it can be retrieved from a dict.
Resulting a in the main cost being the first time the memoization table is filled with data.
The rest of the calls are essentially dict lookups. so for fib(n) the runing time O(n) - linear
add code from dyn_00_fib.py
15m40 - running cost Θ(n) - rule 30m49 flashback for acyclic sub problems
19m30-23m - DP generalisation, running cost
CORE CONCEPT
RECURSION + MEMOIZATION + GUESSING
Split the problem into memoized subproblems . .
and then if you need the solution to that problem again you don�t need to recompute the solution!
COST
time = # of subproblems x time per subproblem
Recursions are NOT counted because memoization removes the cost of repeat computation.
ONLY WORKS ON DAGS? ACYCLIC GRAPHS To fix this create layered graph - graph transform - 48m
23m-29m - Bottom-up DP algorithm (topological sort)
Dont get it - its a for loop?
Add timing info from dyn_00_fib.py
Notes on Space - see p3 notes
29m-51m - shortest paths, converting cyclic paths to acyclic & bellman-ford
Using graph transform - code example - code example in R19
See R19 notes too
35m - shortest paths - GUESSING
dyn_00_fib.py
Any equation identities / topics for this lecture include context and uses for later reference
https://www.youtube.com/watch?v=IFrvgSvZA0I&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=42
contents ~ vid ~ lect notes ~ dynamic prog lect18 spring 2011 - crazy eights ~ Code: dyn_01_shortest_paths.py dyn_02_crazy8s.py Reading:
time | notes |
---|---|
0m-9m30 | Crazy Eights crazy eights - L18 |
9m30-20m40 | Computing shortest paths on a DAG |
19m | Computing shortest paths on a DAG - complexity + Crzy8 complexity |
19m30 | Crzy8 complexity |
20m | Options: Generalised Pseudocode, Memoization, Dealing w// cycles, another DP |
22m-28m | Manhatten lattice |
28m | Q&A - develop Manhatten lattice solution, topologically sorted so no memoization |
32m | Q: what is dynamic programming - Optimal substructure |
37m-48m | Cycles - graph transforms used to remove cycle + complexity |
48m | Q&A about cycles |
**9m30 - Computing shortest paths on a DAG Break the graph into subgraphs and work out their weights. For each node it has a number of incoming nodes each with and associated weight. Shortest path looks at their weights and chooses the minimum weight. For the target node this DEPENDS on the PRECEDING nodes and the same for them until you reach the source. The dependency requires the the nodes be processed in the order the edges progress so a topological sort is required before processing starts. (Missing info will cause the recursion to fail) Every node & edge must be processed giving O(V + E) + topological sort which he appears to gloss over!!
22m | Manhatten lattice Nodes labelled col,row (n * m) so top left 00, top right 5,1, bottom left 1,4, bottom right 5,4
31m you don�t need memoization if you know the topological order of the graph - explain.
32m - what is dynamic programming - Optimal substructure
Optimal substructure
Technique to solve optimisation problems maximise or minimise something.
Also see L20
51m - devils in the detail question - explain
contents ~ vid ~ lect notes ~ Code: dyn_02_text_just.py dyn_04_blackjack.py Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0-6m | Lecture overview |
6m-17m | 6m Dynamic programming overview - 5 steps - Fibonacci / Shortest Path |
17m-35m | Text justification |
35m-39m | Parent Pointer - remember which guess was best - track the solution |
39m-52 | Blackjack - fires though at high speed |
6m Dynamic programming overview - 5 steps (not necessarily sequential!)
-
define sub problems - [count # subproblems]
-
guess (part of the solution) - [count # choices]
-
relate sub problem solutions - [compute time/subproblem]
-
recurse & memoize - [time = time/subproblem * # subproblems]
OR build DP table BOTTOM-UP
For DP table BOTTOM-UP - check subproblems acyclic & in topological order!! -
Solve original problem = subproblem
OR by combining sub problem solutions. - [extra time] -
this is identifying the recursive part 5 is about expressing the problem as a directed acyclic graph DAG, compute the shortest path w/ bell-ford.
Here the skill is in expressing the problem as a DAG.
Note: R19 - 31m you don�t need memoization if you know the topological order of the graph - explain.
dyn_00_fib.py
Goes over the steps 6m-17m
dyn_01_shortest_paths.py
Goes over the steps 6m-17m
15m for k on the outside / for loop over V on the inside reverse does not work!
Use DP rather than greedy.(Whole page strategy rather than line by line) badness of line = (width page - width words)^3 < the more space really highlighted by the cubedness! badness of line = math.inf if the words don�t fit Problem is to minimise the sum of the badnesses of the lines. (Whole page strategy rather than line by line)
1. Copy steps to code and implement in python
2.
3
4.
5.
Code: dyn_02_text_just.py
1. Copy steps to code and implement in python
2. guess - how many hits?
3. recurence outcome member { -1, 0, 1} lose, draw, win
4.
5.
Code: dyn_03_blackjack.py
contents ~ vid ~ lect notes ~ Code:[Black] dyn_03_blackjack.py dyn_04_longest_subsequence.py Reading:
time | notes |
---|---|
39m-end | L20 39m-end refresh BlackJack DP solution |
0m-6m | R20 introduce the problem |
6m-18m | Encoding into a GRAPH solution |
15m15 | Choose TOPOLOGICAL SORT +DFS O(V+E) over Dijkstra O(E + VlogV) |
18m-37m | Dynamic Programming solution |
37m-end | Longest Increasing Subsequence |
Ace can be 1 or 11. Dealer stick 17 or higher $1 per win Cards consumed each round 2 + 2 + player hits + dealer hits
4m - macro - round_outcome(i, h) for each game the only important info is how many cards player hits - h
After that is chosen: player gets 2 cards dealer gets 2 cards each player hits some cards number of cards played - #cp result {1,0,-1}
macro: round_outcome(i, h) i - start point, h - number of cards to hit return: (number of cards played this round #cp, winnings{1,0,-1} )
6m - Encoding into a GRAPH solution The game starts with a deck of cards and after each round #cp cards have been used up resulting in an outcome{1,0,-1}. Setup: 52 nodes each representing card played in the deck. Each edge represents a result, and will go from the start position (i) to start position (i) + #cp
This results in a DAG since all the edges move forward (by at least 4 cards) for each game.
Next step is to find the most profitable (highest score) path through the graph.
Topological Sort + DFS O(V+E) will give the shortest path.
15m15 - Choose TOPOLOGICAL SORT + DFS O(V+E) over Dijkstra O(E + VlogV)
C++ implementation findinh longest path
18m-m - Dynamic Programming solution
Fro DP the graph is implicit and no nodes are used the solutions are stored directly.
Here cards are stored in a 52 element array. named dp[]
37m-endLongest Increasing Subsequence
Brute force but solutions to sub problems already computed as you progress!
Solutions (length of largest sequence of rising number) are store in an array same size as the sequence. 1 slot per number n the problem space. EG 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15
index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Problem space: 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15
Solutions array 6 4 5 3 5 3 4 2 4 3 3 2 3 2 2 1
parent_index 2 3 6 7 6 7 9 15 9 11 11 15 13 15 15 nil
Start at last element end of sequence the solution to longest increasing sequence is 1 store it in same location in solution array.
Go left 1 and calculate from there solution is 2, [7,15]
Repeat index 12 both index 13 & 14 provide acceptable (precalc) routes to increasing sequence. Store 3, parents 13,14
Repeat index 11 only i=15 provides acceptable (precalc) route to increasing sequence. Store 2, parent=15
Repeat index 10 only i=15 provides acceptable (precalc) route to increasing sequence.
Repeat index 9 both index 11 & 13 provide acceptable (precalc) routes to increasing sequence. Store 3, parents 11,13
Repeat index 8 all index higher 9,10,12 provide acceptable (precalc) routes to increasing sequence. Store 4, parents 9,10,12
Repeat index 7 only i=15 provides acceptable (precalc) route to increasing sequence. Store 2, parent=15
Repeat index 6 only i=9 provides acceptable (precalc) route to increasing sequence. Store 4, parent=9
etc etc until whole array processed.
Keep track of the highest stored number and its index. Use parents to reconstruct sequence.
Note depending on requirements only one parent may be necessary.
contents ~ vid ~ lect notes ~ Code: dyn_06_parenthesization.py dyn_07_edit_distance.py dyn_08_knapsack.py Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0m-3m15 | Intro - 5 steps revision |
3m15m-6m | Choosing sub-problems - string or sequences - suffixes |
6m-24m | Parenthesization |
24m-43m | Edit Distance |
31m | Breaking down Edit Distance: Common subsequence - HIEROGLYPHOLOGY ex? |
43m-53m | Knap sack |
3m15m - Choosing subproblems - string or sequences - suffixes
type | complexity | notation |
---|---|---|
suffixes | O(n) linear | x[i:] ∀i |
prefixes | O(n) linear | x[:i] ∀i |
substrings | O(n**2) quatratic | x[i:j] ∀ i<=j |
6m-24m - Parenthesization
Associative expression: re-arranging parenthesis wont change result: (2 + 3) + 4 = 2 + (3 + 4) = 9
Example is matrix multiplication. Where computational cost depends on order of parenthesis.
Initial analysis looks like the solution approaches from both ends, IE using prefixes & suffixes!
In this case use substrings!
Draw out this recurrence / DAG, get better understanding.
24m - Edit Distance
Cost of creating one string from another. Can be used to build a list of likely words from a typo word.
Cost / Distance calculated in character edits w/ each edit having an associated cost.
(keys next to each other might have low cost since typo more likely)
operations:
insert c
del c
replace c > c`
Also spelling correction, DNA sequencing, likely mutation progression.
Longest common subsequence
31m - Breaking down Edit Distance
Understand how the matrix representation on of the graph maps to a graph.
What he data in each cell represents, helpful to visuailse
31m subset / recursion
43m-53m - Knap sack
Optimising the content of a container of constant volume based on volume & value of a set of items.
48m subset / recursion
contents ~ vid ~ a lot DENSE lect notes READ ~ Code: dyn_08_knapsack.py Reading:
time | notes |
---|---|
1m | Knapsack problem |
1m | Graph version |
25m | Graph version - running time |
26m-59m | DP version |
42m | DP version - fills in matrix |
45-47m | DP version - pick matrix apart |
48m | Pseudo polnomial time |
59m | Dijkstra - data representation impact on running time - WRT pseudo polynomial time |
n items, each item: s_i - weight v_i - value
Knapsack can carry s lbs
Problem: Fill the knapsack w/ the most valuable loot! Answer: list of items!
Guess:
As a graph each node is an item.
Each edge represents adding item to sack. (value of items added to sack so far. room left in the sack)
Nope!
Weight is an integer create a layer for each weight in Eg 5lb is the limit so 5 layer 1 for each lb.
Edge represent the value change.
items
0 - gold statue - $10 - 4lb
1 - crystal ball - $4 - 2lb
2 - fountain pen - $7 - 3lb
42m - DP version - fills in matrix
Understand arrows / matrix - depends on changing bag capacity??
45-47m DP version - pick matrix apart - try notes - p1!
quick rehash of knapsack L22 2m-5m
59m - Dijkstra - data representation impact on running time
L22 - Two kinds of guessing; piano/guitar fingering, Tetris training, Super Mario Bros.vid
contents ~ vid ~ lect notes ~ Code: Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
1m | Intro, Piano, guitar, tetris & super mario brothers, 2ND KIND OF GUESSING! |
2m | 2ND KIND OF GUESSING! |
5m40-35m | Piano & Guitar fingering |
13m-16m | P&G f: sub-problems - first INCORRECT attempt |
16m-21m | P&G f: sub-problems - CORRECT approach |
21m-23m30 | P&G f: topological order |
23m30-27m | P&G f: DAG form |
27m | P&G f: multiple notes |
35m30 | Tetris (w/ contraints!) |
43m30 | Super Mario Brothers |
5m40 - Piano & Guitar fingering
note > single note
10 fingers
difficulty measure - d(p, f, q, g)
p - note on
f - finger f & transition to
q - note q using
g - finger g
13m - P&G f: sub-problems
How to play note for i onwards - notes[i:]
23m30 - P&G f: DAG form
5 rows - five fingers
contents ~
vid ~
NO lect notes
Code:
Reading: Strassen's algorithm for fast matrix multiplication is covered in CRLS, chapter 28, pp. 735-742.
time | notes |
---|---|
1m-6m | Intro to DDR rules, what to maximise |
6m-11m | Turning prduct into sums using logs, Concept: NUMERICAL INSTABILITY |
11m-end | Modelling using a graph |
Bouncing ideas about - what to minimise/ maximise:
Maximise score
Minimise effort
Best appearance to MAXIMISE entertainment (for TV)
Minimise probability of failure = maximise probability of success!
(Use logs - doing sums instead of products? 6m45?)
(see L44m43 min product - PS6 I can haz moor frendz - friendbook)
Probability of succeeding on each move, eventually sums up to the total probability of success!
How turn products into sums I assume he means
6m - Turning prduct into sums using logs, Concept: NUMERICAL INSTABILITY 7m14
Multiplying lots on numbers that are close to 1 (IE probabilities that are high/likely) creates NUMERICAL INSTABILITY.
So use logs (of probabilities) and add them up instead of the actual probabilities and multiplying them.
Quick game summary: trochadero dance arcade
Hit all the nodes minimise the effort
delta(from, to) > results in number 0 > 1 0=easy to 1=hard
Moves coded L / R foot position U D L R arrow
EG: LU RD > LL RR -
So delta(LU RD, LL RR) produces a difficulty
11m - Modelling using a graph
node = state
edge = move
What doe the State mean? Move difficulty independant Sum of difficulties up to
44m59 - state summary on board
contents ~
vid ~
lect notes
Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|---|
0m-1m | Intro: polynomial time algos |
1m-5m | Complexity classes: P, Exp, R |
5m-11m | Examples: cycle detect, n*n chess, tetris, halting problem |
12m-18m | most problems uncomputable - set theory - walk the PROOF |
19m- | NP - non deterministic polynomial - lucky algorithm |
24m- | NP - tetris |
26m- | NP - alt view |
26m-34m | P ne NP or P vs NP - YOU CANT ENGINEER LUCK |
36m | hardness & completeness |
41m-end | reductions |
Complexity classes: P - { Set of all problems solvable in Polynomial Time }
Complexity classes: Exp - { Set of all problems solvable in Exponential Time }
Complexity classes: R - { Set of all problems solvable in Finite time }
contents ~
vid ~
NO lect notes
Reading: 34.1-34.3 p1053 34.1 Polynomial time, 34.2 Polynomial time verification, 34.3 NP-completeness and reducibility
time | notes |
---|---|
28m | Solution approaches - know what to avoid |
32m | k-SAT (2R-SAT, 3-SAT) - satisfiability, CNF |
28m - summarises solution approaches from the course. Smiley faces - one approach per face. Talking about reductions,
READ THE NP COMPLETE CHAPTER IN CLRS - C34 p1048-1105 - BIG!! 57 pages!
vid ~
lect notes
Code:
Reading:
type:
use cases: good for bla
queries:
updates:
representation invariant (RI):
properties:
time | notes |
---|
Any equation identities / topics for this lecture include context and uses for later reference
contents ~
vid ~
lect notes
Code:
Reading:
time | notes |
---|---|
40m |
Any equation identities / topics for this lecture include context and uses for later reference
The following symbols o, Ω, ω, and Θ, are used to describe differing kinds of bounds on asymptotic growth rates.
O - big O - describes the asymptotic behaviour of functions WORST case or UPPER bound (common in CompSci)
Θ - big Theta - describes the asymptotic behaviour of functions AVERAGE case (common in CompSci)
Ω - big Omega - BEST case or LOWER bound (common in CompSci)
o - little O - loose upper bound (common in Maths rare in CompSci)
ω - little omega - rough estimate of the order of the growth (rarely used)
T(n) - function defining the exact Time or number of steps to complete an algorithm for n items
Binary tree types:
Full All nodes have 2 children or 0 children (one child not allowed)
Complete Filled top to bottom left to right (and removed in reverse, data position likely re-ordered to preserve RI - EG Heap)
Perfect All layers have all their nodes
Balanced Usually refers to weight height balance leaf depths differ by no more than 1 (can also be weight balanced)
ADT - Abstract Data Type (interface definition, methods & properties)
DS - Data Structure (actual implementation of the ADT)
RI - representation invariant
Stable sorting algorithm - A sorting algorithm is said to be stable if two objects with equal keys appear in the same order in sorted output as they appear in the input array to be sorted.
Note for array of any size tree: element A[n/2+1 . . n] are ALL leaves!
> .pe # alias .pe='. venv/bin/activate'
> pip install matplotlib # plotting lib
> pip install numpy # math sci lib
> ./matplotlib/plot.py # super basic 2d plot example - in maths repo
./matplotlib/time_complexity_plot_q.py
> python --version # Python 2.7.16
> python3 -m venv venv
> .pe # alias .pe='. venv/bin/activate'
> python --version # Python 3.7.5
> pip install --upgrade pip
> pip install striprtf # for rtf processing
> .pe # alias .pe='. venv/bin/activate'
> ./create_TOC_for_md.py -p # takes MATHS_00_MIT_6.042.rtf course notes and add TOC > README.md
# also add README.md to git, commits, and pushes
# -p = commit & push
Install Latex tools notes here
Open LaTeXit edit equation click text and hit the LaTeXit button to check its good.
Export as png and upload it to git (need to do this so the URL and be used to embed the image)
Embed image with

Note the ! before opening [ denotes image
Find texify here
Use LaTeXit to check formula correctness then past it into doc surrounded by consecutive $ symbols like so
Will display the following document distance equation
open problem . .
Courses / Books Found w/ Summary:
MIT 2011 course
Prerequisite: Maths for CS:
2015 https://www.youtube.com/watch?v=wIq4CssPoO0&list=PLUl4u3cNGP60UlabZBeeqOuoLuj_KNphQ
[L3.2.1 Asymptotic Notation] (https://www.youtube.com/watch?v=CWkh5kb4TGc&list=PLUl4u3cNGP60UlabZBeeqOuoLuj_KNphQ&index=72)
[L 3.2.3 Asymptotic Properties] (https://www.youtube.com/watch?v=HeyEK0TWiBw&list=PLUl4u3cNGP60UlabZBeeqOuoLuj_KNphQ&index=73)
[L 3.2.6 Asymptotic Blunder inc Big O] (https://www.youtube.com/watch?v=Y9Blo_G-Mvg&list=PLUl4u3cNGP60UlabZBeeqOuoLuj_KNphQ&index=74)
2010 https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/video-lectures/
[L12 Sums] (https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/video-lectures/lecture-12-sums/)
[L12 Sums and Asymptotics] (https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/video-lectures/lecture-13-sums-and-asymptotics/)
Topics: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/readings/
Prerequisite: Probability: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systems-analysis-and-applied-probability-spring-2006/
Assignments w solutions: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systems-analysis-and-applied-probability-spring-2006/assignments/
https://www.youtube.com/playlist?list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
Paper version: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/
How Lectures Match problem sets:
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/calendar/
Problem Sets (inc code):
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/assignments/
Problem Local (inc code):/a_syllabus/_COURSES_03_TO_SORT/_algorithms/MIT_introduction_to_algorithms/_problems_exams
Final exams & mocks: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/exams/
Play list video: /a_syllabus/_COURSES_03_TO_SORT/_algorithms/MIT_introduction_to_algorithms/_Alogrithms_MIT_2011_vid.m3u
Play list audio: /a_syllabus/_COURSES_03_TO_SORT/_algorithms/MIT_introduction_to_algorithms/_mp3/_Alogrithms_MIT_2011.m3u
LaTex example setup and doc repo: https://github.com/UnacceptableBehaviour/latex_maths
Installing LaTex: http://tug.org/mactex/
Latex Tutorial: https://www.youtube.com/watch?v=xnD4kHHvKhQ
Maths in Latex: https://www.youtube.com/results?search_query=maths+formula+latex
Follow up courses
Θ Ω ω - little omega - rough estimate of the order of the growth (rarely used)