Probability and Computing (CSL 471)

Course Introduction

Syllabus

Marking Scheme

Frequently Asked Questions

Topics Covered and Lecture Notes

Tutorials

Question-Papers






COURSE INTRODUCTION

The course is lined towards understanding probability ideas in computer science. Apart from developing the subject the theoretical way, we address some of the most interesting real life questions and provide solutions to it. Following are some of the most interesting questions we address in the course:

  1. What is the ideal and optimal “dating” strategy? Of what interest is this to computer scientists?
  2. Can you convince a person that you know a secret without telling him/her the secret? This is doable and is one of the path breaking ideas in computer security!
  3. Is it practical and advisable to claim “you’re feeling lucky” and gamble? Why should one never visit Las Vegas? What has this to do with the mechanism with which internet search engines work?
  4. Can ideas from probability theory help one design algorithms that can make your android phone beat the world’s fastest supercomputer?
  5. Do we all have a common biological “ancestral mother” from whom everyone descended? Answer: YES and we need both - ideas from probability theory and computer science to answer this!

The course is mainly geared towards making students understand and appreciate the idea of probability and randomization in the field of computing. At the end of the course, the student will be:





SYLLABUS

Items 1 to 16 will be covered before midterm exams.

Items 17 to 31 will be covered after midterm exams.

Items 32, 33 and invited lectures are all independent sessions that will be covered if time permits. The instructor may cover these items any time during the semester.

Please note: Each item presented below doesn’t denote a 50 min class session. They are just the list of topics that we will be covering. While some items might take a meagre 20 mins (eg., item 3 takes less than 20 mins) some of them may take more than 2 to 3 classes (eg., item 31 takes 2 hours or more to cover).

Sl. No. Topic Reference Material
1 ` Puzzles and Activities Lecture notes
2 More Puzzles and Activities (Gich Game, Online Hiring, German Tank Problem, Cryptology: Vigenere Cipher, One time pad, Random Number Generation) Lecture notes (One time pad: Wikipedia)
3 Application: Polynomial Identities Probability and Computing by Mitzenmacher
4 Application: Matrix Multiplication Verification Probability and Computing by Mitzenmacher
5 Application: Zero knowledge proofs Lecture notes
6 Gambler’s Ruin Introduction to Probability by Snell
7 Introduction to Random Variables and Distributions Introduction to Probability by Snell
8 Contention Resolution Algorithm Design by Kleinberg
9 Finding the Global min cut Probability and Computing by Mitzenmacher
10 Randomized Approximation Algorithm for Max 3-sat Algorithm Design by Kleinberg
11 Median Finding and Quick Sort Probability and Computing by Mitzenmacher
12 Randomized Closest Pair Solution Algorithm Design by Kleinberg
13 Randomized Caching Algorithm Design by Kleinberg
14 Introduction to Chernoff Bounds Probability and Computing by Mitzenmacher
15 Load Balancing Algorithm Design by Kleinberg
16 Packet Routing Algorithm Design by Kleinberg
17 Balls, Bins and Random Graphs : Poisson Distributions and other topics Probability and Computing by Mitzenmacher
18 Balls, Bins and Random Graphs : Other topics Probability and Computing by Mitzenmacher
19 Balls, Bins and Random Graphs : Birthday Paradox Probability and Computing by Mitzenmacher
20 Balls, Bins and Random Graphs: Hashing bit strings Probability and Computing by Mitzenmacher
21 Balls, Bins and Random Graphs: Bloom filters Probability and Computing by Mitzenmacher
22 Balls, Bins and Random Graphs: Finding Hamilton Cycles in a Graph Probability and Computing by Mitzenmacher
23 Markov Chains : Introduction with applications Lecture notes
24 Markov Chains : More applications Lecture notes
25 Markov Chains and Random Walks : 2-SAT randomized solution. Probability and Computing by Mitzenmacher
26 Markov Chains and Random Walks : 3-SAT randomized solution Probability and Computing by Mitzenmacher
27 Markov Chains and Random Walks : Random Walks on undirected Graphs Probability and Computing by Mitzenmacher
28 Markov Chains and Random Walks : An s-t connectivity algorithm Probability and Computing by Mitzenmacher
29 Spectral Analysis, Random Walks and Web Search Lecture notes
30 Random Number Generation The art of computer programming Vol 2 by Donald Knuth
31 Probabilistic Primes: The Miller Rabin Test Introduction to Algorithms by Cormen et al.
32 Branching and Coalescent process Lecture notes
33 Species accumulation problem (Enumeration Queries) Lecture notes
34 Invited Talk 1 -
35 Invited Talk 2 -
36 Invited Talk 3 -





MARKS DISTRIBUTION

Test 1 and Test 2 are of duration 1 hour.

Midterm is for 2 hours.

Final is for 3 hours.

Minimum Passing Marks : 30 (Absolute).

Track 1 Track 2
20% Quiz 20% Quiz
10% Test 1 10% Test 1
15% Midterm 30% Midterm
10% Test 2 10% Test 2
45% Final 30% Final


Total marks = Maximum of marks secured in track 1 and track 2.

Let us illustrate this with an example:

Assume a student A secured the following marks

30 out of 100 in mid term

90 out of 100 in final

And student B secured:

90 out of 100 in mid term

30 out of 100 in final

Student A will secure:

0.15(30) + 0.45(90) = 45 in track 1

0.30(30) + 0.30(90) = 36 in track 2

Hence the final marks of Student A in midterm + final = max (45,36) = 45 marks

Student B will secure:

0.15(90) + 0.45(30) = 27 in track 1

0.30(90) + 0.30(30) = 36 in track 2

The final marks of Student B in midterm + final = max (27,36) =36 marks





Frequently Asked Questions

Q1. The marks distribution looks strangely different. Why?
It is observed that there are two categories of students, ones who perform consistently well and the ones who are slow bloomers. Respecting this, the instructor has decided to evaluate a student in two different ways and take the maximum of the two distributions. For instance, if you end up securing less marks in the minor exam, you will still have hopes to perform well in the major exam and compensate for it because we will consider track 1 distribution. If you are a consistent student and have done well in the minor but not so well in the major, you are well protected by the track 2 distribution. Please note that the marks distribution is final and there will be no changes to it.

Q2. Why are there no assignments in this course?
While the instructor understands that the homework assignments give the students a different kind of learning advantage in comparison to the time bound tests/exams, it was noted in the past that the plagiarism/copying related strategies which a few students resort to, give them unwarranted edge over the faithful and sincere ones. To protect the interests of sincere students, the instructor has decided to not have a homework component in the course.

Q3. I learn the best when I am given homeworks/assignments. How else will a student get to achieve depth if one doesn’t solve assignment or homework problems at one’s own pace?
The instructor completely understands this point and strongly believes that the students should spend quality time in solving problem sets with varying difficulty levels. In this connection, we will ensure that problem sets are given in the form of Tuit-sheets well in advance to students so that they spend time solving them independently/collaboratively. After about a week, these questions will be discussed and solved in the Tutorial sessions by the TA/instructor. It is important for the student to be self motivated enough to attempt the questions provided in the tuit-sheet. The questions in the tuit-sheet are indicative of the difficulty level of the questions that will be asked in the tests and exams. In fact, a few questions in the minor/major will be directly picked from these tuit-sheets.

Q4. How do I ensure that I perform well in this course?
You are more likely to get a very good grade if you do the following: Attend the lectures, follow up with the instructor and TA if you don’t understand a concept. Solve the problem sets given in the form of Tuit sheets before coming to the tutorial classes. Ensure that you understand the topics by reading the material from the references provided. Take the quizzes seriously, which will help you stay with the pace of the classes.

Q5. What is the attendance policy?
For the sake of records, attendance will be taken, but is not compulsory.

Q6. I am not sure what is the actual course load. What if it gets heavy and affects my overall semester’s progress.
This is very unlikely, given that the instructor has made every attempt to keep the course organized with deviations minimized. Beyond this, at any point of time, if you think you are struggling with the course, you should talk to the instructor/TA immediately, we can take a closer look at your problem and with a “high probability” solve it for you. The course is all about probabilistic solutions to problems which the instructor and TA seem to be a little versed with :-)

Q7. If I have a serious complaint with the way the course is conducted or if there is any other issue which I cannot directly convey, fearing that it can be mistaken for insubordination, how do I go about it?
Despite our best efforts, if there is something that you don’t like about the way the course is conducted (by the instructor or by the TA), be it teaching/evaluation/partiality, you may please use the anonymous link to post your grievance: http://goo.gl/forms/pPJKQvSLwJ9mFpNx2 . We will immediately act on it with the benefit of doubt, always given to the students.

Q8. I am into extra curricular activities and cannot give a whole lot of time for studies, do you think i will be risking it by taking up this elective?
Extra curricular activities are as important as academics. We encourage you to participate and suggest that you strike a good balance between the two. In order to encourage the students to spend good amount of time in their holistic development outside academics, the CSL 471 course is structured in a way that we don’t expect beyond 4 to 5 hours of your study time per week, which amounts to 56 hrs to 70 hrs in the semester.

Q9. What will be asked in the quizzes? Will it be surprise or announced?
Quizzes will test your understanding of the concepts, it will be of short duration and straightforward questions. Anyone who has sat through the lectures can easily answer them. Every quiz will be pre-announced by at least a day’s time in advance (over email). There will be 1 to 2 quizzes every week each of 5 to 10 mins duration. More details about the quiz will be shared on the first day of the class.

Q10. Is the grading in the tests/exams boolean or do we get marks for partial attempts?
The valuation (/marking) scheme and solution to the test/exam questions will be shared with everyone. The marking will not be boolean, you will get marks for partial attempts as described in the scheme of valuation.

Q11. I love programming! I don’t see any programming component in this elective. Why?
We feel the course is already heavy for 4 credits, including a programming component will tax the student a bit too much. We will be giving you interesting programming questions and exploratory assignments to try by yourself, these will not be graded.

Q12. How are the grade letters assigned in this course?
It has been our sincere observation from our experience so far that it is tough to decide the grading mechanism a priori and do justice to students. The instructor will decide on the grade letter breakups after observing the final performance of the students. The grading will be fair, in the sense that, it will distinguish students on the basis of their understanding and efforts. Every attempt will be made in ensuring that the evaluation is such that the “naturally talented”, “high IQ-ed” and “intelligent” students will not have an undue advantage over a “hardworking” and “sincere” student.





Class Coverage

S. No. Date and Time Topics Covered Link to the Lecture Notes
1 ` 28 July (15:20-16:10) Monty Hall Problem, Coin Toss Game Lecture 1
2 28 July (17:10-18:00) Online Hiring aka Secretory aka Dating Problem Lecture-2
3 29 July (16:15- 17:05) Vigenere Cipher and its cryptanalysis Lecture-3
4 3 August (14:25- 15:15) Germam Tank Problem Lecture-4
5 4 August (15:20- 16:10) German Tank Problem and One time pad Lecture-5
6 4 August (17:15- 18:00) One time pad continued Covered in notes of lecture-5
7 5 August (16:15- 17:05) Polynomial Evaluation Problem Lecture-7
8 10 August (14:25- 15:15) Matrix Multiplication Verification Lecture-8
9 11 August (15:20- 16:15) Tutorial 1 Discussion
10 11 August (17:15- 18:05) Tutorial 1 Discussion
11 12 August (16:15- 15:05) Matrix Multiplication Verification Lecture-9
12 18 August (15:20- 16:10) Zero Knowledge Proof Lecture-10
13 18 August (17:10- 18:00) Zero Knowledge Proof Lecture-11
13 19 August (16:15- 17:05) Contention Resolution Lecture-12
14 24 August (14:25- 15:15) Global min-cut Lecture-13&14
14 31 August (14:25- 15:15) Max- 3 SAT Problem Lecture-15
15 1 September (15:20- 16:15) Max-streak Problem Lecture-16 & 17
16 1 September (17:10- 18:00) Max-streak Problem Lecture - 16&17
16 2 September (16:15- 17:05) Randomised median finding and quick sort Lecture - 18
17 8 September (15:20- 14:15) Quick sort Lecture - 19
18 8 September (15:10- 16:00) Randomised closest pair solution Lecture - 20
19 9 September (16:15- 17:05) Randomised closest pair solution Same as Lecture-20
20 30 September (16:15- 17:05) Introduction to Caching Lecture - 21
21 5 October (14:25- 15:15) Introduction to Caching Lecture - 22
22 6 October (15:20- 16:10) Competetive Ratio for LRU caching Lecture 23
23 6 October (17:10- 16:00) Randomised LRU caching analysis Refer to handouts given in the class
24 7 October (16:15- 17:05) Chernoff Bound Lecture 25
25 19 October (14:25- 15:15) Load Balancing Lecture - 26
26 20 October (15:20- 16:10) Analysis of max load in balls and bin problem Lecture-27
27 20 October (17:10- 18:00) Birthday paradox, bucket sort, introduction to poisson distribution Lecture-28
28 21 October (16:15- 17:05) Chain Hashing Lecture-29
29 22 October (14:00- 16:00) Chain Hashing, Bloom Filters Lecture-30
30 27 October (15:20- 16:10) Analysis of Branching Process in Epidemics Lecture 31
31 28 October (17:10- 18:00) Analysis of coalescence process (Mitochondrial Eve) Lecture 32
32 2 November (14:25- 15:15) Cryptographic Hash Functions Lecture 33
33 3 November (15:20- 16:10) Hubs and Authorities with spectral analysis Lecture 34
34 3 November(17:10 - 18:00) Species Accumulation(Discovery) problem Lecture 35
35 4 November (16:15 - 17:05) PageRank Algorithm Lecture 36
36 10 November (15:20 - 16:10) Intoduction to Markov Chains Lecture 37
37 10 November (17:10 - 18:00) Markov Chains: 2 SAT Lecture 38
38 11 November (16:15 - 17:05) Image Surveillance using Gaussian Mixture Models Lecture 39
39 16 November (14:25 - 13:15) Digital halftoning Using Statistical Model Lecture 40
40 17 November (15:20 - 16:10) Hidden Markov Model Lecture 41
41 17 November (17:10 - 18:00) Tutorial Doubts, Test 2 distribution


Tutorials

  1. Tutorial 1 to be discussed on 11 August. Solutions
  2. Tutorial 2 to be discussed on 1 September. Solutions
  3. Tutorial 3
  4. Tutorial 4
  5. Tutorial 5
  6. Tutorial 6


QUESTION PAPERS

  1. Test1 organised on Sep 7.
  2. Mid-semester exam organised on 24Sep.
  3. Test2 organised on Nov 11.
  4. Major exam organised on 26 Nov.