University of L'Aquila
Department of Information Engineering Computer Science and Mathematics
Academic Year 2022/2023
|
(Click here) to get News
Web Algorithms (6 CREDITS).
Link to the Computer Science Degree official website (click here).
The course Distributed Systems and Web Algorithms (12 CFU) is divided into: Distributed Systems (6 CFU. By Prof. Proietti) and Web Algorithms (6 CFU).
Description:
The goal of this course is to provide students knowledge of advanced algorithmic techniques; ability to identify, formalize and solve optimization problems; concept of approximation;
knowledge of the web search and sponsored web search strategies in search engines.
Topics include:
- Review of computational complexity and intractability. Optimization problems. Approximation algorithms.
- Algorithmic techniques: greedy. local search, dynamic programming and linear programming.
- Polynomial Time Approximation Schemes (PTAS).
- Prestige and centrality indices in social networks.
- Web search: Pagerank, Topical Pagerank, TrustRank, Hubs and Authorities.
- Sponsored web search: matching markets and market clearing prices, auctions, mechanisms.
Timetable:
First semester (September 26, 2022 - January 13, 2023), Wednesday: 11.30 - 13.30 (room Digital class), and Friday: 11.30 - 13.30 (room Digital class).
Students' reception:
By appointment. Students are invited to arrange the day and time of the meeting by e-mail and therefore to send an e-mail preventively.
Course Material:
Lecture notes (slides) provided by the lecturer.
Additional didactic material:
- Vijay V. Vazirani, Approximation Algorithms , Springer. 2001. ISBN: 3-540-65367-8.
- G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela, M. Protasi, Complexity and Approximation , Springer. 1999. ISBN: 3-54065431-3.
- Jure Leskovec, Anand Rajaraman and Jeff Ullman, Mining of Massive Datasets, Stanford University. 2011. ISBN: 9781107015357 http://infolab.stanford.edu/~ullman/mmds/book.pdf.
- Soumen Chakrabarti, Mining the Web - Discovering Knowledge from Hypertext Data , Morgan Kaufmann. 2003. ISBN: 9781558607545.
- David Easley and Jon Kleinberg, Networks, Crowds, and Markets, Cambridge University Press. 2010. ISBN: 9780521195331 www.cs.cornell.edu/home/kleinber/networks-book/
Course Program:
September 28, 2022: Presentation of the course. Slides (Click here).
Computational Complexity: Problems in Computer Science, Problem Types (Decision, Search, Optimization), Complexity of Algorithms and Problems, Decision Problems and Complexity Classes, Clique problem, Vertex Cover problem, decision problems, non-deterministic algorithms for decision problems, non-deterministic algorithm for clique, efficiency and tractability, growth of polynomial functions with respect to exponential ones, robustness of the concept of polynomial time solvability, polynomially related codes, example, example on non-natural encoding. Slides (Click here).
October 5, 2022: Classes P and NP, NP-Complete problems. Optimization problems: definition, definition of the Clique optimization problem, definition of the Min Vertex Cover optimization problem, definition of the Min Traveling Salesman Problem (TSP), underlying decision problem, underlying decision problem of Max Clique, Complexity Classes of Optimization Problems: PO, NPO, NP-HARD problems, computational Complexity: NP-HARD problems. Approximation Algorithms: coping strategies for solving an NP-HARD problem, r-approximation algorithms for minimization and maximization problems, determination of the approximation factor, approximation algorithm for Min Vertex Cover, example of execution of the Approx-Cover algorithm, proof that the Approx-Cover algorithm is 2-approximating. Algorithmic techniques, Greedy: characteristics, Max 0-1 Knapsack, Greedy choice, Algorithm Greedy-Knapsack. Slides (Click here).
October 7, 2022: Proof that the Algorithm Greedy-Knapsack is not r-approximating for any given r<1. Modified Greedy-Knapsack algorithm, proof that Modified-Greedy-Knapsack algorithm is 1/2-approximating. Min Multiprocessor Scheduling, Greedy Algorithm of Graham, proof that the Greedy Algorithm of Graham is (2-1/h)-approximating, proof that the Greedy Algorithm of Graham is not r-approximating for r < 2-1/h, improving the approximation ratio, algorithm Ordered-Greedy, proof that Ordered-Greedy is (3/2 - 1/(2h))-approximating. Slides (Click here).
October 12, 2022: Max Cut, Algorithm Greedy for Max Cut, Example Greedy Choice, Proof that the Greedy Max Cut algorithm is 1/2-approximating, Min Weighted Set Cover, Greedy algorithm for Min Weighted Set Cover, Greedy Choice, Algorithm Greedy-Min-Weighted-Set-Cover. Slides (Click here).
Example of execution of the greedy algorithm for Min Weighted Set Cover. Slides (Click here).
October 14, 2022: Proof that the greedy algorithm for Min Weighted Set Cover is Hn-approximating, example showing that the approximation ratio of the greedy algorithm is at least Hn, conclusions on the greedy technique. Algorithmic techniques: local search; characteristics, scheme of a local search algorithm, complexity, approximation, neighborhood definition, extremal cases. Max Matching, Algorithm of Edmonds, Complexity, Max Cut, Local-Search Algorithm for Max Cut, Complexity, Example of execution of the Local-Search Algorithm for Max Cut. Slides (Click here).
October 19, 2022: Proof that the Local-Search Algorithm for Max Cut is 1/2-approximation, conclusions on local search. Algorithmic techniques: linear programming; characteristics, rounding, characteristics of rounding. Min Weighted Vertex Cover: ILP, LP, Algorithm Round-Vertex-Cover, Proof that the Round-Vertex-Cover Algorithm is a 2-approximation algorithm. Min Weighted Set Cover: ILP, LP, Algorithm Round-Set Cover, proof that Round-Set-Cover is f-approximating (f>=1). Slides (Click here).
October 21, 2022: Algorithmic techniques: linear programming; characteristics, primal-dual, General Primal-Dual, LP-duality Theorem, Weak duality theorem, Structure of optimal solutions, Complementary slackness conditions theorem, Observations, Relaxing the dual complementary conditions, Relaxed complementary conditions theorem, Primal-dual method goal, Primal and relaxed dual complementary conditions, Steps of an algorithm that follows the primal-dual method, Primal and dual formulation of Min Weighted Vertex Cover, Example. Slides (Click here).
October 26, 2022: Algorithmic techniques: linear programming; Algorithm Primal-Dual Vertex-Cover. Proof that Primal-Dual-Vertex-Cover is 2-approximating, examples of executions of the Primal-Dual Vertex-Cover algorithm. Slides (Click here).
October 28, 2022: Algorithmic techniques: dynamic programming (a simplified version); Characteristics, Dynamic programming algorithm scheme, MAX 0-1 Knapsack, Subproblems, Proof that Max 0-1 Knapsack satisfies the principle of optimality, Uneeded (dominated) solutions, Algorithm Dyn-Prog Knapsack, Proof that Algorithm Dyn-Prog Knapsack computes an optimal solution in pseudo-polynomial time, example of execution of the Dyn-Prog knapsack algorithm Slides (Click here).
November 2, 2022: Polynomial Time Approximation Schemes (PTAS): Definition, Example, Remarks. Min Multiprocessor Scheduling: recalling Graham's greedy algorithm, idea underlying PTAS, Algorithm PTAS-Scheduling, PTAS-Scheduling always returns a (1+epsilon)-approximate solution, Complexity of Algorithm PTAS-Scheduling, Min h-Processor Scheduling problem. PTAS-Scheduling is a PTAS for Min h-Processor Scheduling, Min Partition problem. Fully polynomial time approximation schemes (FPTAS): definition, remarks, examples. FPTAS-Knapsack: idea, complexity, approximation. Algorithm FPTAS-Knapsack: pseudocode, complexity. Slides (Click here).
November 4, 2022: Fully polynomial time approximation schemes (FPTAS): Algorithm FPTAS-Knapsack: approximation or error, remarks, FPTAS-Knapsack is a FPTAS for Max 0-1 Knapsack. Alternative approaches: Guaranteed Performance, Restriction of the set of the instances, Average or probabilistic analysis, heuristics, randomized algorithms, r-approximating randomized algorithms. Max Weighted Cut, Algorithm Random Cut, Random-Cut is a 1/2-approximation algorithm. Slides (Click here).
November 9, 2022: Exercises before mid-term examination (first part). Slides (Click here).
November 11, 2022: Exercises before mid-term examination (second part). Slides (Click here). (Notice that these slides contain also the exercises of the previous lecture)
November 16, 2022: Mid-term examination.
November 18, 2022: Mid-term examination solutions. Slides (Click here).
November 23, 2022: Web Search: social science and bibliometry, centrality, centrality measure based on the concept of radius, degree centrality, closeness centrality, betweenness centrality, examples, co-citation index, prestige. Prestige scores, eigenvectors and eigenvalues, power iteration method. Slides (Click here).
Example about co-citation index. Slides (Click here).
November 25, 2022: Web Search: why power iteration works, how to compute the prestige vector, web and link popularity, ranking algorithms based on the link analysis, PageRank, Example of PageRank algorithm, final calculation of PageRank, problems with PageRank: dead ends and spider traps. How to cope with dead ends and spider traps: teleporting. Damping factor and modified PageRank. Slides (Click here)
November 30, 2022: Web Search: topic-Specific PageRank. HITS Algorithm: root set and expanded set, authorities and hubs, counting in-links (authority), expert quality (hub), mutually recursive definition. Authority scores and hub scores, example of HITS, power iteration method to solve the HITS algorithm, pro and cons of the HITS algorithm, problems with the HITS algorithm. What is Web Spam, First Spammers, First Techniques to make Spam. Solution to term spam, why it works, Link Spamming, Link Farms. Slides (Click here).
Example about HITS: execution. Slides (Click here).
December 2, 2022: Web Search. TrustRank: Combating the Web Spam, idea, Trust Propagation, Simple Model of Trust Propagation, Trust Attenuation, Trust Splitting, Picking the Seed Set, Approaches to Picking Seed Set. Slides (Click here).
Web search exercises. Slides (Click here).
December 7, 2022: Sponsored Search. Search and Advertising: history, combining search and advertising, query or keyword-based advertising, pay-per-click, advertising as matching market, simplifying assumptions, remarks. Matching Markets: basic principles; 1st scenario: room assigning. Bipartite graphs, perfect matching, constricted sets, the Hall's matching theorem, proof of the Hall's matching theorem. Slides (Click here).
December 9, 2022: Sponsored Search. Computing a perfect matching, computing a maximum matching, remarks. Back to assigning rooms: refinement with valuations, optimal assignment. Market-clearing prices. 2nd scenario: House Sales: payoffs, preferred-seller graph. Pricing examples. Theorem (with proof) of existence market-clearing prices. Slides (Click here).
December 14, 2022: Sponsored Search. Optimality Market-Clearing Prices, social welfare, total revenue of sellers. Theorem (with proof) asserting that for any set of market-clearing prices any perfect matching M in the resulting preferred-seller graph G has maximum social welfare. Back to sponsored search: best matching, ad quality, quality factor, how quality factor is computed. Quality factors on keyword-based search, complex queries and interaction with keywords, problems from advertiser and search engine’s perspective, deciding which ads to show, agreements between advertisers and search engines, final remarks on markets. Auctions: Introduction. Types of auctions: ascending-bid, descending-bid, first-price sealed-bid, second-price sealed-bid. Relationship between auctions and properties. Slides (Click here).
December 16, 2022: Sponsored Search. Proof that in second price auctions it is a dominant strategy to bid truthfully, performance of second price auctions, conclusions. Slides (Click here).
Sponsored search exercises (first part). Slides (Click here).
December 21, 2022: Sponsored search exercises (last part). Slides (Click here).
News:
September 28, 2022: The lecture of Friday September 30, 2022, is cancelled due to the UNIVAQ street science.
October 25, 2022: Due to the teacher's indisposition, the "Web Algorithms" lesson, prof. G. Monaco, of Wednesday October 26, 2022 will take place only in remote mode on the course TEAM. TEAM Name: Web Algorithms - A.A. 2022/2023. TEAM Code: q76j4py. The lesson will start at 12:00 in order to facilitate the students attendance.
October 27, 2022: Due to the teacher's indisposition, the "Web Algorithms" lesson, prof. G. Monaco, of Friday October 28, 2022 will take place only in remote mode on the course TEAM. TEAM Name: Web Algorithms - A.A. 2022/2023. TEAM Code: q76j4py. The lesson will start at 12:00 in order to facilitate the students attendance.
October 28, 2022: Click here to download some of the past exam and exercise assignments.
October 31, 2022: Due to the teacher's indisposition, the "Web Algorithms" lesson, prof. G. Monaco, of Wednesday November 2, 2022 will take place only in remote mode on the course TEAM. TEAM Name: Web Algorithms - A.A. 2022/2023. TEAM Code: q76j4py. The lesson will start at 12:00 in order to facilitate the students attendance.
November 2, 2022: The Mid-term written exam date is Wednesday November 16, time: 11.30, room: Digital class.
Students that want to take part in the Mid-term examination have to notify it by email to gianpiero.monaco@univaq.it by November 13.
November 17, 2022: Available the results of the Mid-term exam of November 16, 2022.
November 17, 2022: Click here to download the assignment of the Mid-term exam of November 16, 2022.
December 2, 2022: On the exam there will be no question about: - Proof that power iteration converges; - Trustrank.
December 20, 2022: Final exams A.Y. 2022/2023: (Period: January 16, 2023 - February 24, 2023)
1) Monday January 16, 2023. Time: 10.00, room: Digital class.
2) Monday January 30, 2023. Time: 10.00, room: Digital class.
3) Monday February 13, 2023. Time: 10.00, room: Digital class.
To be able to sit an exam session, students must register within the deadlines provided through the online system (segreteria virtuale). Students having the course "Distributed Systems and Web Algorithms" (12 CFU), to be able to sit an exam session have to also send an email to gianpiero.monaco@univaq.it by four days before the exam date.
December 21, 2022: Classes are over.
January 17, 2023: Available the results of the exam of January 16, 2023. Students are invited to read carefully and follow the instructions written on the document.
January 22, 2023: Click here to download the assignment of the exam of January 16, 2023.
January 31, 2023: Available the results of the exam of January 30, 2023. Students are invited to read carefully and follow the instructions written on the document.
February 14, 2023: Available the results of the exam of February 13, 2023. Students are invited to read carefully and follow the instructions written on the document.
February 21, 2023: Click here to download the assignment of the exam of January 30, 2023.
February 21, 2023: Click here to download the assignment of the exam of February 13, 2023.
May 25, 2023: Final exams A.Y. 2022/2023: (Period: June 12, 2023 - July 28, 2023)
1) Wednesday June 21, 2023. Time: 10.00.
2) Wednesday July 12, 2023. Time: 09.30.
To be able to sit an exam session, students must register within the deadlines provided through the online system (segreteria virtuale). Students having the course "Distributed Systems and Web Algorithms" (12 CFU), to be able to sit an exam session have to also send an email to gianpiero.monaco@univaq.it by four days before the exam date.