-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsearch.xml
486 lines (233 loc) · 259 KB
/
search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title>post</title>
<link href="/2025/01/05/cs61a/listandnonlocal/"/>
<url>/2025/01/05/cs61a/listandnonlocal/</url>
<content type="html"><![CDATA[<h1 id="List"><a href="#List" class="headerlink" title="List"></a>List</h1><pre><code class="lang-python">a = [1,2,3]b = ab[0] = 4a>>>[4,2,3]</code></pre><p>Because a points an array, and b points to the same array. a and b are independent, but they are able to access the same array so they can change the array</p><h1 id="Nonlocal"><a href="#Nonlocal" class="headerlink" title="Nonlocal"></a>Nonlocal</h1><p>Nonlocal is a good way to change variables in the frame of parents, especially the frame of parents is not global frame</p>]]></content>
<categories>
<category> CS61A </category>
</categories>
<tags>
<tag> List </tag>
<tag> Nonlocal </tag>
</tags>
</entry>
<entry>
<title>Array</title>
<link href="/2024/12/30/cs61a/array/"/>
<url>/2024/12/30/cs61a/array/</url>
<content type="html"><![CDATA[<h1 id="Array"><a href="#Array" class="headerlink" title="Array"></a>Array</h1><pre><code class="lang-python"> fastest = [[] for _ in player_indices]</code></pre><p>A smart way to create a 2D array in Python</p>]]></content>
<categories>
<category> CS61A </category>
</categories>
</entry>
<entry>
<title>Recursion</title>
<link href="/2024/12/25/cs61a/recursion/"/>
<url>/2024/12/25/cs61a/recursion/</url>
<content type="html"><![CDATA[<h1 id="Recursion"><a href="#Recursion" class="headerlink" title="Recursion"></a>Recursion</h1><pre><code class="lang-python">def has_path(t, word): """Return whether there is a path in a tree where the entries along the path spell out a particular word. >>> greetings = tree('h', [tree('i'), ... tree('e', [tree('l', [tree('l', [tree('o')])]), ... tree('y')])]) >>> print_tree(greetings) h i e l l o y >>> has_path(greetings, 'h') True >>> has_path(greetings, 'i') False >>> has_path(greetings, 'hi') True >>> has_path(greetings, 'hello') True >>> has_path(greetings, 'hey') True >>> has_path(greetings, 'bye') False """ assert len(word) > 0, 'no path for empty word.' "*** YOUR CODE HERE ***" if len(word) == 1: return word == label(t) return any([has_path(b, word[1:]) for b in branches(t)])</code></pre><p>I do not understand why it will check the first character of the word with the label of the tree. I think it should check the last character of the word with the label of the tree.</p><p>But it is. So maybe in the future, I will understand it. According to ChatGPT, it says it will check every character implicitly when we use recursion with <code>word[1:]</code></p>]]></content>
<categories>
<category> CS61A </category>
</categories>
<tags>
<tag> CS61A </tag>
<tag> Recursion </tag>
</tags>
</entry>
<entry>
<title>Higher-order Function</title>
<link href="/2024/12/23/cs61a/higher-order-function/"/>
<url>/2024/12/23/cs61a/higher-order-function/</url>
<content type="html"><![CDATA[<h1 id="Higher-order-Function"><a href="#Higher-order-Function" class="headerlink" title="Higher-order Function"></a>Higher-order Function</h1><p>Function that takes a function as an argument or returns a function as a result.</p><pre><code class="lang-python">def square(x): return x * x def make_adder(n): def adder (k): return k+n return adderdef compose1(f,g): def h (x): return f(g(x)) return hcompose1(square,make_adder(2)(3))</code></pre><p><img src="code-of-higher-order-function.jpg" alt="Higher-order Function"></p><p>Pay attention to the order of the function, the inner function is executed first, then the outer function is executed.</p><p>And local frame and global frame are different, the local frame is created when the function is called, and the global frame is created when the program is executed.</p><p>There are different ways to call a function, such as <code>f(x)</code> and <code>f(x)(y)</code>, the first one is to call the function <code>f</code> with the argument <code>x</code>, and the second one is to call the function <code>f(x)</code> with the argument <code>y</code>.</p><p>I think the value of Higher-order function is to split one function into two functions or more to make them more flexible<br>for example:<br>We can use Higher-order function to make an achievement:<br>Pow(2,3) by pow(2)(3)<br>And then, we can add some more conditions inside functions</p><h1 id="Nested-Function"><a href="#Nested-Function" class="headerlink" title="Nested Function"></a>Nested Function</h1><p>A nested function is a function that is defined inside another function. It is used to define a function that is used only within the scope of the outer function.</p><p>And pay attention to the parent of the frame, the parent of the frame is the frame that is created when the function is called, and the parent of the parent of the frame is the frame that is created when the outer function is called.</p><h1 id="Decorator"><a href="#Decorator" class="headerlink" title="Decorator"></a>Decorator</h1><p>A decorator is a function that takes a function as an argument and returns a function as a result. It is used to add functionality to an existing function without modifying the function itself.</p><p>It is another form of Higher-order Function, and it is used to add functionality to an existing function without modifying the function itself.</p><pre><code class="lang-python">def trace1(fn): def traced(x): print('Calling', fn, 'on argument', x) return fn(x) return traced@trace1def square(x): return x * xdef square(x): return x * xsquare = trace1(square)</code></pre><p>These two forms are equivalent, the first one is to use the <code>@</code> symbol to decorate the function, and the second one is to use the assignment statement to decorate the function which is a usage of Higher-order Function.</p><h1 id="Lambda-Expression"><a href="#Lambda-Expression" class="headerlink" title="Lambda Expression"></a>Lambda Expression</h1><p>A lambda expression is a function that is defined using the <code>lambda</code> keyword, which is a short way to define a function. It is used to define a function that is used only once and does not have a name.</p>]]></content>
<categories>
<category> CS61A </category>
</categories>
<tags>
<tag> CS61A </tag>
<tag> Higher-order Function </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture11: Q&A</title>
<link href="/2024/12/19/missing-semester/missing-semester-lecture11/"/>
<url>/2024/12/19/missing-semester/missing-semester-lecture11/</url>
<content type="html"><![CDATA[<h2 id="Any-recommendations-on-learning-Operating-Systems-related-topics-like-processes-virtual-memory-interrupts-memory-management-etc"><a href="#Any-recommendations-on-learning-Operating-Systems-related-topics-like-processes-virtual-memory-interrupts-memory-management-etc" class="headerlink" title="Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc"></a>Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc</h2><p>First, it is unclear whether you actually need to be very familiar with all of these topics since they are very low level topics. They will matter as you start writing more low level code like implementing or modifying a kernel. Otherwise, most topics will not be relevant, with the exception of processes and signals that were briefly covered in other lectures.</p><p>Some good resources to learn about this topic:</p><ul><li><a href="https://pdos.csail.mit.edu/6.828/">MIT’s 6.828 class</a> - Graduate level class on Operating System Engineering. Class materials are publicly available.</li><li>Modern Operating Systems (4th ed) - by Andrew S. Tanenbaum is a good overview of many of the mentioned concepts.</li><li>The Design and Implementation of the FreeBSD Operating System - A good resource about the FreeBSD OS (note that this is not Linux).</li><li>Other guides like <a href="https://os.phil-opp.com/">Writing an OS in Rust</a> where people implement a kernel step by step in various languages, mostly for teaching purposes.</li></ul><h2 id="What-are-some-of-the-tools-you’d-prioritize-learning-first"><a href="#What-are-some-of-the-tools-you’d-prioritize-learning-first" class="headerlink" title="What are some of the tools you’d prioritize learning first?"></a>What are some of the tools you’d prioritize learning first?</h2><p>Some topics worth prioritizing:</p><ul><li>Learning how to use your keyboard more and your mouse less. This can be through keyboard shortcuts, changing interfaces, &c.</li><li>Learning your editor well. As a programmer most of your time is spent editing files so it really pays off to learn this skill well.</li><li>Learning how to automate and/or simplify repetitive tasks in your workflow because the time savings will be enormous…</li><li>Learning about version control tools like Git and how to use it in conjunction with GitHub to collaborate in modern software projects.</li></ul><h2 id="When-do-I-use-Python-versus-a-Bash-scripts-versus-some-other-language"><a href="#When-do-I-use-Python-versus-a-Bash-scripts-versus-some-other-language" class="headerlink" title="When do I use Python versus a Bash scripts versus some other language?"></a>When do I use Python versus a Bash scripts versus some other language?</h2><p>In general, bash scripts are useful for short and simple one-off scripts when you just want to run a specific series of commands. bash has a set of oddities that make it hard to work with for larger programs or scripts:</p><ul><li>bash is easy to get right for a simple use case but it can be really hard to get right for all possible inputs. For example, spaces in script arguments have led to countless bugs in bash scripts.</li><li>bash is not amenable to code reuse so it can be hard to reuse components of previous programs you have written. More generally, there is no concept of software libraries in bash.</li><li>bash relies on many magic strings like <code>$?</code> or <code>$@</code> to refer to specific values, whereas other languages refer to them explicitly, like <code>exitCode</code> or <code>sys.args</code> respectively.</li></ul><p>Therefore, for larger and/or more complex scripts we recommend using more mature scripting languages like Python or Ruby. You can find online countless libraries that people have already written to solve common problems in these languages. If you find a library that implements the specific functionality you care about in some language, usually the best thing to do is to just use that language.</p><h2 id="What-is-the-difference-between-source-script-sh-and-script-sh"><a href="#What-is-the-difference-between-source-script-sh-and-script-sh" class="headerlink" title="What is the difference between source script.sh and ./script.sh"></a>What is the difference between <code>source script.sh</code> and <code>./script.sh</code></h2><p>In both cases the <code>script.sh</code> will be read and executed in a bash session, the difference lies in which session is running the commands. For <code>source</code> the commands are executed in your current bash session and thus any changes made to the current environment, like changing directories or defining functions will persist in the current session once the <code>source</code> command finishes executing. When running the script standalone like <code>./script.sh</code>, your current bash session starts a new instance of bash that will run the commands in <code>script.sh</code>. Thus, if <code>script.sh</code> changes directories, the new bash instance will change directories but once it exits and returns control to the parent bash session, the parent session will remain in the same place. Similarly, if <code>script.sh</code> defines a function that you want to access in your terminal, you need to <code>source</code> it for it to be defined in your current bash session. Otherwise, if you run it, the new bash process will be the one to process the function definition instead of your current shell.</p><h2 id="What-are-the-places-where-various-packages-and-tools-are-stored-and-how-does-referencing-them-work-What-even-is-bin-or-lib"><a href="#What-are-the-places-where-various-packages-and-tools-are-stored-and-how-does-referencing-them-work-What-even-is-bin-or-lib" class="headerlink" title="What are the places where various packages and tools are stored and how does referencing them work? What even is /bin or /lib?"></a>What are the places where various packages and tools are stored and how does referencing them work? What even is <code>/bin</code> or <code>/lib</code>?</h2><p>Regarding programs that you execute in your terminal, they are all found in the directories listed in your <code>PATH</code> environment variable and you can use the <code>which</code> command (or the <code>type</code> command) to check where your shell is finding a specific program. In general, there are some conventions about where specific types of files live. Here are some of the ones we talked about, check the <a href="https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard">Filesystem, Hierarchy Standard</a> for a more comprehensive list.</p><ul><li><p><code>/bin</code> - Essential command binaries</p></li><li><p><code>/sbin</code> - Essential system binaries, usually to be run by root</p></li><li><p><code>/dev</code> - Device files, special files that often are interfaces to hardware devices</p></li><li><p><code>/etc</code> - Host-specific system-wide configuration files</p></li><li><p><code>/home</code> - Home directories for users in the system</p></li><li><p><code>/lib</code> - Common libraries for system programs</p></li><li><p><code>/opt</code> - Optional application software</p></li><li><p><code>/sys</code> - Contains information and configuration for the system (covered in the <a href="https://missing.csail.mit.edu/2020/course-shell/">first lecture</a>)</p></li><li><p><code>/tmp</code> - Temporary files (also <code>/var/tmp</code>). Usually deleted between reboots.</p></li><li><pre><code class="lang-plaintext">/usr/</code></pre></li></ul><p> - Read only user data</p><ul><li><code>/usr/bin</code> - Non-essential command binaries</li><li><code>/usr/sbin</code> - Non-essential system binaries, usually to be run by root</li><li><code>/usr/local/bin</code> - Binaries for user compiled programs</li></ul><ul><li><code>/var</code> - Variable files like logs or caches</li></ul><h2 id="Should-I-apt-get-install-a-python-whatever-or-pip-installwhatever-package"><a href="#Should-I-apt-get-install-a-python-whatever-or-pip-installwhatever-package" class="headerlink" title="Should I apt-get install a python-whatever, or pip installwhatever package?"></a>Should I <code>apt-get install</code> a python-whatever, or <code>pip install</code>whatever package?</h2><p>There’s no universal answer to this question. It’s related to the more general question of whether you should use your system’s package manager or a language-specific package manager to install software. A few things to take into account:</p><ul><li>Common packages will be available through both, but less popular ones or more recent ones might not be available in your system package manager. In this case, using the language-specific tool is the better choice.</li><li>Similarly, language-specific package managers usually have more up to date versions of packages than system package managers.</li><li>When using your system package manager, libraries will be installed system wide. This means that if you need different versions of a library for development purposes, the system package manager might not suffice. For this scenario, most programming languages provide some sort of isolated or virtual environment so you can install different versions of libraries without running into conflicts. For Python, there’s virtualenv, and for Ruby, there’s RVM.</li><li>Depending on the operating system and the hardware architecture, some of these packages might come with binaries or might need to be compiled. For instance, in ARM computers like the Raspberry Pi, using the system package manager can be better than the language specific one if the former comes in form of binaries and the latter needs to be compiled. This is highly dependent on your specific setup.</li></ul><p>You should try to use one solution or the other and not both since that can lead to conflicts that are hard to debug. Our recommendation is to use the language-specific package manager whenever possible, and to use isolated environments (like Python’s virtualenv) to avoid polluting the global environment.</p><h2 id="What’s-the-easiest-and-best-profiling-tools-to-use-to-improve-performance-of-my-code"><a href="#What’s-the-easiest-and-best-profiling-tools-to-use-to-improve-performance-of-my-code" class="headerlink" title="What’s the easiest and best profiling tools to use to improve performance of my code?"></a>What’s the easiest and best profiling tools to use to improve performance of my code?</h2><p>The easiest tool that is quite useful for profiling purposes is <a href="https://missing.csail.mit.edu/2020/debugging-profiling/#timing">print timing</a>. You just manually compute the time taken between different parts of your code. By repeatedly doing this, you can effectively do a binary search over your code and find the segment of code that took the longest.</p><p>For more advanced tools, Valgrind’s <a href="http://valgrind.org/docs/manual/cl-manual.html">Callgrind</a> lets you run your program and measure how long everything takes and all the call stacks, namely which function called which other function. It then produces an annotated version of your program’s source code with the time taken per line. However, it slows down your program by an order of magnitude and does not support threads. For other cases, the <a href="http://www.brendangregg.com/perf.html"><code>perf</code></a> tool and other language specific sampling profilers can output useful data pretty quickly. <a href="http://www.brendangregg.com/flamegraphs.html">Flamegraphs</a> are a good visualization tool for the output of said sampling profilers. You should also try to use specific tools for the programming language or task you are working with. For example, for web development, the dev tools built into Chrome and Firefox have fantastic profilers.</p><p>Sometimes the slow part of your code will be because your system is waiting for an event like a disk read or a network packet. In those cases, it is worth checking that back-of-the-envelope calculations about the theoretical speed in terms of hardware capabilities do not deviate from the actual readings. There are also specialized tools to analyze the wait times in system calls. These include tools like <a href="http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html">eBPF</a> that perform kernel tracing of user programs. In particular <a href="https://github.com/iovisor/bpftrace"><code>bpftrace</code></a> is worth checking out if you need to perform this sort of low level profiling.</p><h2 id="What-browser-plugins-do-you-use"><a href="#What-browser-plugins-do-you-use" class="headerlink" title="What browser plugins do you use?"></a>What browser plugins do you use?</h2><p>Some of our favorites, mostly related to security and usability:</p><ul><li><a href="https://github.com/gorhill/uBlock">uBlock Origin</a> - It is a <a href="https://github.com/gorhill/uBlock/wiki/Blocking-mode">wide-spectrum</a> blocker that doesn’t just stop ads, but all sorts of third-party communication a page may try to do. This also covers inline scripts and other types of resource loading. If you’re willing to spend some time on configuration to make things work, go to <a href="https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode">medium mode</a> or even <a href="https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode">hard mode</a>. Those will make some sites not work until you’ve fiddled with the settings enough, but will also significantly improve your online security. Otherwise, the <a href="https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode">easy mode</a> is already a good default that blocks most ads and tracking. You can also define your own rules about what website objects to block.</li><li><a href="https://github.com/openstyles/stylus/">Stylus</a> - a fork of Stylish (don’t use Stylish, it was shown to <a href="https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/">steal users’ browsing history</a>), allows you to sideload custom CSS stylesheets to websites. With Stylus you can easily customize and modify the appearance of websites. This can be removing a sidebar, changing the background color or even the text size or font choice. This is fantastic for making websites that you visit frequently more readable. Moreover, Stylus can find styles written by other users and published in <a href="https://userstyles.org/">userstyles.org</a>. Most common websites have one or several dark theme stylesheets for instance.</li><li>Full Page Screen Capture - <a href="https://screenshots.firefox.com/">Built into Firefox</a> and <a href="https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en">Chrome extension</a>. Lets you take a screenshot of a full website, often much better than printing for reference purposes.</li><li><a href="https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/">Multi Account Containers</a> - lets you separate cookies into “containers”, allowing you to browse the web with different identities and/or ensuring that websites are unable to share information between them.</li><li>Password Manager Integration - Most password managers have browser extensions that make inputting your credentials into websites not only more convenient but also more secure. Compared to simply copy-pasting your user and password, these tools will first check that the website domain matches the one listed for the entry, preventing phishing attacks that impersonate popular websites to steal credentials.</li><li><a href="https://github.com/philc/vimium">Vimium</a> - A browser extension that provides keyboard-based navigation and control of the web in the spirit of the Vim editor.</li></ul><h2 id="What-are-other-useful-data-wrangling-tools"><a href="#What-are-other-useful-data-wrangling-tools" class="headerlink" title="What are other useful data wrangling tools?"></a>What are other useful data wrangling tools?</h2><p>Some of the data wrangling tools we did not have time to cover during the data wrangling lecture include <code>jq</code> or <code>pup</code> which are specialized parsers for JSON and HTML data respectively. The Perl programming language is another good tool for more advanced data wrangling pipelines. Another trick is the <code>column -t</code> command that can be used to convert whitespace text (not necessarily aligned) into properly column aligned text.</p><p>More generally a couple of more unconventional data wrangling tools are vim and Python. For some complex and multi-line transformations, vim macros can be a quite invaluable tool to use. You can just record a series of actions and repeat them as many times as you want, for instance in the editors <a href="https://missing.csail.mit.edu/2020/editors/#macros">lecture notes</a> (and last year’s <a href="https://missing.csail.mit.edu/2019/editors/">video</a>) there is an example of converting an XML-formatted file into JSON just using vim macros.</p><p>For tabular data, often presented in CSVs, the <a href="https://pandas.pydata.org/">pandas</a> Python library is a great tool. Not only because it makes it quite easy to define complex operations like group by, join or filters; but also makes it quite easy to plot different properties of your data. It also supports exporting to many table formats including XLS, HTML or LaTeX. Alternatively the R programming language (an arguably <a href="http://arrgh.tim-smith.us/">bad</a>programming language) has lots of functionality for computing statistics over data and can be quite useful as the last step of your pipeline. <a href="https://ggplot2.tidyverse.org/">ggplot2</a> is a great plotting library in R.</p><h2 id="What-is-the-difference-between-Docker-and-a-Virtual-Machine"><a href="#What-is-the-difference-between-Docker-and-a-Virtual-Machine" class="headerlink" title="What is the difference between Docker and a Virtual Machine?"></a>What is the difference between Docker and a Virtual Machine?</h2><p>Docker is based on a more general concept called containers. The main difference between containers and virtual machines is that virtual machines will execute an entire OS stack, including the kernel, even if the kernel is the same as the host machine. Unlike VMs, containers avoid running another instance of the kernel and instead share the kernel with the host. In Linux, this is achieved through a mechanism called LXC, and it makes use of a series of isolation mechanisms to spin up a program that thinks it’s running on its own hardware but it’s actually sharing the hardware and kernel with the host. Thus, containers have a lower overhead than a full VM. On the flip side, containers have a weaker isolation and only work if the host runs the same kernel. For instance if you run Docker on macOS, Docker needs to spin up a Linux virtual machine to get an initial Linux kernel and thus the overhead is still significant. Lastly, Docker is a specific implementation of containers and it is tailored for software deployment. Because of this, it has some quirks: for example, Docker containers will not persist any form of storage between reboots by default.</p><h2 id="What-are-the-advantages-and-disadvantages-of-each-OS-and-how-can-we-choose-between-them-e-g-choosing-the-best-Linux-distribution-for-our-purposes"><a href="#What-are-the-advantages-and-disadvantages-of-each-OS-and-how-can-we-choose-between-them-e-g-choosing-the-best-Linux-distribution-for-our-purposes" class="headerlink" title="What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)?"></a>What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)?</h2><p>Regarding Linux distros, even though there are many, many distros, most of them will behave fairly identically for most use cases. Most of Linux and UNIX features and inner workings can be learned in any distro. A fundamental difference between distros is how they deal with package updates. Some distros, like Arch Linux, use a rolling update policy where things are bleeding-edge but things might break every so often. On the other hand, some distros like Debian, CentOS or Ubuntu LTS releases are much more conservative with releasing updates in their repositories so things are usually more stable at the expense of sacrificing newer features. Our recommendation for an easy and stable experience with both desktops and servers is to use Debian or Ubuntu.</p><p>Mac OS is a good middle point between Windows and Linux that has a nicely polished interface. However, Mac OS is based on BSD rather than Linux, so some parts of the system and commands are different. An alternative worth checking is FreeBSD. Even though some programs will not run on FreeBSD, the BSD ecosystem is much less fragmented and better documented than Linux. We discourage Windows for anything but for developing Windows applications or if there is some deal breaker feature that you need, like good driver support for gaming.</p><p>For dual boot systems, we think that the most working implementation is macOS’ bootcamp and that any other combination can be problematic on the long run, specially if you combine it with other features like disk encryption.</p><h2 id="Vim-vs-Emacs"><a href="#Vim-vs-Emacs" class="headerlink" title="Vim vs Emacs?"></a>Vim vs Emacs?</h2><p>The three of us use vim as our primary editor but Emacs is also a good alternative and it’s worth trying both to see which works better for you. Emacs does not follow vim’s modal editing, but this can be enabled through Emacs plugins like <a href="https://github.com/emacs-evil/evil">Evil</a> or <a href="https://github.com/hlissner/doom-emacs">Doom Emacs</a>. An advantage of using Emacs is that extensions can be implemented in Lisp, a better scripting language than vimscript, Vim’s default scripting language.</p><h2 id="Any-tips-or-tricks-for-Machine-Learning-applications"><a href="#Any-tips-or-tricks-for-Machine-Learning-applications" class="headerlink" title="Any tips or tricks for Machine Learning applications?"></a>Any tips or tricks for Machine Learning applications?</h2><p>Some of the lessons and takeaways from this class can directly be applied to ML applications. As it is the case with many science disciplines, in ML you often perform a series of experiments and want to check what things worked and what didn’t. You can use shell tools to easily and quickly search through these experiments and aggregate the results in a sensible way. This could mean subselecting all experiments in a given time frame or that use a specific dataset. By using a simple JSON file to log all relevant parameters of the experiments, this can be incredibly simple with the tools we covered in this class. Lastly, if you do not work with some sort of cluster where you submit your GPU jobs, you should look into how to automate this process since it can be a quite time consuming task that also eats away your mental energy.</p><h2 id="Any-more-Vim-tips"><a href="#Any-more-Vim-tips" class="headerlink" title="Any more Vim tips?"></a>Any more Vim tips?</h2><p>A few more tips:</p><ul><li>Plugins - Take your time and explore the plugin landscape. There are a lot of great plugins that address some of vim’s shortcomings or add new functionality that composes well with existing vim workflows. For this, good resources are <a href="https://vimawesome.com/">VimAwesome</a> and other programmers’ dotfiles.</li><li>Marks - In vim, you can set a mark doing <code>m<X></code> for some letter <code>X</code>. You can then go back to that mark doing <code>'<X></code>. This lets you quickly navigate to specific locations within a file or even across files.</li><li>Navigation - <code>Ctrl+O</code> and <code>Ctrl+I</code> move you backward and forward respectively through your recently visited locations.</li><li>Undo Tree - Vim has a quite fancy mechanism for keeping track of changes. Unlike other editors, vim stores a tree of changes so even if you undo and then make a different change you can still go back to the original state by navigating the undo tree. Some plugins like <a href="https://github.com/sjl/gundo.vim">gundo.vim</a> and <a href="https://github.com/mbbill/undotree">undotree</a> expose this tree in a graphical way.</li><li>Undo with time - The <code>:earlier</code> and <code>:later</code> commands will let you navigate the files using time references instead of one change at a time.</li><li><a href="https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo">Persistent undo</a> is an amazing built-in feature of vim that is disabled by default. It persists undo history between vim invocations. By setting <code>undofile</code> and <code>undodir</code> in your <code>.vimrc</code>, vim will store a per-file history of changes.</li><li>Leader Key - The leader key is a special key that is often left to the user to be configured for custom commands. The pattern is usually to press and release this key (often the space key) and then some other key to execute a certain command. Often, plugins will use this key to add their own functionality, for instance the UndoTree plugin uses <code><Leader> U</code> to open the undo tree.</li><li>Advanced Text Objects - Text objects like searches can also be composed with vim commands. E.g. <code>d/<pattern></code> will delete to the next match of said pattern or <code>cgn</code> will change the next occurrence of the last searched string.</li></ul><h2 id="What-is-2FA-and-why-should-I-use-it"><a href="#What-is-2FA-and-why-should-I-use-it" class="headerlink" title="What is 2FA and why should I use it?"></a>What is 2FA and why should I use it?</h2><p>Two Factor Authentication (2FA) adds an extra layer of protection to your accounts on top of passwords. In order to login, you not only have to know some password, but you also have to “prove” in some way you have access to some hardware device. In the most simple case, this can be achieved by receiving an SMS on your phone, although there are <a href="https://www.kaspersky.com/blog/2fa-practical-guide/24219/">known issues</a> with SMS 2FA. A better alternative we endorse is to use a <a href="https://en.wikipedia.org/wiki/Universal_2nd_Factor">U2F</a> solution like <a href="https://www.yubico.com/">YubiKey</a>.</p><h2 id="Any-comments-on-differences-between-web-browsers"><a href="#Any-comments-on-differences-between-web-browsers" class="headerlink" title="Any comments on differences between web browsers?"></a>Any comments on differences between web browsers?</h2><p>The current landscape of browsers as of 2020 is that most of them are like Chrome because they use the same engine (Blink). This means that Microsoft Edge which is also based on Blink, and Safari, which is based on WebKit, a similar engine to Blink, are just worse versions of Chrome. Chrome is a reasonably good browser both in terms of performance and usability. Should you want an alternative, Firefox is our recommendation. It is comparable to Chrome in pretty much every regard and it excels for privacy reasons. Another browser called <a href="https://www.ekioh.com/flow-browser/">Flow</a> is not user ready yet, but it is implementing a new rendering engine that promises to be faster than the current ones.</p>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> Q&A </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture10: Potpourri</title>
<link href="/2024/12/19/missing-semester/missing-semester-lecture10/"/>
<url>/2024/12/19/missing-semester/missing-semester-lecture10/</url>
<content type="html"><![CDATA[<h2 id="Keyboard-remapping"><a href="#Keyboard-remapping" class="headerlink" title="Keyboard remapping"></a>Keyboard remapping</h2><p>As a programmer, your keyboard is your main input method. As with pretty much anything in your computer, it is configurable (and worth configuring).</p><p>The most basic change is to remap keys. This usually involves some software that is listening and, whenever a certain key is pressed, it intercepts that event and replaces it with another event corresponding to a different key. Some examples:</p><ul><li>Remap Caps Lock to Ctrl or Escape. We (the instructors) highly encourage this setting since Caps Lock has a very convenient location but is rarely used.</li><li>Remapping PrtSc to Play/Pause music. Most OSes have a play/pause key.</li><li>Swapping Ctrl and the Meta (Windows or Command) key.</li></ul><p>You can also map keys to arbitrary commands of your choosing. This is useful for common tasks that you perform. Here, some software listens for a specific key combination and executes some script whenever that event is detected.</p><ul><li>Open a new terminal or browser window.</li><li>Inserting some specific text, e.g. your long email address or your MIT ID number.</li><li>Sleeping the computer or the displays.</li></ul><p>There are even more complex modifications you can configure:</p><ul><li>Remapping sequences of keys, e.g. pressing shift five times toggles Caps Lock.</li><li>Remapping on tap vs on hold, e.g. Caps Lock key is remapped to Esc if you quickly tap it, but is remapped to Ctrl if you hold it and use it as a modifier.</li><li>Having remaps being keyboard or software specific.</li></ul><p>Some software resources to get started on the topic:</p><ul><li>macOS - <a href="https://karabiner-elements.pqrs.org/">karabiner-elements</a>, <a href="https://github.com/koekeishiya/skhd">skhd</a> or <a href="https://folivora.ai/">BetterTouchTool</a></li><li>Linux - <a href="https://wiki.archlinux.org/index.php/Xmodmap">xmodmap</a> or <a href="https://github.com/autokey/autokey">Autokey</a></li><li>Windows - Builtin in Control Panel, <a href="https://www.autohotkey.com/">AutoHotkey</a> or <a href="https://www.randyrants.com/category/sharpkeys/">SharpKeys</a></li><li>QMK - If your keyboard supports custom firmware you can use <a href="https://docs.qmk.fm/">QMK</a> to configure the hardware device itself so the remaps works for any machine you use the keyboard with.</li></ul><h2 id="Daemons"><a href="#Daemons" class="headerlink" title="Daemons"></a>Daemons</h2><p>You are probably already familiar with the notion of daemons, even if the word seems new. Most computers have a series of processes that are always running in the background rather than waiting for a user to launch them and interact with them. These processes are called daemons and the programs that run as daemons often end with a <code>d</code> to indicate so. For example <code>sshd</code>, the SSH daemon, is the program responsible for listening to incoming SSH requests and checking that the remote user has the necessary credentials to log in.</p><p>In Linux, <code>systemd</code> (the system daemon) is the most common solution for running and setting up daemon processes. You can run <code>systemctl status</code> to list the current running daemons. Most of them might sound unfamiliar but are responsible for core parts of the system such as managing the network, solving DNS queries or displaying the graphical interface for the system. Systemd can be interacted with the <code>systemctl</code> command in order to <code>enable</code>, <code>disable</code>, <code>start</code>, <code>stop</code>, <code>restart</code> or check the <code>status</code> of services (those are the <code>systemctl</code> commands).</p><p>More interestingly, <code>systemd</code> has a fairly accessible interface for configuring and enabling new daemons (or services). Below is an example of a daemon for running a simple Python app. We won’t go in the details but as you can see most of the fields are pretty self explanatory.</p><pre><code class="lang-bash"># /etc/systemd/system/myapp.service[Unit]Description=My Custom AppAfter=network.target[Service]User=fooGroup=fooWorkingDirectory=/home/foo/projects/mydaemonExecStart=/usr/bin/local/python3.7 app.pyRestart=on-failure[Install]WantedBy=multi-user.target</code></pre><p>Also, if you just want to run some program with a given frequency there is no need to build a custom daemon, you can use <a href="https://www.man7.org/linux/man-pages/man8/cron.8.html"><code>cron</code></a>, a daemon your system already runs to perform scheduled tasks.</p><h2 id="FUSE"><a href="#FUSE" class="headerlink" title="FUSE"></a>FUSE</h2><p>Modern software systems are usually composed of smaller building blocks that are composed together. Your operating system supports using different filesystem backends because there is a common language of what operations a filesystem supports. For instance, when you run <code>touch</code> to create a file, <code>touch</code>performs a system call to the kernel to create the file and the kernel performs the appropriate filesystem call to create the given file. A caveat is that UNIX filesystems are traditionally implemented as kernel modules and only the kernel is allowed to perform filesystem calls.</p><p><a href="https://en.wikipedia.org/wiki/Filesystem_in_Userspace">FUSE</a> (Filesystem in User Space) allows filesystems to be implemented by a user program. FUSE lets users run user space code for filesystem calls and then bridges the necessary calls to the kernel interfaces. In practice, this means that users can implement arbitrary functionality for filesystem calls.</p><p>For example, FUSE can be used so whenever you perform an operation in a virtual filesystem, that operation is forwarded through SSH to a remote machine, performed there, and the output is returned back to you. This way, local programs can see the file as if it was in your computer while in reality it’s in a remote server. This is effectively what <code>sshfs</code> does.</p><p>Some interesting examples of FUSE filesystems are:</p><ul><li><a href="https://github.com/libfuse/sshfs">sshfs</a> - Open locally remote files/folder through an SSH connection.</li><li><a href="https://rclone.org/commands/rclone_mount/">rclone</a> - Mount cloud storage services like Dropbox, GDrive, Amazon S3 or Google Cloud Storage and open data locally.</li><li><a href="https://nuetzlich.net/gocryptfs/">gocryptfs</a> - Encrypted overlay system. Files are stored encrypted but once the FS is mounted they appear as plaintext in the mountpoint.</li><li><a href="https://keybase.io/docs/kbfs">kbfs</a> - Distributed filesystem with end-to-end encryption. You can have private, shared and public folders.</li><li><a href="https://borgbackup.readthedocs.io/en/stable/usage/mount.html">borgbackup</a> - Mount your deduplicated, compressed and encrypted backups for ease of browsing.</li></ul><h2 id="Backups"><a href="#Backups" class="headerlink" title="Backups"></a>Backups</h2><p>Any data that you haven’t backed up is data that could be gone at any moment, forever. It’s easy to copy data around, it’s hard to reliably backup data. Here are some good backup basics and the pitfalls of some approaches.</p><p>First, a copy of the data in the same disk is not a backup, because the disk is the single point of failure for all the data. Similarly, an external drive in your home is also a weak backup solution since it could be lost in a fire/robbery/&c. Instead, having an off-site backup is a recommended practice.</p><p>Synchronization solutions are not backups. For instance, Dropbox/GDrive are convenient solutions, but when data is erased or corrupted they propagate the change. For the same reason, disk mirroring solutions like RAID are not backups. They don’t help if data gets deleted, corrupted or encrypted by ransomware.</p><p>Some core features of good backups solutions are versioning, deduplication and security. Versioning backups ensure that you can access your history of changes and efficiently recover files. Efficient backup solutions use data deduplication to only store incremental changes and reduce the storage overhead. Regarding security, you should ask yourself what someone would need to know/have in order to read your data and, more importantly, to delete all your data and associated backups. Lastly, blindly trusting backups is a terrible idea and you should verify regularly that you can use them to recover data.</p><p>Backups go beyond local files in your computer. Given the significant growth of web applications, large amounts of your data are only stored in the cloud. For instance, your webmail, social media photos, music playlists in streaming services or online docs are gone if you lose access to the corresponding accounts. Having an offline copy of this information is the way to go, and you can find online tools that people have built to fetch the data and save it.</p><p>For a more detailed explanation, see 2019’s lecture notes on <a href="https://missing.csail.mit.edu/2019/backups">Backups</a>.</p><h2 id="APIs"><a href="#APIs" class="headerlink" title="APIs"></a>APIs</h2><p>We’ve talked a lot in this class about using your computer more efficiently to accomplish <em>local</em> tasks, but you will find that many of these lessons also extend to the wider internet. Most services online will have “APIs” that let you programmatically access their data. For example, the US government has an API that lets you get weather forecasts, which you could use to easily get a weather forecast in your shell.</p><p>Most of these APIs have a similar format. They are structured URLs, often rooted at <code>api.service.com</code>, where the path and query parameters indicate what data you want to read or what action you want to perform. For the US weather data for example, to get the forecast for a particular location, you issue GET request (with <code>curl</code> for example) to <a href="https://api.weather.gov/points/42.3604,-71.094">https://api.weather.gov/points/42.3604,-71.094</a>. The response itself contains a bunch of other URLs that let you get specific forecasts for that region. Usually, the responses are formatted as JSON, which you can then pipe through a tool like <a href="https://stedolan.github.io/jq/"><code>jq</code></a> to massage into what you care about.</p><p>Some APIs require authentication, and this usually takes the form of some sort of secret <em>token</em> that you need to include with the request. You should read the documentation for the API to see what the particular service you are looking for uses, but “<a href="https://www.oauth.com/">OAuth</a>” is a protocol you will often see used. At its heart, OAuth is a way to give you tokens that can “act as you” on a given service, and can only be used for particular purposes. Keep in mind that these tokens are <em>secret</em>, and anyone who gains access to your token can do whatever the token allows under <em>your</em> account!</p><p><a href="https://ifttt.com/">IFTTT</a> is a website and service centered around the idea of APIs — it provides integrations with tons of services, and lets you chain events from them in nearly arbitrary ways. Give it a look!</p><h2 id="Common-command-line-flags-patterns"><a href="#Common-command-line-flags-patterns" class="headerlink" title="Common command-line flags/patterns"></a>Common command-line flags/patterns</h2><p>Command-line tools vary a lot, and you will often want to check out their <code>man</code>pages before using them. They often share some common features though that can be good to be aware of:</p><ul><li>Most tools support some kind of <code>--help</code> flag to display brief usage instructions for the tool.</li><li>Many tools that can cause irrevocable change support the notion of a “dry run” in which they only print what they <em>would have done</em>, but do not actually perform the change. Similarly, they often have an “interactive” flag that will prompt you for each destructive action.</li><li>You can usually use <code>--version</code> or <code>-V</code> to have the program print its own version (handy for reporting bugs!).</li><li>Almost all tools have a <code>--verbose</code> or <code>-v</code> flag to produce more verbose output. You can usually include the flag multiple times (<code>-vvv</code>) to get <em>more</em>verbose output, which can be handy for debugging. Similarly, many tools have a <code>--quiet</code> flag for making it only print something on error.</li><li>In many tools, <code>-</code> in place of a file name means “standard input” or “standard output”, depending on the argument.</li><li>Possibly destructive tools are generally not recursive by default, but support a “recursive” flag (often <code>-r</code>) to make them recurse.</li><li>Sometimes, you want to pass something that <em>looks</em> like a flag as a normal argument. For example, imagine you wanted to remove a file called <code>-r</code>. Or you want to run one program “through” another, like <code>ssh machine foo</code>, and you want to pass a flag to the “inner” program (<code>foo</code>). The special argument <code>--</code> makes a program <em>stop</em> processing flags and options (things starting with <code>-</code>) in what follows, letting you pass things that look like flags without them being interpreted as such: <code>rm -- -r</code> or <code>ssh machine --for-ssh -- foo --for-foo</code>.</li></ul><h2 id="Window-managers"><a href="#Window-managers" class="headerlink" title="Window managers"></a>Window managers</h2><p>Most of you are used to using a “drag and drop” window manager, like what comes with Windows, macOS, and Ubuntu by default. There are windows that just sort of hang there on screen, and you can drag them around, resize them, and have them overlap one another. But these are only one <em>type</em> of window manager, often referred to as a “floating” window manager. There are many others, especially on Linux. A particularly common alternative is a “tiling” window manager. In a tiling window manager, windows never overlap, and are instead arranged as tiles on your screen, sort of like panes in tmux. With a tiling window manager, the screen is always filled by whatever windows are open, arranged according to some <em>layout</em>. If you have just one window, it takes up the full screen. If you then open another, the original window shrinks to make room for it (often something like 2/3 and 1/3). If you open a third, the other windows will again shrink to accommodate the new window. Just like with tmux panes, you can navigate around these tiled windows with your keyboard, and you can resize them and move them around, all without touching the mouse. They are worth looking into!</p><h2 id="VPNs"><a href="#VPNs" class="headerlink" title="VPNs"></a>VPNs</h2><p>VPNs are all the rage these days, but it’s not clear that’s for <a href="https://web.archive.org/web/20230710155258/https://gist.github.com/joepie91/5a9909939e6ce7d09e29">any good reason</a>. You should be aware of what a VPN does and does not get you. A VPN, in the best case, is <em>really</em> just a way for you to change your internet service provider as far as the internet is concerned. All your traffic will look like it’s coming from the VPN provider instead of your “real” location, and the network you are connected to will only see encrypted traffic.</p><p>While that may seem attractive, keep in mind that when you use a VPN, all you are really doing is shifting your trust from you current ISP to the VPN hosting company. Whatever your ISP <em>could</em> see, the VPN provider now sees <em>instead</em>. If you trust them <em>more</em> than your ISP, that is a win, but otherwise, it is not clear that you have gained much. If you are sitting on some dodgy unencrypted public Wi-Fi at an airport, then maybe you don’t trust the connection much, but at home, the trade-off is not quite as clear.</p><p>You should also know that these days, much of your traffic, at least of a sensitive nature, is <em>already</em> encrypted through HTTPS or TLS more generally. In that case, it usually matters little whether you are on a “bad” network or not – the network operator will only learn what servers you talk to, but not anything about the data that is exchanged.</p><p>Notice that I said “in the best case” above. It is not unheard of for VPN providers to accidentally misconfigure their software such that the encryption is either weak or entirely disabled. Some VPN providers are malicious (or at the very least opportunist), and will log all your traffic, and possibly sell information about it to third parties. Choosing a bad VPN provider is often worse than not using one in the first place.</p><p>In a pinch, MIT <a href="https://ist.mit.edu/vpn">runs a VPN</a> for its students, so that may be worth taking a look at. Also, if you’re going to roll your own, give <a href="https://www.wireguard.com/">WireGuard</a> a look.</p><h2 id="Markdown"><a href="#Markdown" class="headerlink" title="Markdown"></a>Markdown</h2><p>There is a high chance that you will write some text over the course of your career. And often, you will want to mark up that text in simple ways. You want some text to be bold or italic, or you want to add headers, links, and code fragments. Instead of pulling out a heavy tool like Word or LaTeX, you may want to consider using the lightweight markup language <a href="https://commonmark.org/help/">Markdown</a>.</p><p>You have probably seen Markdown already, or at least some variant of it. Subsets of it are used and supported almost everywhere, even if it’s not under the name Markdown. At its core, Markdown is an attempt to codify the way that people already often mark up text when they are writing plain text documents. Emphasis (<em>italics</em>) is added by surrounding a word with <code>*</code>. Strong emphasis (<strong>bold</strong>) is added using <code>**</code>. Lines starting with <code>#</code> are headings (and the number of <code>#</code>s is the subheading level). Any line starting with <code>-</code> is a bullet list item, and any line starting with a number + <code>.</code> is a numbered list item. Backtick is used to show words in <code>code font</code>, and a code block can be entered by indenting a line with four spaces or surrounding it with triple-backticks:</p><pre><code class="lang-markdown">```code goes here```</code></pre><p>To add a link, place the <em>text</em> for the link in square brackets, and the URL immediately following that in parentheses: <code>[name](url)</code>. Markdown is easy to get started with, and you can use it nearly everywhere. In fact, the lecture notes for this lecture, and all the others, are written in Markdown, and you can see the raw Markdown <a href="https://raw.githubusercontent.com/missing-semester/missing-semester/master/_2020/potpourri.md">here</a>.</p><h2 id="Hammerspoon-desktop-automation-on-macOS"><a href="#Hammerspoon-desktop-automation-on-macOS" class="headerlink" title="Hammerspoon (desktop automation on macOS)"></a>Hammerspoon (desktop automation on macOS)</h2><p><a href="https://www.hammerspoon.org/">Hammerspoon</a> is a desktop automation framework for macOS. It lets you write Lua scripts that hook into operating system functionality, allowing you to interact with the keyboard/mouse, windows, displays, filesystem, and much more.</p><p>Some examples of things you can do with Hammerspoon:</p><ul><li>Bind hotkeys to move windows to specific locations</li><li>Create a menu bar button that automatically lays out windows in a specific layout</li><li>Mute your speaker when you arrive in lab (by detecting the Wi-Fi network)</li><li>Show you a warning if you’ve accidentally taken your friend’s power supply</li></ul><p>At a high level, Hammerspoon lets you run arbitrary Lua code, bound to menu buttons, key presses, or events, and Hammerspoon provides an extensive library for interacting with the system, so there’s basically no limit to what you can do with it. Many people have made their Hammerspoon configurations public, so you can generally find what you need by searching the internet, but you can always write your own code from scratch.</p><h3 id="Resources"><a href="#Resources" class="headerlink" title="Resources"></a>Resources</h3><ul><li><a href="https://www.hammerspoon.org/go/">Getting Started with Hammerspoon</a></li><li><a href="https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations">Sample configurations</a></li><li><a href="https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon">Anish’s Hammerspoon config</a></li></ul><h2 id="Booting-Live-USBs"><a href="#Booting-Live-USBs" class="headerlink" title="Booting + Live USBs"></a>Booting + Live USBs</h2><p>When your machine boots up, before the operating system is loaded, the<a href="https://en.wikipedia.org/wiki/BIOS">BIOS</a>/<a href="https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface">UEFI</a> initializes the system. During this process, you can press a specific key combination to configure this layer of software. For example, your computer may say something like “Press F9 to configure BIOS. Press F12 to enter boot menu.” during the boot process. You can configure all sorts of hardware-related settings in the BIOS menu. You can also enter the boot menu to boot from an alternate device instead of your hard drive.</p><p><a href="https://en.wikipedia.org/wiki/Live_USB">Live USBs</a> are USB flash drives containing an operating system. You can create one of these by downloading an operating system (e.g. a Linux distribution) and burning it to the flash drive. This process is a little bit more complicated than simply copying a <code>.iso</code> file to the disk. There are tools like <a href="https://unetbootin.github.io/">UNetbootin</a> to help you create live USBs.</p><p>Live USBs are useful for all sorts of purposes. Among other things, if you break your existing operating system installation so that it no longer boots, you can use a live USB to recover data or fix the operating system.</p><h2 id="Docker-Vagrant-VMs-Cloud-OpenStack"><a href="#Docker-Vagrant-VMs-Cloud-OpenStack" class="headerlink" title="Docker, Vagrant, VMs, Cloud, OpenStack"></a>Docker, Vagrant, VMs, Cloud, OpenStack</h2><p><a href="https://en.wikipedia.org/wiki/Virtual_machine">Virtual machines</a> and similar tools like containers let you emulate a whole computer system, including the operating system. This can be useful for creating an isolated environment for testing, development, or exploration (e.g. running potentially malicious code).</p><p><a href="https://www.vagrantup.com/">Vagrant</a> is a tool that lets you describe machine configurations (operating system, services, packages, etc.) in code, and then instantiate VMs with a simple <code>vagrant up</code>. <a href="https://www.docker.com/">Docker</a> is conceptually similar but it uses containers instead.</p><p>You can also rent virtual machines on the cloud, and it’s a nice way to get instant access to:</p><ul><li>A cheap always-on machine that has a public IP address, used to host services</li><li>A machine with a lot of CPU, disk, RAM, and/or GPU</li><li>Many more machines than you physically have access to (billing is often by the second, so if you want a lot of computing for a short amount of time, it’s feasible to rent 1000 computers for a couple of minutes)</li></ul><p>Popular services include <a href="https://aws.amazon.com/">Amazon AWS</a>, <a href="https://cloud.google.com/">Google Cloud</a>,<a href="https://azure.microsoft.com/"> Microsoft Azure</a>,<a href="https://www.digitalocean.com/">DigitalOcean</a>.</p><p>If you’re a member of MIT CSAIL, you can get free VMs for research purposes through the <a href="https://tig.csail.mit.edu/shared-computing/open-stack/">CSAIL OpenStack instance</a>.</p><h2 id="Notebook-programming"><a href="#Notebook-programming" class="headerlink" title="Notebook programming"></a>Notebook programming</h2><p><a href="https://en.wikipedia.org/wiki/Notebook_interface">Notebook programming environments</a> can be really handy for doing certain types of interactive or exploratory development. Perhaps the most popular notebook programming environment today is <a href="https://jupyter.org/">Jupyter</a>, for Python (and several other languages). <a href="https://www.wolfram.com/mathematica/">Wolfram Mathematica</a> is another notebook programming environment that’s great for doing math-oriented programming.</p><h2 id="GitHub"><a href="#GitHub" class="headerlink" title="GitHub"></a>GitHub</h2><p><a href="https://github.com/">GitHub</a> is one of the most popular platforms for open-source software development. Many of the tools we’ve talked about in this class, from <a href="https://github.com/vim/vim">vim</a> to<a href="https://github.com/Hammerspoon/hammerspoon">Hammerspoon</a>, are hosted on GitHub. It’s easy to get started contributing to open-source to help improve the tools that you use every day.</p><p>There are two primary ways in which people contribute to projects on GitHub:</p><ul><li>Creating an <a href="https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue">issue</a>. This can be used to report bugs or request a new feature. Neither of these involves reading or writing code, so it can be pretty lightweight to do. High-quality bug reports can be extremely valuable to developers. Commenting on existing discussions can be helpful too.</li><li>Contribute code through a <a href="https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests">pull request</a>. This is generally more involved than creating an issue. You can <a href="https://help.github.com/en/github/getting-started-with-github/fork-a-repo">fork</a> a repository on GitHub, clone your fork, create a new branch, make some changes (e.g. fix a bug or implement a feature), push the branch, and then <a href="https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request">create a pull request</a>. After this, there will generally be some back-and-forth with the project maintainers, who will give you feedback on your patch. Finally, if all goes well, your patch will be merged into the upstream repository. Often times, larger projects will have a contributing guide, tag beginner-friendly issues, and some even have mentorship programs to help first-time contributors become familiar with the project.</li></ul>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> Potpourri </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture9: Security and Cryptography</title>
<link href="/2024/11/08/missing-semester/missing-semester-lecture9/"/>
<url>/2024/11/08/missing-semester/missing-semester-lecture9/</url>
<content type="html"><![CDATA[<h3 id="Security-and-Cryptography"><a href="#Security-and-Cryptography" class="headerlink" title="Security and Cryptography"></a>Security and Cryptography</h3><p>This lecture has a very informal (but we think practical) treatment of basic cryptography concepts. This lecture won’t be enough to teach you how to design secure systems or cryptographic protocols, but we hope it will be enough to give you a general understanding of the programs and protocols you already use.</p><h4 id="Entropy"><a href="#Entropy" class="headerlink" title="Entropy"></a>Entropy</h4><p>Entropy is a measure of randomness. This is useful, for example, when determining the strength of a password.</p><p>Entropy is measured in bits, and when selecting uniformly at random from a set of possible outcomes, the entropy is equal to log_2(# of possibilities). A fair coin flip gives 1 bit of entropy. A dice roll (of a 6-sided die) has ~2.58 bits of entropy.</p><p>How many bits of entropy is enough? It depends on your threat model. For online guessing, as the XKCD comic points out, ~40 bits of entropy is pretty good. To be resistant to offline guessing, a stronger password would be necessary (e.g. 80 bits, or more).</p><h4 id="Hashing-functions"><a href="#Hashing-functions" class="headerlink" title="Hashing functions"></a>Hashing functions</h4><p>An example of a hash function is SHA1, which is used in Git. It maps arbitrary-sized inputs to 160-bit outputs (which can be represented as 40 hexadecimal characters). We can try out the SHA1 hash on an input using the sha1sum command:</p><pre><code class="lang-bash">$ printf 'hello' | sha1sumaaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d$ printf 'hello' | sha1sumaaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d$ printf 'Hello' | sha1sum f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0</code></pre><h5 id="applications"><a href="#applications" class="headerlink" title="applications"></a>applications</h5><ul><li>Git, for content-addressed storage. The idea of a hash function is a more general concept (there are non-cryptographic hash functions). Why does Git use a cryptographic hash function?</li><li>A short summary of the contents of a file. Software can often be downloaded from (potentially less trustworthy) mirrors, e.g. Linux ISOs, and it would be nice to not have to trust them. The official sites usually post hashes alongside the download links (that point to third-party mirrors), so that the hash can be checked after downloading a file.</li><li>Commitment schemes. Suppose you want to commit to a particular value, but reveal the value itself later. For example, I want to do a fair coin toss “in my head”, without a trusted shared coin that two parties can see. I could choose a value r = random(), and then share h = sha256(r). Then, you could call heads or tails (we’ll agree that even r means heads, and odd r means tails). After you call, I can reveal my value r, and you can confirm that I haven’t cheated by checking sha256(r) matches the hash I shared earlier.</li></ul><h4 id="Key-derivation-functions"><a href="#Key-derivation-functions" class="headerlink" title="Key derivation functions"></a>Key derivation functions</h4><p>A related concept to cryptographic hashes, key derivation functions (KDFs) are used for a number of applications, including producing fixed-length output for use as keys in other cryptographic algorithms. Usually, KDFs are deliberately slow, in order to slow down offline brute-force attacks.</p><h5 id="applications-1"><a href="#applications-1" class="headerlink" title="applications"></a>applications</h5><ul><li>Producing keys from passphrases for use in other cryptographic algorithms (e.g. symmetric cryptography, see below).</li><li>Storing login credentials. Storing plaintext passwords is bad; the right approach is to generate and store a random salt salt = random() for each user, store KDF(password + salt), and verify login attempts by re-computing the KDF given the entered password and the stored salt.</li></ul><h4 id="Symmetric-encryption"><a href="#Symmetric-encryption" class="headerlink" title="Symmetric encryption"></a>Symmetric encryption</h4><p>Hiding message contents is probably the first concept you think about when you think about cryptography. Symmetric cryptography accomplishes this with the following set of functionality:</p><pre><code>keygen() -> key (this function is randomized)encrypt(plaintext: array<byte>, key) -> array<byte> (the ciphertext)decrypt(ciphertext: array<byte>, key) -> array<byte> (the plaintext)</code></pre><p>The encrypt function has the property that given the output (ciphertext), it’s hard to determine the input (plaintext) without the key. The decrypt function has the obvious correctness property, that decrypt(encrypt(m, k), k) = m.</p><p>An example of a symmetric cryptosystem in wide use today is AES.</p><h5 id="applications-2"><a href="#applications-2" class="headerlink" title="applications"></a>applications</h5><ul><li>Encrypting files for storage in an untrusted cloud service. This can be combined with KDFs, so you can encrypt a file with a passphrase. Generate key = KDF(passphrase), and then store encrypt(file, key).</li></ul><h4 id="Asymmetric-encryption"><a href="#Asymmetric-encryption" class="headerlink" title="Asymmetric encryption"></a>Asymmetric encryption</h4><p>The term “asymmetric” refers to there being two keys, with two different roles. A private key, as its name implies, is meant to be kept private, while the public key can be publicly shared and it won’t affect security (unlike sharing the key in a symmetric cryptosystem). Asymmetric cryptosystems provide the following set of functionality, to encrypt/decrypt and to sign/verify:</p><pre><code class="lang-bash">keygen() -> (public key, private key) (this function is randomized)encrypt(plaintext: array<byte>, public key) -> array<byte> (the ciphertext)decrypt(ciphertext: array<byte>, private key) -> array<byte> (the plaintext)sign(message: array<byte>, private key) -> array<byte> (the signature)verify(message: array<byte>, signature: array<byte>, public key) -> bool (whether or not the signature is valid)</code></pre><p>The decrypt function has the obvious correctness property, that <code>decrypt(encrypt(m, public key), private key) = m</code>.</p><p>Symmetric and asymmetric encryption can be compared to physical locks. A symmetric cryptosystem is like a door lock: anyone with the key can lock and unlock it. Asymmetric encryption is like a padlock with a key. You could give the unlocked lock to someone (the public key), they could put a message in a box and then put the lock on, and after that, only you could open the lock because you kept the key (the private key).</p><p>The sign/verify functions have the same properties that you would hope physical signatures would have, in that it’s hard to forge a signature. No matter the message, without the private key, it’s hard to produce a signature such that <code>verify(message, signature, public key)</code> returns true. And of course, the verify function has the obvious correctness property that <code>verify(message, sign(message, private key), public key) = true</code>.</p><h5 id="applications-3"><a href="#applications-3" class="headerlink" title="applications"></a>applications</h5><ul><li>PGP email encryption. People can have their public keys posted online (e.g. in a PGP keyserver, or on Keybase). Anyone can send them encrypted email.</li><li>Private messaging. Apps like Signal and Keybase use asymmetric keys to establish private communication channels.</li><li>Signing software. Git can have GPG-signed commits and tags. With a posted public key, anyone can verify the authenticity of downloaded software.</li></ul><h5 id="key-distribution"><a href="#key-distribution" class="headerlink" title="key distribution"></a>key distribution</h5><p>Asymmetric-key cryptography is wonderful, but it has a big challenge of distributing public keys / mapping public keys to real-world identities. There are many solutions to this problem. Signal has one simple solution: trust on first use, and support out-of-band public key exchange (you verify your friends’ “safety numbers” in person). PGP has a different solution, which is <a href="https://en.wikipedia.org/wiki/Web_of_trust">web of trust</a>. Keybase has yet another solution of <a href="https://keybase.io/blog/chat-apps-softer-than-tofu">social proof</a> (along with other neat ideas). Each model has its merits; we (the instructors) like Keybase’s model.</p><h4 id="Case-studies"><a href="#Case-studies" class="headerlink" title="Case studies"></a>Case studies</h4><h5 id="Password-managers"><a href="#Password-managers" class="headerlink" title="Password managers"></a>Password managers</h5><h5 id="Two-factor-authentication"><a href="#Two-factor-authentication" class="headerlink" title="Two-factor authentication"></a>Two-factor authentication</h5><h5 id="Full-disk-encryption"><a href="#Full-disk-encryption" class="headerlink" title="Full disk encryption"></a>Full disk encryption</h5><h5 id="Private-messaging"><a href="#Private-messaging" class="headerlink" title="Private messaging"></a>Private messaging</h5><h5 id="SSH"><a href="#SSH" class="headerlink" title="SSH"></a>SSH</h5>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> Security </tag>
<tag> Cryptography </tag>
</tags>
</entry>
<entry>
<title>Frequently Used Commands</title>
<link href="/2024/09/14/linuxcommands/frequentlyusedcommands/"/>
<url>/2024/09/14/linuxcommands/frequentlyusedcommands/</url>
<content type="html"><![CDATA[<h3 id="some-frequently-used-commands"><a href="#some-frequently-used-commands" class="headerlink" title="some frequently used commands"></a>some frequently used commands</h3><h4 id="cat-amp-less-amp-tail-amp-vim"><a href="#cat-amp-less-amp-tail-amp-vim" class="headerlink" title="cat & less & tail & vim"></a>cat & less & tail & vim</h4><ul><li>less filename.log<ul><li><code>/</code> search_word</li></ul></li><li>tail -f filename.log<ul><li><code>f</code> follow</li></ul></li><li>vim filename.log</li></ul><h4 id="ping-amp-telnet-amp-traceroute-amp-dig"><a href="#ping-amp-telnet-amp-traceroute-amp-dig" class="headerlink" title="ping & telnet & traceroute & dig"></a>ping & telnet & traceroute & dig</h4><ul><li>ping <a href="http://www.xxx.com">http://www.xxx.com</a><ul><li>check the network connection</li></ul></li><li>telnet www.xxx.com port_num<ul><li>check the port connection</li></ul></li></ul><h4 id="netstat"><a href="#netstat" class="headerlink" title="netstat"></a>netstat</h4><ul><li>netstat -lntp<ul><li>check the port status</li><li><code>l</code> listening,<code>netstat -l</code> shows the listening sockets</li><li><code>n</code> number</li><li><code>t</code> tcp</li><li><code>p</code> process</li></ul></li><li>netstat -latp<ul><li><code>l</code> listening,<code>netstat -l</code> shows the listening sockets</li><li><code>a</code> all</li><li><code>t</code> tcp</li><li><code>p</code> process</li></ul></li></ul><h4 id="ps"><a href="#ps" class="headerlink" title="ps"></a>ps</h4><ul><li>ps -ef<ul><li><code>ps</code> process status</li><li><code>e</code> select all processes</li><li><code>f</code> full format</li></ul></li></ul><h4 id="top-amp-df-amp-du"><a href="#top-amp-df-amp-du" class="headerlink" title="top & df & du"></a>top & df & du</h4><ul><li>top<ul><li>check the cpu and memory usage</li></ul></li><li>df -h<ul><li>check the disk usage</li><li><code>h</code> human-readable</li></ul></li><li>du -sh<ul><li>check the disk usage</li></ul></li></ul><h4 id="gt-amp-gt-gt-amp-lt-amp"><a href="#gt-amp-gt-gt-amp-lt-amp" class="headerlink" title="> & >> & < & |"></a>> & >> & < & |</h4><ul><li>bash xxx.sh > xxx.log<ul><li><code>></code> overwrite</li></ul></li><li>bash xx.sh >> xxx.log<ul><li><code>>></code> append</li><li><code>2>&1</code> redirect the error message to the standard output</li></ul></li><li>bash xxx.sh < xxx.log<ul><li><code><</code> input</li><li><code>|</code> pipe</li></ul></li></ul><h4 id="echo-amp-sed-amp-awk-amp-systemctl-service"><a href="#echo-amp-sed-amp-awk-amp-systemctl-service" class="headerlink" title="echo & sed & awk & systemctl/service"></a>echo & sed & awk & systemctl/service</h4><ul><li><p>echo “hello world”</p><ul><li>print the string</li><li><code>echo $?</code> print the return value</li><li><code>echo $0</code> print the script name</li><li><code>echo $1</code> print the first parameter</li><li><code>echo $*</code> print all the parameters</li><li><code>echo $@</code> print all the parameters</li><li><code>echo $$</code> print the process id</li><li><code>echo $!</code> print the last process id</li></ul></li><li><p>sed ‘s/old/new/g’ filename</p><ul><li><code>s</code> substitute</li><li><code>g</code> global</li></ul></li><li>awk ‘{print $1}’ filename<ul><li><code>awk</code> pattern scanning and processing language</li><li><code>{print $1}</code> print the first column</li></ul></li><li>systemctl start/stop/restart service_name</li><li>service service_name start/stop/restart</li></ul>]]></content>
<categories>
<category> Linux </category>
</categories>
<tags>
<tag> linux commands </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture8: Metaprogramming</title>
<link href="/2024/09/04/missing-semester/missing-semester-lecture8/"/>
<url>/2024/09/04/missing-semester/missing-semester-lecture8/</url>
<content type="html"><![CDATA[<h3 id="Lecture-8-Metaprogramming"><a href="#Lecture-8-Metaprogramming" class="headerlink" title="Lecture 8: Metaprogramming"></a>Lecture 8: Metaprogramming</h3><p><a href="https://missing.csail.mit.edu/2020/metaprogramming/">https://missing.csail.mit.edu/2020/metaprogramming/</a></p><h4 id="Build-systems"><a href="#Build-systems" class="headerlink" title="Build systems"></a>Build systems</h4><p>make is one of the most common build systems out there, and you will usually find it installed on pretty much any UNIX-based computer. It has its warts, but works quite well for simple-to-moderate projects. When you run make, it consults a file called Makefile in the current directory. All the targets, their dependencies, and the rules are defined in that file. Let’s take a look at one:</p><pre><code class="lang-bash">paper.pdf: paper.tex plot-data.png pdflatex paper.texplot-%.png: %.dat plot.py ./plot.py -i $*.dat -o $@</code></pre><p>Each directive in this file is a rule for how to produce the left-hand side using the right-hand side. Or, phrased differently, the things named on the right-hand side are dependencies, and the left-hand side is the target. The indented block is a sequence of programs to produce the target from those dependencies. In make, the first directive also defines the default goal. If you run <code>make</code> with no arguments, this is the target it will build. Alternatively, you can run something like <code>make plot-data.png</code>, and it will build that target instead.</p><p>The <code>%</code> in a rule is a “pattern”, and will match the same string on the left and on the right. For example, if the target <code>plot-foo.png</code> is requested, make will look for the dependencies <code>foo.dat</code> and <code>plot.py</code>. Now let’s look at what happens if we run make with an empty source directory.</p><pre><code class="lang-bash">$ makemake: *** No rule to make target 'paper.tex', needed by 'paper.pdf'. Stop.</code></pre><p><code>make</code> is helpfully telling us that in order to build <code>paper.pdf</code>, it needs <code>paper.tex</code>, and it has no rule telling it how to make that file. Let’s try making it!</p><pre><code class="lang-bash">$ touch paper.tex$ makemake: *** No rule to make target 'plot-data.png', needed by 'paper.pdf'. Stop.</code></pre><p>Hmm, interesting, there is a rule to make <code>plot-data.png</code>, but it is a pattern rule. Since the source files do not exist (<code>data.dat</code>), make simply states that it cannot make that file. Let’s try creating all the files:</p><pre><code class="lang-bash">$ cat paper.tex\documentclass{article}\usepackage{graphicx}\begin{document}\includegraphics[scale=0.65]{plot-data.png}\end{document}$ cat plot.py#!/usr/bin/env pythonimport matplotlibimport matplotlib.pyplot as pltimport numpy as npimport argparseparser = argparse.ArgumentParser()parser.add_argument('-i', type=argparse.FileType('r'))parser.add_argument('-o')args = parser.parse_args()data = np.loadtxt(args.i)plt.plot(data[:, 0], data[:, 1])plt.savefig(args.o)$ cat data.dat1 12 23 34 45 8</code></pre><p>Now what happens if we run make?</p><pre><code class="lang-bash">$ make./plot.py -i data.dat -o plot-data.pngpdflatex paper.tex... lots of output ...</code></pre><p>And look, it made a PDF for us! What if we run make again?</p><pre><code class="lang-bash">$ makemake: 'paper.pdf' is up to date.</code></pre><p>It didn’t do anything! Why not? Well, because it didn’t need to. It checked that all of the previously-built targets were still up to date with respect to their listed dependencies. We can test this by modifying paper.tex and then re-running make:</p><pre><code class="lang-bash">$ vim paper.tex$ makepdflatex paper.tex...</code></pre><p>Notice that make did not re-run <code>plot.py</code> because that was not necessary; none of <code>plot-data.png</code>’s dependencies changed!</p><h4 id="Dependency-management"><a href="#Dependency-management" class="headerlink" title="Dependency management"></a>Dependency management</h4><p>That also isn’t ideal though! What if I issue a security update which does not change the public interface of my library (its “API”), and which any project that depended on the old version should immediately start using? This is where the different groups of numbers in a version come in. The exact meaning of each one varies between projects, but one relatively common standard is semantic versioning. With semantic versioning, every version number is of the form: major.minor.patch. The rules are:</p><ul><li>If a new release does not change the API, increase the patch version.</li><li>If you add to your API in a backwards-compatible way, increase the minor version.</li><li>If you change the API in a non-backwards-compatible way, increase the major version.</li></ul><p>When working with dependency management systems, you may also come across the notion of lock files. A lock file is simply a file that lists the exact version you are currently depending on of each dependency. Usually, you need to explicitly run an update program to upgrade to newer versions of your dependencies.</p><h4 id="Continuous-integration-systems"><a href="#Continuous-integration-systems" class="headerlink" title="Continuous integration systems"></a>Continuous integration systems</h4><p>Continuous integration, or CI, is an umbrella term for “stuff that runs whenever your code changes”, and there are many companies out there that provide various types of CI, often for free for open-source projects. Some of the big ones are Travis CI, Azure Pipelines, and GitHub Actions. They all work in roughly the same way: you add a file to your repository that describes what should happen when various things happen to that repository. By far the most common one is a rule like “when someone pushes code, run the test suite”. When the event triggers, the CI provider spins up a virtual machines (or more), runs the commands in your “recipe”, and then usually notes down the results somewhere. You might set it up so that you are notified if the test suite stops passing, or so that a little badge appears on your repository as long as the tests pass.</p><p>As an example of a CI system, the class website is set up using GitHub Pages. Pages is a CI action that runs the Jekyll blog software on every push to master and makes the built site available on a particular GitHub domain. This makes it trivial for us to update the website! We just make our changes locally, commit them with git, and then push. CI takes care of the rest.</p><h4 id="A-brief-aside-on-testing"><a href="#A-brief-aside-on-testing" class="headerlink" title="A brief aside on testing"></a>A brief aside on testing</h4><p>Most large software projects come with a “test suite”. You may already be familiar with the general concept of testing, but we thought we’d quickly mention some approaches to testing and testing terminology that you may encounter in the wild:</p><p>Test suite: a collective term for all the tests<br>Unit test: a “micro-test” that tests a specific feature in isolation<br>Integration test: a “macro-test” that runs a larger part of the system to check that different feature or components work together.<br>Regression test: a test that implements a particular pattern that previously caused a bug to ensure that the bug does not resurface.<br>Mocking: to replace a function, module, or type with a fake implementation to avoid testing unrelated functionality. For example, you might “mock the network” or “mock the disk”.</p><h4 id="make"><a href="#make" class="headerlink" title="make"></a>make</h4><ul><li>Caret<br>If your program depends on an API which only exists from version 1.2.3 to 1.9, then version requirement would be ^1.2.3</li><li>Tilde<br>If your program depends on an API which only exists from version 1.0 to 1.9, then version requirement would be ~1</li><li>Wildcard<br>If your program does not matter of different versions of library, then version requirement would be *</li><li>Comparison<br>If your program depends on an API which has been deleted since version 1.2.0, then version requirement would be < 1.2.0</li><li>Multiple<br>If your program depends on an API which only exists from version 1.2 to 1.5, then version requirement would be >= 1.2, < 1.6</li></ul>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> metaprogramming, build systems, dependency management, continuous integration systems </tag>
</tags>
</entry>
<entry>
<title>Tutorial for Starters for Coders on Mac</title>
<link href="/2024/08/16/tutorial-for-starters/"/>
<url>/2024/08/16/tutorial-for-starters/</url>
<content type="html"><![CDATA[<h1 id="Tutorial-for-Starters"><a href="#Tutorial-for-Starters" class="headerlink" title="Tutorial for Starters"></a>Tutorial for Starters</h1><p>This is a tutorial for starters to learn how to code on mac.</p><h2 id="Install-Homebrew"><a href="#Install-Homebrew" class="headerlink" title="Install Homebrew"></a>Install Homebrew</h2><ul><li>Homebrew is a package manager for macOS. It is used to install software packages that are not included in macOS by default. To install Homebrew, open Terminal and run the following command:<pre><code class="lang-bash">/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"</code></pre></li></ul><h2 id="Install-Git"><a href="#Install-Git" class="headerlink" title="Install Git"></a>Install Git</h2><ul><li>Git is a version control system that allows you to track changes in your code and collaborate with others. To install Git using Homebrew, run the following command:<pre><code class="lang-Bash">brew install git</code></pre></li></ul><h2 id="Create-a-github-account"><a href="#Create-a-github-account" class="headerlink" title="Create a github account"></a>Create a github account</h2><ul><li>Github is a platform for hosting and sharing code. It allows you to collaborate with others, track changes in your code, and showcase your projects. To create a Github account, go to the Github website and sign up for an account. </li><li>link to github account by ssh key</li></ul><h2 id="Install-Python"><a href="#Install-Python" class="headerlink" title="Install Python"></a>Install Python</h2><ul><li>Python3 is already installed on M1 Mac Air by default. </li><li>So just ignore this step if mac is M1 or later. </li><li>Python is a popular programming language that is widely used for web development, data analysis, artificial intelligence, and more. To install Python using Homebrew, run the following command:<pre><code class="lang-bash">brew install python3</code></pre></li></ul><h2 id="Install-Iterm2"><a href="#Install-Iterm2" class="headerlink" title="Install Iterm2"></a>Install Iterm2</h2><ul><li>iTerm2 is a terminal emulator for macOS. It provides more features and customization options than the built-in Terminal app. To install iTerm2 using Homebrew, run the following command:<pre><code class="lang-bash">brew install --cask iterm2</code></pre></li></ul><h2 id="Install-Zsh"><a href="#Install-Zsh" class="headerlink" title="Install Zsh"></a>Install Zsh</h2><ul><li>Probably zsh is already installed on mac by default. </li><li>So just ignore this step if mac is M1 or later. </li><li>Zsh is a powerful shell that provides more features and customization options than the default Bash shell. To install Zsh using Homebrew, run the following command:<pre><code class="lang-Bash">brew install zsh</code></pre></li></ul><h2 id="Install-Oh-My-zsh"><a href="#Install-Oh-My-zsh" class="headerlink" title="Install Oh My zsh"></a>Install Oh My zsh</h2><ul><li>Oh My Zsh is a community-driven framework for managing your Zsh configuration. It comes with a variety of themes, plugins, and features that make working with Zsh more productive and enjoyable. To install Oh My Zsh, run the following command:<pre><code class="lang-bash">sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"</code></pre></li></ul><h2 id="Install-p10k"><a href="#Install-p10k" class="headerlink" title="Install p10k"></a>Install p10k</h2><ul><li>Powerlevel10k is a theme for the Zsh shell that provides a customizable prompt with additional information and features. To install Powerlevel10k, run the following command:<pre><code class="lang-bash">brew install romkatv/powerlevel10k/powerlevel10k</code></pre></li></ul><h2 id="Install-tmux"><a href="#Install-tmux" class="headerlink" title="Install tmux"></a>Install tmux</h2><p>tmux is a terminal multiplexer that allows you to run multiple terminal sessions in a single window. To install tmux using Homebrew, run the following command:</p><pre><code class="lang-bash">brew install tmux</code></pre><p>customize the nvim configuration file(Very important)</p><h2 id="Install-Oh-My-Tmux"><a href="#Install-Oh-My-Tmux" class="headerlink" title="Install Oh My Tmux"></a>Install Oh My Tmux</h2><p>Oh My Tmux is a configuration framework for tmux that provides a variety of themes, plugins, and features to enhance your tmux experience. To install Oh My Tmux, run the following command:</p><pre><code class="lang-bash">git clone https://github.com/gpakosz/.tmux.gitln -s -f .tmux/.tmux.confcp .tmux/.tmux.conf.local</code></pre><h2 id="Install-nvim"><a href="#Install-nvim" class="headerlink" title="Install nvim"></a>Install nvim</h2><ul><li>Neovim is a text editor that is designed for developers. It provides more features and customization options than the built-in Vim editor. To install Neovim using Homebrew, run the following command:<pre><code class="lang-Bash">brew install neovim</code></pre></li><li>learn how to use Neovim by following the tutorials and documentation on the Neovim website.<ul><li>vimtutor(quick start how to use vim)<ul><li>type vimtutor in the terminal to start the vim tutorial</li></ul></li><li>vim cheat sheet <ul><li><a href="https://vim.rtorr.com">https://vim.rtorr.com</a></li></ul></li></ul></li><li>customize the nvim configuration file(Very important)</li></ul><h2 id="Install-yabai"><a href="#Install-yabai" class="headerlink" title="Install yabai"></a>Install yabai</h2><ul><li>yabai is a tiling window manager for macOS. It allows you to organize your windows in a more efficient way. To install yabai using Homebrew, run the following command:<pre><code class="lang-bash">brew install koekeishiya/formulae/yabai</code></pre></li></ul><h2 id="Install-skhd"><a href="#Install-skhd" class="headerlink" title="Install skhd"></a>Install skhd</h2><ul><li>skhd is a hotkey daemon for macOS. It allows you to define custom keyboard shortcuts to control your windows and applications. To install skhd using Homebrew, run the following command:<pre><code class="lang-bash">brew install koekeishiya/formulae/skhd</code></pre></li></ul><h2 id="Install-Node-js"><a href="#Install-Node-js" class="headerlink" title="Install Node.js"></a>Install Node.js</h2><ul><li>Node.js is a JavaScript runtime that allows you to run JavaScript code outside of a web browser. To install Node.js using Homebrew, run the following command:<pre><code class="lang-bash">brew install node</code></pre></li></ul><h2 id="Install-nvm"><a href="#Install-nvm" class="headerlink" title="Install nvm"></a>Install nvm</h2><ul><li>nvm is a Node Version Manager that allows you to easily switch between different versions of Node.js. To install nvm, run the following command:<pre><code class="lang-bash">curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.38.0/install.sh | bash</code></pre></li></ul><h2 id="Install-joshuto"><a href="#Install-joshuto" class="headerlink" title="Install joshuto"></a>Install joshuto</h2><ul><li>joshuto is a command-line tool that allows you to quickly search for and open files and directories. To install joshuto using Homebrew, run the following command:<br>```Bash</li><li>brew install joshuto<br>```</li></ul><h2 id="Install-karabiner-elements"><a href="#Install-karabiner-elements" class="headerlink" title="Install karabiner-elements"></a>Install karabiner-elements</h2><ul><li>karabiner-elements is a powerful and flexible keyboard remapping tool for macOS. It allows you to customize your keyboard shortcuts and key mappings. To install karabiner-elements using Homebrew, run the following command:<pre><code class="lang-bash">brew install --cask karabiner-elements</code></pre></li></ul>]]></content>
<categories>
<category> tutorial </category>
</categories>
<tags>
<tag> tutorial </tag>
<tag> mac </tag>
<tag> starter </tag>
</tags>
</entry>
<entry>
<title>Algorithms for Massive Datasets</title>
<link href="/2024/08/15/algorithms-for-massive-datasets/"/>
<url>/2024/08/15/algorithms-for-massive-datasets/</url>
<content type="html"><![CDATA[<h2 id="1-Hashing-Integer-Data-Structures-I"><a href="#1-Hashing-Integer-Data-Structures-I" class="headerlink" title="1.Hashing(Integer Data Structures I)"></a>1.Hashing(Integer Data Structures I)</h2><h3 id="Hashing"><a href="#Hashing" class="headerlink" title="Hashing"></a>Hashing</h3><h3 id="Perfect-Hashing"><a href="#Perfect-Hashing" class="headerlink" title="Perfect Hashing"></a>Perfect Hashing</h3><p>FKS-scheme</p><ul><li>lookeup: O(1)<script type="math/tex">\sqrt{N}</script></li><li>Space: O(n)</li><li>feature: <ul><li>2-level</li><li>collision-free(the second level uses <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.025ex;" xmlns="http://www.w3.org/2000/svg" width="2.345ex" height="1.912ex" role="img" focusable="false" viewBox="0 -833.9 1036.6 844.9"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msup"><g data-mml-node="mi"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g><g data-mml-node="mn" transform="translate(633,363) scale(0.707)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g></g></g></g></svg></mjx-container>)</li></ul></li></ul><h3 id="String-Hashing"><a href="#String-Hashing" class="headerlink" title="String Hashing"></a>String Hashing</h3><ul><li>Karp-Rabin fingerprint<ul><li>rolling property</li><li>If fingerprints match, verify using brute-force comparison. Return “yes!” if we match.</li></ul></li></ul><h2 id="2-Predecessor-Integer-Data-Structures-II"><a href="#2-Predecessor-Integer-Data-Structures-II" class="headerlink" title="2.Predecessor(Integer Data Structures II)"></a>2.Predecessor(Integer Data Structures II)</h2><p><a href="https://www.youtube.com/watch?v=u-HHY1ylhHY&t=4032s">video from MIT 6.861</a></p><h3 id="Predecessor-Problem"><a href="#Predecessor-Problem" class="headerlink" title="Predecessor Problem"></a>Predecessor Problem</h3><h3 id="van-Emde-Boas"><a href="#van-Emde-Boas" class="headerlink" title="van Emde Boas"></a>van Emde Boas</h3><p>T(u) = T(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.491ex;" xmlns="http://www.w3.org/2000/svg" width="3.224ex" height="2.398ex" role="img" focusable="false" viewBox="0 -843 1425 1060"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msqrt"><g transform="translate(853,0)"><g data-mml-node="mi"><path data-c="1D462" d="M21 287Q21 295 30 318T55 370T99 420T158 442Q204 442 227 417T250 358Q250 340 216 246T182 105Q182 62 196 45T238 27T291 44T328 78L339 95Q341 99 377 247Q407 367 413 387T427 416Q444 431 463 431Q480 431 488 421T496 402L420 84Q419 79 419 68Q419 43 426 35T447 26Q469 29 482 57T512 145Q514 153 532 153Q551 153 551 144Q550 139 549 130T540 98T523 55T498 17T462 -8Q454 -10 438 -10Q372 -10 347 46Q345 45 336 36T318 21T296 6T267 -6T233 -11Q189 -11 155 7Q103 38 103 113Q103 170 138 262T173 379Q173 380 173 381Q173 390 173 393T169 400T158 404H154Q131 404 112 385T82 344T65 302T57 280Q55 278 41 278H27Q21 284 21 287Z"></path></g></g><g data-mml-node="mo" transform="translate(0,-17)"><path data-c="221A" d="M95 178Q89 178 81 186T72 200T103 230T169 280T207 309Q209 311 212 311H213Q219 311 227 294T281 177Q300 134 312 108L397 -77Q398 -77 501 136T707 565T814 786Q820 800 834 800Q841 800 846 794T853 782V776L620 293L385 -193Q381 -200 366 -200Q357 -200 354 -197Q352 -195 256 15L160 225L144 214Q129 202 113 190T95 178Z"></path></g><rect width="572" height="60" x="853" y="723"></rect></g></g></g></svg></mjx-container>) + 1 = O(loglogu)<br>O(logw) = O(loglogu)<br>O(u) space</p><ul><li>combine perfect hashing we can reduce space to O(n)<h3 id="x-Fast-and-y-Fast-Tries"><a href="#x-Fast-and-y-Fast-Tries" class="headerlink" title="x-Fast and y-Fast Tries"></a>x-Fast and y-Fast Tries</h3>x-Fast</li><li>don’t store 0s</li><li>For each level store a dictionary of prefixes of keys</li><li>Binary search over levels to find longest matching prefix with x</li><li>Space: O(nlogu)</li></ul><p>y-Fast Tries </p><ul><li>y-Fast Tries = x-Fast Tries + indirection</li><li>partition S into n/logu groups<ul><li>so space of top part is O(n/logu * logu) = O(n) </li><li>space of down part use Binary search tree is O(n)</li><li>so total space is O(n)</li><li>Time of top part is O(loglogu)</li><li>Time of down part is O(loglogu)</li><li>so total time is O(loglogu)</li></ul></li><li>Space: O(n)</li><li>Time: O(logw) = O(loglogu) </li></ul><h2 id="3-Range-Reporting-Geometry"><a href="#3-Range-Reporting-Geometry" class="headerlink" title="3.Range Reporting(Geometry)"></a>3.Range Reporting(Geometry)</h2><h3 id="1-D-Range-Reporting"><a href="#1-D-Range-Reporting" class="headerlink" title="1-D Range Reporting"></a>1-D Range Reporting</h3><ul><li>sort and binary search</li><li>space: O(n)</li><li>time: O(log+occ)</li><li>preprocess: O(nlogn)<h3 id="2-D-Range-Reporting"><a href="#2-D-Range-Reporting" class="headerlink" title="2-D Range Reporting"></a>2-D Range Reporting</h3><h4 id="Range-Trees"><a href="#Range-Trees" class="headerlink" title="Range Trees"></a>Range Trees</h4><ul><li>x uses binary search</li><li>then y just follow x to form the same tree </li></ul></li></ul><p>2D range tree reporting \<br>using bridges(fractional cascading) \<br><a href="https://ocw.mit.edu/courses/6-851-advanced-data-structures-spring-2012/resources/mit6_851s12_l3/">Range Tree MIT 6.861</a></p><p>range tree explain<br><a href="https://www.youtube.com/watch?v=5a7EYVulN-w&t=491s">video from youtube</a><br>fractional cascading </p><ul><li>Time O(logn+occ)</li><li>Space O(nlogn)</li></ul><h4 id="Predecessor-in-Nested-Sets"><a href="#Predecessor-in-Nested-Sets" class="headerlink" title="Predecessor in Nested Sets"></a>Predecessor in Nested Sets</h4><h4 id="kD-Trees"><a href="#kD-Trees" class="headerlink" title="kD Trees"></a>kD Trees</h4><ul><li>divide the set into two parts by x</li><li>then divide each subset into two parts by y</li><li>recursively do two procedures above</li><li>Space: O(n)</li><li>Time: O(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.491ex;" xmlns="http://www.w3.org/2000/svg" width="3.287ex" height="2.398ex" role="img" focusable="false" viewBox="0 -843 1453 1060"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msqrt"><g transform="translate(853,0)"><g data-mml-node="mi"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g><g data-mml-node="mo" transform="translate(0,-17)"><path data-c="221A" d="M95 178Q89 178 81 186T72 200T103 230T169 280T207 309Q209 311 212 311H213Q219 311 227 294T281 177Q300 134 312 108L397 -77Q398 -77 501 136T707 565T814 786Q820 800 834 800Q841 800 846 794T853 782V776L620 293L385 -193Q381 -200 366 -200Q357 -200 354 -197Q352 -195 256 15L160 225L144 214Q129 202 113 190T95 178Z"></path></g><rect width="600" height="60" x="853" y="723"></rect></g></g></g></svg></mjx-container>)</li><li>Preprocess: O(nlogn)</li></ul><h2 id="4-LCA-and-RMQ-Integer-Data-Structures-III"><a href="#4-LCA-and-RMQ-Integer-Data-Structures-III" class="headerlink" title="4.LCA and RMQ(Integer Data Structures III)"></a>4.LCA and RMQ(Integer Data Structures III)</h2><h3 id="Range-Minimum-Query"><a href="#Range-Minimum-Query" class="headerlink" title="Range Minimum Query"></a>Range Minimum Query</h3><p><a href="https://www.youtube.com/watch?v=0rCFkuQS968">Lowest Common Ancestor And Level Ancestor</a></p><ul><li>Save the result for all intervals of length a power of 2</li><li>Time: O(1)</li><li>Space: O(nlogn)<h3 id="pm-1-RMQ"><a href="#pm-1-RMQ" class="headerlink" title="$\pm 1$RMQ"></a><mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: 0;" xmlns="http://www.w3.org/2000/svg" width="2.891ex" height="1.507ex" role="img" focusable="false" viewBox="0 -666 1278 666"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mo"><path data-c="B1" d="M56 320T56 333T70 353H369V502Q369 651 371 655Q376 666 388 666Q402 666 405 654T409 596V500V353H707Q722 345 722 333Q722 320 707 313H409V40H707Q722 32 722 20T707 0H70Q56 7 56 20T70 40H369V313H70Q56 320 56 333Z"></path></g><g data-mml-node="mn" transform="translate(778,0)"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></g></g></g></svg></mjx-container>RMQ</h3></li><li>2-level solution</li><li>divide the array into blocks of size 1/2 logn</li><li>2-level data structure:<ul><li>Sparse table on blocks<ul><li>space O((n/logn)*log(n/logn)) = O(n)</li><li>time : O(1)</li></ul></li><li>Tabulation inside blocks.<ul><li>Describe block by sequence of +1s and -1s </li><li>length of sequence is 1/2 * logn -1</li><li>so #sequence = <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: 0;" xmlns="http://www.w3.org/2000/svg" width="10.783ex" height="2.021ex" role="img" focusable="false" viewBox="0 -893.3 4766.2 893.3"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msup"><g data-mml-node="mn"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g><g data-mml-node="TeXAtom" transform="translate(533,363) scale(0.707)" data-mjx-texclass="ORD"><g data-mml-node="mo"><path data-c="28" d="M94 250Q94 319 104 381T127 488T164 576T202 643T244 695T277 729T302 750H315H319Q333 750 333 741Q333 738 316 720T275 667T226 581T184 443T167 250T184 58T225 -81T274 -167T316 -220T333 -241Q333 -250 318 -250H315H302L274 -226Q180 -141 137 -14T94 250Z"></path></g><g data-mml-node="mn" transform="translate(389,0)"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></g><g data-mml-node="TeXAtom" data-mjx-texclass="ORD" transform="translate(889,0)"><g data-mml-node="mo"><path data-c="2F" d="M423 750Q432 750 438 744T444 730Q444 725 271 248T92 -240Q85 -250 75 -250Q68 -250 62 -245T56 -231Q56 -221 230 257T407 740Q411 750 423 750Z"></path></g></g><g data-mml-node="mn" transform="translate(1389,0)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g><g data-mml-node="mo" transform="translate(1889,0)"><path data-c="29" d="M60 749L64 750Q69 750 74 750H86L114 726Q208 641 251 514T294 250Q294 182 284 119T261 12T224 -76T186 -143T145 -194T113 -227T90 -246Q87 -249 86 -250H74Q66 -250 63 -250T58 -247T55 -238Q56 -237 66 -225Q221 -64 221 250T66 725Q56 737 55 738Q55 746 60 749Z"></path></g><g data-mml-node="mo" transform="translate(2278,0)"><path data-c="2217" d="M229 286Q216 420 216 436Q216 454 240 464Q241 464 245 464T251 465Q263 464 273 456T283 436Q283 419 277 356T270 286L328 328Q384 369 389 372T399 375Q412 375 423 365T435 338Q435 325 425 315Q420 312 357 282T289 250L355 219L425 184Q434 175 434 161Q434 146 425 136T401 125Q393 125 383 131T328 171L270 213Q283 79 283 63Q283 53 276 44T250 35Q231 35 224 44T216 63Q216 80 222 143T229 213L171 171Q115 130 110 127Q106 124 100 124Q87 124 76 134T64 161Q64 166 64 169T67 175T72 181T81 188T94 195T113 204T138 215T170 230T210 250L74 315Q65 324 65 338Q65 353 74 363T98 374Q106 374 116 368T171 328L229 286Z"></path></g><g data-mml-node="mi" transform="translate(2778,0)"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(3076,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="mi" transform="translate(3561,0)"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g><g data-mml-node="mi" transform="translate(4038,0)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g><g data-mml-node="mo" transform="translate(4638,0)"><path data-c="2212" d="M84 237T84 250T98 270H679Q694 262 694 250T679 230H98Q84 237 84 250Z"></path></g><g data-mml-node="mn" transform="translate(5416,0)"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></g></g></g></g></g></svg></mjx-container> smaller than <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.491ex;" xmlns="http://www.w3.org/2000/svg" width="3.287ex" height="2.398ex" role="img" focusable="false" viewBox="0 -843 1453 1060"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msqrt"><g transform="translate(853,0)"><g data-mml-node="mi"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g><g data-mml-node="mo" transform="translate(0,-17)"><path data-c="221A" d="M95 178Q89 178 81 186T72 200T103 230T169 280T207 309Q209 311 212 311H213Q219 311 227 294T281 177Q300 134 312 108L397 -77Q398 -77 501 136T707 565T814 786Q820 800 834 800Q841 800 846 794T853 782V776L620 293L385 -193Q381 -200 366 -200Q357 -200 354 -197Q352 -195 256 15L160 225L144 214Q129 202 113 190T95 178Z"></path></g><rect width="600" height="60" x="853" y="723"></rect></g></g></g></svg></mjx-container></li><li>size of table is O(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.464ex;" xmlns="http://www.w3.org/2000/svg" width="5.196ex" height="2.482ex" role="img" focusable="false" viewBox="0 -892 2296.6 1097"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msup"><g data-mml-node="TeXAtom" data-mjx-texclass="ORD"><g data-mml-node="mi"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(298,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="mi" transform="translate(783,0)"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g></g><g data-mml-node="mn" transform="translate(1293,421.1) scale(0.707)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g></g><g data-mml-node="mi" transform="translate(1696.6,0)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g></g></svg></mjx-container>)</li><li>space is O(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.491ex;" xmlns="http://www.w3.org/2000/svg" width="3.287ex" height="2.398ex" role="img" focusable="false" viewBox="0 -843 1453 1060"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msqrt"><g transform="translate(853,0)"><g data-mml-node="mi"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g><g data-mml-node="mo" transform="translate(0,-17)"><path data-c="221A" d="M95 178Q89 178 81 186T72 200T103 230T169 280T207 309Q209 311 212 311H213Q219 311 227 294T281 177Q300 134 312 108L397 -77Q398 -77 501 136T707 565T814 786Q820 800 834 800Q841 800 846 794T853 782V776L620 293L385 -193Q381 -200 366 -200Q357 -200 354 -197Q352 -195 256 15L160 225L144 214Q129 202 113 190T95 178Z"></path></g><rect width="600" height="60" x="853" y="723"></rect></g></g></g></svg></mjx-container>*<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.464ex;" xmlns="http://www.w3.org/2000/svg" width="5.196ex" height="2.482ex" role="img" focusable="false" viewBox="0 -892 2296.6 1097"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msup"><g data-mml-node="TeXAtom" data-mjx-texclass="ORD"><g data-mml-node="mi"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(298,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="mi" transform="translate(783,0)"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g></g><g data-mml-node="mn" transform="translate(1293,421.1) scale(0.707)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g></g><g data-mml-node="mi" transform="translate(1696.6,0)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g></g></svg></mjx-container>) + O(n/logn)(number) =O(n) </li></ul></li><li>in total space is O(n), time is O(1)<h3 id="Lowest-Common-Ancestor"><a href="#Lowest-Common-Ancestor" class="headerlink" title="Lowest Common Ancestor"></a>Lowest Common Ancestor</h3></li></ul></li><li>E: Euler tour representation. preorder walk, write id of node when met.</li><li>A: depth of node node in E[i].</li><li>R: first occurrence in E of node with id i</li><li>LCA(i, j) = E[<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.439ex;" xmlns="http://www.w3.org/2000/svg" width="7.272ex" height="2.032ex" role="img" focusable="false" viewBox="0 -704 3214.3 898"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mi"><path data-c="1D445" d="M230 637Q203 637 198 638T193 649Q193 676 204 682Q206 683 378 683Q550 682 564 680Q620 672 658 652T712 606T733 563T739 529Q739 484 710 445T643 385T576 351T538 338L545 333Q612 295 612 223Q612 212 607 162T602 80V71Q602 53 603 43T614 25T640 16Q668 16 686 38T712 85Q717 99 720 102T735 105Q755 105 755 93Q755 75 731 36Q693 -21 641 -21H632Q571 -21 531 4T487 82Q487 109 502 166T517 239Q517 290 474 313Q459 320 449 321T378 323H309L277 193Q244 61 244 59Q244 55 245 54T252 50T269 48T302 46H333Q339 38 339 37T336 19Q332 6 326 0H311Q275 2 180 2Q146 2 117 2T71 2T50 1Q33 1 33 10Q33 12 36 24Q41 43 46 45Q50 46 61 46H67Q94 46 127 49Q141 52 146 61Q149 65 218 339T287 628Q287 635 230 637ZM630 554Q630 586 609 608T523 636Q521 636 500 636T462 637H440Q393 637 386 627Q385 624 352 494T319 361Q319 360 388 360Q466 361 492 367Q556 377 592 426Q608 449 619 486T630 554Z"></path></g><g data-mml-node="mi" transform="translate(759,0)"><path data-c="1D440" d="M289 629Q289 635 232 637Q208 637 201 638T194 648Q194 649 196 659Q197 662 198 666T199 671T201 676T203 679T207 681T212 683T220 683T232 684Q238 684 262 684T307 683Q386 683 398 683T414 678Q415 674 451 396L487 117L510 154Q534 190 574 254T662 394Q837 673 839 675Q840 676 842 678T846 681L852 683H948Q965 683 988 683T1017 684Q1051 684 1051 673Q1051 668 1048 656T1045 643Q1041 637 1008 637Q968 636 957 634T939 623Q936 618 867 340T797 59Q797 55 798 54T805 50T822 48T855 46H886Q892 37 892 35Q892 19 885 5Q880 0 869 0Q864 0 828 1T736 2Q675 2 644 2T609 1Q592 1 592 11Q592 13 594 25Q598 41 602 43T625 46Q652 46 685 49Q699 52 704 61Q706 65 742 207T813 490T848 631L654 322Q458 10 453 5Q451 4 449 3Q444 0 433 0Q418 0 415 7Q413 11 374 317L335 624L267 354Q200 88 200 79Q206 46 272 46H282Q288 41 289 37T286 19Q282 3 278 1Q274 0 267 0Q265 0 255 0T221 1T157 2Q127 2 95 1T58 0Q43 0 39 2T35 11Q35 13 38 25T43 40Q45 46 65 46Q135 46 154 86Q158 92 223 354T289 629Z"></path></g><g data-mml-node="msub" transform="translate(1810,0)"><g data-mml-node="mi"><path data-c="1D444" d="M399 -80Q399 -47 400 -30T402 -11V-7L387 -11Q341 -22 303 -22Q208 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435Q740 255 592 107Q529 47 461 16L444 8V3Q444 2 449 -24T470 -66T516 -82Q551 -82 583 -60T625 -3Q631 11 638 11Q647 11 649 2Q649 -6 639 -34T611 -100T557 -165T481 -194Q399 -194 399 -87V-80ZM636 468Q636 523 621 564T580 625T530 655T477 665Q429 665 379 640Q277 591 215 464T153 216Q153 110 207 59Q231 38 236 38V46Q236 86 269 120T347 155Q372 155 390 144T417 114T429 82T435 55L448 64Q512 108 557 185T619 334T636 468ZM314 18Q362 18 404 39L403 49Q399 104 366 115Q354 117 347 117Q344 117 341 117T337 118Q317 118 296 98T274 52Q274 18 314 18Z"></path></g><g data-mml-node="mi" transform="translate(824,-152.7) scale(0.707)"><path data-c="1D434" d="M208 74Q208 50 254 46Q272 46 272 35Q272 34 270 22Q267 8 264 4T251 0Q249 0 239 0T205 1T141 2Q70 2 50 0H42Q35 7 35 11Q37 38 48 46H62Q132 49 164 96Q170 102 345 401T523 704Q530 716 547 716H555H572Q578 707 578 706L606 383Q634 60 636 57Q641 46 701 46Q726 46 726 36Q726 34 723 22Q720 7 718 4T704 0Q701 0 690 0T651 1T578 2Q484 2 455 0H443Q437 6 437 9T439 27Q443 40 445 43L449 46H469Q523 49 533 63L521 213H283L249 155Q208 86 208 74ZM516 260Q516 271 504 416T490 562L463 519Q447 492 400 412L310 260L413 259Q516 259 516 260Z"></path></g></g></g></g></svg></mjx-container>(R[i], R[j])]</li><li>RMQ -> LCA -> <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: 0;" xmlns="http://www.w3.org/2000/svg" width="2.891ex" height="1.507ex" role="img" focusable="false" viewBox="0 -666 1278 666"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mo"><path data-c="B1" d="M56 320T56 333T70 353H369V502Q369 651 371 655Q376 666 388 666Q402 666 405 654T409 596V500V353H707Q722 345 722 333Q722 320 707 313H409V40H707Q722 32 722 20T707 0H70Q56 7 56 20T70 40H369V313H70Q56 320 56 333Z"></path></g><g data-mml-node="mn" transform="translate(778,0)"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></g></g></g></svg></mjx-container>RMQ </li><li>RMQ and LCA can be solved in O(n) space and O(1) query time.</li></ul><h2 id="5-Level-Ancestor-Trees"><a href="#5-Level-Ancestor-Trees" class="headerlink" title="5.Level Ancestor(Trees)"></a>5.Level Ancestor(Trees)</h2><h3 id="Level-Ancestor"><a href="#Level-Ancestor" class="headerlink" title="Level Ancestor"></a>Level Ancestor</h3><h3 id="Path-Decompositions"><a href="#Path-Decompositions" class="headerlink" title="Path Decompositions"></a>Path Decompositions</h3><h3 id="Tree-Decompositions"><a href="#Tree-Decompositions" class="headerlink" title="Tree Decompositions"></a>Tree Decompositions</h3><p><a href="https://www.youtube.com/watch?v=0rCFkuQS968">Lowest Common Ancestor And Level Ancestor</a><br><a href="https://ocw.mit.edu/courses/6-851-advanced-data-structures-spring-2012/resources/mit6_851s12_lec15/">LCA and Level Ancestor</a></p><h4 id="Top-down-Decomposition"><a href="#Top-down-Decomposition" class="headerlink" title="Top-down Decomposition"></a>Top-down Decomposition</h4><ul><li>jump pointer (make sure at least jump up k/2 nodes)</li><li>ladder decomposition (so the jumped ancestor is on a ladder can be used to find <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="2.866ex" height="1.927ex" role="img" focusable="false" viewBox="0 -694 1266.6 851.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D458" d="M121 647Q121 657 125 670T137 683Q138 683 209 688T282 694Q294 694 294 686Q294 679 244 477Q194 279 194 272Q213 282 223 291Q247 309 292 354T362 415Q402 442 438 442Q468 442 485 423T503 369Q503 344 496 327T477 302T456 291T438 288Q418 288 406 299T394 328Q394 353 410 369T442 390L458 393Q446 405 434 405H430Q398 402 367 380T294 316T228 255Q230 254 243 252T267 246T293 238T320 224T342 206T359 180T365 147Q365 130 360 106T354 66Q354 26 381 26Q429 26 459 145Q461 153 479 153H483Q499 153 499 144Q499 139 496 130Q455 -11 378 -11Q333 -11 305 15T277 90Q277 108 280 121T283 145Q283 167 269 183T234 206T200 217T182 220H180Q168 178 159 139T145 81T136 44T129 20T122 7T111 -2Q98 -11 83 -11Q66 -11 57 -1T48 16Q48 26 85 176T158 471L195 616Q196 629 188 632T149 637H144Q134 637 131 637T124 640T121 647Z"></path></g><g data-mml-node="TeXAtom" transform="translate(554,-150) scale(0.707)" data-mjx-texclass="ORD"><g data-mml-node="mi"><path data-c="1D461" d="M26 385Q19 392 19 395Q19 399 22 411T27 425Q29 430 36 430T87 431H140L159 511Q162 522 166 540T173 566T179 586T187 603T197 615T211 624T229 626Q247 625 254 615T261 596Q261 589 252 549T232 470L222 433Q222 431 272 431H323Q330 424 330 420Q330 398 317 385H210L174 240Q135 80 135 68Q135 26 162 26Q197 26 230 60T283 144Q285 150 288 151T303 153H307Q322 153 322 145Q322 142 319 133Q314 117 301 95T267 48T216 6T155 -11Q125 -11 98 4T59 56Q57 64 57 83V101L92 241Q127 382 128 383Q128 385 77 385H26Z"></path></g><g data-mml-node="mi" transform="translate(361,0)"><path data-c="210E" d="M137 683Q138 683 209 688T282 694Q294 694 294 685Q294 674 258 534Q220 386 220 383Q220 381 227 388Q288 442 357 442Q411 442 444 415T478 336Q478 285 440 178T402 50Q403 36 407 31T422 26Q450 26 474 56T513 138Q516 149 519 151T535 153Q555 153 555 145Q555 144 551 130Q535 71 500 33Q466 -10 419 -10H414Q367 -10 346 17T325 74Q325 90 361 192T398 345Q398 404 354 404H349Q266 404 205 306L198 293L164 158Q132 28 127 16Q114 -11 83 -11Q69 -11 59 -2T48 16Q48 30 121 320L195 616Q195 629 188 632T149 637H128Q122 643 122 645T124 664Q129 683 137 683Z"></path></g></g></g></g></g></svg></mjx-container> ancestor)</li><li>tree trimming(node with >= 1/4 * logn descendants)</li><li>Space<ul><li>top part is O(n + n/logn * logn) = O(n) (the second logn is from ladder)</li><li>down part uses balanced parentheses representation<ul><li>tree encoding: the tree nodes is < 1/4 logn</li><li>so the tree encoding is 2<em> 1/4 logn = 1/2 </em>logn bits</li><li>node v and ancestor k encoding is 2 * log(1/4logn) < 2 loglogn bits</li><li>so all probabilities are < <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: 0;" xmlns="http://www.w3.org/2000/svg" width="13.73ex" height="2.021ex" role="img" focusable="false" viewBox="0 -893.3 6068.7 893.3"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msup"><g data-mml-node="mn"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g><g data-mml-node="TeXAtom" transform="translate(533,363) scale(0.707)" data-mjx-texclass="ORD"><g data-mml-node="mn"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></g><g data-mml-node="TeXAtom" data-mjx-texclass="ORD" transform="translate(500,0)"><g data-mml-node="mo"><path data-c="2F" d="M423 750Q432 750 438 744T444 730Q444 725 271 248T92 -240Q85 -250 75 -250Q68 -250 62 -245T56 -231Q56 -221 230 257T407 740Q411 750 423 750Z"></path></g></g><g data-mml-node="mn" transform="translate(1000,0)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g><g data-mml-node="mi" transform="translate(1500,0)"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(1798,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="mi" transform="translate(2283,0)"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g><g data-mml-node="mi" transform="translate(2760,0)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g><g data-mml-node="mo" transform="translate(3360,0)"><path data-c="2B" d="M56 237T56 250T70 270H369V420L370 570Q380 583 389 583Q402 583 409 568V270H707Q722 262 722 250T707 230H409V-68Q401 -82 391 -82H389H387Q375 -82 369 -68V230H70Q56 237 56 250Z"></path></g><g data-mml-node="mn" transform="translate(4138,0)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g><g data-mml-node="mi" transform="translate(4638,0)"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(4936,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="mi" transform="translate(5421,0)"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g><g data-mml-node="mi" transform="translate(5898,0)"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(6196,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="mi" transform="translate(6681,0)"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g><g data-mml-node="mi" transform="translate(7158,0)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g></g></g></g></svg></mjx-container> = O(n)</li></ul></li><li>if we can find the node in the bottom tree, just use bottom tree. if we need to find ancestor in the top part, just O(1) by ladder and jump pointer.</li></ul></li></ul><h2 id="6-Suffix-Tree-Strings-I"><a href="#6-Suffix-Tree-Strings-I" class="headerlink" title="6.Suffix Tree(Strings I)"></a>6.Suffix Tree(Strings I)</h2><h3 id="Dictionaries"><a href="#Dictionaries" class="headerlink" title="Dictionaries"></a>Dictionaries</h3><h3 id="Tries"><a href="#Tries" class="headerlink" title="Tries"></a>Tries</h3><h3 id="Suffix-trees"><a href="#Suffix-trees" class="headerlink" title="Suffix trees"></a>Suffix trees</h3><p><a href="https://ocw.mit.edu/courses/6-851-advanced-data-structures-spring-2012/resources/mit6_851s12_lec16/">note from MIT 6.861</a>]</p><ul><li>space : O(n)</li><li>time: O(m) for searching for a string of length m</li><li>time: O(m + occ) for prefix search, where occ = #occurrences</li><li>the occ could be large</li></ul><h2 id="7-Suffix-Sorting-Strings-II"><a href="#7-Suffix-Sorting-Strings-II" class="headerlink" title="7.Suffix Sorting(Strings II)"></a>7.Suffix Sorting(Strings II)</h2><h3 id="Radix-Sorting"><a href="#Radix-Sorting" class="headerlink" title="Radix Sorting"></a>Radix Sorting</h3><h3 id="Suffix-Array"><a href="#Suffix-Array" class="headerlink" title="Suffix Array"></a>Suffix Array</h3><h3 id="Suffix-Sorting"><a href="#Suffix-Sorting" class="headerlink" title="Suffix Sorting"></a>Suffix Sorting</h3><p>DC3 Algorithm</p><ul><li>Sort Sample Suffixes<ul><li>Sample all suffixes starting at positions i = 1 mod 3 and i = 2 mod 3 O(n)</li><li>Recursively sort sample suffixes O(2/3 n)</li></ul></li><li>Sort non-sample suffixes<ul><li>Sort the remaining suffixes (starting at positions i = 0 mod 3) O(n)</li></ul></li><li>Merge O(n)</li><li>total time: O(n+2/3n+n+n) = O(n)</li><li>We can suffix sort a string of length n over alphabet Σ of size n in time O(n).</li><li>We can suffix sort a string of length n over alphabet Σ O(sort(n, |Σ|)) time.</li></ul><h2 id="8-Compression-Compression"><a href="#8-Compression-Compression" class="headerlink" title="8.Compression(Compression)"></a>8.Compression(Compression)</h2><h3 id="Lempel-Ziv"><a href="#Lempel-Ziv" class="headerlink" title="Lempel-Ziv"></a>Lempel-Ziv</h3><h4 id="Lempel-Ziv-77"><a href="#Lempel-Ziv-77" class="headerlink" title="Lempel-Ziv 77"></a>Lempel-Ziv 77</h4><ul><li>using suffix tree</li><li>Parse from left-to-right into phrases.</li><li>Select longest phrase seen before + a single character.</li><li>Encode phrases (previous phrase, character) or single phrase</li><li>time: O(n)<h4 id="Lempel-Ziv-78"><a href="#Lempel-Ziv-78" class="headerlink" title="Lempel-Ziv 78"></a>Lempel-Ziv 78</h4></li><li>Parse from left-to-right into phrases.</li><li>Select longest phrase seen before + a single character.</li><li>Encode phrases (previous phrase, character) or single phrase</li><li>time: O(n)<h3 id="Re-Pair"><a href="#Re-Pair" class="headerlink" title="Re-Pair"></a>Re-Pair</h3></li><li>Recursive-pairing compression [Larsson and Moffat 2000].<ul><li>Start with string S.</li><li>Replace a most frequent pair ab by new character Xi. Output rule <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="2.613ex" height="1.902ex" role="img" focusable="false" viewBox="0 -683 1155 840.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D44B" d="M42 0H40Q26 0 26 11Q26 15 29 27Q33 41 36 43T55 46Q141 49 190 98Q200 108 306 224T411 342Q302 620 297 625Q288 636 234 637H206Q200 643 200 645T202 664Q206 677 212 683H226Q260 681 347 681Q380 681 408 681T453 682T473 682Q490 682 490 671Q490 670 488 658Q484 643 481 640T465 637Q434 634 411 620L488 426L541 485Q646 598 646 610Q646 628 622 635Q617 635 609 637Q594 637 594 648Q594 650 596 664Q600 677 606 683H618Q619 683 643 683T697 681T738 680Q828 680 837 683H845Q852 676 852 672Q850 647 840 637H824Q790 636 763 628T722 611T698 593L687 584Q687 585 592 480L505 384Q505 383 536 304T601 142T638 56Q648 47 699 46Q734 46 734 37Q734 35 732 23Q728 7 725 4T711 1Q708 1 678 1T589 2Q528 2 496 2T461 1Q444 1 444 10Q444 11 446 25Q448 35 450 39T455 44T464 46T480 47T506 54Q523 62 523 64Q522 64 476 181L429 299Q241 95 236 84Q232 76 232 72Q232 53 261 47Q262 47 267 47T273 46Q276 46 277 46T280 45T283 42T284 35Q284 26 282 19Q279 6 276 4T261 1Q258 1 243 1T201 2T142 2Q64 2 42 0Z"></path></g><g data-mml-node="mi" transform="translate(861,-150) scale(0.707)"><path data-c="1D456" d="M184 600Q184 624 203 642T247 661Q265 661 277 649T290 619Q290 596 270 577T226 557Q211 557 198 567T184 600ZM21 287Q21 295 30 318T54 369T98 420T158 442Q197 442 223 419T250 357Q250 340 236 301T196 196T154 83Q149 61 149 51Q149 26 166 26Q175 26 185 29T208 43T235 78T260 137Q263 149 265 151T282 153Q302 153 302 143Q302 135 293 112T268 61T223 11T161 -11Q129 -11 102 10T74 74Q74 91 79 106T122 220Q160 321 166 341T173 380Q173 404 156 404H154Q124 404 99 371T61 287Q60 286 59 284T58 281T56 279T53 278T49 278T41 278H27Q21 284 21 287Z"></path></g></g></g></g></svg></mjx-container> ➞ ab.(find the most frequent pair by suffix tree?)</li><li>Repeat until we have a single pair.<h3 id="Grammars"><a href="#Grammars" class="headerlink" title="Grammars"></a>Grammars</h3></li></ul></li><li>Grammar compression. Encode string S as an grammar G that generates S.</li></ul><h2 id="9-Approximation-Algorithm-1"><a href="#9-Approximation-Algorithm-1" class="headerlink" title="9.Approximation Algorithm 1"></a>9.Approximation Algorithm 1</h2><h3 id="Acyclic-Graph-Given-a-directed-graph-G-V-E-pick-a-maximum-cardinality-set-of-edges-from-E-such-that-the-resulting-graph-is-acyclic"><a href="#Acyclic-Graph-Given-a-directed-graph-G-V-E-pick-a-maximum-cardinality-set-of-edges-from-E-such-that-the-resulting-graph-is-acyclic" class="headerlink" title="Acyclic Graph. Given a directed graph G=(V,E), pick a maximum cardinality set of edges from E such that the resulting graph is acyclic."></a>Acyclic Graph. Given a directed graph G=(V,E), pick a maximum cardinality set of edges from E such that the resulting graph is acyclic.</h3><ul><li>1/2-Approximation Algorithm</li><li>select one directed path between two directed paths<h3 id="Min-Max-Matching"><a href="#Min-Max-Matching" class="headerlink" title="Min Max Matching"></a>Min Max Matching</h3></li><li>2-Algorithm Algorithm<br><a href="https://www.youtube.com/watch?v=wWk1EmV52Ks&t=86s">Matching</a><h3 id="LPT-scheduling-longest-processing-time"><a href="#LPT-scheduling-longest-processing-time" class="headerlink" title="LPT(scheduling) longest processing time"></a>LPT(scheduling) longest processing time</h3><a href="https://www.yatming.net/2018/01/03/Machine-Scheduling-Machine-Learning-in-Engineering/">LPT</a></li><li>Assume t1 ≥ …. ≥ tn.</li><li>Assume wlog that smallest job finishes last.</li><li>If tn ≤ T<em>/3 then T ≤ 4/3 T</em>.</li><li>If tn > T*/3 then each machine can process at most 2 jobs in OPT.<h3 id="K-center"><a href="#K-center" class="headerlink" title="K-center"></a>K-center</h3></li><li>assume r<em> is the OPT. We can get 2 </em> r*. It is 2-Approximation Algorithm.</li><li>Because radius of OPT cluster is r<em> and every node in this cluster is no more far away from 2 </em> r<em>. We cannot ensure we always pick the right node in each cluster. But we can ensure we choose a node in OPT cluster so every node in this cluster is no more far away from 2 </em> r*.<br><a href="http://staff.ustc.edu.cn/~huding/data_pdf/聚类.pdf">link</a>]</li></ul><h2 id="10-Approximation-Algorithm-2"><a href="#10-Approximation-Algorithm-2" class="headerlink" title="10.Approximation Algorithm 2"></a>10.Approximation Algorithm 2</h2><h3 id="TSP"><a href="#TSP" class="headerlink" title="TSP"></a>TSP</h3><ul><li>Christofides’ algorithm<ul><li>find MST T(minimum spinning tree by Prime’s algorithm)</li><li>find odd degree O vertices in MST T</li><li>compute minimum perfect matching M on O</li><li>Construct Euler tour 𝞃</li><li>Shortcut such that each vertex only visited once (𝞃’) (in a triangle, a + b > c)</li></ul></li><li>length(𝞃’) ≤ length(𝞃) = cost(T) + cost(M) ≤ OPT + cost(M).</li><li>cost(M) ≤ OPT/2.<ul><li><mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="5.71ex" height="1.95ex" role="img" focusable="false" viewBox="0 -704 2523.9 861.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mi"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mi" transform="translate(763,0)"><path data-c="1D443" d="M287 628Q287 635 230 637Q206 637 199 638T192 648Q192 649 194 659Q200 679 203 681T397 683Q587 682 600 680Q664 669 707 631T751 530Q751 453 685 389Q616 321 507 303Q500 302 402 301H307L277 182Q247 66 247 59Q247 55 248 54T255 50T272 48T305 46H336Q342 37 342 35Q342 19 335 5Q330 0 319 0Q316 0 282 1T182 2Q120 2 87 2T51 1Q33 1 33 11Q33 13 36 25Q40 41 44 43T67 46Q94 46 127 49Q141 52 146 61Q149 65 218 339T287 628ZM645 554Q645 567 643 575T634 597T609 619T560 635Q553 636 480 637Q463 637 445 637T416 636T404 636Q391 635 386 627Q384 621 367 550T332 412T314 344Q314 342 395 342H407H430Q542 342 590 392Q617 419 631 471T645 554Z"></path></g><g data-mml-node="msub" transform="translate(1514,0)"><g data-mml-node="mi"><path data-c="1D447" d="M40 437Q21 437 21 445Q21 450 37 501T71 602L88 651Q93 669 101 677H569H659Q691 677 697 676T704 667Q704 661 687 553T668 444Q668 437 649 437Q640 437 637 437T631 442L629 445Q629 451 635 490T641 551Q641 586 628 604T573 629Q568 630 515 631Q469 631 457 630T439 622Q438 621 368 343T298 60Q298 48 386 46Q418 46 427 45T436 36Q436 31 433 22Q429 4 424 1L422 0Q419 0 415 0Q410 0 363 1T228 2Q99 2 64 0H49Q43 6 43 9T45 27Q49 40 55 46H83H94Q174 46 189 55Q190 56 191 56Q196 59 201 76T241 233Q258 301 269 344Q339 619 339 625Q339 630 310 630H279Q212 630 191 624Q146 614 121 583T67 467Q60 445 57 441T43 437H40Z"></path></g><g data-mml-node="mi" transform="translate(617,-150) scale(0.707)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g></g></g></g></svg></mjx-container> = OPT restricted to O.</li><li><mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="5.71ex" height="1.95ex" role="img" focusable="false" viewBox="0 -704 2523.9 861.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mi"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mi" transform="translate(763,0)"><path data-c="1D443" d="M287 628Q287 635 230 637Q206 637 199 638T192 648Q192 649 194 659Q200 679 203 681T397 683Q587 682 600 680Q664 669 707 631T751 530Q751 453 685 389Q616 321 507 303Q500 302 402 301H307L277 182Q247 66 247 59Q247 55 248 54T255 50T272 48T305 46H336Q342 37 342 35Q342 19 335 5Q330 0 319 0Q316 0 282 1T182 2Q120 2 87 2T51 1Q33 1 33 11Q33 13 36 25Q40 41 44 43T67 46Q94 46 127 49Q141 52 146 61Q149 65 218 339T287 628ZM645 554Q645 567 643 575T634 597T609 619T560 635Q553 636 480 637Q463 637 445 637T416 636T404 636Q391 635 386 627Q384 621 367 550T332 412T314 344Q314 342 395 342H407H430Q542 342 590 392Q617 419 631 471T645 554Z"></path></g><g data-mml-node="msub" transform="translate(1514,0)"><g data-mml-node="mi"><path data-c="1D447" d="M40 437Q21 437 21 445Q21 450 37 501T71 602L88 651Q93 669 101 677H569H659Q691 677 697 676T704 667Q704 661 687 553T668 444Q668 437 649 437Q640 437 637 437T631 442L629 445Q629 451 635 490T641 551Q641 586 628 604T573 629Q568 630 515 631Q469 631 457 630T439 622Q438 621 368 343T298 60Q298 48 386 46Q418 46 427 45T436 36Q436 31 433 22Q429 4 424 1L422 0Q419 0 415 0Q410 0 363 1T228 2Q99 2 64 0H49Q43 6 43 9T45 27Q49 40 55 46H83H94Q174 46 189 55Q190 56 191 56Q196 59 201 76T241 233Q258 301 269 344Q339 619 339 625Q339 630 310 630H279Q212 630 191 624Q146 614 121 583T67 467Q60 445 57 441T43 437H40Z"></path></g><g data-mml-node="mi" transform="translate(617,-150) scale(0.707)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g></g></g></g></svg></mjx-container> ≤ OPT.</li><li>can partition <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="5.71ex" height="1.95ex" role="img" focusable="false" viewBox="0 -704 2523.9 861.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mi"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mi" transform="translate(763,0)"><path data-c="1D443" d="M287 628Q287 635 230 637Q206 637 199 638T192 648Q192 649 194 659Q200 679 203 681T397 683Q587 682 600 680Q664 669 707 631T751 530Q751 453 685 389Q616 321 507 303Q500 302 402 301H307L277 182Q247 66 247 59Q247 55 248 54T255 50T272 48T305 46H336Q342 37 342 35Q342 19 335 5Q330 0 319 0Q316 0 282 1T182 2Q120 2 87 2T51 1Q33 1 33 11Q33 13 36 25Q40 41 44 43T67 46Q94 46 127 49Q141 52 146 61Q149 65 218 339T287 628ZM645 554Q645 567 643 575T634 597T609 619T560 635Q553 636 480 637Q463 637 445 637T416 636T404 636Q391 635 386 627Q384 621 367 550T332 412T314 344Q314 342 395 342H407H430Q542 342 590 392Q617 419 631 471T645 554Z"></path></g><g data-mml-node="msub" transform="translate(1514,0)"><g data-mml-node="mi"><path data-c="1D447" d="M40 437Q21 437 21 445Q21 450 37 501T71 602L88 651Q93 669 101 677H569H659Q691 677 697 676T704 667Q704 661 687 553T668 444Q668 437 649 437Q640 437 637 437T631 442L629 445Q629 451 635 490T641 551Q641 586 628 604T573 629Q568 630 515 631Q469 631 457 630T439 622Q438 621 368 343T298 60Q298 48 386 46Q418 46 427 45T436 36Q436 31 433 22Q429 4 424 1L422 0Q419 0 415 0Q410 0 363 1T228 2Q99 2 64 0H49Q43 6 43 9T45 27Q49 40 55 46H83H94Q174 46 189 55Q190 56 191 56Q196 59 201 76T241 233Q258 301 269 344Q339 619 339 625Q339 630 310 630H279Q212 630 191 624Q146 614 121 583T67 467Q60 445 57 441T43 437H40Z"></path></g><g data-mml-node="mi" transform="translate(617,-150) scale(0.707)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g></g></g></g></svg></mjx-container> into two perfect matchings <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.339ex;" xmlns="http://www.w3.org/2000/svg" width="2.714ex" height="1.932ex" role="img" focusable="false" viewBox="0 -704 1199.6 854"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mn" transform="translate(796,-150) scale(0.707)"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></g></g></g></g></svg></mjx-container> and <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.339ex;" xmlns="http://www.w3.org/2000/svg" width="2.714ex" height="1.932ex" role="img" focusable="false" viewBox="0 -704 1199.6 854"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mn" transform="translate(796,-150) scale(0.707)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g></g></g></g></svg></mjx-container>.</li><li>cost(M) ≤ min(cost(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.339ex;" xmlns="http://www.w3.org/2000/svg" width="2.714ex" height="1.932ex" role="img" focusable="false" viewBox="0 -704 1199.6 854"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mn" transform="translate(796,-150) scale(0.707)"><path data-c="31" d="M213 578L200 573Q186 568 160 563T102 556H83V602H102Q149 604 189 617T245 641T273 663Q275 666 285 666Q294 666 302 660V361L303 61Q310 54 315 52T339 48T401 46H427V0H416Q395 3 257 3Q121 3 100 0H88V46H114Q136 46 152 46T177 47T193 50T201 52T207 57T213 61V578Z"></path></g></g></g></g></svg></mjx-container>), cost(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.339ex;" xmlns="http://www.w3.org/2000/svg" width="2.714ex" height="1.932ex" role="img" focusable="false" viewBox="0 -704 1199.6 854"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mn" transform="translate(796,-150) scale(0.707)"><path data-c="32" d="M109 429Q82 429 66 447T50 491Q50 562 103 614T235 666Q326 666 387 610T449 465Q449 422 429 383T381 315T301 241Q265 210 201 149L142 93L218 92Q375 92 385 97Q392 99 409 186V189H449V186Q448 183 436 95T421 3V0H50V19V31Q50 38 56 46T86 81Q115 113 136 137Q145 147 170 174T204 211T233 244T261 278T284 308T305 340T320 369T333 401T340 431T343 464Q343 527 309 573T212 619Q179 619 154 602T119 569T109 550Q109 549 114 549Q132 549 151 535T170 489Q170 464 154 447T109 429Z"></path></g></g></g></g></svg></mjx-container>)) ≤ OPT/2.</li></ul></li><li>length(𝞃’) ≤ length(𝞃) = cost(T) + cost(M) ≤ OPT + OPT/2 = 3/2 OPT.</li><li>Christofides’ algorithm is a 3/2-approximation algorithm for TSP.<br><a href="https://www.youtube.com/watch?v=GiDsjIBOVoA&t=726s">link</a><br><a href="https://xujinzh.github.io/2021/11/19/hungarian-matching-algorithm/index.html">Hungarian Matching Algorithm</a><h3 id="Set-Cover"><a href="#Set-Cover" class="headerlink" title="Set Cover"></a>Set Cover</h3></li><li>greedy algorithm</li><li><mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -1.033ex;" xmlns="http://www.w3.org/2000/svg" width="4.78ex" height="2.763ex" role="img" focusable="false" viewBox="0 -764.8 2113 1221.4"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mfrac"><g data-mml-node="msub" transform="translate(687.8,451.6) scale(0.707)"><g data-mml-node="mi"><path data-c="1D464" d="M580 385Q580 406 599 424T641 443Q659 443 674 425T690 368Q690 339 671 253Q656 197 644 161T609 80T554 12T482 -11Q438 -11 404 5T355 48Q354 47 352 44Q311 -11 252 -11Q226 -11 202 -5T155 14T118 53T104 116Q104 170 138 262T173 379Q173 380 173 381Q173 390 173 393T169 400T158 404H154Q131 404 112 385T82 344T65 302T57 280Q55 278 41 278H27Q21 284 21 287Q21 293 29 315T52 366T96 418T161 441Q204 441 227 416T250 358Q250 340 217 250T184 111Q184 65 205 46T258 26Q301 26 334 87L339 96V119Q339 122 339 128T340 136T341 143T342 152T345 165T348 182T354 206T362 238T373 281Q402 395 406 404Q419 431 449 431Q468 431 475 421T483 402Q483 389 454 274T422 142Q420 131 420 107V100Q420 85 423 71T442 42T487 26Q558 26 600 148Q609 171 620 213T632 273Q632 306 619 325T593 357T580 385Z"></path></g><g data-mml-node="mi" transform="translate(749,-150) scale(0.707)"><path data-c="1D456" d="M184 600Q184 624 203 642T247 661Q265 661 277 649T290 619Q290 596 270 577T226 557Q211 557 198 567T184 600ZM21 287Q21 295 30 318T54 369T98 420T158 442Q197 442 223 419T250 357Q250 340 236 301T196 196T154 83Q149 61 149 51Q149 26 166 26Q175 26 185 29T208 43T235 78T260 137Q263 149 265 151T282 153Q302 153 302 143Q302 135 293 112T268 61T223 11T161 -11Q129 -11 102 10T74 74Q74 91 79 106T122 220Q160 321 166 341T173 380Q173 404 156 404H154Q124 404 99 371T61 287Q60 286 59 284T58 281T56 279T53 278T49 278T41 278H27Q21 284 21 287Z"></path></g></g><g data-mml-node="mrow" transform="translate(220,-345) scale(0.707)"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D446" d="M308 24Q367 24 416 76T466 197Q466 260 414 284Q308 311 278 321T236 341Q176 383 176 462Q176 523 208 573T273 648Q302 673 343 688T407 704H418H425Q521 704 564 640Q565 640 577 653T603 682T623 704Q624 704 627 704T632 705Q645 705 645 698T617 577T585 459T569 456Q549 456 549 465Q549 471 550 475Q550 478 551 494T553 520Q553 554 544 579T526 616T501 641Q465 662 419 662Q362 662 313 616T263 510Q263 480 278 458T319 427Q323 425 389 408T456 390Q490 379 522 342T554 242Q554 216 546 186Q541 164 528 137T492 78T426 18T332 -20Q320 -22 298 -22Q199 -22 144 33L134 44L106 13Q83 -14 78 -18T65 -22Q52 -22 52 -14Q52 -11 110 221Q112 227 130 227H143Q149 221 149 216Q149 214 148 207T144 186T142 153Q144 114 160 87T203 47T255 29T308 24Z"></path></g><g data-mml-node="mi" transform="translate(646,-150) scale(0.707)"><path data-c="1D456" d="M184 600Q184 624 203 642T247 661Q265 661 277 649T290 619Q290 596 270 577T226 557Q211 557 198 567T184 600ZM21 287Q21 295 30 318T54 369T98 420T158 442Q197 442 223 419T250 357Q250 340 236 301T196 196T154 83Q149 61 149 51Q149 26 166 26Q175 26 185 29T208 43T235 78T260 137Q263 149 265 151T282 153Q302 153 302 143Q302 135 293 112T268 61T223 11T161 -11Q129 -11 102 10T74 74Q74 91 79 106T122 220Q160 321 166 341T173 380Q173 404 156 404H154Q124 404 99 371T61 287Q60 286 59 284T58 281T56 279T53 278T49 278T41 278H27Q21 284 21 287Z"></path></g></g><g data-mml-node="mo" transform="translate(940,0)"><path data-c="2229" d="M88 -21T75 -21T55 -7V200Q55 231 55 280Q56 414 60 428Q61 430 61 431Q77 500 152 549T332 598Q443 598 522 544T610 405Q611 399 611 194V-7Q604 -22 591 -22Q582 -22 572 -9L570 405Q563 433 556 449T529 485Q498 519 445 538T334 558Q251 558 179 518T96 401Q95 396 95 193V-7Q88 -21 75 -21Z"></path></g><g data-mml-node="mi" transform="translate(1607,0)"><path data-c="1D445" d="M230 637Q203 637 198 638T193 649Q193 676 204 682Q206 683 378 683Q550 682 564 680Q620 672 658 652T712 606T733 563T739 529Q739 484 710 445T643 385T576 351T538 338L545 333Q612 295 612 223Q612 212 607 162T602 80V71Q602 53 603 43T614 25T640 16Q668 16 686 38T712 85Q717 99 720 102T735 105Q755 105 755 93Q755 75 731 36Q693 -21 641 -21H632Q571 -21 531 4T487 82Q487 109 502 166T517 239Q517 290 474 313Q459 320 449 321T378 323H309L277 193Q244 61 244 59Q244 55 245 54T252 50T269 48T302 46H333Q339 38 339 37T336 19Q332 6 326 0H311Q275 2 180 2Q146 2 117 2T71 2T50 1Q33 1 33 10Q33 12 36 24Q41 43 46 45Q50 46 61 46H67Q94 46 127 49Q141 52 146 61Q149 65 218 339T287 628Q287 635 230 637ZM630 554Q630 586 609 608T523 636Q521 636 500 636T462 637H440Q393 637 386 627Q385 624 352 494T319 361Q319 360 388 360Q466 361 492 367Q556 377 592 426Q608 449 619 486T630 554Z"></path></g></g><rect width="1873" height="60" x="120" y="220"></rect></g></g></g></svg></mjx-container> (this is dynamic)<br><a href="https://www.cnblogs.com/xlucidator/p/17294349.html">set cover</a></li><li>OPT = 1+x(x is slight bigger than 0)<ul><li>then cost = <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="3.028ex" height="1.902ex" role="img" focusable="false" viewBox="0 -683 1338.3 840.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D43B" d="M228 637Q194 637 192 641Q191 643 191 649Q191 673 202 682Q204 683 219 683Q260 681 355 681Q389 681 418 681T463 682T483 682Q499 682 499 672Q499 670 497 658Q492 641 487 638H485Q483 638 480 638T473 638T464 637T455 637Q416 636 405 634T387 623Q384 619 355 500Q348 474 340 442T328 395L324 380Q324 378 469 378H614L615 381Q615 384 646 504Q674 619 674 627T617 637Q594 637 587 639T580 648Q580 650 582 660Q586 677 588 679T604 682Q609 682 646 681T740 680Q802 680 835 681T871 682Q888 682 888 672Q888 645 876 638H874Q872 638 869 638T862 638T853 637T844 637Q805 636 794 634T776 623Q773 618 704 340T634 58Q634 51 638 51Q646 48 692 46H723Q729 38 729 37T726 19Q722 6 716 0H701Q664 2 567 2Q533 2 504 2T458 2T437 1Q420 1 420 10Q420 15 423 24Q428 43 433 45Q437 46 448 46H454Q481 46 514 49Q520 50 522 50T528 55T534 64T540 82T547 110T558 153Q565 181 569 198Q602 330 602 331T457 332H312L279 197Q245 63 245 58Q245 51 253 49T303 46H334Q340 38 340 37T337 19Q333 6 327 0H312Q275 2 178 2Q144 2 115 2T69 2T48 1Q31 1 31 10Q31 12 34 24Q39 43 44 45Q48 46 59 46H65Q92 46 125 49Q139 52 144 61Q147 65 216 339T285 628Q285 635 228 637Z"></path></g><g data-mml-node="mi" transform="translate(864,-150) scale(0.707)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g></g></g></svg></mjx-container> (<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="3.028ex" height="1.902ex" role="img" focusable="false" viewBox="0 -683 1338.3 840.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D43B" d="M228 637Q194 637 192 641Q191 643 191 649Q191 673 202 682Q204 683 219 683Q260 681 355 681Q389 681 418 681T463 682T483 682Q499 682 499 672Q499 670 497 658Q492 641 487 638H485Q483 638 480 638T473 638T464 637T455 637Q416 636 405 634T387 623Q384 619 355 500Q348 474 340 442T328 395L324 380Q324 378 469 378H614L615 381Q615 384 646 504Q674 619 674 627T617 637Q594 637 587 639T580 648Q580 650 582 660Q586 677 588 679T604 682Q609 682 646 681T740 680Q802 680 835 681T871 682Q888 682 888 672Q888 645 876 638H874Q872 638 869 638T862 638T853 637T844 637Q805 636 794 634T776 623Q773 618 704 340T634 58Q634 51 638 51Q646 48 692 46H723Q729 38 729 37T726 19Q722 6 716 0H701Q664 2 567 2Q533 2 504 2T458 2T437 1Q420 1 420 10Q420 15 423 24Q428 43 433 45Q437 46 448 46H454Q481 46 514 49Q520 50 522 50T528 55T534 64T540 82T547 110T558 153Q565 181 569 198Q602 330 602 331T457 332H312L279 197Q245 63 245 58Q245 51 253 49T303 46H334Q340 38 340 37T337 19Q333 6 327 0H312Q275 2 178 2Q144 2 115 2T69 2T48 1Q31 1 31 10Q31 12 34 24Q39 43 44 45Q48 46 59 46H65Q92 46 125 49Q139 52 144 61Q147 65 216 339T285 628Q285 635 228 637Z"></path></g><g data-mml-node="mi" transform="translate(864,-150) scale(0.707)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g></g></g></svg></mjx-container> is logn)</li></ul></li><li>in fact, cost = <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="3.028ex" height="1.902ex" role="img" focusable="false" viewBox="0 -683 1338.3 840.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D43B" d="M228 637Q194 637 192 641Q191 643 191 649Q191 673 202 682Q204 683 219 683Q260 681 355 681Q389 681 418 681T463 682T483 682Q499 682 499 672Q499 670 497 658Q492 641 487 638H485Q483 638 480 638T473 638T464 637T455 637Q416 636 405 634T387 623Q384 619 355 500Q348 474 340 442T328 395L324 380Q324 378 469 378H614L615 381Q615 384 646 504Q674 619 674 627T617 637Q594 637 587 639T580 648Q580 650 582 660Q586 677 588 679T604 682Q609 682 646 681T740 680Q802 680 835 681T871 682Q888 682 888 672Q888 645 876 638H874Q872 638 869 638T862 638T853 637T844 637Q805 636 794 634T776 623Q773 618 704 340T634 58Q634 51 638 51Q646 48 692 46H723Q729 38 729 37T726 19Q722 6 716 0H701Q664 2 567 2Q533 2 504 2T458 2T437 1Q420 1 420 10Q420 15 423 24Q428 43 433 45Q437 46 448 46H454Q481 46 514 49Q520 50 522 50T528 55T534 64T540 82T547 110T558 153Q565 181 569 198Q602 330 602 331T457 332H312L279 197Q245 63 245 58Q245 51 253 49T303 46H334Q340 38 340 37T337 19Q333 6 327 0H312Q275 2 178 2Q144 2 115 2T69 2T48 1Q31 1 31 10Q31 12 34 24Q39 43 44 45Q48 46 59 46H65Q92 46 125 49Q139 52 144 61Q147 65 216 339T285 628Q285 635 228 637Z"></path></g><g data-mml-node="mi" transform="translate(864,-150) scale(0.707)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g></g></g></svg></mjx-container> * OPT</li><li>so this is a <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.357ex;" xmlns="http://www.w3.org/2000/svg" width="3.028ex" height="1.902ex" role="img" focusable="false" viewBox="0 -683 1338.3 840.8"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msub"><g data-mml-node="mi"><path data-c="1D43B" d="M228 637Q194 637 192 641Q191 643 191 649Q191 673 202 682Q204 683 219 683Q260 681 355 681Q389 681 418 681T463 682T483 682Q499 682 499 672Q499 670 497 658Q492 641 487 638H485Q483 638 480 638T473 638T464 637T455 637Q416 636 405 634T387 623Q384 619 355 500Q348 474 340 442T328 395L324 380Q324 378 469 378H614L615 381Q615 384 646 504Q674 619 674 627T617 637Q594 637 587 639T580 648Q580 650 582 660Q586 677 588 679T604 682Q609 682 646 681T740 680Q802 680 835 681T871 682Q888 682 888 672Q888 645 876 638H874Q872 638 869 638T862 638T853 637T844 637Q805 636 794 634T776 623Q773 618 704 340T634 58Q634 51 638 51Q646 48 692 46H723Q729 38 729 37T726 19Q722 6 716 0H701Q664 2 567 2Q533 2 504 2T458 2T437 1Q420 1 420 10Q420 15 423 24Q428 43 433 45Q437 46 448 46H454Q481 46 514 49Q520 50 522 50T528 55T534 64T540 82T547 110T558 153Q565 181 569 198Q602 330 602 331T457 332H312L279 197Q245 63 245 58Q245 51 253 49T303 46H334Q340 38 340 37T337 19Q333 6 327 0H312Q275 2 178 2Q144 2 115 2T69 2T48 1Q31 1 31 10Q31 12 34 24Q39 43 44 45Q48 46 59 46H65Q92 46 125 49Q139 52 144 61Q147 65 216 339T285 628Q285 635 228 637Z"></path></g><g data-mml-node="mi" transform="translate(864,-150) scale(0.707)"><path data-c="1D45B" d="M21 287Q22 293 24 303T36 341T56 388T89 425T135 442Q171 442 195 424T225 390T231 369Q231 367 232 367L243 378Q304 442 382 442Q436 442 469 415T503 336T465 179T427 52Q427 26 444 26Q450 26 453 27Q482 32 505 65T540 145Q542 153 560 153Q580 153 580 145Q580 144 576 130Q568 101 554 73T508 17T439 -10Q392 -10 371 17T350 73Q350 92 386 193T423 345Q423 404 379 404H374Q288 404 229 303L222 291L189 157Q156 26 151 16Q138 -11 108 -11Q95 -11 87 -5T76 7T74 17Q74 30 112 180T152 343Q153 348 153 366Q153 405 129 405Q91 405 66 305Q60 285 60 284Q58 278 41 278H27Q21 284 21 287Z"></path></g></g></g></g></svg></mjx-container> approximation algorithm.<h2 id="11-External-Memory-1"><a href="#11-External-Memory-1" class="headerlink" title="11.External Memory 1"></a>11.External Memory 1</h2><h3 id="I-O-Model"><a href="#I-O-Model" class="headerlink" title="I/O Model"></a>I/O Model</h3><h3 id="Scanning"><a href="#Scanning" class="headerlink" title="Scanning"></a>Scanning</h3></li><li>Scanning. Given an array A of N values (stored in N/B blocks), process all values from left-to-right.</li><li>I/Os. O(N/B).<h3 id="Sorting"><a href="#Sorting" class="headerlink" title="Sorting"></a>Sorting</h3></li><li>Goal. Sorting in O(N/B logM/B (N/B)) I/Os.</li><li>Solution in 3 steps.<ul><li>Base case.<ul><li>Partition N elements into N/M arrays of size M.</li><li>Load each into memory and sort.</li><li>I/Os. O(N/B)</li></ul></li><li>External multi-way merge.<ul><li>Multiway merge algorithm.<ul><li>Input is N elements in M/B arrays.</li><li>Load M/B first blocks into memory and sort.</li><li>Output B smallest elements.</li><li>Load more blocks into memory if needed.</li><li>Repeat.</li></ul></li><li>I/Os. O(N/B).</li></ul></li><li>External merge sort.<ul><li>Partition N elements into N/M arrays of size M. Load each into memory and sort.</li><li>Apply M/B way external multiway merge until left with single sorted array.</li><li>I/Os.<ul><li>Sort N/M arrays: O(N/B) I/Os</li><li>Height of tree O(logM/B(N/M))</li><li>Cost per level: O(N/B) I/Os.</li></ul></li></ul></li><li>Total I/Os: O($\frac{N}{B}<em>log_{\frac{M}{B}}\frac{N}{M}<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.566ex;" xmlns="http://www.w3.org/2000/svg" width="6.504ex" height="2.262ex" role="img" focusable="false" viewBox="0 -750 2874.6 1000"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mo"><path data-c="29" d="M60 749L64 750Q69 750 74 750H86L114 726Q208 641 251 514T294 250Q294 182 284 119T261 12T224 -76T186 -143T145 -194T113 -227T90 -246Q87 -249 86 -250H74Q66 -250 63 -250T58 -247T55 -238Q56 -237 66 -225Q221 -64 221 250T66 725Q56 737 55 738Q55 746 60 749Z"></path></g><g data-mml-node="mo" transform="translate(666.8,0)"><path data-c="3D" d="M56 347Q56 360 70 367H707Q722 359 722 347Q722 336 708 328L390 327H72Q56 332 56 347ZM56 153Q56 168 72 173H708Q722 163 722 153Q722 140 707 133H70Q56 140 56 153Z"></path></g><g data-mml-node="mi" transform="translate(1722.6,0)"><path data-c="1D442" d="M740 435Q740 320 676 213T511 42T304 -22Q207 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435ZM637 476Q637 565 591 615T476 665Q396 665 322 605Q242 542 200 428T157 216Q157 126 200 73T314 19Q404 19 485 98T608 313Q637 408 637 476Z"></path></g><g data-mml-node="mo" transform="translate(2485.6,0)"><path data-c="28" d="M94 250Q94 319 104 381T127 488T164 576T202 643T244 695T277 729T302 750H315H319Q333 750 333 741Q333 738 316 720T275 667T226 581T184 443T167 250T184 58T225 -81T274 -167T316 -220T333 -241Q333 -250 318 -250H315H302L274 -226Q180 -141 137 -14T94 250Z"></path></g></g></g></svg></mjx-container>\frac{N}{B}</em>log_{\frac{M}{B}}\frac{N}{B}$)</li></ul></li></ul><h3 id="Searching"><a href="#Searching" class="headerlink" title="Searching"></a>Searching</h3><ul><li>B-tree<ul><li>I/Os. O(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.464ex;" xmlns="http://www.w3.org/2000/svg" width="6.262ex" height="2.034ex" role="img" focusable="false" viewBox="0 -694 2767.7 899"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mi"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(298,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="msub" transform="translate(783,0)"><g data-mml-node="mi"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g><g data-mml-node="mi" transform="translate(510,-150) scale(0.707)"><path data-c="1D435" d="M231 637Q204 637 199 638T194 649Q194 676 205 682Q206 683 335 683Q594 683 608 681Q671 671 713 636T756 544Q756 480 698 429T565 360L555 357Q619 348 660 311T702 219Q702 146 630 78T453 1Q446 0 242 0Q42 0 39 2Q35 5 35 10Q35 17 37 24Q42 43 47 45Q51 46 62 46H68Q95 46 128 49Q142 52 147 61Q150 65 219 339T288 628Q288 635 231 637ZM649 544Q649 574 634 600T585 634Q578 636 493 637Q473 637 451 637T416 636H403Q388 635 384 626Q382 622 352 506Q352 503 351 500L320 374H401Q482 374 494 376Q554 386 601 434T649 544ZM595 229Q595 273 572 302T512 336Q506 337 429 337Q311 337 310 336Q310 334 293 263T258 122L240 52Q240 48 252 48T333 46Q422 46 429 47Q491 54 543 105T595 229Z"></path></g></g><g data-mml-node="TeXAtom" data-mjx-texclass="ORD" transform="translate(1879.7,0)"><g data-mml-node="mi"><path data-c="1D441" d="M234 637Q231 637 226 637Q201 637 196 638T191 649Q191 676 202 682Q204 683 299 683Q376 683 387 683T401 677Q612 181 616 168L670 381Q723 592 723 606Q723 633 659 637Q635 637 635 648Q635 650 637 660Q641 676 643 679T653 683Q656 683 684 682T767 680Q817 680 843 681T873 682Q888 682 888 672Q888 650 880 642Q878 637 858 637Q787 633 769 597L620 7Q618 0 599 0Q585 0 582 2Q579 5 453 305L326 604L261 344Q196 88 196 79Q201 46 268 46H278Q284 41 284 38T282 19Q278 6 272 0H259Q228 2 151 2Q123 2 100 2T63 2T46 1Q31 1 31 10Q31 14 34 26T39 40Q41 46 62 46Q130 49 150 85Q154 91 221 362L289 634Q287 635 234 637Z"></path></g></g></g></g></svg></mjx-container>)</li></ul></li></ul><h2 id="12-External-Memory-2"><a href="#12-External-Memory-2" class="headerlink" title="12.External Memory 2"></a>12.External Memory 2</h2><h3 id="Access-Path-Traversal"><a href="#Access-Path-Traversal" class="headerlink" title="Access Path Traversal"></a>Access Path Traversal</h3><ul><li>I/O intuition.<ul><li>Flush moves Θ(B) message together in O(1) I/Os.</li><li>A message moves at most P times.</li><li>⇒ O(P/B + 1) = O(P/B) amortized I/Os.<h3 id="Bε-trees"><a href="#Bε-trees" class="headerlink" title="Bε-trees"></a>Bε-trees</h3></li></ul></li><li>B tree with buffer some updates at each node</li><li>ε ∈ (0, 1] is a parameter.<ul><li>Solution in 2 steps.</li><li>Focus on <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.206ex;" xmlns="http://www.w3.org/2000/svg" width="3.647ex" height="2.398ex" role="img" focusable="false" viewBox="0 -969 1612 1060"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msqrt"><g transform="translate(853,0)"><g data-mml-node="mi"><path data-c="1D435" d="M231 637Q204 637 199 638T194 649Q194 676 205 682Q206 683 335 683Q594 683 608 681Q671 671 713 636T756 544Q756 480 698 429T565 360L555 357Q619 348 660 311T702 219Q702 146 630 78T453 1Q446 0 242 0Q42 0 39 2Q35 5 35 10Q35 17 37 24Q42 43 47 45Q51 46 62 46H68Q95 46 128 49Q142 52 147 61Q150 65 219 339T288 628Q288 635 231 637ZM649 544Q649 574 634 600T585 634Q578 636 493 637Q473 637 451 637T416 636H403Q388 635 384 626Q382 622 352 506Q352 503 351 500L320 374H401Q482 374 494 376Q554 386 601 434T649 544ZM595 229Q595 273 572 302T512 336Q506 337 429 337Q311 337 310 336Q310 334 293 263T258 122L240 52Q240 48 252 48T333 46Q422 46 429 47Q491 54 543 105T595 229Z"></path></g></g><g data-mml-node="mo" transform="translate(0,109)"><path data-c="221A" d="M95 178Q89 178 81 186T72 200T103 230T169 280T207 309Q209 311 212 311H213Q219 311 227 294T281 177Q300 134 312 108L397 -77Q398 -77 501 136T707 565T814 786Q820 800 834 800Q841 800 846 794T853 782V776L620 293L385 -193Q381 -200 366 -200Q357 -200 354 -197Q352 -195 256 15L160 225L144 214Q129 202 113 190T95 178Z"></path></g><rect width="759" height="60" x="853" y="849"></rect></g></g></g></svg></mjx-container>-tree ( = 1/2).</li><li>Searching in O(<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.464ex;" xmlns="http://www.w3.org/2000/svg" width="4.253ex" height="2.034ex" role="img" focusable="false" viewBox="0 -694 1879.7 899"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mi"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(298,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="msub" transform="translate(783,0)"><g data-mml-node="mi"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g><g data-mml-node="mi" transform="translate(510,-150) scale(0.707)"><path data-c="1D435" d="M231 637Q204 637 199 638T194 649Q194 676 205 682Q206 683 335 683Q594 683 608 681Q671 671 713 636T756 544Q756 480 698 429T565 360L555 357Q619 348 660 311T702 219Q702 146 630 78T453 1Q446 0 242 0Q42 0 39 2Q35 5 35 10Q35 17 37 24Q42 43 47 45Q51 46 62 46H68Q95 46 128 49Q142 52 147 61Q150 65 219 339T288 628Q288 635 231 637ZM649 544Q649 574 634 600T585 634Q578 636 493 637Q473 637 451 637T416 636H403Q388 635 384 626Q382 622 352 506Q352 503 351 500L320 374H401Q482 374 494 376Q554 386 601 434T649 544ZM595 229Q595 273 572 302T512 336Q506 337 429 337Q311 337 310 336Q310 334 293 263T258 122L240 52Q240 48 252 48T333 46Q422 46 429 47Q491 54 543 105T595 229Z"></path></g></g></g></g></svg></mjx-container> N) I/Os.</li><li>Updates in O((<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.464ex;" xmlns="http://www.w3.org/2000/svg" width="4.253ex" height="2.034ex" role="img" focusable="false" viewBox="0 -694 1879.7 899"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mi"><path data-c="1D459" d="M117 59Q117 26 142 26Q179 26 205 131Q211 151 215 152Q217 153 225 153H229Q238 153 241 153T246 151T248 144Q247 138 245 128T234 90T214 43T183 6T137 -11Q101 -11 70 11T38 85Q38 97 39 102L104 360Q167 615 167 623Q167 626 166 628T162 632T157 634T149 635T141 636T132 637T122 637Q112 637 109 637T101 638T95 641T94 647Q94 649 96 661Q101 680 107 682T179 688Q194 689 213 690T243 693T254 694Q266 694 266 686Q266 675 193 386T118 83Q118 81 118 75T117 65V59Z"></path></g><g data-mml-node="mi" transform="translate(298,0)"><path data-c="1D45C" d="M201 -11Q126 -11 80 38T34 156Q34 221 64 279T146 380Q222 441 301 441Q333 441 341 440Q354 437 367 433T402 417T438 387T464 338T476 268Q476 161 390 75T201 -11ZM121 120Q121 70 147 48T206 26Q250 26 289 58T351 142Q360 163 374 216T388 308Q388 352 370 375Q346 405 306 405Q243 405 195 347Q158 303 140 230T121 120Z"></path></g><g data-mml-node="msub" transform="translate(783,0)"><g data-mml-node="mi"><path data-c="1D454" d="M311 43Q296 30 267 15T206 0Q143 0 105 45T66 160Q66 265 143 353T314 442Q361 442 401 394L404 398Q406 401 409 404T418 412T431 419T447 422Q461 422 470 413T480 394Q480 379 423 152T363 -80Q345 -134 286 -169T151 -205Q10 -205 10 -137Q10 -111 28 -91T74 -71Q89 -71 102 -80T116 -111Q116 -121 114 -130T107 -144T99 -154T92 -162L90 -164H91Q101 -167 151 -167Q189 -167 211 -155Q234 -144 254 -122T282 -75Q288 -56 298 -13Q311 35 311 43ZM384 328L380 339Q377 350 375 354T369 368T359 382T346 393T328 402T306 405Q262 405 221 352Q191 313 171 233T151 117Q151 38 213 38Q269 38 323 108L331 118L384 328Z"></path></g><g data-mml-node="mi" transform="translate(510,-150) scale(0.707)"><path data-c="1D435" d="M231 637Q204 637 199 638T194 649Q194 676 205 682Q206 683 335 683Q594 683 608 681Q671 671 713 636T756 544Q756 480 698 429T565 360L555 357Q619 348 660 311T702 219Q702 146 630 78T453 1Q446 0 242 0Q42 0 39 2Q35 5 35 10Q35 17 37 24Q42 43 47 45Q51 46 62 46H68Q95 46 128 49Q142 52 147 61Q150 65 219 339T288 628Q288 635 231 637ZM649 544Q649 574 634 600T585 634Q578 636 493 637Q473 637 451 637T416 636H403Q388 635 384 626Q382 622 352 506Q352 503 351 500L320 374H401Q482 374 494 376Q554 386 601 434T649 544ZM595 229Q595 273 572 302T512 336Q506 337 429 337Q311 337 310 336Q310 334 293 263T258 122L240 52Q240 48 252 48T333 46Q422 46 429 47Q491 54 543 105T595 229Z"></path></g></g></g></g></svg></mjx-container> N)/<mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: -0.206ex;" xmlns="http://www.w3.org/2000/svg" width="3.647ex" height="2.398ex" role="img" focusable="false" viewBox="0 -969 1612 1060"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="msqrt"><g transform="translate(853,0)"><g data-mml-node="mi"><path data-c="1D435" d="M231 637Q204 637 199 638T194 649Q194 676 205 682Q206 683 335 683Q594 683 608 681Q671 671 713 636T756 544Q756 480 698 429T565 360L555 357Q619 348 660 311T702 219Q702 146 630 78T453 1Q446 0 242 0Q42 0 39 2Q35 5 35 10Q35 17 37 24Q42 43 47 45Q51 46 62 46H68Q95 46 128 49Q142 52 147 61Q150 65 219 339T288 628Q288 635 231 637ZM649 544Q649 574 634 600T585 634Q578 636 493 637Q473 637 451 637T416 636H403Q388 635 384 626Q382 622 352 506Q352 503 351 500L320 374H401Q482 374 494 376Q554 386 601 434T649 544ZM595 229Q595 273 572 302T512 336Q506 337 429 337Q311 337 310 336Q310 334 293 263T258 122L240 52Q240 48 252 48T333 46Q422 46 429 47Q491 54 543 105T595 229Z"></path></g></g><g data-mml-node="mo" transform="translate(0,109)"><path data-c="221A" d="M95 178Q89 178 81 186T72 200T103 230T169 280T207 309Q209 311 212 311H213Q219 311 227 294T281 177Q300 134 312 108L397 -77Q398 -77 501 136T707 565T814 786Q820 800 834 800Q841 800 846 794T853 782V776L620 293L385 -193Q381 -200 366 -200Q357 -200 354 -197Q352 -195 256 15L160 225L144 214Q129 202 113 190T95 178Z"></path></g><rect width="759" height="60" x="853" y="849"></rect></g></g></g></svg></mjx-container> ) amortized.</li><li>Generalize to any ε .</li></ul></li></ul><h3 id="Range-Tree"><a href="#Range-Tree" class="headerlink" title="Range Tree"></a>Range Tree</h3><p>2D range tree reporting \<br>using bridges(fractional cascading) \<br><a href="https://ocw.mit.edu/courses/6-851-advanced-data-structures-spring-2012/resources/mit6_851s12_l3/">Range Tree MIT 6.861</a></p><p><a href="https://www.youtube.com/watch?v=5a7EYVulN-w">video from youtube</a></p><h3 id="Lowest-Common-Ancestor-And-Level-Ancestor"><a href="#Lowest-Common-Ancestor-And-Level-Ancestor" class="headerlink" title="Lowest Common Ancestor And Level Ancestor"></a>Lowest Common Ancestor And Level Ancestor</h3><p>LCA => RMQ => <mjx-container class="MathJax" jax="SVG"><svg style="vertical-align: 0;" xmlns="http://www.w3.org/2000/svg" width="1.76ex" height="1.507ex" role="img" focusable="false" viewBox="0 -666 778 666"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)"><g data-mml-node="math"><g data-mml-node="mo"><path data-c="B1" d="M56 320T56 333T70 353H369V502Q369 651 371 655Q376 666 388 666Q402 666 405 654T409 596V500V353H707Q722 345 722 333Q722 320 707 313H409V40H707Q722 32 722 20T707 0H70Q56 7 56 20T70 40H369V313H70Q56 320 56 333Z"></path></g></g></g></svg></mjx-container>RMQ<br><a href="https://www.youtube.com/watch?v=0rCFkuQS968">Lowest Common Ancestor And Level Ancestor</a></p>]]></content>
<categories>
<category> Algorithm </category>
</categories>
<tags>
<tag> algorithm </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture7: Debugging and Profiling</title>
<link href="/2024/07/11/missing-semester/missing-semester-lecture7/"/>
<url>/2024/07/11/missing-semester/missing-semester-lecture7/</url>
<content type="html"><![CDATA[<h4 id="Lecture-7-Debugging-and-Profiling"><a href="#Lecture-7-Debugging-and-Profiling" class="headerlink" title="Lecture 7: Debugging and Profiling"></a>Lecture 7: Debugging and Profiling</h4><p>link:<a href="https://missing.csail.mit.edu/2020/debugging-profiling/">https://missing.csail.mit.edu/2020/debugging-profiling/</a></p><h3 id="Printf-debugging-and-Logging"><a href="#Printf-debugging-and-Logging" class="headerlink" title="Printf debugging and Logging"></a>Printf debugging and Logging</h3><ul><li>A first approach to debug a program is to add print statements around where you have detected the problem, and keep iterating until you have extracted enough information to understand what is responsible for the issue.</li><li>A second approach is to use logging in your program, instead of ad hoc print statements. Logging is better than regular print statements for several reasons:<ul><li>You can log to files, sockets or even remote servers instead of standard output.</li><li>Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c), that allow you to filter the output accordingly.</li><li>For new issues, there’s a fair chance that your logs will contain enough information to detect what is going wrong.</li></ul></li></ul><pre><code class="lang-bash">$ python logger.py# Raw output as with just prints$ python logger.py log# Log formatted output$ python logger.py log ERROR# Print only ERROR levels and above$ python logger.py color# Color formatted output</code></pre><blockquote><p>The most important way of making logs more readable is to color code them. By now you probably have realized that your terminal uses colors to make things more readable. But how does it do it? Programs like <code>ls</code> or <code>grep</code> are using <a href="https://en.wikipedia.org/wiki/ANSI_escape_code">ANSI escape codes</a>, which are special sequences of characters to indicate your shell to change the color of the output. For example, executing <code>echo -e "\e[38;2;255;0;0mThis is red\e[0m"</code> prints the message <code>This is red</code>in red on your terminal, as long as it supports <a href="https://github.com/termstandard/colors#truecolor-support-in-output-devices">true color</a>. If your terminal doesn’t support this (e.g. macOS’s Terminal.app), you can use the more universally supported escape codes for 16 color choices, for example <code>echo -e "\e[31;1mThis is red\e[0m"</code>.</p></blockquote><h3 id="Third-party-logs"><a href="#Third-party-logs" class="headerlink" title="Third party logs"></a>Third party logs</h3><blockquote><p>most programs write their own logs somewhere in your system. In UNIX systems, it is commonplace for programs to write their logs under <code>/var/log</code>. For instance, the <a href="https://www.nginx.com/">NGINX</a> webserver places its logs under <code>/var/log/nginx</code>. </p><p>More recently, systems have started using a <strong>system log</strong>, which is increasingly where all of your log messages go. Most (but not all) Linux systems use <code>systemd</code>, a system daemon that controls many things in your system such as which services are enabled and running. <code>systemd</code> places the logs under <code>/var/log/journal</code> in a specialized format and you can use the <a href="https://www.man7.org/linux/man-pages/man1/journalctl.1.html"><code>journalctl</code></a>command to display the messages. </p><p>Similarly, on macOS there is still <code>/var/log/system.log</code> but an increasing number of tools use the system log, that can be displayed with <a href="https://www.manpagez.com/man/1/log/"><code>log show</code></a>. </p><p>On most UNIX systems you can also use the <a href="https://www.man7.org/linux/man-pages/man1/dmesg.1.html"><code>dmesg</code></a> command to access the kernel log.</p><p>For logging under the system logs you can use the <a href="https://www.man7.org/linux/man-pages/man1/logger.1.html"><code>logger</code></a> shell program. Here’s an example of using <code>logger</code> and how to check that the entry made it to the system logs. Moreover, most programming languages have bindings logging to the system log.</p></blockquote><pre><code class="lang-bash">logger "Hello Logs"# On macOSlog show --last 1m | grep Hello# On Linuxjournalctl --since "1m ago" | grep Hello</code></pre><h3 id="Debugger"><a href="#Debugger" class="headerlink" title="Debugger"></a>Debugger</h3><p>When printf debugging is not enough you should use a debugger. Debuggers are programs that let you interact with the execution of a program, allowing the following:</p><ul><li>Halt execution of the program when it reaches a certain line.</li><li>Step through the program one instruction at a time.</li><li>Inspect values of variables after the program crashed.</li><li>Conditionally halt the execution when a given condition is met.</li><li>And many more advanced features</li></ul><p>Many programming languages come with some form of debugger. In Python this is the Python Debugger <a href="https://docs.python.org/3/library/pdb.html"><code>pdb</code></a>.</p><p>Here is a brief description of some of the commands <code>pdb</code> supports:</p><ul><li><strong>l</strong>(ist) - Displays 11 lines around the current line or continue the previous listing.</li><li><strong>s</strong>(tep) - Execute the current line, stop at the first possible occasion.</li><li><strong>n</strong>(ext) - Continue execution until the next line in the current function is reached or it returns.</li><li><strong>b</strong>(reak) - Set a breakpoint (depending on the argument provided).</li><li><strong>p</strong>(rint) - Evaluate the expression in the current context and print its value. There’s also <strong>pp</strong> to display using <a href="https://docs.python.org/3/library/pprint.html"><code>pprint</code></a> instead.</li><li><strong>r</strong>(eturn) - Continue execution until the current function returns.</li><li><strong>q</strong>(uit) - Quit the debugger</li></ul><p>Note that since Python is an interpreted language we can use the <code>pdb</code> shell to execute commands and to execute instructions. <a href="https://pypi.org/project/ipdb/"><code>ipdb</code></a> is an improved <code>pdb</code> that uses the <a href="https://ipython.org/"><code>IPython</code></a> REPL enabling tab completion, syntax highlighting, better tracebacks, and better introspection while retaining the same interface as the <code>pdb</code> module.</p><p>For more low level programming you will probably want to look into <a href="https://www.gnu.org/software/gdb/"><code>gdb</code></a> (and its quality of life modification <a href="https://github.com/pwndbg/pwndbg"><code>pwndbg</code></a>) and <a href="https://lldb.llvm.org/"><code>lldb</code></a>. They are optimized for C-like language debugging but will let you probe pretty much any process and get its current machine state: registers, stack, program counter, &c.</p><h3 id="Specialized-Tools"><a href="#Specialized-Tools" class="headerlink" title="Specialized Tools"></a>Specialized Tools</h3><p>Even if what you are trying to debug is a black box binary there are tools that can help you with that. Whenever programs need to perform actions that only the kernel can, they use <a href="https://en.wikipedia.org/wiki/System_call">System Calls</a>. There are commands that let you trace the syscalls your program makes. In Linux there’s <a href="https://www.man7.org/linux/man-pages/man1/strace.1.html"><code>strace</code></a> and macOS and BSD have <a href="http://dtrace.org/blogs/about/"><code>dtrace</code></a>. <code>dtrace</code> can be tricky to use because it uses its own <code>D</code> language, but there is a wrapper called <a href="https://www.manpagez.com/man/1/dtruss/"><code>dtruss</code></a> that provides an interface more similar to <code>strace</code> (more details <a href="https://8thlight.com/blog/colin-jones/2015/11/06/dtrace-even-better-than-strace-for-osx.html">here</a>).</p><p>Below are some examples of using <code>strace</code> or <code>dtruss</code> to show <a href="https://www.man7.org/linux/man-pages/man2/stat.2.html"><code>stat</code></a> syscall traces for an execution of <code>ls</code>. For a deeper dive into <code>strace</code>, <a href="https://blogs.oracle.com/linux/strace-the-sysadmins-microscope-v2">this article</a> and <a href="https://jvns.ca/strace-zine-unfolded.pdf">this zine</a> are good reads.</p><pre><code class="lang-bash"># On Linuxsudo strace -e lstat ls -l > /dev/null# On macOSsudo dtruss -t lstat64_extended ls -l > /dev/null</code></pre><p>Under some circumstances, you may need to look at the network packets to figure out the issue in your program. Tools like <a href="https://www.man7.org/linux/man-pages/man1/tcpdump.1.html"><code>tcpdump</code></a> and <a href="https://www.wireshark.org/">Wireshark</a> are network packet analyzers that let you read the contents of network packets and filter them based on different criteria.</p><p>For web development, the Chrome/Firefox developer tools are quite handy. They feature a large number of tools, including:</p><ul><li>Source code - Inspect the HTML/CSS/JS source code of any website.</li><li>Live HTML, CSS, JS modification - Change the website content, styles and behavior to test (you can see for yourself that website screenshots are not valid proofs).</li><li>Javascript shell - Execute commands in the JS REPL.</li><li>Network - Analyze the requests timeline.</li><li>Storage - Look into the Cookies and local application storage.</li></ul><blockquote><p>For some issues you do not need to run any code. For example, just by carefully looking at a piece of code you could realize that your loop variable is shadowing an already existing variable or function name; or that a program reads a variable before defining it. Here is where <a href="https://en.wikipedia.org/wiki/Static_program_analysis">static analysis</a> tools come into play. Static analysis programs take source code as input and analyze it using coding rules to reason about its correctness.</p><p>In the following Python snippet there are several mistakes. First, our loop variable <code>foo</code> shadows the previous definition of the function <code>foo</code>. We also wrote <code>baz</code> instead of <code>bar</code> in the last line, so the program will crash after completing the <code>sleep</code> call (which will take one minute).</p></blockquote><pre><code class="lang-bash">import timedef foo(): return 42for foo in range(5): print(foo)bar = 1bar *= 0.2time.sleep(60)print(baz)</code></pre><p>Static analysis tools can identify these kinds of issues. When we run <a href="https://pypi.org/project/pyflakes"><code>pyflakes</code></a>on the code we get the errors related to both bugs. <a href="http://mypy-lang.org/"><code>mypy</code></a> is another tool that can detect type checking issues. Here, <code>mypy</code> will warn us that <code>bar</code> is initially an <code>int</code> and is then casted to a <code>float</code>. Again, note that all these issues were detected without having to run the code.</p><pre><code class="lang-bash">$ pyflakes foobar.pyfoobar.py:6: redefinition of unused 'foo' from line 3foobar.py:11: undefined name 'baz'$ mypy foobar.pyfoobar.py:6: error: Incompatible types in assignment (expression has type "int", variable has type "Callable[[], Any]")foobar.py:9: error: Incompatible types in assignment (expression has type "float", variable has type "int")foobar.py:11: error: Name 'baz' is not definedFound 3 errors in 1 file (checked 1 source file)</code></pre><p>In the shell tools lecture we covered <a href="https://www.shellcheck.net/"><code>shellcheck</code></a>, which is a similar tool for shell scripts.</p><p>Most editors and IDEs support displaying the output of these tools within the editor itself, highlighting the locations of warnings and errors. This is often called <strong>code linting</strong> and it can also be used to display other types of issues such as stylistic violations or insecure constructs.</p><p>In vim, the plugins <a href="https://vimawesome.com/plugin/ale"><code>ale</code></a> or <a href="https://vimawesome.com/plugin/syntastic"><code>syntastic</code></a> will let you do that. For Python, <a href="https://github.com/PyCQA/pylint"><code>pylint</code></a>and <a href="https://pypi.org/project/pep8/"><code>pep8</code></a> are examples of stylistic linters and <a href="https://pypi.org/project/bandit/"><code>bandit</code></a> is a tool designed to find common security issues. For other languages people have compiled comprehensive lists of useful static analysis tools, such as <a href="https://github.com/mre/awesome-static-analysis">Awesome Static Analysis</a> (you may want to take a look at the <em>Writing</em> section) and for linters there is <a href="https://github.com/caramelomartins/awesome-linters">Awesome Linters</a>.</p><p>A complementary tool to stylistic linting are code formatters such as <a href="https://github.com/psf/black"><code>black</code></a> for Python, <code>gofmt</code> for Go, <code>rustfmt</code> for Rust or <a href="https://prettier.io/"><code>prettier</code></a> for JavaScript, HTML and CSS. These tools autoformat your code so that it’s consistent with common stylistic patterns for the given programming language. Although you might be unwilling to give stylistic control about your code, standardizing code format will help other people read your code and will make you better at reading other people’s (stylistically standardized) code.</p><h3 id="Profiling"><a href="#Profiling" class="headerlink" title="Profiling"></a>Profiling</h3><h4 id="Time"><a href="#Time" class="headerlink" title="Time"></a>Time</h4><p>Similarly to the debugging case, in many scenarios it can be enough to just print the time it took your code between two points. Here is an example in Python using the <a href="https://docs.python.org/3/library/time.html"><code>time</code></a> module.</p><pre><code class="lang-bash">import time, randomn = random.randint(1, 10) * 100# Get current timestart = time.time()# Do some workprint("Sleeping for {} ms".format(n))time.sleep(n/1000)# Compute time between start and nowprint(time.time() - start)# Output# Sleeping for 500 ms# 0.5713930130004883</code></pre><p>However, wall clock time can be misleading since your computer might be running other processes at the same time or waiting for events to happen. It is common for tools to make a distinction between <em>Real</em>, <em>User</em> and <em>Sys</em> time. In general, <em>User</em> + <em>Sys</em> tells you how much time your process actually spent in the CPU (more detailed explanation <a href="https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1">here</a>).</p><ul><li><em>Real</em> - Wall clock elapsed time from start to finish of the program, including the time taken by other processes and time taken while blocked (e.g. waiting for I/O or network)</li><li><em>User</em> - Amount of time spent in the CPU running user code</li><li><em>Sys</em> - Amount of time spent in the CPU running kernel code</li></ul><p>For example, try running a command that performs an HTTP request and prefixing it with <a href="https://www.man7.org/linux/man-pages/man1/time.1.html"><code>time</code></a>. Under a slow connection you might get an output like the one below. Here it took over 2 seconds for the request to complete but the process only took 15ms of CPU user time and 12ms of kernel CPU time.</p><pre><code class="lang-bash">$ time curl https://missing.csail.mit.edu &> /dev/nullreal 0m2.561suser 0m0.015ssys 0m0.012s</code></pre><h4 id="Profiler"><a href="#Profiler" class="headerlink" title="Profiler"></a>Profiler</h4><h4 id="CPU"><a href="#CPU" class="headerlink" title="CPU"></a>CPU</h4><p>Most of the time when people refer to <em>profilers</em> they actually mean <em>CPU profilers</em>, which are the most common. There are two main types of CPU profilers: <em>tracing</em>and <em>sampling</em> profilers. Tracing profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every millisecond) and record the program’s stack. They use these records to present aggregate statistics of what your program spent the most time doing. <a href="https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-">Here</a> is a good intro article if you want more detail on this topic.</p><p>Most programming languages have some sort of command line profiler that you can use to analyze your code. They often integrate with full fledged IDEs but for this lecture we are going to focus on the command line tools themselves.</p><p>In Python we can use the <code>cProfile</code> module to profile time per function call. Here is a simple example that implements a rudimentary grep in Python:</p><pre><code class="lang-bash">#!/usr/bin/env pythonimport sys, redef grep(pattern, file): with open(file, 'r') as f: print(file) for i, line in enumerate(f.readlines()): pattern = re.compile(pattern) match = pattern.search(line) if match is not None: print("{}: {}".format(i, line), end="")if __name__ == '__main__': times = int(sys.argv[1]) pattern = sys.argv[2] for i in range(times): for file in sys.argv[3:]: grep(pattern, file)</code></pre><p>We can profile this code using the following command. Analyzing the output we can see that IO is taking most of the time and that compiling the regex takes a fair amount of time as well. Since the regex only needs to be compiled once, we can factor it out of the for.</p><pre><code class="lang-bash">$ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py[omitted program output] ncalls tottime percall cumtime percall filename:lineno(function) 8000 0.266 0.000 0.292 0.000 {built-in method io.open} 8000 0.153 0.000 0.894 0.000 grep.py:5(grep) 17000 0.101 0.000 0.101 0.000 {built-in method builtins.print} 8000 0.100 0.000 0.129 0.000 {method 'readlines' of '_io._IOBase' objects} 93000 0.097 0.000 0.111 0.000 re.py:286(_compile) 93000 0.069 0.000 0.069 0.000 {method 'search' of '_sre.SRE_Pattern' objects} 93000 0.030 0.000 0.141 0.000 re.py:231(compile) 17000 0.019 0.000 0.029 0.000 codecs.py:318(decode) 1 0.017 0.017 0.911 0.911 grep.py:3(<module>)[omitted lines]</code></pre><p>A caveat of Python’s <code>cProfile</code> profiler (and many profilers for that matter) is that they display time per function call. That can become unintuitive really fast, especially if you are using third party libraries in your code since internal function calls are also accounted for. A more intuitive way of displaying profiling information is to include the time taken per line of code, which is what <em>line profilers</em> do.</p><p>For instance, the following piece of Python code performs a request to the class website and parses the response to get all URLs in the page:</p><pre><code class="lang-bash">#!/usr/bin/env pythonimport requestsfrom bs4 import BeautifulSoup# This is a decorator that tells line_profiler# that we want to analyze this function@profiledef get_urls(): response = requests.get('https://missing.csail.mit.edu') s = BeautifulSoup(response.content, 'lxml') urls = [] for url in s.find_all('a'): urls.append(url['href'])if __name__ == '__main__': get_urls()</code></pre><p>If we used Python’s <code>cProfile</code> profiler we’d get over 2500 lines of output, and even with sorting it’d be hard to understand where the time is being spent. A quick run with <a href="https://github.com/pyutils/line_profiler"><code>line_profiler</code></a> shows the time taken per line:</p><pre><code class="lang-bash">$ kernprof -l -v a.pyWrote profile results to urls.py.lprofTimer unit: 1e-06 sTotal time: 0.636188 sFile: a.pyFunction: get_urls at line 5Line # Hits Time Per Hit % Time Line Contents============================================================== 5 @profile 6 def get_urls(): 7 1 613909.0 613909.0 96.5 response = requests.get('https://missing.csail.mit.edu') 8 1 21559.0 21559.0 3.4 s = BeautifulSoup(response.content, 'lxml') 9 1 2.0 2.0 0.0 urls = []10 25 685.0 27.4 0.1 for url in s.find_all('a'):11 24 33.0 1.4 0.0 urls.append(url['href'])</code></pre><h4 id="Memory"><a href="#Memory" class="headerlink" title="Memory"></a>Memory</h4><p>In languages like C or C++ memory leaks can cause your program to never release memory that it doesn’t need anymore. To help in the process of memory debugging you can use tools like <a href="https://valgrind.org/">Valgrind</a> that will help you identify memory leaks.</p><p>In garbage collected languages like Python it is still useful to use a memory profiler because as long as you have pointers to objects in memory they won’t be garbage collected. Here’s an example program and its associated output when running it with <a href="https://pypi.org/project/memory-profiler/">memory-profiler</a> (note the decorator like in <code>line-profiler</code>).</p><pre><code class="lang-bash">@profiledef my_func(): a = [1] * (10 ** 6) b = [2] * (2 * 10 ** 7) del b return aif __name__ == '__main__': my_func()$ python -m memory_profiler example.pyLine # Mem usage Increment Line Contents============================================== 3 @profile 4 5.97 MB 0.00 MB def my_func(): 5 13.61 MB 7.64 MB a = [1] * (10 ** 6) 6 166.20 MB 152.59 MB b = [2] * (2 * 10 ** 7) 7 13.61 MB -152.59 MB del b 8 13.61 MB 0.00 MB return a</code></pre><h3 id="Visualization"><a href="#Visualization" class="headerlink" title="Visualization"></a>Visualization</h3><p>Profiler output for real world programs will contain large amounts of information because of the inherent complexity of software projects. Humans are visual creatures and are quite terrible at reading large amounts of numbers and making sense of them. Thus there are many tools for displaying profiler’s output in an easier to parse way.</p><p>One common way to display CPU profiling information for sampling profilers is to use a <a href="http://www.brendangregg.com/flamegraphs.html">Flame Graph</a>, which will display a hierarchy of function calls across the Y axis and time taken proportional to the X axis. They are also interactive, letting you zoom into specific parts of the program and get their stack traces (try clicking in the image below).</p><p><img src="http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg" alt=""></p><p>Call graphs or control flow graphs display the relationships between subroutines within a program by including functions as nodes and functions calls between them as directed edges. When coupled with profiling information such as the number of calls and time taken, call graphs can be quite useful for interpreting the flow of a program. In Python you can use the <a href="https://pycallgraph.readthedocs.io/"><code>pycallgraph</code></a>library to generate them.</p><p><img src="https://upload.wikimedia.org/wikipedia/commons/2/2f/A_Call_Graph_generated_by_pycallgraph.png" alt=""></p><h4 id="Resource-Monitoring"><a href="#Resource-Monitoring" class="headerlink" title="Resource Monitoring"></a>Resource Monitoring</h4><p>Sometimes, the first step towards analyzing the performance of your program is to understand what its actual resource consumption is. Programs often run slowly when they are resource constrained, e.g. without enough memory or on a slow network connection. There are a myriad of command line tools for probing and displaying different system resources like CPU usage, memory usage, network, disk usage and so on.</p><ul><li><strong>General Monitoring</strong> - Probably the most popular is <a href="https://htop.dev/"><code>htop</code></a>, which is an improved version of <a href="https://www.man7.org/linux/man-pages/man1/top.1.html"><code>top</code></a>. <code>htop</code> presents various statistics for the currently running processes on the system. <code>htop</code> has a myriad of options and keybinds, some useful ones are: <code><F6></code> to sort processes, <code>t</code> to show tree hierarchy and <code>h</code> to toggle threads. See also <a href="https://nicolargo.github.io/glances/"><code>glances</code></a> for similar implementation with a great UI. For getting aggregate measures across all processes, <a href="http://dag.wiee.rs/home-made/dstat/"><code>dstat</code></a> is another nifty tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c.</li><li><strong>I/O operations</strong> - <a href="https://www.man7.org/linux/man-pages/man8/iotop.8.html"><code>iotop</code></a> displays live I/O usage information and is handy to check if a process is doing heavy I/O disk operations</li><li><strong>Disk Usage</strong> - <a href="https://www.man7.org/linux/man-pages/man1/df.1.html"><code>df</code></a> displays metrics per partitions and <a href="http://man7.org/linux/man-pages/man1/du.1.html"><code>du</code></a> displays <strong>d</strong>isk <strong>u</strong>sage per file for the current directory. In these tools the <code>-h</code> flag tells the program to print with <strong>h</strong>uman readable format. A more interactive version of <code>du</code> is <a href="https://dev.yorhel.nl/ncdu"><code>ncdu</code></a> which lets you navigate folders and delete files and folders as you navigate.</li><li><strong>Memory Usage</strong> - <a href="https://www.man7.org/linux/man-pages/man1/free.1.html"><code>free</code></a> displays the total amount of free and used memory in the system. Memory is also displayed in tools like <code>htop</code>.</li><li><strong>Open Files</strong> - <a href="https://www.man7.org/linux/man-pages/man8/lsof.8.html"><code>lsof</code></a> lists file information about files opened by processes. It can be quite useful for checking which process has opened a specific file.</li><li><strong>Network Connections and Config</strong> - <a href="https://www.man7.org/linux/man-pages/man8/ss.8.html"><code>ss</code></a> lets you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of <code>ss</code> is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use <a href="http://man7.org/linux/man-pages/man8/ip.8.html"><code>ip</code></a>. Note that <code>netstat</code> and <code>ifconfig</code> have been deprecated in favor of the former tools respectively.</li><li><strong>Network Usage</strong> - <a href="https://github.com/raboof/nethogs"><code>nethogs</code></a> and <a href="http://www.ex-parrot.com/pdw/iftop/"><code>iftop</code></a> are good interactive CLI tools for monitoring network usage.</li></ul><p>If you want to test these tools you can also artificially impose loads on the machine using the <a href="https://linux.die.net/man/1/stress"><code>stress</code></a> command.</p><h4 id="Specialized-tools"><a href="#Specialized-tools" class="headerlink" title="Specialized tools"></a>Specialized tools</h4><p>Sometimes, black box benchmarking is all you need to determine what software to use. Tools like <a href="https://github.com/sharkdp/hyperfine"><code>hyperfine</code></a> let you quickly benchmark command line programs. For instance, in the shell tools and scripting lecture we recommended <code>fd</code> over <code>find</code>. We can use <code>hyperfine</code> to compare them in tasks we run often. E.g. in the example below <code>fd</code> was 20x faster than <code>find</code> in my machine.</p><pre><code class="lang-bash">$ hyperfine --warmup 3 'fd -e jpg' 'find . -iname "*.jpg"'Benchmark #1: fd -e jpg Time (mean ± σ): 51.4 ms ± 2.9 ms [User: 121.0 ms, System: 160.5 ms] Range (min … max): 44.2 ms … 60.1 ms 56 runsBenchmark #2: find . -iname "*.jpg" Time (mean ± σ): 1.126 s ± 0.101 s [User: 141.1 ms, System: 956.1 ms] Range (min … max): 0.975 s … 1.287 s 10 runsSummary 'fd -e jpg' ran 21.89 ± 2.33 times faster than 'find . -iname "*.jpg"'</code></pre><p>As it was the case for debugging, browsers also come with a fantastic set of tools for profiling webpage loading, letting you figure out where time is being spent (loading, rendering, scripting, &c). More info for <a href="https://profiler.firefox.com/docs/">Firefox</a> and <a href="https://developers.google.com/web/tools/chrome-devtools/rendering-tools">Chrome</a>.</p><p><strong>Shellcheck</strong></p><p><strong>PDB</strong></p><p><a href="https://wil.yegelwel.com/pdb-pm/">https://wil.yegelwel.com/pdb-pm/</a></p><p><a href="https://github.com/spiside/pdb-tutorial">https://github.com/spiside/pdb-tutorial</a></p><blockquote><p><code>l(ist)</code> <code>ll(long list)</code>ll shows you source code for the current function or frame. It’s much better knowing which function you are in than an arbitrary 11 lines around your current position.</p><pre><code class="lang-bash">(Pdb) l 4 def main(): 5 print("Add the values of the dice") 6 print("It's really that easy") 7 print("What are you doing with your life.") 8 import pdb; pdb.set_trace() 9 -> GameRunner.run() 10 11 12 if __name__ == "__main__": 13 main()[EOF]</code></pre><p>If we want to see the whole file, we can call the list function with the range 1 to 13 like so:</p><pre><code class="lang-bash">(Pdb) l 1, 13 1 from dicegame.runner import GameRunner 2 3 4 def main(): 5 print("Add the values of the dice") 6 print("It's really that easy") 7 print("What are you doing with your life.") 8 import pdb; pdb.set_trace() 9 -> GameRunner.run() 10 11 12 if __name__ == "__main__": 13 main()</code></pre><p><code>s(tep)</code> Execute the current line, stop at the first possible occasion(either in a function that is called or in the current function).</p><pre><code class="lang-bash">(Pdb) s--Call--> /Users/Development/pdb-tutorial/dicegame/runner.py(21)run()-> @classmethod</code></pre><pre><code class="lang-bash">(Pdb) l 16 total = 0 17 for die in self.dice: 18 total += 1 19 return total 20 21 -> @classmethod 22 def run(cls): 23 # Probably counts wins or something. 24 # Great variable name, 10/10. 25 c = 0 26 while True:</code></pre><p><code>n(ext)</code> Continue execution until the next line in the current function is reached or it returns.</p><pre><code class="lang-bash">(Pdb) n> /Users/Development/pdb-tutorial/dicegame/runner.py(27)run()-> while True:(Pdb) l 21 @classmethod 22 def run(cls): 23 # Probably counts wins or something. 24 # Great variable name, 10/10. 25 c = 0 26 -> while True: 27 runner = cls() 28 29 print("Round {}\n".format(runner.round)) 30 31 for die in runner.dice:</code></pre><p><code>b(reak)</code> </p><ul><li>Without argument, list all breaks.</li><li><p>With a line number argument, set a break at this line in the current file. With a function name, set a break at the first executable line of that function. If a second argument is present, it is a string specifying an expression which must evaluate to true before the breakpoint is honored.</p></li><li><p>The line number may be prefixed with a filename and a colon, to specify a breakpoint in another file (probably one that hasn’t been loaded yet). The file is searched for on sys.path; the .py suffix may be omitted.</p></li></ul><pre><code class="lang-bash">(Pdb) b 34Breakpoint 1 at /Users/Development/pdb-tutorial/dicegame/runner.py(34)run()(Pdb) c[...] # prints some dice> /Users/Development/pdb-tutorial/dicegame/runner.py(34)run()-> guess = input("Sigh. What is your guess?: ")</code></pre><pre><code class="lang-bash">(Pdb) bNum Type Disp Enb Where1 breakpoint keep yes at /Users/Development/pdb-tutorial/dicegame/runner.py:34 breakpoint already hit 1 time</code></pre><p><code>cl(ear)</code> To clear your break points, you can use the <code>cl(ear)</code> command followed by the breakpoint number which is found in the leftmost column of the above output. You can also clear all the breakpoints if you don’t provide any arguments to the <code>clear</code> command.</p><pre><code class="lang-bash">(Pdb) cl 1Deleted breakpoint 1 at /Users/Development/pdb-tutorial/dicegame/runner.py:34</code></pre><p><code>r(eturn)</code> Continue execution until the current function returns.</p><pre><code class="lang-bash">(Pdb) r--Return--> /Users/Development/pdb-tutorial/dicegame/runner.py(19)answer()->5-> return total(Pdb) l 14 15 def answer(self): 16 total = 0 17 for die in self.dice: 18 total += 1 19 -> return total 20 21 @classmethod 22 def run(cls): 23 # Probably counts wins or something. 24 # Great variable name, 10/10.(Pdb)</code></pre><p><code>continue</code> The <code>continue</code> command instructs the debugger to resume execution of the program until the next breakpoint is encountered, an exception is raised, or the program terminates.</p><p><code>!c</code> If there is a variable called. c, but this c maybe be misunderstood with c(ontinue). So we use <code>!c</code> to get the value of variable c</p><pre><code class="lang-bash"> (Pdb) !c 0</code></pre><p><code>commands</code></p><pre><code class="lang-bash">commands [bpnumber] (com) ... (com) end (Pdb)Specify a list of commands for breakpoint number bpnumber.</code></pre><p><code>commands</code> will run python code or pdb commands that you specified whenever the stated breakpoint number is hit. Once you start the <code>commands</code> block, the prompt changes to <code>(com)</code>. The code/commands you write here function as if you had typed them at the <code>(Pdb)</code> prompt after getting to that breakpoint. Writing <code>end</code> will terminate the command and the prompt changes back to <code>(Pdb)</code> from <code>(com)</code>. I have found this of great use when I need to monitor certain variables inside of a loop as I don’t need to print the values of the variables repeatedly. Let’s see an example. Make sure to be at the root of the project in your terminal and type the following:</p><pre><code class="lang-bash">python -m pdb main.py</code></pre><p>Reach line <code>:8</code> and <code>s(tep)</code> into the <code>run()</code> method of the GameRunner class. Then, set up a breakpoint at <code>:17</code>.</p><pre><code class="lang-bash">> /Users/Development/pdb-tutorial/main.py(8)main()-> GameRunner.run() #This is line 8 in main.py(Pdb) s --Call--> /Users/Development/pdb-tutorial/dicegame/runner.py(21)run()-> @classmethod (Pdb) b 17Breakpoint 4 at /Users/Development/pdb-tutorial/dicegame/runner.py:17</code></pre><p>This sets up the breakpoint, which has been given the number <code>4</code>, at the start of the loop inside the <code>answer()</code>method which is used to calculate the total values of the dice. Now, let’s us use <code>commands</code> to print the value of the variable <code>total</code> when we hit this breakpoint.</p><pre><code class="lang-bash">(Pdb) commands 4(com) print(f"The total value as of now is {total}")(com) end</code></pre><p>We have now set up <code>commands</code> for breakpoint number 4 which will execute when we reach this breakpoint. Let us <code>c(ontinue)</code> and reach this breakpoint.</p><pre><code class="lang-bash">(Pdb) c[...] # You will have to guess a numberThe total value as of now is 0> /Users/Development/pdb-tutorial/dicegame/runner.py(17)answer()-> for die in self.dice:(Pdb)</code></pre><p>We see that out print statement executed upon reaching this breakpoint. Let’s <code>c(ontinue)</code> again and see what happens.</p><pre><code class="lang-bash">(Pdb) c[...] The total value as of now is 1> /Users/Development/pdb-tutorial/dicegame/runner.py(17)answer()-> for die in self.dice:(Pdb)</code></pre><p>The <code>commands</code> command executes upon reaching the breakpoint again. You can see how this might be useful especially during loops.</p><h3 id="pdb-pm-Post-Mortem"><a href="#pdb-pm-Post-Mortem" class="headerlink" title="pdb.pm() Post Mortem"></a><code>pdb.pm()</code> Post Mortem</h3><pre><code>pdb.post_mortem(traceback=None) Enter post-mortem debugging of the given traceback object. If no traceback is given, it uses the one of the exception that is currently being handled (an exception must be being handled if the default is to be used).pdb.pm() Enter post-mortem debugging of the traceback found in sys.last_traceback.</code></pre><p>While both methods may look the same, <code>post_mortem() and pm()</code> differ by the traceback they are given. I commonly use <code>post_mortem()</code> in the <code>except</code> block. However, we will cover the <code>pm()</code> method since I find it to be a bit more powerful. Let’s try and see how this works in practice.</p><p>Open up the python REPL by typing <code>python</code> in your shell in the root of this project. From there, let’s import the <code>main</code> method from the <code>main</code> module and import <code>pdb</code> as well. Play the game until the we get the exception after trying to type <code>Y</code> to continue the game.</p><pre><code>>>> import pdb>>> from main import main>>> main()[...]Would you like to play again?[Y/n]: YTraceback (most recent call last): File "main.py", line 12, in <module> main() File "main.py", line 8, in main GameRunner.run() File "/Users/Development/pdb-tutorial/dicegame/runner.py", line 62, in run i_just_throw_an_exception() File "/Users/Development/pdb-tutorial/dicegame/utils.py", line 13, in i_just_throw_an_exception raise UnnecessaryError("You actually called this function...")dicegame.utils.UnnecessaryError: You actually called this function...</code></pre><p>Now, let’s call the <code>pm()</code> method from the <code>pdb</code> module and see what happens.</p><pre><code>>>> pdb.pm()> /Users/Development/pdb-tutorial/dicegame/utils.py(13)i_just_throw_an_exception()-> raise UnnecessaryError("You actually called this function...")(Pdb)</code></pre><p>Look at that! We recover from the point where the last exception was thrown and are placed in the <code>pdb</code> prompt. From here, we can examine the state the program was in before it crashed which will help you in your investigation.</p><p><strong>NB</strong>: You can also start the <code>main.py</code> script using <code>python -m pdb main.py</code> and <code>continue</code> until an exception is thrown. Python will automatically enter <code>post_mortem</code> mode at the uncaught exception.</p><p>links:<a href="https://wil.yegelwel.com/pdb-pm/">https://wil.yegelwel.com/pdb-pm/</a></p><h4 id="Python-Debugging-How-to-use-pdb-pm"><a href="#Python-Debugging-How-to-use-pdb-pm" class="headerlink" title="Python Debugging: How to use pdb.pm()"></a>Python Debugging: How to use pdb.pm()</h4><p>Python’s debugger, <a href="https://docs.python.org/3/library/pdb.html">pdb</a>, is a great tool and one of my favorite uses of pdb is <code>pdb.pm()</code>. This post shows how to use <code>pdb.pm()</code> by working through an example of a buggy implementation of mergesort.</p><p><code>pdb.pm()</code> launches <code>pdb</code> to debug the most recently raised exception. When you find yourself saying “Shit! That exception occurred four layers deep in code I wasn’t editting and setting up the state to reproduce this is going to take 20 minutes” or “I wish I could just debug right from where the exception was raised”, <code>pdb.pm()</code> is what you want.</p><p>As a working example, consider the following implementation of mergesort. It isn’t great: the code has a logic bug, duplicates memory and suffers from array expansion, we will only focus on the logic bug.</p><pre><code class="lang-Python">def mergesort(xs): if len(xs) == 1: return xs else: left = mergesort(xs[:len(xs)//2]) right = mergesort(xs[len(xs)//2:]) ys = [] left_i, right_i = 0, 0 while left_i < len(left) or right_i < len(right): if left[left_i] <= right[right_i]: ys.append(left[left_i]) left_i += 1 else: ys.append(right[right_i]) right_i += 1 return ys</code></pre><p>The bug is in the while loop. If <code>left_i</code> is less than <code>len(left)</code> it will drop into the loop body, even if <code>right_i</code> is greater than or equal to <code>len(right)</code>. In that case, it will try to index out of the array bounds. Gasp!</p><p>If you try to use this code, you will get the error we expect.</p><pre><code class="lang-Python">In [2]: mergesort([6,5,4,3,2,1]) ---------------------------------------------------------------------------IndexError Traceback (most recent call last)<ipython-input-30-15081790cedb> in <module>----> 1 mergesort([6,5,4,3,2,1])<ipython-input-28-cb4e97206eb8> in mergesort(xs) 3 return xs 4 else:----> 5 left = mergesort(xs[:len(xs)//2]) 6 right = mergesort(xs[len(xs)//2:]) 7 ys = []<ipython-input-28-cb4e97206eb8> in mergesort(xs) 4 else: 5 left = mergesort(xs[:len(xs)//2])----> 6 right = mergesort(xs[len(xs)//2:]) 7 ys = [] 8 left_i, right_i = 0, 0<ipython-input-28-cb4e97206eb8> in mergesort(xs) 8 left_i, right_i = 0, 0 9 while left_i < len(left) or right_i < len(right):---> 10 if left[left_i] <= right[right_i]: 11 ys.append(left[left_i]) 12 left_i += 1IndexError: list index out of range</code></pre><p>While we already know what the bug is, we can use <code>pdb.pm()</code> to poke at the state to see what is going on. We can query for the state of variables to see what triggered the problem.</p><pre><code class="lang-Python">In [33]: import pdb; pdb.pm() > <ipython-input-28-cb4e97206eb8>(10)mergesort()-> if left[left_i] <= right[right_i]:(Pdb) left_i, right_i(0, 1)(Pdb) left[5](Pdb) right[4]</code></pre><p>Ok, now we can see what happened. Both <code>left</code> and <code>right</code> had a single element but the code tried to index the non-existent second element in <code>right</code>.</p><p>If we needed to, we could step up the traceback to understand the state using the command <code>u</code> (for up). Then we can poke at the state higher in the stack as we did before.</p><pre><code class="lang-Python">(Pdb) u> <ipython-input-28-cb4e97206eb8>(6)mergesort()-> right = mergesort(xs[len(xs)//2:])(Pdb) left[6](Pdb) xs[6, 5, 4](Pdb) xs[len(xs)//2:][5, 4]</code></pre><p>Debugging exceptions becomes a lot easier once you become comfortable with <code>pdb.pm()</code>.</p></blockquote><p><strong>pycallgraph</strong></p><p><strong>graphviz</strong></p><pre><code class="lang-bash">pycallgraph graphviz script.py</code></pre>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> debug </tag>
<tag> profiling </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture6: Version Control(git)</title>
<link href="/2024/04/27/missing-semester/missing-semester-lecture6/"/>
<url>/2024/04/27/missing-semester/missing-semester-lecture6/</url>
<content type="html"><![CDATA[<h3 id="Lecture-6-Version-Control-git"><a href="#Lecture-6-Version-Control-git" class="headerlink" title="Lecture 6: Version Control(git)"></a>Lecture 6: Version Control(git)</h3><p>link:<a href="https://missing.csail.mit.edu/2020/version-control/">https://missing.csail.mit.edu/2020/version-control/</a></p><pre><code class="lang-bash">31:42 git add . git commit -m <msg>git loggit cat-file -p <8d-commit-hash/ branch-hash/ file-hash>32:47 git commit -agit add :/35:23 git log --all --graph --decorate36:12 git status (have staged or commited or not)41:41 git-checkout -f <8d-commit-hash / 8d-branch-hash> (switch branch)43:11 git diff hello.txt (show the changes in the file compared to the last commit)43:28 git diff <8d-commit-hash> hello.txt (compared to the branch)44:33 git diff HEAD hello.txt46:22 git diff <8d-commit-hash) HEAD hello.txt (change from to)50:06 git branch (print all the branches)git branch -vv (extra verbose)50:17 git branch cat (create new branch that points to the HEAD you are currently looking)git checkout cat53:56 git branch -b dog52:53 git log --all --graph --decorate --oneline (more compact view)56:27 git merge cat (can do cat dog) 58:51 git merge --abort1:00:41 git add . git merge --continue > git commit -m <msg> / git commit59:37 <<<<< HEAD points to last snapshot in master branch, >>>> dog points to the branch you're trying to merge.1:04:17 git init --bare (initiatialize empty git repo in current dir) 1:04:20 git remote add <remote name> <remote repository URL>git push <remote name> <local branch>: <remote branch>1:07:54 git clone <url> <folder name>1:10:33 git branch --set-upstream-to=origin/master1:18:12 git blame .config.yml (who edit the file on which commit message by who, when) git show <commit hash> (to get line changes like git diff)1:19:22 git stash (changes saved somewhere) git stash pop (get saved back)1:20:46 git bisect 1:21:48 git ignore (put file name or *.extension)</code></pre><h2 id="Basics"><a href="#Basics" class="headerlink" title="Basics"></a>Basics</h2><ul><li><p><code>git help <command></code>: get help for a git command</p></li><li><p><code>git init</code>: creates a new git repo, with data stored in the <code>.git</code> directory</p></li><li><p><code>git status</code>: tells you what’s going on</p></li><li><p><code>git add <filename></code>: adds files to staging area</p></li><li><pre><code class="lang-plaintext">git commit</code></pre><p>: creates a new commit</p><ul><li>Write <a href="https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html">good commit messages</a>!</li><li>Even more reasons to write <a href="https://chris.beams.io/posts/git-commit/">good commit messages</a>!</li></ul></li><li><p><code>git log</code>: shows a flattened log of history</p></li><li><p><code>git log --all --graph --decorate</code>: visualizes history as a DAG</p></li><li><p><code>git diff <filename></code>: show changes you made relative to the staging area</p></li><li><p><code>git diff <revision> <filename></code>: shows differences in a file between snapshots</p></li><li><p><code>git checkout <revision></code>: updates HEAD and current branch</p></li></ul><h2 id="Branching-and-merging"><a href="#Branching-and-merging" class="headerlink" title="Branching and merging"></a>Branching and merging</h2><ul><li><p><code>git branch</code>: shows branches</p></li><li><p><code>git branch <name></code>: creates a branch</p></li><li><pre><code class="lang-plaintext">git checkout -b <name></code></pre><p>: creates a branch and switches to it</p><ul><li>same as <code>git branch <name>; git checkout <name></code></li></ul></li><li><p><code>git merge <revision></code>: merges into current branch</p></li><li><p><code>git mergetool</code>: use a fancy tool to help resolve merge conflicts</p></li><li><p><code>git rebase</code>: rebase set of patches onto a new base</p></li></ul><h2 id="Remotes"><a href="#Remotes" class="headerlink" title="Remotes"></a>Remotes</h2><ul><li><code>git remote</code>: list remotes</li><li><code>git remote add <name> <url></code>: add a remote</li><li><code>git push <remote> <local branch>:<remote branch></code>: send objects to remote, and update remote reference</li><li><code>git branch --set-upstream-to=<remote>/<remote branch></code>: set up correspondence between local and remote branch</li><li><code>git fetch</code>: retrieve objects/references from a remote</li><li><code>git pull</code>: same as <code>git fetch; git merge</code></li><li><code>git clone</code>: download repository from remote</li></ul><h2 id="Undo"><a href="#Undo" class="headerlink" title="Undo"></a>Undo</h2><ul><li><code>git commit --amend</code>: edit a commit’s contents/message</li><li><code>git reset HEAD <file></code>: unstage a file</li><li><code>git checkout -- <file></code>: discard changes</li></ul><h1 id="Advanced-Git"><a href="#Advanced-Git" class="headerlink" title="Advanced Git"></a>Advanced Git</h1><ul><li><code>git config</code>: Git is <a href="https://git-scm.com/docs/git-config">highly customizable</a></li><li><code>git clone --depth=1</code>: shallow clone, without entire version history</li><li><code>git add -p</code>: interactive staging</li><li><code>git rebase -i</code>: interactive rebasing</li><li><code>git blame</code>: show who last edited which line</li><li><code>git stash</code>: temporarily remove modifications to working directory</li><li><code>git bisect</code>: binary search history (e.g. for regressions)</li><li><code>.gitignore</code>: <a href="https://git-scm.com/docs/gitignore">specify</a> intentionally untracked files to ignore</li></ul>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> git </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture5: Command-line Environment</title>
<link href="/2024/04/23/missing-semester/missing-semester-lecture5/"/>
<url>/2024/04/23/missing-semester/missing-semester-lecture5/</url>
<content type="html"><![CDATA[<h3 id="Lecture-5-Command-line-Environment"><a href="#Lecture-5-Command-line-Environment" class="headerlink" title="Lecture 5: Command-line Environment"></a>Lecture 5: Command-line Environment</h3><p>link: <a href="https://missing.csail.mit.edu/2020/command-line/">https://missing.csail.mit.edu/2020/command-line/</a></p><pre><code class="lang-bash">sleep 20</code></pre><blockquote><p>sleep for 20 seconds</p></blockquote><p><code><C-C></code> : interrupt the commond</p><blockquote><p>it sends SIGINT</p></blockquote><pre><code class="lang-bash">man signal</code></pre><blockquote><p>read some signals</p></blockquote><p><code><C-\></code>: SIGQUIT</p><pre><code class="lang-bash">kill -TERM <PID></code></pre><p>While <code>SIGINT</code> and <code>SIGQUIT</code> are both usually associated with terminal related requests, a more generic signal for asking a process to exit gracefully is the <code>SIGTERM</code> signal. To send this signal we can use the <a href="https://www.man7.org/linux/man-pages/man1/kill.1.html"><code>kill</code></a> command, with the syntax <code>kill -TERM <PID></code>.</p><p><code><C-Z></code>:SIGTSTP, short for Terminal Stop</p><p><code>bg</code></p><p><code>fg</code></p><p><code>jobs</code></p><p><code>kill</code></p><p><code>nohup</code>: To background an already running program you can do <code>Ctrl-Z</code> followed by <code>bg</code>. Note that backgrounded processes are still children processes of your terminal and will die if you close the terminal (this will send yet another signal, <code>SIGHUP</code>). To prevent that from happening you can run the program with <a href="https://www.man7.org/linux/man-pages/man1/nohup.1.html"><code>nohup</code></a> (a wrapper to ignore <code>SIGHUP</code>), or use <code>disown</code> if the process has already been started. Alternatively, you can use a terminal multiplexer as we will see in the next section.</p><pre><code class="lang-bash">$ sleep 1000^Z[1] + 18653 suspended sleep 1000$ nohup sleep 2000 &[2] 18745appending output to nohup.out$ jobs[1] + suspended sleep 1000[2] - running nohup sleep 2000$ bg %1[1] - 18653 continued sleep 1000$ jobs[1] - running sleep 1000[2] + running nohup sleep 2000$ kill -STOP %1[1] + 18653 suspended (signal) sleep 1000$ jobs[1] + suspended (signal) sleep 1000[2] - running nohup sleep 2000$ kill -SIGHUP %1[1] + 18653 hangup sleep 1000$ jobs[2] + running nohup sleep 2000$ kill -SIGHUP %2$ jobs[2] + running nohup sleep 2000$ kill %2[2] + 18745 terminated nohup sleep 2000$ jobs</code></pre><h3 id="tmux-Terminal-Multiplexers"><a href="#tmux-Terminal-Multiplexers" class="headerlink" title="tmux(Terminal Multiplexers)"></a>tmux(Terminal Multiplexers)</h3><ul><li>Sessions<ul><li>Windows(Tabs)<ul><li>Panes</li></ul></li></ul></li></ul><h3 id="split-panes-using-and"><a href="#split-panes-using-and" class="headerlink" title="split panes using | and -"></a>split panes using | and -</h3><p>bind | split-window -h<br>bind - split-window -v<br>unbind ‘“‘<br>unbind %</p><h3 id="switch-panes-using-Alt-hjkl-without-prefix"><a href="#switch-panes-using-Alt-hjkl-without-prefix" class="headerlink" title="switch panes using Alt-hjkl without prefix"></a>switch panes using Alt-hjkl without prefix</h3><p>bind -n M-h select-pane -L<br>bind -n M-l select-pane -R<br>bind -n M-k select-pane -U<br>bind -n M-j select-pane -D</p><h3 id="default-tmux-setting"><a href="#default-tmux-setting" class="headerlink" title="default tmux setting"></a>default tmux setting</h3><p>C-a C-o Rotate through the panes [38/38]<br>C-a C-z Suspend the current client<br>C-a Space Select next layout<br>C-a ! Break pane to a new window<br>C-a # List all paste buffers<br>C-a $ Rename current session<br>C-a & Kill current window<br>C-a ‘ Prompt for window index to select<br>C-a ( Switch to previous client<br>C-a ) Switch to next client<br>C-a , Rename current window<br>C-a . Move the current window<br>C-a / Describe key binding<br>C-a 0 Select window 0<br>C-a 1 Select window 1<br>C-a 2 Select window 2<br>C-a 3 Select window 3<br>C-a 4 Select window 4<br>C-a 5 Select window 5<br>C-a 6 Select window 6<br>C-a 7 Select window 7<br>C-a 8 Select window 8<br>C-a 9 Select window 9<br>C-a : Prompt for a command<br>C-a ; Move to the previously active pane<br>C-a = Choose a paste buffer from a list<br>C-a ? List key bindings<br>C-a C Customize options<br>C-a D Choose a client from a list<br>C-a E Spread panes out evenly<br>C-a L Switch to the last client<br>C-a M Clear the marked pane<br>C-a [ Enter copy mode<br>C-a ] Paste the most recent paste buffer<br>C-a c Create a new window<br>C-a d Detach the current client<br>C-a f Search for a pane<br>C-a i Display window information<br>C-a l Select the previously current window<br>C-a m Toggle the marked pane</p>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> tmux </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture4: Data Wrangling</title>
<link href="/2024/04/21/missing-semester/missing-semester-lecture4/"/>
<url>/2024/04/21/missing-semester/missing-semester-lecture4/</url>
<content type="html"><![CDATA[<h3 id="Lecture-4-Data-Wrangling"><a href="#Lecture-4-Data-Wrangling" class="headerlink" title="Lecture 4: Data Wrangling"></a>Lecture 4: Data Wrangling</h3><p>link:<a href="https://missing.csail.mit.edu/2020/data-wrangling/">https://missing.csail.mit.edu/2020/data-wrangling/</a></p><p>sed is short for stream editor</p><pre><code class="lang-bash">| sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/'</code></pre><pre><code class="lang-bash">ssh myserver journalctl | grep sshd | grep "Disconnected from" | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' | sort | uniq -c | sort -nk1,1 | tail -n10 | awk '{print $2}' | paste -sd,</code></pre><pre><code class="lang-bash">| awk '$1 == 1 && $2 ~ /^c[^ ]*e$/ { print $2 }' | wc -l</code></pre><pre><code class="lang-bash">BEGIN { rows = 0 }$1 == 1 && $2 ~ /^c[^ ]*e$/ { rows += $1 }END { print rows }</code></pre><pre><code class="lang-bash"> | paste -sd+ | bc -l</code></pre>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> shell </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture3: Editor(vim)</title>
<link href="/2024/04/19/missing-semester/missing-semester-lecture3/"/>
<url>/2024/04/19/missing-semester/missing-semester-lecture3/</url>
<content type="html"><![CDATA[<h3 id="Lecture-3-Editors-vim"><a href="#Lecture-3-Editors-vim" class="headerlink" title="Lecture 3: Editors(vim)"></a>Lecture 3: Editors(vim)</h3><p>link: <a href="https://missing.csail.mit.edu/2020/editors/">https://missing.csail.mit.edu/2020/editors/</a></p><p>normal->insert: i</p><p>insert->normal: <esc></esc></p><p>normal->replace: R</p><p>nomral->visual: v</p><p>normal->visual-line: <s-v></s-v></p><p>noraml->visual-block: <c-v></c-v></p><p>normal->command-line: <strong>:</strong></p><pre><code class="lang-bash">:help :w</code></pre><blockquote><p>show the usage of ‘:w’</p></blockquote><p>shift + 6(^) : move the cursor to the beginning of the line where is not empty.</p><p>L: move the cursor to the lowest of the screen</p><p>M: middle</p><p>H: highest</p><p>fo: jump the next first o</p><p>Fo: jump to the previous o</p><p>to: jump to before the next first o</p><p>To: jump to after the previous first o</p><p>in nomral mode: c = d + i (c is short for change)</p><p>cc = dd + i</p><p><c-r>: redo</c-r></p><p>undo is undo all the insert change or last change in normal mode</p><p>v: block</p><p>V: rectangle</p><p>ci[ : delete the contents that are inside the brackets and go to the insert mode.</p><p>ca[: delete the contents that are around the brackets and go to the insert mode</p><p>%: jump to the other parentheses</p><p>a: a is short for append</p><p>A: append at the end of this line</p><p>~: change the uppercase and lowercase</p><p>U: return the line to its original state.</p><p><c-g>: show your location in the file and the file status.</c-g></p><p>number + G: return to the line with the number</p><p>To search for the same phrase again, simply type n .<br>To search for the same phrase in the opposite direction, type N .<br>To search for a phrase in the backward direction, use ? instead of / .<br>To go back to where you came from press CTRL-O (Keep Ctrl down while pressing the letter o). Repeat to go back further. CTRL-I goes forward.</p><p>s: x+i</p><p>:s/thee/the: Note that this command only changes the first occurrence of “thee” in the line.</p><p>:s/thee/the/g . Adding the g flag means to substitute globally in the line, change all occurrences of “thee” in the line.</p><p>To substitute new for the first old in a line type :s/old/new<br>To substitute new for all ‘old’s on a line type :s/old/new/g<br>To substitute phrases between two line #’s type :#,#s/old/new/g<br>To substitute all occurrences in the file type :%s/old/new/g<br> To ask for confirmation each time add ‘c’ :%s/old/new/gc</p><p>Type :! followed by an external command to execute that command.</p><p>Now type: :w TEST (where TEST is the filename you chose.) This saves the whole file (the Vim Tutor) under the name TEST.</p><p>:!command executes an external command.<br>Some useful examples are:<br>:!ls - shows a directory listing.<br>:!rm FILENAME - removes file FILENAME.<br>:w FILENAME writes the current Vim file to disk with name FILENAME.<br>v motion :w FILENAME saves the Visually selected lines in file FILENAME.<br>:r FILENAME retrieves disk file FILENAME and puts it below the cursor position.<br>:r !dir reads the output of the dir command and puts it below the cursor position.</p>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> vim </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture2: Shell Tools and Scripting</title>
<link href="/2024/04/15/missing-semester/missing-semester-lecture2/"/>
<url>/2024/04/15/missing-semester/missing-semester-lecture2/</url>
<content type="html"><![CDATA[<h3 id="Lecture-2-Shell-Tools-and-Scripting"><a href="#Lecture-2-Shell-Tools-and-Scripting" class="headerlink" title="Lecture 2: Shell Tools and Scripting"></a>Lecture 2: Shell Tools and Scripting</h3><p>link:<a href="https://missing.csail.mit.edu/2020/shell-tools/">https://missing.csail.mit.edu/2020/shell-tools/</a></p><pre><code class="lang-bash">echo "Hello"> Helloecho "World"> Worldfoo=bar(good)foo = bar(not work)echo "Value is $foo"> Value is barecho 'Value is $foo'> Value is $foo</code></pre><ul><li><code>$0</code> - Name of the script</li><li><code>$1</code> to <code>$9</code> - Arguments to the script. <code>$1</code> is the first argument and so on.</li><li><code>$@</code> - All the arguments</li><li><code>$#</code> - Number of arguments</li><li><code>$?</code> - Return code of the previous command</li><li>$$$$ - Process identification number (PID) for the current script</li><li><code>!!</code> - Entire last command, including arguments. A common pattern is to execute a command only for it to fail due to missing permissions; you can quickly re-execute the command with sudo by doing <code>sudo !!</code></li><li><code>$_</code> - Last argument from the last command. If you are in an interactive shell, you can also quickly get this value by typing <code>Esc</code> followed by <code>.</code> or <code>Alt+.</code></li><li><code>$!</code> contains the process ID of the most recently executed background pipeline</li></ul><pre><code class="lang-bash">echo "We are in $(pwd)"cat <(ls) <(ls ..)ls *.sh> show all files which are end with shls project?> ? means a single charactertouch foo{1,2,3}> and press tab> it will be foo1, foo2, foo3touch {foo,bar}/{a..j}diff <(foo) <(bar)</code></pre><p><code>tldr</code> is a good tool to read guide of commands </p><pre><code class="lang-bash">findfdrgfzftreebrootnnn</code></pre><pre><code class="lang-bash">find . -maxdepth 1 -type f -name "*.html" -print0 | xargs -0 zip html_files.zipfind . -type f -exec stat -f "%m %N" {} + | sort -nr | awk '{print $2}'</code></pre><p><code>bash shortcuts</code>:<a href="https://skorks.com/2009/09/bash-shortcuts-for-maximum-productivity/">https://skorks.com/2009/09/bash-shortcuts-for-maximum-productivity/</a></p><h3 id="Command-Editing-Shortcuts"><a href="#Command-Editing-Shortcuts" class="headerlink" title="Command Editing Shortcuts"></a>Command Editing Shortcuts</h3><ul><li><strong>Ctrl + a</strong> – go to the start of the command line</li><li><strong>Ctrl + e</strong> – go to the end of the command line</li><li><strong>Ctrl + k</strong> – delete from cursor to the end of the command line</li><li><strong>Ctrl + u</strong> – delete from cursor to the start of the command line</li><li><strong>Ctrl + w</strong> – delete from cursor to start of word (i.e. delete backwards one word)</li><li><strong>Ctrl + y</strong> – paste word or text that was cut using one of the deletion shortcuts (such as the one above) after the cursor</li><li><strong>Ctrl + xx</strong> – move between start of command line and current cursor position (and back again)</li><li><strong>Alt + b</strong> – move backward one word (or go to start of word the cursor is currently on)</li><li><strong>Alt + f</strong> – move forward one word (or go to end of word the cursor is currently on)</li><li><strong>Alt + d</strong> – delete to end of word starting at cursor (whole word if cursor is at the beginning of word)</li><li><strong>Alt + c</strong> – capitalize to end of word starting at cursor (whole word if cursor is at the beginning of word)</li><li><strong>Alt + u</strong> – make uppercase from cursor to end of word</li><li><strong>Alt + l</strong> – make lowercase from cursor to end of word</li><li><strong>Alt + t</strong> – swap current word with previous</li><li><strong>Ctrl + f</strong> – move forward one character</li><li><strong>Ctrl + b</strong> – move backward one character</li><li><strong>Ctrl + d</strong> – delete character under the cursor</li><li><strong>Ctrl + h</strong> – delete character before the cursor</li><li><strong>Ctrl + t</strong> – swap character under cursor with the previous one</li></ul><h3 id="Command-Recall-Shortcuts"><a href="#Command-Recall-Shortcuts" class="headerlink" title="Command Recall Shortcuts"></a>Command Recall Shortcuts</h3><ul><li><strong>Ctrl + r</strong> – search the history backwards</li><li><strong>Ctrl + g</strong> – escape from history searching mode</li><li><strong>Ctrl + p</strong> – previous command in history (i.e. walk back through the command history)</li><li><strong>Ctrl + n</strong> – next command in history (i.e. walk forward through the command history)</li><li><strong>Alt + .</strong> – use the last word of the previous command</li></ul><h3 id="Command-Control-Shortcuts"><a href="#Command-Control-Shortcuts" class="headerlink" title="Command Control Shortcuts"></a>Command Control Shortcuts</h3><ul><li><strong>Ctrl + l</strong> – clear the screen</li><li><strong>Ctrl + s</strong> – stops the output to the screen (for long running verbose command)</li><li><strong>Ctrl + q</strong> – allow output to the screen (if previously stopped using command above)</li><li><strong>Ctrl + c</strong> – terminate the command</li><li><strong>Ctrl + z</strong> – suspend/stop the command</li></ul><h3 id="Bash-Bang-Commands"><a href="#Bash-Bang-Commands" class="headerlink" title="Bash Bang (!) Commands"></a>Bash Bang (!) Commands</h3><p><em>Bash</em> also has some handy features that use the ! (bang) to allow you to do some funky stuff with <em>bash</em> commands.</p><ul><li><strong>!!</strong> – run last command</li><li><strong>!blah</strong> – run the most recent command that starts with ‘blah’ (e.g. !ls)</li><li><strong>!blah:p</strong> – print out the command that <strong>!blah</strong> would run (also adds it as the latest command in the command history)</li><li><strong>!$</strong> – the last word of the previous command (same as <strong>Alt + .</strong>)</li><li><strong>!$:p</strong> – print out the word that <strong>!$</strong> would substitute</li><li><strong>!*</strong> – the previous command except for the last word (e.g. if you type ‘_find some<em>file.txt /</em>’, then <strong>!*</strong> would give you ‘_find some<em>file.txt</em>’)</li><li><strong>!*:p</strong> – print out what <strong>!*</strong> would substitute</li><li><strong>$!</strong> contains the process ID of the most recently executed background pipeline</li></ul>]]></content>
<categories>
<category> Missing </category>
</categories>
<tags>
<tag> shell </tag>
</tags>
</entry>
<entry>
<title>Missing Semester Lecture1: Course Overview + The Shell</title>
<link href="/2024/04/15/missing-semester/missing-semester-lecture1/"/>
<url>/2024/04/15/missing-semester/missing-semester-lecture1/</url>
<content type="html"><![CDATA[<h3 id="Lecture-1-Course-Overview-The-Shell"><a href="#Lecture-1-Course-Overview-The-Shell" class="headerlink" title="Lecture 1: Course Overview + The Shell"></a>Lecture 1: Course Overview + The Shell</h3><p><a href="https://missing.csail.mit.edu/2020/course-shell/">https://missing.csail.mit.edu/2020/course-shell/</a></p><pre><code class="lang-bash">mv a.md b.md</code></pre><blockquote><p>rename the file</p></blockquote><pre><code class="lang-bash">mv a.md ../b.md</code></pre><blockquote><p>move the file</p></blockquote><pre><code class="lang-bash">rm dir1</code></pre><blockquote><p>delete all items recursive</p></blockquote><pre><code class="lang-bash">rmdir dir1</code></pre><blockquote><p>delete dir only empty</p></blockquote><pre><code class="lang-bash">ls ..</code></pre><blockquote><p>list all files of the father dir</p></blockquote><p>shortcut: <code>Control + L</code></p><blockquote><p>same as command clear</p></blockquote><pre><code class="lang-bash">cat < hello.txt > hello2.txt</code></pre><blockquote><p>replace hello2.txt with the content of hello.txt</p></blockquote><pre><code class="lang-bash">cat < hello.txt >> hello2.txt</code></pre><blockquote><p>append the content of hello.txt to hello2.txt as an appendix</p></blockquote><ul><li><strong>Pipe</strong></li></ul><pre><code class="lang-bash">ls - l | tail -n1</code></pre><blockquote><p>get last info of (ls -l)</p></blockquote><pre><code class="lang-bash">curl --head --silent google.com | grep -i content-length</code></pre><blockquote><p>output is ‘Content-Length:219’</p></blockquote>]]></content>
<categories>
<category> Missing </category>
<category> Semester </category>
</categories>
<tags>
<tag> shell </tag>
</tags>
</entry>
<entry>
<title>Hello World</title>
<link href="/2024/04/15/hello-world/"/>
<url>/2024/04/15/hello-world/</url>
<content type="html"><![CDATA[<p>Welcome to <a href="https://hexo.io/">Hexo</a>! This is your very first post. Check <a href="https://hexo.io/docs/">documentation</a> for more info. If you get any problems when using Hexo, you can find the answer in <a href="https://hexo.io/docs/troubleshooting.html">troubleshooting</a> or you can ask me on <a href="https://github.com/hexojs/hexo/issues">GitHub</a>.</p><h2 id="Quick-Start"><a href="#Quick-Start" class="headerlink" title="Quick Start"></a>Quick Start</h2><h3 id="Create-a-new-post"><a href="#Create-a-new-post" class="headerlink" title="Create a new post"></a>Create a new post</h3><pre><code class="lang-bash">$ hexo new "My New Post"</code></pre><p>More info: <a href="https://hexo.io/docs/writing.html">Writing</a></p><h3 id="Run-server"><a href="#Run-server" class="headerlink" title="Run server"></a>Run server</h3><pre><code class="lang-bash">$ hexo server</code></pre><p>More info: <a href="https://hexo.io/docs/server.html">Server</a></p><h3 id="Generate-static-files"><a href="#Generate-static-files" class="headerlink" title="Generate static files"></a>Generate static files</h3><pre><code class="lang-bash">$ hexo generate</code></pre><p>More info: <a href="https://hexo.io/docs/generating.html">Generating</a></p><h3 id="Deploy-to-remote-sites"><a href="#Deploy-to-remote-sites" class="headerlink" title="Deploy to remote sites"></a>Deploy to remote sites</h3><pre><code class="lang-bash">$ hexo deploy</code></pre><p>More info: <a href="https://hexo.io/docs/one-command-deployment.html">Deployment</a></p>]]></content>
<tags>
<tag> hexo </tag>
</tags>
</entry>
</search>