Implement a Web Crawler

Introduction

Design and implement a multithreaded or asynchronous web crawler program that can crawl all web links under the same host starting from a specified URL.

A Blocking Solution

Let’s put aside the topics of efficiency and scalability, focus on the MVP that works. Given a starter url, we will use request …

Applied Cryptograph Basics I

GCD

Definition

GCD of two integer is the largest integer that divide both given integers.

Calculation

Factorization method

Find the integer factorization of the two integer first, find the common factors and multiply together.

Euclidean Algorithm

The key idea of Euclidean algorithm is to use the smaller integer to …

Setting File Permissions for Web Documents

What file permissions the web documents should have in order to make development process easier?

The answer depends. For my use case, I want to avoid using sudo to copy documents from the user’s home to the Apache document root directory every time I want to update a file. Because the /etc/www directory owned by …

My Number-Base-System Gap

I was stuck for a while when solving the leetcode problem Remove 9 . After sketched a relatively complicated solution, I decided to give myself a hint by reading the discussion board. It is all about numeric base conversion! When I tried to write the routine to convert a decimal number to a number in base 9, I found it …