Implementing a blog friend circle on any blog platform.
Download the Project
It is recommended to download the project in a Linux environment. It has been tested that some dependencies cannot be downloaded and the program cannot run properly in a MacOS environment.
Since the crawler program does not have the crawling rules for the NexT theme, part of the code has been modified to add the crawling rules for the NexT theme. The actual crawling rules should be based on the structure of your own blog’s friends link page.
Define the variables to be crawled through the get_next_url() method: avatar, link, and name. The method name can include a custom theme name get_xxx_url() method. When adding a global handler at the end of the method, set the fifth parameter to the theme name self.handle(avatar, link, name, queue, "xxx"). In the __init__() method, you need to add the configuration for the custom theme self.strategies = ("xxx").
link = response.css(".link-grid .link-grid-container a::attr(href)").extract() # Check if there is a 302 redirect prefix prefix = "/302.html?target=" for i inrange(len(link)): if link[i].startswith(prefix): link[i] = link[i][len(prefix):]
name = response.css(".link-grid .link-grid-container p::text").extract() # Keep only the odd elements new_name_list = [] for i inrange(len(name)): if i % 2 == 0: new_name_list.append(name[i]) name = new_name_list
self.handle(avatar, link, name, queue, "next")
Configure Your Own Friends Link Page as the Starting Point for Crawling
If you customize the crawling rules, you need to set the theme to the custom theme name theme: "xxx".
Actually, it deploys the project runtime environment that has been configured by yyyzyyyz and then replaces the crawler project with the modified local project.
/root/hexo-circle-of-friends: the local path of the project
1
docker run -di --name circle -p 8000:8000 -v /tmp/:/tmp/ -v /root/hexo-circle-of-friends:/home/fcircle_src yyyzyyyz/fcircle:latest
In fact, this project includes the frontend deployment solution, but I feel that it doesn’t fit well with my current theme, so I wrote a minimalistic one myself.
<scriptdata-pjax=""type="module"> asyncfunctioninit() { let response = awaitfetch("<ip>/all"); let result = await response.json(); for (const item of result["article_data"]) { let a = document.createElement("a"); a.href = item["link"]; a.classList.add("article_item"); a.innerHTML = ` <div class="article_title">${item["title"]}</div> <div class="article_author">${item["author"]} Published at ${item["created"]} Updated at ${item["updated"]}</div> <br> `; document.getElementById("app").append(a); } } awaitinit(); </script>