Day12 Google Dorking - part2 - iT 邦幫忙::一起幫忙解決難題，拯救 IT 人的一天

2022 iThome 鐵人賽

DAY 11

Security

從 Tryhackme 靶機中學習資安系列第 12 篇

Day12 Google Dorking - part2

14th鐵人賽

Yvonne

團隊那團名要叫什麼？

2022-09-28 01:19:18

1184 瀏覽

分享至

Day12 Google Dorking - part2
本次題目連結:https://tryhackme.com/room/googledorking

Robots.txt:
接續昨天的題目 SEO，今天我們探討「robots.txt」。
簡單來說 robots.txt 是一個檔案，這個檔案會規範來到你的網站的爬蟲的行為，像是有哪些頁面可以爬、哪些頁面不能爬。

我們會把 robots.txt 檔放在 test.com 的公開資料夾下，路徑會是 text.com/robots.txt。

如果你要檢視 hackerone 的 robots.txt，那你可以到在你的導覽頁輸入https://www.hackerone.com/robots.txt ，
你就可以看到 hackerone 的robots.txt 的內容。

robots.txt是一種規範，他會告訴你來到你的網站的爬蟲有哪些頁面可以爬、哪些不能爬。有以下幾種的使用時機

不讓爬蟲爬機密檔案
不讓爬蟲爬一些相對不重要的內容，避免消耗crawl budget
不讓不必要的爬蟲爬取你的網站，消耗網站資源

robots.txt如何撰寫

robots.txt 最常出現的三個單字分別為 User-Agent、Allow 還有Disallow。

User-Agent：爬蟲的名稱，像是Googlebot
Allow：允許爬蟲爬取的資料夾、頁面
Disallow：不允許爬蟲爬取的資料夾、頁面

前提概要完 robots.txt
我們可以來解第 4 題ㄌ

Where would “robots.txt” be located on the domain “ablog.com”
robots.txt”
在網域“ ablog.com ”中的位置
照我們剛剛說的

ablog.com/robots.txt

If a website was to have a sitemap 站點地圖, where would that be located?
Hint:站點地圖是一個包含網站索引的 xml 格式文件，供爬蟲使用。

sitemap.xml

How would we only allow “Bingbot” to index the website?
我只允許 Bingbot 被爬蟲爬,其他檔案都不要被爬,該怎麼設置呢?
我們可以參考教學圖

User-agent: Bingbot

4.How would we prevent a “Crawler” from indexing the directory “/dont-index-me/”?
我們如何防止“爬蟲爬”索引目錄“/dont-index-me/”？

上面有提過這時候我們可以用:Disallow

Disallow: /dont-index-me/

5.What is the extension of a Unix/Linux system configuration file that we might want to hide from “Crawlers”?

我們希望對“爬蟲”隱藏的 Unix/Linux 系統配置文件的擴展名是什麼？

Hint:“系統文件通常是 3/4 個字符！” 所以這意味著配置文件擴展名比通常的 config 縮寫略短

這題我看不太明白,有點不懂題目再問什麼
我就上網找解答 write up

.conf

解答是找到ㄌ
頂多是了解把 configuration 縮寫成 .conf
但我沒很懂為啥要這樣

理解的幫友麻煩在底下留言跟我說
感激不盡

Day11 - Google Dorking - part1

Day13 :Google Dorking - part3

系列文

從 Tryhackme 靶機中學習資安共 30 篇

RSS系列文訂閱系列文

24 人訂閱

完整目錄

熱門推薦

{{ item.channelVendor }} | {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

參賽組數

902 組

團體組數

37 組

累計文章數

19777 篇

完賽人數

529 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 17th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# linux windows server css react

IT邦幫忙

從 Tryhackme 靶機中學習資安系列 第 12 篇