{"id":470924,"date":"2024-06-23T16:01:59","date_gmt":"2024-06-23T16:01:59","guid":{"rendered":"https:\/\/proxycompass.com\/?p=470924"},"modified":"2024-07-04T11:54:28","modified_gmt":"2024-07-04T11:54:28","slug":"web-scraping-best-practices-good-etiquette-and-some-tricks","status":"publish","type":"post","link":"https:\/\/proxycompass.com\/ko\/web-scraping-best-practices-good-etiquette-and-some-tricks\/","title":{"rendered":"\uc6f9 \uc2a4\ud06c\ub798\ud551 \ubaa8\ubc94 \uc0ac\ub840: \uc88b\uc740 \uc5d0\ud2f0\ucf13\uacfc \uba87 \uac00\uc9c0 \uc694\ub839"},"content":{"rendered":"<p>\uc774 \uac8c\uc2dc\ubb3c\uc5d0\uc11c \uc6b0\ub9ac\ub294 \uc6f9 \uc2a4\ud06c\ub798\ud551 \ubaa8\ubc94 \uc0ac\ub840\uc5d0 \ub300\ud574 \ub17c\uc758\ud560 \uac83\uc785\ub2c8\ub2e4. \uadf8\ub9ac\uace0 \ub9ce\uc740 \ubd84\ub4e4\uc774 \uc774\uc5d0 \ub300\ud574 \uc0dd\uac01\ud558\uace0 \uacc4\uc2dc\ub9ac\ub77c \ubbff\uae30 \ub54c\ubb38\uc5d0 \ubc14\ub85c \ubc29\uc5d0 \uc788\ub294 \ucf54\ub07c\ub9ac\uc5d0 \ub300\ud574 \uc5b8\uae09\ud558\uaca0\uc2b5\ub2c8\ub2e4. \ud569\ubc95\uc801\uc778\uac00\uc694? \uc544\ub9c8\ub3c4 \uadf8\ub807\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<p>\uc0ac\uc774\ud2b8 \uc2a4\ud06c\ub798\ud551\uc740 \uc77c\ubc18\uc801\uc73c\ub85c \ud569\ubc95\uc801\uc774\uc9c0\ub9cc \ud569\ub9ac\uc801\uc778 \uadfc\uac70\uac00 \uc788\ub294 \uacbd\uc6b0\uc5d0\ub294 \uacc4\uc18d \uc77d\uc73c\uc2ed\uc2dc\uc624.<br><\/p>\n\n\n\n<p>\uc9c0\ub9ac\uc801 \uc704\uce58\uc5d0 \ub530\ub77c\uc11c\ub3c4 \ub2e4\ub974\uace0, \uc800\ub294 \uc9c0\ub2c8\uac00 \uc544\ub2c8\uae30 \ub54c\ubb38\uc5d0 \uc5b4\ub514\uc5d0 \uacc4\uc2e0\uc9c0 \ubaa8\ub974\uae30 \ub54c\ubb38\uc5d0 \ud655\uc2e4\ud788 \ub9d0\uc500\ub4dc\ub9b4 \uc218\ub294 \uc5c6\uc2b5\ub2c8\ub2e4. \ud604\uc9c0 \ubc95\ub960\uc744 \ud655\uc778\ud558\uace0, \uc6b0\ub9ac\uac00 &quot;\ub098\uc05c \uc870\uc5b8&quot;\uc744 \ud574\ub3c4 \ubd88\ud3c9\ud558\uc9c0 \ub9c8\uc138\uc694. \ud558\ud558.&nbsp;<\/p>\n\n\n\n<p>\ub18d\ub2f4\uacfc\ub294 \ubcc4\uac1c\ub85c \ub300\ubd80\ubd84\uc758 \uc7a5\uc18c\uc5d0\uc11c\ub294 \uad1c\ucc2e\uc2b5\ub2c8\ub2e4. \ub2e4\ub9cc \uadf8\uac83\uc5d0 \ub300\ud574 \uba4d\uccad\ud55c \ud0dc\ub3c4\ub97c \ucde8\ud558\uc9c0 \ub9d0\uace0 \uc800\uc791\uad8c\uc774 \uc788\ub294 \uc790\ub8cc, \uac1c\uc778 \ub370\uc774\ud130 \ubc0f \ub85c\uadf8\uc778 \ud654\uba74 \ub4a4\uc5d0 \uc788\ub294 \uac83\ub4e4\ub85c\ubd80\ud130 \uba40\ub9ac \ub5a8\uc5b4\uc838 \uc788\uc73c\uc2ed\uc2dc\uc624.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\ub2e4\uc74c \uc6f9 \uc2a4\ud06c\ub798\ud551 \ubaa8\ubc94 \uc0ac\ub840\ub97c \ub530\ub974\ub294 \uac83\uc774 \uc88b\uc2b5\ub2c8\ub2e4.&nbsp;<\/h2>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. robots.txt\ub97c \uc874\uc911\ud558\uc138\uc694.<\/h3>\n\n\n\n<p>\uc6f9\uc0ac\uc774\ud2b8\ub97c \ud3c9\ud654\ub86d\uac8c \uc2a4\ud06c\ub7a9\ud558\ub294 \ube44\uacb0\uc744 \uc54c\uace0 \uc2f6\uc73c\uc2ed\ub2c8\uae4c? \uc6f9\uc0ac\uc774\ud2b8\uc758 robots.txt \ud30c\uc77c\uc744 \uc874\uc911\ud558\uc138\uc694. \uc6f9\uc0ac\uc774\ud2b8 \ub8e8\ud2b8\uc5d0 \uc704\uce58\ud55c \uc774 \ud30c\uc77c\uc740 \ubd07\uc774 \uc2a4\ud06c\ub7a9\ud560 \uc218 \uc788\ub294 \ud398\uc774\uc9c0\uc640 \uae08\uc9c0\ub41c \ud398\uc774\uc9c0\ub97c \uc9c0\uc815\ud569\ub2c8\ub2e4. robots.txt\ub97c \ub530\ub974\ub294 \uac83\ub3c4 \uc911\uc694\ud569\ub2c8\ub2e4. \uc774\ub294 \uadc0\ud558\uc758 \ud604\uc7ac \uc704\uce58\uc5d0 \ub530\ub77c \uadc0\ud558\uc758 IP\uac00 \ucc28\ub2e8\ub418\uac70\ub098 \ubc95\uc801 \uacb0\uacfc\ub97c \ucd08\ub798\ud560 \uc218 \uc788\uae30 \ub54c\ubb38\uc785\ub2c8\ub2e4.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. \ud569\ub9ac\uc801\uc778 \ud06c\ub864\ub9c1 \uc18d\ub3c4 \uc124\uc815<\/h3>\n\n\n\n<p>\uc6f9 \uc0ac\uc774\ud2b8 \uc11c\ubc84\uc758 \uacfc\ubd80\ud558, \uc815\uc9c0 \ub610\ub294 \ucda9\ub3cc\uc744 \ubc29\uc9c0\ud558\ub824\uba74 \uc694\uccad \uc18d\ub3c4\ub97c \uc81c\uc5b4\ud558\uace0 \uc2dc\uac04 \uac04\uaca9\uc744 \ud1b5\ud569\ud558\uc2ed\uc2dc\uc624. \ud6e8\uc52c \ub354 \uac04\ub2e8\ud558\uac8c \ub9d0\ud558\uba74 \ud06c\ub864\ub9c1 \uc18d\ub3c4\ub97c \uc27d\uac8c \uc124\uc815\ud558\uc138\uc694. \uc774\ub97c \ub2ec\uc131\ud558\ub824\uba74 Scrapy \ub610\ub294 Selenium\uc744 \uc0ac\uc6a9\ud558\uace0 \uc694\uccad\uc5d0 \uc9c0\uc5f0\uc744 \ud3ec\ud568\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3. \uc0ac\uc6a9\uc790 \uc5d0\uc774\uc804\ud2b8 \ubc0f IP \uc8fc\uc18c \uad50\uccb4<\/h3>\n\n\n\n<p>\uc6f9\uc0ac\uc774\ud2b8\ub294 \uc0ac\uc6a9\uc790 \uc5d0\uc774\uc804\ud2b8 \ubb38\uc790\uc5f4\uc774\ub098 IP \uc8fc\uc18c\ub97c \uc0ac\uc6a9\ud558\uc5ec \uc2a4\ud06c\ub798\ud551 \ubd07\uc744 \uc2dd\ubcc4\ud558\uace0 \ucc28\ub2e8\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4. \ub54c\ub54c\ub85c \uc0ac\uc6a9\uc790 \uc5d0\uc774\uc804\ud2b8\uc640 IP \uc8fc\uc18c\ub97c \ubcc0\uacbd\ud558\uace0 \uc2e4\uc81c \ube0c\ub77c\uc6b0\uc800 \uc138\ud2b8\ub97c \uc0ac\uc6a9\ud558\uc2ed\uc2dc\uc624. \uc0ac\uc6a9\uc790 \uc5d0\uc774\uc804\ud2b8 \ubb38\uc790\uc5f4\uc744 \uc0ac\uc6a9\ud558\uace0 \uc5b4\ub290 \uc815\ub3c4 \uc790\uc2e0\uc744 \uc5b8\uae09\ud558\uc2ed\uc2dc\uc624. \ub2f9\uc2e0\uc758 \ubaa9\ud45c\ub294 \ub208\uc5d0 \ub744\uc9c0 \uc54a\uac8c \ub418\ub294 \uac83\uc774\ubbc0\ub85c \uc62c\ubc14\ub974\uac8c \uc218\ud589\ud558\uc2ed\uc2dc\uc624.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4. \ub85c\uadf8\uc778 \ud398\uc774\uc9c0 \ub4a4\uc758 \uc2a4\ud06c\ub798\ud551\uc744 \ud53c\ud558\uc138\uc694<\/h3>\n\n\n\n<p>\ub85c\uadf8\uc778 \ub4a4\uc5d0 \uc788\ub294 \ub0b4\uc6a9\uc744 \uae01\ub294 \uac83\uc740 \uc77c\ubc18\uc801\uc73c\ub85c \uc798\ubabb\ub418\uc5c8\ub2e4\uace0 \uac00\uc815\ud574 \ubcf4\uaca0\uc2b5\ub2c8\ub2e4. \uc624\ub978\ucabd? \uc88b\uc544\uc694? \ub9ce\uc740 \ubd84\ub4e4\uc774 \ud574\ub2f9 \uc139\uc158\uc744 \uac74\ub108\ub6f0\uc2e4 \uac83\uc774\ub77c\ub294 \uac83\uc744 \uc54c\uace0 \uc788\uc9c0\ub9cc \uc5b4\uca0c\ub4e0\u2026 \uacf5\uac1c \ub370\uc774\ud130\ub85c \uc2a4\ud06c\ub798\ud551\uc744 \uc81c\ud55c\ud558\uace0, \ub85c\uadf8\uc778 \ub4a4\uc5d0\uc11c \uc2a4\ud06c\ub798\ud551\ud574\uc57c \ud558\ub294 \uacbd\uc6b0 \ud5c8\uac00\ub97c \uc694\uccad\ud560 \uc218\ub3c4 \uc788\uc2b5\ub2c8\ub2e4. \uc798 \ubaa8\ub974\uaca0\uc2b5\ub2c8\ub2e4. \uc774 \ubb38\uc81c\ub97c \uc5b4\ub5bb\uac8c \ucc98\ub9ac\ud560\uc9c0 \uc758\uacac\uc744 \ub0a8\uaca8\uc8fc\uc138\uc694. \ub85c\uadf8\uc778 \ub4a4\uc5d0 \uc788\ub294 \ub0b4\uc6a9\uc744 \uae01\uc5b4\uc624\uc2dc\ub098\uc694?<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">5. \ucd94\ucd9c\ub41c \ub370\uc774\ud130 \uad6c\ubb38 \ubd84\uc11d \ubc0f \uc815\ub9ac<\/h3>\n\n\n\n<p>\uc2a4\ud06c\ub7a9\ub41c \ub370\uc774\ud130\ub294 \ucc98\ub9ac\ub418\uc9c0 \uc54a\uc740 \uacbd\uc6b0\uac00 \ub9ce\uc73c\uba70 \uad00\ub828\uc774 \uc5c6\uac70\ub098 \uc2ec\uc9c0\uc5b4 \uad6c\uc870\ud654\ub418\uc9c0 \uc54a\uc740 \uc815\ubcf4\ub97c \ud3ec\ud568\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4. \ubd84\uc11d\ud558\uae30 \uc804\uc5d0 \ub370\uc774\ud130\ub97c \uc804\ucc98\ub9ac\ud558\uace0 regex, XPath \ub610\ub294 CSS \uc120\ud0dd\uae30\ub97c \uc0ac\uc6a9\ud558\uc5ec \uc815\ub9ac\ud574\uc57c \ud569\ub2c8\ub2e4. \uc911\ubcf5\uc131\uc744 \uc81c\uac70\ud558\uace0, \uc624\ub958\ub97c \uc218\uc815\ud558\uace0, \ub204\ub77d\ub41c \ub370\uc774\ud130\ub97c \ucc98\ub9ac\ud558\uc5ec \uc774\ub97c \uc218\ud589\ud569\ub2c8\ub2e4. \ub450\ud1b5\uc744 \ud53c\ud558\uae30 \uc704\ud574\uc11c\ub294 \ud488\uc9c8\uc774 \ud544\uc694\ud558\ubbc0\ub85c \uc2dc\uac04\uc744 \ub4e4\uc5ec \uccad\uc18c\ud558\uc2ed\uc2dc\uc624.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">6. \ub3d9\uc801 \ucf58\ud150\uce20 \ucc98\ub9ac<\/h3>\n\n\n\n<p>\ub300\ubd80\ubd84\uc758 \uc6f9\uc0ac\uc774\ud2b8\ub294 JavaScript\ub97c \uc0ac\uc6a9\ud558\uc5ec \ud398\uc774\uc9c0 \ucf58\ud150\uce20\ub97c \uc0dd\uc131\ud558\ub294\ub370 \uc774\ub294 \uae30\uc874 \uc2a4\ud06c\ub798\ud551 \uae30\uc220\uc758 \ubb38\uc81c\uc810\uc785\ub2c8\ub2e4. \ub3d9\uc801\uc73c\ub85c \ub85c\ub4dc\ub418\ub294 \ub370\uc774\ud130\ub97c \uac00\uc838\uc624\uace0 \uc2a4\ud06c\ub7a9\ud558\ub824\uba74 Puppeteer\uc640 \uac19\uc740 \ud5e4\ub4dc\ub9ac\uc2a4 \ube0c\ub77c\uc6b0\uc800\ub098 Selenium\uacfc \uac19\uc740 \ub3c4\uad6c\ub97c \uc0ac\uc6a9\ud560 \uc218 \uc788\uc2b5\ub2c8\ub2e4. \ud6a8\uc728\uc131\uc744 \ub192\uc774\uae30 \uc704\ud574 \uad00\uc2ec \uc788\ub294 \uce21\uba74\uc5d0\ub9cc \uc9d1\uc911\ud558\uc138\uc694.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">7. \uac15\ub825\ud55c \uc624\ub958 \ucc98\ub9ac \uad6c\ud604<\/h3>\n\n\n\n<p>\ub124\ud2b8\uc6cc\ud06c \ubb38\uc81c, \uc18d\ub3c4 \uc81c\ud55c, \uc6f9\uc0ac\uc774\ud2b8 \uad6c\uc870 \ubcc0\uacbd \ub4f1\uc73c\ub85c \uc778\ud55c \ud504\ub85c\uadf8\ub7a8 \uc7a5\uc560\ub97c \ubc29\uc9c0\ud558\uae30 \uc704\ud574 \uc624\ub958 \uc218\uc815\uc774 \ud544\uc694\ud569\ub2c8\ub2e4. \uc2e4\ud328\ud55c \uc694\uccad\uc744 \uc7ac\uc2dc\ub3c4\ud558\uace0 \uc18d\ub3c4 \uc81c\ud55c\uc744 \uc900\uc218\ud558\uba70 HTML \uad6c\uc870\uac00 \ubcc0\uacbd\ub41c \uacbd\uc6b0 \uad6c\ubb38 \ubd84\uc11d\uc744 \ubcc0\uacbd\ud569\ub2c8\ub2e4. \uc2e4\uc218\ub97c \uae30\ub85d\ud558\uace0 \ud65c\ub3d9\uc744 \ub530\ub77c\uac00\uba70 \ubb38\uc81c\ub97c \ud30c\uc545\ud558\uace0 \ud574\uacb0 \ubc29\ubc95\uc744 \ucc3e\uc544\ubcf4\uc138\uc694.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">8. \uc6f9\uc0ac\uc774\ud2b8 \uc11c\ube44\uc2a4 \uc57d\uad00\uc744 \uc874\uc911\ud569\ub2c8\ub2e4.<\/h3>\n\n\n\n<p>\uc6f9\uc0ac\uc774\ud2b8\ub97c \uc2a4\ud06c\ub7a9\ud558\uae30 \uc804\uc5d0 \uc6f9\uc0ac\uc774\ud2b8\uc758 \uc11c\ube44\uc2a4 \uc57d\uad00\uc744 \uc0b4\ud3b4\ubcf4\ub294 \uac83\uc774 \uc88b\uc2b5\ub2c8\ub2e4. \uadf8\ub4e4 \uc911 \uc77c\ubd80\ub294 \uae01\ub294 \uac83\uc744 \ud5c8\uc6a9\ud558\uc9c0 \uc54a\uac70\ub098 \ub530\ub77c\uc57c \ud560 \uba87 \uac00\uc9c0 \uaddc\uce59\uacfc \uaddc\uc815\uc774 \uc788\uc2b5\ub2c8\ub2e4. \uc6a9\uc5b4\uac00 \ubaa8\ud638\ud55c \uacbd\uc6b0 \uc6f9\uc0ac\uc774\ud2b8 \uc18c\uc720\uc790\uc5d0\uac8c \ubb38\uc758\ud558\uc5ec \uc790\uc138\ud55c \ub0b4\uc6a9\uc744 \ud655\uc778\ud574\uc57c \ud569\ub2c8\ub2e4.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">9. \ubc95\uc801 \uc758\ubbf8\ub97c \uace0\ub824\ud558\uc2ed\uc2dc\uc624<\/h3>\n\n\n\n<p>\uc800\uc791\uad8c \ubc0f \uac1c\uc778 \uc815\ubcf4 \ubcf4\ud638 \ubb38\uc81c\ub97c \ud3ec\ud568\ud558\uc5ec \ud569\ubc95\uc801\uc73c\ub85c \ub370\uc774\ud130\ub97c \uc2a4\ud06c\ub7a9\ud558\uace0 \uc0ac\uc6a9\ud560 \uc218 \uc788\ub294\uc9c0 \ud655\uc778\ud558\uc2ed\uc2dc\uc624. \ud0c0\uc778\uc758 \uc800\uc791\uad8c \uc790\ub8cc\ub098 \uac1c\uc778\uc815\ubcf4\ub97c \uc2a4\ud06c\ub7a9\ud558\ub294 \ud589\uc704\ub294 \uae08\uc9c0\ub418\uc5b4 \uc788\uc2b5\ub2c8\ub2e4. \uadc0\ud558\uc758 \ube44\uc988\ub2c8\uc2a4\uac00 GDPR\uacfc \uac19\uc740 \ub370\uc774\ud130 \ubcf4\ud638\ubc95\uc758 \uc601\ud5a5\uc744 \ubc1b\ub294 \uacbd\uc6b0 \uc774\ub97c \uc900\uc218\ud558\uc2ed\uc2dc\uc624.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">10. \ub300\uccb4 \ub370\uc774\ud130 \uc218\uc9d1 \ubc29\ubc95 \ud0d0\uc0c9<\/h3>\n\n\n\n<p>\ub370\uc774\ud130\ub97c \uc2a4\ud06c\ub7a9\ud558\uae30 \uc804\uc5d0 \ub2e4\ub978 \ub370\uc774\ud130 \uc18c\uc2a4\ub97c \ucc3e\uc544\ubcf4\ub294 \uac83\uc774 \uc88b\uc2b5\ub2c8\ub2e4. \ub2e4\uc6b4\ub85c\ub4dc\ud560 \uc218 \uc788\ub294 API\ub098 \ub370\uc774\ud130 \uc138\ud2b8\ub97c \uc81c\uacf5\ud558\ub294 \uc6f9\uc0ac\uc774\ud2b8\uac00 \ub9ce\uc774 \uc788\uc73c\uba70 \uc774\ub294 \uc2a4\ud06c\ub798\ud551\ubcf4\ub2e4 \ud6e8\uc52c \ud3b8\ub9ac\ud558\uace0 \ud6a8\uc728\uc801\uc785\ub2c8\ub2e4. \uadf8\ub7ec\ubbc0\ub85c \uae34 \uae38\uc744 \ud0dd\ud558\uae30 \uc804\uc5d0 \uc9c0\ub984\uae38\uc774 \uc788\ub294\uc9c0 \ud655\uc778\ud558\uc2ed\uc2dc\uc624.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">11. \ub370\uc774\ud130 \ud488\uc9c8 \ubcf4\uc99d \ubc0f \ubaa8\ub2c8\ud130\ub9c1 \uad6c\ud604<\/h3>\n\n\n\n<p>\uc2a4\ud06c\ub7a9\ub41c \ub370\uc774\ud130\uc758 \ud488\uc9c8\uc744 \ud5a5\uc0c1\ud560 \uc218 \uc788\ub294 \ubc29\ubc95\uc744 \uc2dd\ubcc4\ud558\uc2ed\uc2dc\uc624. \uc2a4\ud06c\ub808\uc774\ud37c\uc640 \ub370\uc774\ud130 \ud488\uc9c8\uc744 \ub9e4\uc77c \uc810\uac80\ud558\uc5ec \uc774\uc0c1 \uc720\ubb34\ub97c \ud655\uc778\ud569\ub2c8\ub2e4. \ubb38\uc81c\ub97c \uc2dd\ubcc4\ud558\uace0 \ubc29\uc9c0\ud558\uae30 \uc704\ud574 \uc790\ub3d9\ud654\ub41c \ubaa8\ub2c8\ud130\ub9c1 \ubc0f \ud488\uc9c8 \uac80\uc0ac\ub97c \uad6c\ud604\ud569\ub2c8\ub2e4.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">12. \uacf5\uc2dd \ub370\uc774\ud130 \uc218\uc9d1 \uc815\ucc45 \ucc44\ud0dd<\/h3>\n\n\n\n<p>\uc774\ub97c \uc62c\ubc14\ub974\uace0 \ud569\ubc95\uc801\uc73c\ub85c \uc218\ud589\ud558\uace0 \uc788\ub294\uc9c0 \ud655\uc778\ud558\ub824\uba74 \ub370\uc774\ud130 \uc218\uc9d1 \uc815\ucc45\uc744 \uc124\uc815\ud558\uc138\uc694. \uc5ec\uae30\uc5d0\ub294 \ud300\uc774 \uc54c\uc544\uc57c \ud560 \uaddc\uce59, \uad8c\uc7a5 \uc0ac\ud56d \ubc0f \ubc95\uc801 \uce21\uba74\uc774 \ud3ec\ud568\ub429\ub2c8\ub2e4. \uc774\ub294 \ub370\uc774\ud130 \uc624\uc6a9\uc758 \uc704\ud5d8\uc744 \ubc30\uc81c\ud558\uace0 \ubaa8\ub4e0 \uc0ac\ub78c\uc774 \uaddc\uce59\uc744 \uc778\uc9c0\ud558\ub3c4\ub85d \ubcf4\uc7a5\ud569\ub2c8\ub2e4.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">13. \ucd5c\uc2e0 \uc815\ubcf4\ub97c \uc5bb\uace0 \ubcc0\ud654\uc5d0 \uc801\uc751\ud558\uc138\uc694<\/h3>\n\n\n\n<p>\uc6f9\uc2a4\ud06c\ub798\ud551\uc740 \uc0c8\ub85c\uc6b4 \uae30\uc220\uc758 \ub4f1\uc7a5\uacfc \ubc95\uc801 \ubb38\uc81c, \uc6f9\uc0ac\uc774\ud2b8\uac00 \uc9c0\uc18d\uc801\uc73c\ub85c \uc5c5\ub370\uc774\ud2b8\ub418\ub294 \ub4f1 \ud65c\ubc1c\ud55c \ud65c\ub3d9\uc744 \ud3bc\uce58\ub294 \ubd84\uc57c\uc785\ub2c8\ub2e4. \uc62c\ubc14\ub978 \ubc29\ud5a5\uc73c\ub85c \ub098\uc544\uac08 \uc218 \uc788\ub3c4\ub85d \ud559\uc2b5\uacfc \uc720\uc5f0\uc131\uc758 \ubb38\ud654\ub97c \ucc44\ud0dd\ud558\uc2ed\uc2dc\uc624.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\ub9c8\ubb34\ub9ac!<\/h2>\n\n\n\n<p>\uc6b0\ub9ac\uac00 \ucc98\ubd84\ud560 \uc218 \uc788\ub294 \uc544\ub984\ub2e4\uc6b4 \uc7a5\ub09c\uac10\uc744 \uac00\uc9c0\uace0 \ub180 \uc608\uc815\uc774\ub77c\uba74(\ud638\uc758\ub97c \ubca0\ud480\uace0 Python \ub77c\uc774\ube0c\ub7ec\ub9ac\ub97c \ucc3e\uc544\ubcf4\uc2ed\uc2dc\uc624), \uadf8\ub7fc\u2026 \uc74c, \uc608\uc758\ub97c \uac16\ucd94\uc2dc\uace0, \ubb34\uc2dc\ud558\uae30\ub85c \uacb0\uc815\ud558\uc168\ub2e4\uba74 \ud604\uba85\ud558\uac8c \ud589\ub3d9\ud558\uc2dc\uae30 \ubc14\ub78d\ub2c8\ub2e4. \uccab \ubc88\uc9f8 \uc870\uc5b8.&nbsp;<\/p>\n\n\n\n<p>\uc6b0\ub9ac\uac00 \ub17c\uc758\ud55c \uba87 \uac00\uc9c0 \ubaa8\ubc94 \uc0ac\ub840\ub294 \ub2e4\uc74c\uacfc \uac19\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>robots.txt \uc874\uc911<\/li>\n\n\n\n<li>\ud06c\ub864\ub9c1 \uc18d\ub3c4 \uc81c\uc5b4<\/li>\n\n\n\n<li>\ub2f9\uc2e0\uc758 \uc815\uccb4\uc131\uc744 \ud68c\uc804<\/li>\n\n\n\n<li>\uc0ac\uc801\uc778 \uacf5\uac04\uc744 \ud53c\ud558\uc138\uc694<\/li>\n\n\n\n<li>\ub370\uc774\ud130 \uc815\ub9ac \ubc0f \uad6c\ubb38 \ubd84\uc11d<\/li>\n\n\n\n<li>\ud6a8\uc728\uc801\uc73c\ub85c \uc624\ub958 \ucc98\ub9ac<\/li>\n\n\n\n<li>\uc798 \uc9c0\ub0b4\ub77c, \uaddc\uce59\uc744 \uc9c0\ud0a4\ub77c<\/li>\n<\/ul>\n\n\n\n<p>\ub370\uc774\ud130\uc758 \uac00\uce58\uac00 \uc810\uc810 \ub354 \ub192\uc544\uc9d0\uc5d0 \ub530\ub77c \uc6f9 \uc2a4\ud06c\ub808\uc774\ud37c\ub294 \ub2e4\uc74c\uacfc \uac19\uc740 \uc120\ud0dd\uc5d0 \uc9c1\uba74\ud558\uac8c \ub429\ub2c8\ub2e4.&nbsp;<\/p>\n\n\n\n<p>robots.txt \ud30c\uc77c\uc744 \uc874\uc911\ud558\uc2dc\uaca0\uc2b5\ub2c8\uae4c? \uadf8\uac83\uc740 \ub2f9\uc2e0\uc5d0\uac8c \ub2ec\ub824 \uc788\uc2b5\ub2c8\ub2e4.<\/p>\n\n\n\n<p>\uc544\ub798\uc5d0 \uc758\uacac\uc744 \ub0a8\uaca8\uc8fc\uc138\uc694. \uc774\uc5d0 \ub300\ud574 \uc5b4\ub5bb\uac8c \uc0dd\uac01\ud558\uc2dc\ub098\uc694?<\/p>","protected":false},"excerpt":{"rendered":"<p>In this post, we&#8217;ll discuss the web scraping best practices, and since I believe many of you are thinking about it, I&#8217;ll address the elephant in the room right away. Is it legal? Most likely yes. Scraping sites is generally legal, but within certain reasonable grounds (just keep reading). Also depends on your geographical location, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":470932,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[35],"tags":[],"class_list":["post-470924","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-articles"],"acf":[],"_links":{"self":[{"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/posts\/470924","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/comments?post=470924"}],"version-history":[{"count":5,"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/posts\/470924\/revisions"}],"predecessor-version":[{"id":470935,"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/posts\/470924\/revisions\/470935"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/media\/470932"}],"wp:attachment":[{"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/media?parent=470924"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/categories?post=470924"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/proxycompass.com\/ko\/wp-json\/wp\/v2\/tags?post=470924"}],"curies":[{"name":"\uc6cc\ub4dc\ud504\ub808\uc2a4","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}