شروع دوره های تخصصی, پایتون | هوش مصنوعی 18 دی شروع دوره های تخصصی, پایتون | هوش مصنوعی 18 دی
ثبت نام
Michael Schrenk

Webbots, Spiders, and Screen Scrapers

نویسنده :Michael Schrenk

    Table of Contents | Index
    Dedication
    ACKNOWLEDGMENTS
    Introduction
    FUNDAMENTAL CONCEPTS AND TECHNIQUES
    WHAT'S IN IT FOR YOU?
    Uncovering the Internet's True Potential
    What's in It for Developers?
    What's in It for Business Leaders?
    Final Thoughts
    IDEAS FOR WEBBOT PROJECTS
    Inspiration from Browser Limitations
    A Few Crazy Ideas to Get You Started
    Final Thoughts
    DOWNLOADING WEB PAGES
    Think About Files, Not Web Pages
    Downloading Files with PHP's Built-in Functions
    Introducing PHP/CURL
    Installing PHP/CURL
    LIB_http
    Final Thoughts
    PARSING TECHNIQUES
    Parsing Poorly Written HTML
    Standard Parse Routines
    Using LIB_parse
    Useful PHP Functions
    Final Thoughts
    AUTOMATING FORM SUBMISSION
    Reverse Engineering Form Interfaces
    Form Handlers, Data Fields, Methods, and Event Triggers
    Unpredictable Forms
    Analyzing a Form
    Final Thoughts
    MANAGING LARGE AMOUNTS OF DATA
    Organizing Data
    Making Data Smaller
    Thumbnailing Images
    Final Thoughts
    PROJECTS
    PRICE-MONITORING WEBBOTS
    The Target

    Designing the Parsing Script
    Initialization and Downloading the Target
    Further Exploration
    IMAGE-CAPTURING WEBBOTS
    Example Image-Capturing Webbot
    Creating the Image-Capturing Webbot
    Further Exploration
    Final Thoughts
    LINK-VERIFICATION WEBBOTS
    Creating the Link-Verification Webbot
    Running the Webbot
    Further Exploration
    ANONYMOUS BROWSING WEBBOTS
    Anonymity with Proxies
    The Anonymizer Project
    Final Thoughts
    SEARCH-RANKING WEBBOTS
    Description of a Search Result Page
    What the Search-Ranking Webbot Does
    Running the Search-Ranking Webbot
    How the Search-Ranking Webbot Works
    The Search-Ranking Webbot Script
    Final Thoughts
    Further Exploration
    AGGREGATION WEBBOTS
    Choosing Data Sources for Webbots
    Example Aggregation Webbot
    Adding Filtering to Your Aggregation Webbot
    Further Exploration
    FTP WEBBOTS
    Example FTP Webbot
    PHP and FTP
    Further Exploration
    NNTP NEWS WEBBOTS
    NNTP Use and History
    Webbots and Newsgroups
    Further Exploration
    WEBBOTS THAT READ EMAIL
    The POP3 Protocol
    Executing POP3 Commands with a Webbot
    Further Exploration
    WEBBOTS THAT SEND EMAIL
    Email, Webbots, and Spam
    Sending Mail with SMTP and PHP
    Writing a Webbot That Sends Email Notifications
    Further Exploration
    CONVERTING A WEBSITE INTO A FUNCTION
    Writing a Function Interface
    Final Thoughts
    ADVANCED TECHNICAL CONSIDERATIONS
    SPIDERS

    How Spiders Work
    Example Spider
    LIB_simple_spider
    Experimenting with the Spider
    Adding the Payload
    Further Exploration
    PROCUREMENT WEBBOTS AND SNIPERS
    Procurement Webbot Theory
    Sniper Theory
    Testing Your Own Webbots and Snipers
    Further Exploration
    Final Thoughts
    WEBBOTS AND CRYPTOGRAPHY
    Designing Webbots That Use Encryption
    A Quick Overview of Web Encryption
    Local Certificates
    Final Thoughts
    AUTHENTICATION
    What Is Authentication?
    Example Scripts and Practice Pages
    Basic Authentication
    Session Authentication
    Final Thoughts
    ADVANCED COOKIE MANAGEMENT
    How Cookies Work
    PHP/CURL and Cookies
    How Cookies Challenge Webbot Design
    Further Exploration
    SCHEDULING WEBBOTS AND SPIDERS
    The Windows Task Scheduler
    Complex Schedules
    Non-Calendar-Based Triggers
    Final Thoughts
    LARGER CONSIDERATIONS
    DESIGNING STEALTHY WEBBOTS AND SPIDERS
    Why Design a Stealthy Webbot?
    Stealth Means Simulating Human Patterns
    Final Thoughts
    WRITING FAULT-TOLERANT WEBBOTS
    Types of Webbot Fault Tolerance
    Error Handlers
    DESIGNING WEBBOT-FRIENDLY WEBSITES
    Optimizing Web Pages for Search Engine Spiders
    Web Design Techniques That Hinder Search Engine Spiders
    Designing Data-Only Interfaces
    KILLING SPIDERS
    Asking Nicely
    Building Speed Bumps
    Setting Traps
    Final Thoughts
    KEEPING WEBBOTS OUT OF TROUBLE

    It's All About Respect
    Copyright
    Trespass to Chattels
    Internet Law
    Final Thoughts
    PHP/CURL REFERENCE
    Creating a Minimal PHP/CURL Session
    Initiating PHP/CURL Sessions
    Setting PHP/CURL Options
    Executing the PHP/CURL Command
    Closing PHP/CURL Sessions
    STATUS CODES
    HTTP Codes
    NNTP Codes
    SMS EMAIL ADDRESSES
    Colophon
    Index

1394/07/27 2187 313
رمز عبور : tahlildadeh.com یا www.tahlildadeh.com
نظرات شما

نظرات خود را ثبت کنید...