openSUSE之PHP学习之旅(10、CURL)
curl是一个利用URL语法在命令行方式下工作的文件传输工具。它支持很多协议:FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE 以及 LDAP。curl同样支持HTTPS认证,HTTP POST方法, HTTP PUT方法, FTP上传, kerberos认证, HTTP上传, 代理服务器, cookies, 用户名/密码认证, 下载文件断点续传, 上载文件断点续传, http代理服务器管道( proxy tunneling), 甚至它还支持IPv6, socks5代理服务器, 通过http代理服务器上传文件到FTP服务器等等,功能十分强大。
在openSUSE 11.4 已经带了CURL。
以下来看几个实例:
1、抓取页面内容到一个文件里。
linux-4k5v:~ # curl http://www.linuxsight.com > page.html 或者linux-4k5v:~curl -o page.html http://www.linuxsight.com
2、抓取某个文件。
linux-4k5v:~ # curl -O http://www.linuxsight.com/wp-content/uploads/2010/07/openSUSE.png
3、断点续传。
linux-4k5v:~ # curl -C -O http://www.linuxsight.com/wp-content/uploads/2010/07/openSUSE.png
4、显示抓取错误
linux-4k5v:~ # curl -f http://suse.linuxsight.com/linuxsight
5、代理
linux-4k5v:~ #curl -x IP地址 -o page.html http://www.linuxsight.com
在PHP里也支持CURL。首先必须开启扩展:
linux-4k5v:~ # vim /etc/php5/apache2/php.ini
开启: extension=php_curl.dll
下面是一段PHP代码
<?php //PHP另外2种抓取页面的方法 //echo $content = file_get_contents(“http://www.linuxsight.com”); //readfile(“http://www.linuxsight.com”); // 1. 初始化 $ch = curl_init(); // 2. 设置选项,包括URL curl_setopt($ch, CURLOPT_URL, “http://www.linuxsight.com”); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);//0表示输出结果 curl_setopt($ch, CURLOPT_HEADER, 0); // 3. 执行并获取HTML文档内容 $output = curl_exec($ch); if ($output === FALSE) { //请注意,比较的时候我们用的是“=== FALSE”,而非“== FALSE”。因为我们得区分 空输出 和 布尔值FALSE echo “cURL Error: ” . curl_error($ch); } curl_exec($ch); $info = curl_getinfo($ch); echo ‘获取’. $info['url'] . ‘耗时’. $info['total_time'] . ‘秒’; //返回的数组中包括了以下信息: //“url” //资源网络地址 //“content_type” //内容编码 //“http_code” //HTTP状态码 //“header_size” //header的大小 //“request_size” //请求的大小 //“filetime” //文件创建时间 //“ssl_verify_result” //SSL验证结果 //“redirect_count” //跳转技术 //“total_time” //总耗时 //“namelookup_time” //DNS查询耗时 //“connect_time” //等待连接耗时 //“pretransfer_time” //传输前准备耗时 //“size_upload” //上传数据的大小 //“size_download” //下载数据的大小 //“speed_download” //下载速度 //“speed_upload” //上传速度 //“download_content_length”//下载内容的长度 //“upload_content_length” //上传内容的长度 //“starttransfer_time” //开始传输的时间 //“redirect_time”//重定向耗时 // 4. 释放curl句柄 curl_close($ch); ?>
当然curl的功能是很强大的,这里只是简要介绍了它的用法。一些高级的如模拟登录的还需要继续研究。
附:Linux curl 命令大全:
linux-4k5v:~ # curl –help
Usage: curl [options...] <url>
Options: (H) means HTTP/HTTPS only, (F) means FTP only
–anyauth Pick “any” authentication method (H)
-a/–append Append to target file when uploading (F/SFTP)
–basic Use HTTP Basic Authentication (H)
–cacert <file> CA certificate to verify peer against (SSL)
–capath <directory> CA directory to verify peer against (SSL)
-E/–cert <cert[:passwd]> Client certificate file and password (SSL)
–cert-type <type> Certificate file type (DER/PEM/ENG) (SSL)
–ciphers <list> SSL ciphers to use (SSL)
–compressed Request compressed response (using deflate or gzip)
-K/–config <file> Specify which config file to read
–connect-timeout <seconds> Maximum time allowed for connection
-C/–continue-at <offset> Resumed transfer offset
-b/–cookie <name=string/file> Cookie string or file to read cookies from (H)
-c/–cookie-jar <file> Write cookies to this file after operation (H)
–create-dirs Create necessary local directory hierarchy
–crlf Convert LF to CRLF in upload
–crlfile <file> Get a CRL list in PEM format from the given file
-d/–data <data> HTTP POST data (H)
–data-ascii <data> HTTP POST ASCII data (H)
–data-binary <data> HTTP POST binary data (H)
–data-urlencode <name=data/name@filename> HTTP POST data url encoded (H)
–digest Use HTTP Digest Authentication (H)
–disable-eprt Inhibit using EPRT or LPRT (F)
–disable-epsv Inhibit using EPSV (F)
-D/–dump-header <file> Write the headers to this file
–egd-file <file> EGD socket path for random data (SSL)
–engine <eng> Crypto engine to use (SSL). “–engine list” for list
-f/–fail Fail silently (no output at all) on HTTP errors (H)
-F/–form <name=content> Specify HTTP multipart POST data (H)
–form-string <name=string> Specify HTTP multipart POST data (H)
–ftp-account <data> Account data to send when requested by server (F)
–ftp-alternative-to-user <cmd> String to replace “USER [name]” (F)
–ftp-create-dirs Create the remote dirs if not present (F)
–ftp-method [multicwd/nocwd/singlecwd] Control CWD usage (F)
–ftp-pasv Use PASV/EPSV instead of PORT (F)
-P/–ftp-port <address> Use PORT with address instead of PASV (F)
–ftp-skip-pasv-ip Skip the IP address for PASV (F)
–ftp-pret Send PRET before PASV (for drftpd) (F)
–ftp-ssl-ccc Send CCC after authenticating (F)
–ftp-ssl-ccc-mode [active/passive] Set CCC mode (F)
–ftp-ssl-control Require SSL/TLS for ftp login, clear for transfer (F)
-G/–get Send the -d data with a HTTP GET (H)
-g/–globoff Disable URL sequences and ranges using {} and []
-H/–header <line> Custom header to pass to server (H)
-I/–head Show document info only
-h/–help This help text
–hostpubmd5 <md5> Hex encoded MD5 string of the host public key. (SSH)
-0/–http1.0 Use HTTP 1.0 (H)
–ignore-content-length Ignore the HTTP Content-Length header
-i/–include Include protocol headers in the output (H/F)
-k/–insecure Allow connections to SSL sites without certs (H)
–interface <interface> Specify network interface/address to use
-4/–ipv4 Resolve name to IPv4 address
-6/–ipv6 Resolve name to IPv6 address
-j/–junk-session-cookies Ignore session cookies read from file (H)
–keepalive-time <seconds> Interval between keepalive probes
–key <key> Private key file name (SSL/SSH)
–key-type <type> Private key file type (DER/PEM/ENG) (SSL)
–krb <level> Enable Kerberos with specified security level (F)
–libcurl <file> Dump libcurl equivalent code of this command line
–limit-rate <rate> Limit transfer speed to this rate
-J/–remote-header-name Use the header-provided filename (H)
-l/–list-only List only names of an FTP directory (F)
–local-port <num>[-num] Force use of these local port numbers
-L/–location Follow Location: hints (H)
–location-trusted Follow Location: and send auth to other hosts (H)
-M/–manual Display the full manual
–mail-from <from> Mail from this address
–mail-rcpt <to> Mail to this receiver(s)
–max-filesize <bytes> Maximum file size to download (H/F)
–max-redirs <num> Maximum number of redirects allowed (H)
-m/–max-time <seconds> Maximum time allowed for the transfer
–negotiate Use HTTP Negotiate Authentication (H)
-n/–netrc Must read .netrc for user name and password
–netrc-optional Use either .netrc or URL; overrides -n
-N/–no-buffer Disable buffering of the output stream
–no-keepalive Disable keepalive use on the connection
–no-sessionid Disable SSL session-ID reusing (SSL)
–noproxy Comma-separated list of hosts which do not use proxy
–ntlm Use HTTP NTLM authentication (H)
-o/–output <file> Write output to <file> instead of stdout
–pass <pass> Pass phrase for the private key (SSL/SSH)
–post301 Do not switch to GET after following a 301 redirect (H)
–post302 Do not switch to GET after following a 302 redirect (H)
-#/–progress-bar Display transfer progress as a progress bar
–proto <protocols> Enable/disable specified protocols
–proto-redir <protocols> Enable/disable specified protocols on redirect
-x/–proxy <host[:port]> Use HTTP proxy on given port
–proxy-anyauth Pick “any” proxy authentication method (H)
–proxy-basic Use Basic authentication on the proxy (H)
–proxy-digest Use Digest authentication on the proxy (H)
–proxy-negotiate Use Negotiate authentication on the proxy (H)
–proxy-ntlm Use NTLM authentication on the proxy (H)
-U/–proxy-user <user[:password]> Set proxy user and password
–proxy1.0 <host[:port]> Use HTTP/1.0 proxy on given port
-p/–proxytunnel Operate through a HTTP proxy tunnel (using CONNECT)
–pubkey <key> Public key file name (SSH)
-Q/–quote <cmd> Send command(s) to server before file transfer (F/SFTP)
–random-file <file> File for reading random data from (SSL)
-r/–range <range> Retrieve only the bytes within a range
–raw Pass HTTP “raw”, without any transfer decoding (H)
-e/–referer Referer URL (H)
-O/–remote-name Write output to a file named as the remote file
–remote-name-all Use the remote file name for all URLs
-R/–remote-time Set the remote file’s time on the local output
-X/–request <command> Specify request command to use
–retry <num> Retry request <num> times if transient problems occur
–retry-delay <seconds> When retrying, wait this many seconds between each
–retry-max-time <seconds> Retry only within this period
-S/–show-error Show error. With -s, make curl show errors when they occur
-s/–silent Silent mode. Don’t output anything
–socks4 <host[:port]> SOCKS4 proxy on given host + port
–socks4a <host[:port]> SOCKS4a proxy on given host + port
–socks5 <host[:port]> SOCKS5 proxy on given host + port
–socks5-hostname <host[:port]> SOCKS5 proxy, pass host name to proxy
–socks5-gssapi-service <name> SOCKS5 proxy service name for gssapi
–socks5-gssapi-nec Compatibility with NEC SOCKS5 server
-Y/–speed-limit Stop transfer if below speed-limit for ‘speed-time’ secs
-y/–speed-time Time needed to trig speed-limit abort. Defaults to 30
–ssl Try SSL/TLS (FTP, IMAP, POP3, SMTP)
–ssl-reqd Require SSL/TLS (FTP, IMAP, POP3, SMTP)
-2/–sslv2 Use SSLv2 (SSL)
-3/–sslv3 Use SSLv3 (SSL)
–stderr <file> Where to redirect stderr. – means stdout
–tcp-nodelay Use the TCP_NODELAY option
-t/–telnet-option <OPT=val> Set telnet option
–tftp-blksize <value> Set TFTP BLKSIZE option (must be >512)
-z/–time-cond <time> Transfer based on a time condition
-1/–tlsv1 Use TLSv1 (SSL)
–trace <file> Write a debug trace to the given file
–trace-ascii <file> Like –trace but without the hex output
–trace-time Add time stamps to trace/verbose output
-T/–upload-file <file> Transfer <file> to remote site
–url <URL> Set URL to work with
-B/–use-ascii Use ASCII/text transfer
-u/–user <user[:password]> Set server user and password
-A/–user-agent <string> User-Agent to send to server (H)
-v/–verbose Make the operation more talkative
-V/–version Show version number and quit
-w/–write-out <format> What to output after completion
-q If used as the first parameter disables .curlrc