openSUSE之PHP学习之旅(10、CURL)

2011年05月30日 Linux, SUSE/openSUSE 暂无评论

curl是一个利用URL语法在命令行方式下工作的文件传输工具。它支持很多协议:FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE 以及 LDAP。curl同样支持HTTPS认证,HTTP POST方法, HTTP PUT方法, FTP上传, kerberos认证, HTTP上传, 代理服务器, cookies, 用户名/密码认证, 下载文件断点续传, 上载文件断点续传, http代理服务器管道( proxy tunneling), 甚至它还支持IPv6, socks5代理服务器, 通过http代理服务器上传文件到FTP服务器等等,功能十分强大。

在openSUSE 11.4 已经带了CURL。

以下来看几个实例:

1、抓取页面内容到一个文件里。
linux-4k5v:~ # curl http://www.linuxsight.com > page.html 或者linux-4k5v:~curl -o page.html http://www.linuxsight.com
2、抓取某个文件。
linux-4k5v:~ # curl -O http://www.linuxsight.com/wp-content/uploads/2010/07/openSUSE.png
3、断点续传。
linux-4k5v:~ #  curl -C  -O  http://www.linuxsight.com/wp-content/uploads/2010/07/openSUSE.png
4、显示抓取错误
linux-4k5v:~ # curl -f http://suse.linuxsight.com/linuxsight
5、代理
linux-4k5v:~ #curl -x  IP地址  -o  page.html  http://www.linuxsight.com
在PHP里也支持CURL。首先必须开启扩展:

linux-4k5v:~ # vim /etc/php5/apache2/php.ini
开启:  extension=php_curl.dll

 

下面是一段PHP代码

<?php
//PHP另外2种抓取页面的方法
//echo $content = file_get_contents(“http://www.linuxsight.com”);
//readfile(“http://www.linuxsight.com”);
// 1. 初始化
$ch = curl_init();
// 2. 设置选项,包括URL
curl_setopt($ch, CURLOPT_URL, “http://www.linuxsight.com”);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);//0表示输出结果
curl_setopt($ch, CURLOPT_HEADER, 0);
// 3. 执行并获取HTML文档内容
$output = curl_exec($ch);
if ($output === FALSE) {
//请注意,比较的时候我们用的是“=== FALSE”,而非“== FALSE”。因为我们得区分 空输出 和 布尔值FALSE
echo “cURL Error: ” . curl_error($ch);
}
curl_exec($ch);
$info = curl_getinfo($ch);
echo ‘获取’. $info['url'] . ‘耗时’. $info['total_time'] . ‘秒’;
//返回的数组中包括了以下信息:
//“url” //资源网络地址
//“content_type” //内容编码
//“http_code” //HTTP状态码
//“header_size” //header的大小
//“request_size” //请求的大小
//“filetime” //文件创建时间
//“ssl_verify_result” //SSL验证结果
//“redirect_count” //跳转技术
//“total_time” //总耗时
//“namelookup_time” //DNS查询耗时
//“connect_time” //等待连接耗时
//“pretransfer_time” //传输前准备耗时
//“size_upload” //上传数据的大小
//“size_download” //下载数据的大小
//“speed_download” //下载速度
//“speed_upload” //上传速度
//“download_content_length”//下载内容的长度
//“upload_content_length” //上传内容的长度
//“starttransfer_time” //开始传输的时间
//“redirect_time”//重定向耗时
// 4. 释放curl句柄
curl_close($ch);
?>

当然curl的功能是很强大的,这里只是简要介绍了它的用法。一些高级的如模拟登录的还需要继续研究。
附:Linux curl 命令大全:

linux-4k5v:~ # curl –help

Usage: curl [options...] <url>

Options: (H) means HTTP/HTTPS only, (F) means FTP only

–anyauth       Pick “any” authentication method (H)

-a/–append        Append to target file when uploading (F/SFTP)

–basic         Use HTTP Basic Authentication (H)

–cacert <file> CA certificate to verify peer against (SSL)

–capath <directory> CA directory to verify peer against (SSL)

-E/–cert <cert[:passwd]> Client certificate file and password (SSL)

–cert-type <type> Certificate file type (DER/PEM/ENG) (SSL)

–ciphers <list> SSL ciphers to use (SSL)

–compressed    Request compressed response (using deflate or gzip)

-K/–config <file> Specify which config file to read

–connect-timeout <seconds> Maximum time allowed for connection

-C/–continue-at <offset> Resumed transfer offset

-b/–cookie <name=string/file> Cookie string or file to read cookies from (H)

-c/–cookie-jar <file> Write cookies to this file after operation (H)

–create-dirs   Create necessary local directory hierarchy

–crlf          Convert LF to CRLF in upload

–crlfile <file> Get a CRL list in PEM format from the given file

-d/–data <data>   HTTP POST data (H)

–data-ascii <data>  HTTP POST ASCII data (H)

–data-binary <data> HTTP POST binary data (H)

–data-urlencode <name=data/name@filename> HTTP POST data url encoded (H)

–digest        Use HTTP Digest Authentication (H)

–disable-eprt  Inhibit using EPRT or LPRT (F)

–disable-epsv  Inhibit using EPSV (F)

-D/–dump-header <file> Write the headers to this file

–egd-file <file> EGD socket path for random data (SSL)

–engine <eng>  Crypto engine to use (SSL). “–engine list” for list

-f/–fail          Fail silently (no output at all) on HTTP errors (H)

-F/–form <name=content> Specify HTTP multipart POST data (H)

–form-string <name=string> Specify HTTP multipart POST data (H)

–ftp-account <data> Account data to send when requested by server (F)

–ftp-alternative-to-user <cmd> String to replace “USER [name]” (F)

–ftp-create-dirs Create the remote dirs if not present (F)

–ftp-method [multicwd/nocwd/singlecwd] Control CWD usage (F)

–ftp-pasv      Use PASV/EPSV instead of PORT (F)

-P/–ftp-port <address> Use PORT with address instead of PASV (F)

–ftp-skip-pasv-ip Skip the IP address for PASV (F)

–ftp-pret      Send PRET before PASV (for drftpd) (F)

–ftp-ssl-ccc   Send CCC after authenticating (F)

–ftp-ssl-ccc-mode [active/passive] Set CCC mode (F)

–ftp-ssl-control Require SSL/TLS for ftp login, clear for transfer (F)

-G/–get           Send the -d data with a HTTP GET (H)

-g/–globoff       Disable URL sequences and ranges using {} and []

-H/–header <line> Custom header to pass to server (H)

-I/–head          Show document info only

-h/–help          This help text

–hostpubmd5 <md5> Hex encoded MD5 string of the host public key. (SSH)

-0/–http1.0       Use HTTP 1.0 (H)

–ignore-content-length  Ignore the HTTP Content-Length header

-i/–include       Include protocol headers in the output (H/F)

-k/–insecure      Allow connections to SSL sites without certs (H)

–interface <interface> Specify network interface/address to use

-4/–ipv4          Resolve name to IPv4 address

-6/–ipv6          Resolve name to IPv6 address

-j/–junk-session-cookies Ignore session cookies read from file (H)

–keepalive-time <seconds> Interval between keepalive probes

–key <key>     Private key file name (SSL/SSH)

–key-type <type> Private key file type (DER/PEM/ENG) (SSL)

–krb <level>   Enable Kerberos with specified security level (F)

–libcurl <file> Dump libcurl equivalent code of this command line

–limit-rate <rate> Limit transfer speed to this rate

-J/–remote-header-name Use the header-provided filename (H)

-l/–list-only     List only names of an FTP directory (F)

–local-port <num>[-num] Force use of these local port numbers

-L/–location      Follow Location: hints (H)

–location-trusted Follow Location: and send auth to other hosts (H)

-M/–manual        Display the full manual

–mail-from <from> Mail from this address

–mail-rcpt <to> Mail to this receiver(s)

–max-filesize <bytes> Maximum file size to download (H/F)

–max-redirs <num> Maximum number of redirects allowed (H)

-m/–max-time <seconds> Maximum time allowed for the transfer

–negotiate     Use HTTP Negotiate Authentication (H)

-n/–netrc         Must read .netrc for user name and password

–netrc-optional Use either .netrc or URL; overrides -n

-N/–no-buffer     Disable buffering of the output stream

–no-keepalive  Disable keepalive use on the connection

–no-sessionid  Disable SSL session-ID reusing (SSL)

–noproxy       Comma-separated list of hosts which do not use proxy

–ntlm          Use HTTP NTLM authentication (H)

-o/–output <file> Write output to <file> instead of stdout

–pass  <pass>  Pass phrase for the private key (SSL/SSH)

–post301       Do not switch to GET after following a 301 redirect (H)

–post302       Do not switch to GET after following a 302 redirect (H)

-#/–progress-bar  Display transfer progress as a progress bar

–proto <protocols>       Enable/disable specified protocols

–proto-redir <protocols> Enable/disable specified protocols on redirect

-x/–proxy <host[:port]> Use HTTP proxy on given port

–proxy-anyauth Pick “any” proxy authentication method (H)

–proxy-basic   Use Basic authentication on the proxy (H)

–proxy-digest  Use Digest authentication on the proxy (H)

–proxy-negotiate Use Negotiate authentication on the proxy (H)

–proxy-ntlm    Use NTLM authentication on the proxy (H)

-U/–proxy-user <user[:password]> Set proxy user and password

–proxy1.0 <host[:port]> Use HTTP/1.0 proxy on given port

-p/–proxytunnel   Operate through a HTTP proxy tunnel (using CONNECT)

–pubkey <key>  Public key file name (SSH)

-Q/–quote <cmd>   Send command(s) to server before file transfer (F/SFTP)

–random-file <file> File for reading random data from (SSL)

-r/–range <range> Retrieve only the bytes within a range

–raw           Pass HTTP “raw”, without any transfer decoding (H)

-e/–referer       Referer URL (H)

-O/–remote-name   Write output to a file named as the remote file

–remote-name-all Use the remote file name for all URLs

-R/–remote-time   Set the remote file’s time on the local output

-X/–request <command> Specify request command to use

–retry <num>   Retry request <num> times if transient problems occur

–retry-delay <seconds> When retrying, wait this many seconds between each

–retry-max-time <seconds> Retry only within this period

-S/–show-error    Show error. With -s, make curl show errors when they occur

-s/–silent        Silent mode. Don’t output anything

–socks4 <host[:port]> SOCKS4 proxy on given host + port

–socks4a <host[:port]> SOCKS4a proxy on given host + port

–socks5 <host[:port]> SOCKS5 proxy on given host + port

–socks5-hostname <host[:port]> SOCKS5 proxy, pass host name to proxy

–socks5-gssapi-service <name> SOCKS5 proxy service name for gssapi

–socks5-gssapi-nec  Compatibility with NEC SOCKS5 server

-Y/–speed-limit   Stop transfer if below speed-limit for ‘speed-time’ secs

-y/–speed-time    Time needed to trig speed-limit abort. Defaults to 30

–ssl           Try SSL/TLS (FTP, IMAP, POP3, SMTP)

–ssl-reqd      Require SSL/TLS (FTP, IMAP, POP3, SMTP)

-2/–sslv2         Use SSLv2 (SSL)

-3/–sslv3         Use SSLv3 (SSL)

–stderr <file> Where to redirect stderr. – means stdout

–tcp-nodelay   Use the TCP_NODELAY option

-t/–telnet-option <OPT=val> Set telnet option

–tftp-blksize <value> Set TFTP BLKSIZE option (must be >512)

-z/–time-cond <time> Transfer based on a time condition

-1/–tlsv1         Use TLSv1 (SSL)

–trace <file>  Write a debug trace to the given file

–trace-ascii <file> Like –trace but without the hex output

–trace-time    Add time stamps to trace/verbose output

-T/–upload-file <file> Transfer <file> to remote site

–url <URL>     Set URL to work with

-B/–use-ascii     Use ASCII/text transfer

-u/–user <user[:password]> Set server user and password

-A/–user-agent <string> User-Agent to send to server (H)

-v/–verbose       Make the operation more talkative

-V/–version       Show version number and quit

-w/–write-out <format> What to output after completion

-q                 If used as the first parameter disables .curlrc

给我留言