How I can get web page's content and save it into the string variable

小开

Webclient client = new Webclient();
string content = client.DownloadString(url);

传递想要获取的页面的 URL。您可以使用 htmlagilitypack 解析结果。

小开

最佳答案

你可以使用网络客户端

Using System.Net;


using(WebClient client = new WebClient()) {
string downloadString = client.DownloadString("http://www.gooogle.com");
}

小开

我曾经遇到过 Webclient 的问题。下载前。如果你遇到过，你可以试试这个:

WebRequest request = WebRequest.Create("http://www.google.com");
WebResponse response = request.GetResponse();
Stream data = response.GetResponseStream();
string html = String.Empty;
using (StreamReader sr = new StreamReader(data))
{
html = sr.ReadToEnd();
}

小开

我推荐使用 WebClient.DownloadString的没有。这是因为(至少在。NET 3.5) DownloadString 不够聪明，不能使用/删除存在的 BOM这可能导致在返回 UTF-8数据时 BOM (ï»¿)错误地显示为字符串的一部分(至少没有字符集)-恶心！

相反，这种微小的变化将正确地与 BOM 一起工作:

string ReadTextFromUrl(string url) {
// WebClient is still convenient
// Assume UTF8, but detect BOM - could also honor response charset I suppose
using (var client = new WebClient())
using (var stream = client.OpenRead(url))
using (var textReader = new StreamReader(stream, Encoding.UTF8, true)) {
return textReader.ReadToEnd();
}
}

小开

我一直在使用 WebClient，但是在这篇文章发表的时候(。NET 6是有效的) ，WebClient 正在被弃用。

The preferred way is

HttpClient client = new HttpClient();
string content = await client.GetStringAsync(url);