如何从网站获取HTML代码,进行保存并通过LINQ表达式查找一些文本?
我正在使用以下代码来获取网页的来源:
public static String code(string Url) { HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(Url); myRequest.Method = "GET"; WebResponse myResponse = myRequest.GetResponse(); StreamReader sr = new StreamReader(myResponse.GetResponseStream(), System.Text.Encoding.UTF8); string result = sr.ReadToEnd(); sr.Close(); myResponse.Close(); return result; }
如何在网页源中的div中查找文本?
从网站获取HTML代码。您可以使用这样的代码。
string urlAddress = "http://google.com"; HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress); HttpWebResponse response = (HttpWebResponse)request.GetResponse(); if (response.StatusCode == HttpStatusCode.OK) { Stream receiveStream = response.GetResponseStream(); StreamReader readStream = null; if (String.IsNullOrWhiteSpace(response.CharacterSet)) readStream = new StreamReader(receiveStream); else readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet)); string data = readStream.ReadToEnd(); response.Close(); readStream.Close(); }
这将为您提供从网站返回的 HTML 代码。但是通过 LINQ 查找文本并不是那么容易。也许使用正则表达式会更好,但不能与 HTML 代码一起很好地使用