- HTML code
<tr id="tr_domains_16969543" style="cursor:auto;" onclick="selRow(this);" onmouseover="tr_Mouseover(this)" onmouseout="tr_Mouseout(this)"> <td class="domainname" > <div class="domainurl"> <a href="http://whois.chinaz.com/30n.net" id="domain_1" target="_blank" title="查看">a</a> </div> </td> <td>b</td> <td>c</td> <td>d</td> <td>d</td></tr> <tr id="tr_domains_16969543" style="cursor:auto;" onclick="selRow(this);" onmouseover="tr_Mouseover(this)" onmouseout="tr_Mouseout(this)"> <td class="domainname" > <div class="domainurl"> <a href="http://whois.chinaz.com/30n.net" id="domain_1" target="_blank" title="查看">a</a> </div> </td> <td>b</td> <td>c</td> <td>d</td> <td>d</td></tr> <tr id="tr_domains_16969543" style="cursor:auto;" onclick="selRow(this);" onmouseover="tr_Mouseover(this)" onmouseout="tr_Mouseout(this)"> <td class="domainname" > <div class="domainurl"> <a href="http://whois.chinaz.com/30n.net" id="domain_1" target="_blank" title="查看">a</a> </div> </td> <td>b</td> <td>c</td> <td>d</td> <td>e</td></tr> <tr id="tr_domains_16969543" style="cursor:auto;" onclick="selRow(this);" onmouseover="tr_Mouseover(this)" onmouseout="tr_Mouseout(this)"> <td class="domainname" > <div class="domainurl"> <a href="http://whois.chinaz.com/30n.net" id="domain_1" target="_blank" title="查看">a</a> </div> </td> <td>b</td> <td>c</td> <td>d</td> <td>e</td></tr>
如何把td内的每组数据分别提取出来?
------解决方案--------------------------------------------------------
- C# code
string url = "http://del.chinaz.com/"; WebRequest request = WebRequest.Create(url); //请求url WebResponse response = request.GetResponse(); //获取url数据 StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.GetEncoding("utf-8")); string tempStr = reader.ReadToEnd(); string pattern = @"(?i)<tr[^>]*?id=(['""]?)tr_domains[^'""]*?\1[^>]*?>[\s\S]*?<a[^>]*?id=(['""]?)domain[^'""]*?\2"; pattern += @"[^>]*?>(?<a>[^>]*?)</a>[\s\S]*?<td[^>]*?>(?<b>[\s\S]*?)</td>\s*?<td[^>]*?>(?<c>[\s\S]*?)</td>\s*?"; pattern += @"<td[^>]*?>(?<d>[\s\S]*?)</td>\s*?<td[^>]*?>(?<e>[\s\S]*?)</td>\s*?"; //循环读取 foreach (Match m in Regex.Matches(tempStr, pattern)) { string a = m.Groups["a"].Value;//1dq.net string b = m.Groups["b"].Value;//3 string c = m.Groups["c"].Value;//net string d = m.Groups["d"].Value;//2012-08-26 string e1 = m.Groups["e"].Value;//"Delete }