我正在使用selenium抓取一个网站“ https://www.medline.com/catalog/category- products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03 ”
对于单页和单个产品,我可以通过传递产品网址来进行抓取,但是我试图通过selenium来做到这一点,即在逐个选择所有产品后自动选择产品页面,并且应该移至下一页并在打开后产品详细信息页面应该刮掉,这是由美丽的汤完成的,这里是基本URL中的产品URL“ https://www.medline.com/product/SensiCare-Powder-Free-Nitrile-Exam- Gloves/SensiCare/Z05-PF00342 ?question =&index = P1&indexCount = 1 “
这是我的代码:
chromeOptions = webdriver.ChromeOptions() chromeOptions.add_experimental_option('useAutomationExtension', False) driver = webdriver.Chrome(executable_path='C:/Users/ptiwar34/Documents/chromedriver.exe', chrome_options=chromeOptions, desired_capabilities=chromeOptions.to_capabilities()) driver.get("https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03") while True: try: WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'resultGalleryViewRow')]//div[@class='medGridProdTitle']//a[contains(@href]"))).click() print("Clicked for next page") except TimeoutException: print("No more pages") break driver.quit()
在这里它不会引发错误
它没有打开每个产品的页面,我想在新选项卡中打开每个产品,在将其废弃后删除并打开新产品的新选项卡
从网页https://www.medline.com/catalog/category- products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03中打开每个产品的 新标签,并取消它,你必须诱导_WebDriverWait_的number_of_windows_to_be(2),你可以使用下面的[定位策略:
https://www.medline.com/catalog/category- products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03
number_of_windows_to_be(2)
from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC import time chrome_options = webdriver.ChromeOptions() chrome_options.add_argument("start-maximized") driver = webdriver.Chrome(options=chrome_options, executable_path=r'C:\WebDrivers\chromedriver.exe') driver.get("https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03") my_hrefs = [my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class, 'resultGalleryViewRow')]//div[@class='medGridProdTitle']//a")))] windows_before = driver.current_window_handle # Store the parent_window_handle for future use for my_href in my_hrefs: driver.execute_script("window.open('" + my_href +"');") WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2)) # Induce WebDriverWait for the number_of_windows_to_be 2 windows_after = driver.window_handles new_window = [x for x in windows_after if x != windows_before][0] # Identify the newly opened window driver.switch_to.window(new_window) # switch_to the new window time.sleep(3) # perform your webscrapping here print(driver.title) # print the page title or your perform your webscrapping driver.close() # close the window driver.switch_to.window(windows_before) # switch_to the parent_window_handle driver.quit() #quit your program
SensiCare Powder-Free Nitrile Exam Gloves | Medline Industries, Inc. MediGuard Vinyl Synthetic Exam Gloves | Medline Industries, Inc. CURAD Stretch Vinyl Exam Gloves | Medline Industries, Inc. CURAD Nitrile Exam Gloves | Medline Industries, Inc. SensiCare Ice Blue Powder-Free Nitrile Exam Gloves | Medline Industries, Inc. MediGuard Synthetic Exam Gloves | Medline Industries, Inc. Accutouch Synthetic Exam Gloves | Medline Industries, Inc. Aloetouch Ice Powder-Free Nitrile Exam Gloves | Medline Industries, Inc. Aloetouch 3G Powder-Free Synthetic Exam Gloves | Medline Industries, Inc. SensiCare Powder-Free Stretch Vinyl Sterile Exam Gloves | Medline Industries, Inc. CURAD Powder-Free Textured Latex Exam Gloves | Medline Industries, Inc. Accutouch Chemo Nitrile Exam Gloves | Medline Industries, Inc. Aloetouch 12" Powder-Free Nitrile Exam Gloves | Medline Industries, Inc. Ultra Stretch Synthetic Exam Gloves | Medline Industries, Inc. Generation Pink 3G Synthetic Exam Gloves | Medline Industries, Inc. SensiCare Extended Cuff Powder-Free Nitrile Exam Gloves | Medline Industries, Inc. Eudermic MP High-Risk Powder-Free Latex Exam Gloves | Medline Industries, Inc. Aloetouch Powder-Free Latex Exam Gloves | Medline Industries, Inc. CURAD Powder-Free Nitrile Exam Gloves | Medline Industries, Inc. Medline Sterile Powder-Free Latex Exam Gloves | Medline Industries, Inc. SensiCare Silk Powder-Free Nitrile Exam Gloves | Medline Industries, Inc. Medline Sterile Powder-Free Latex Exam Glove Pairs | Medline Industries, Inc. MediGuard 2.0 Nitrile Exam Gloves | Medline Industries, Inc. Designer Boxed Vinyl Exam Gloves | Medline Industries, Inc.