如何使用 Selenium WebDriver 捕获特定元素的屏幕快照而不是整个页面?

目前,我正在尝试使用 Selenium WebDriver 捕获一个截图。但是我只能得到整个页面的截图。但是,我想要的仅仅是捕获页面的一部分,或者仅仅是基于 ID 或任何特定元素定位器的特定元素。(例如,我希望用图像 id = “ Butterfly”来捕捉图片)

有没有什么方法可以通过选定的项目或元素来捕捉截图?

137158 次浏览

We can get the element screenshot by cropping entire page screenshot as below:

driver.get("http://www.google.com");
WebElement ele = driver.findElement(By.id("hplogo"));


// Get entire page screenshot
File screenshot = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
BufferedImage  fullImg = ImageIO.read(screenshot);


// Get the location of element on the page
Point point = ele.getLocation();


// Get width and height of the element
int eleWidth = ele.getSize().getWidth();
int eleHeight = ele.getSize().getHeight();


// Crop the entire page screenshot to get only element screenshot
BufferedImage eleScreenshot= fullImg.getSubimage(point.getX(), point.getY(),
eleWidth, eleHeight);
ImageIO.write(eleScreenshot, "png", screenshot);


// Copy the element screenshot to disk
File screenshotLocation = new File("C:\\images\\GoogleLogo_screenshot.png");
FileUtils.copyFile(screenshot, screenshotLocation);

I wasted a lot of time on taking screenshot and I want to save yours. I have used chrome + selenium + c# the result was totally horrible. Finally i wrote a function :

driver.Manage().Window.Maximize();
RemoteWebElement remElement = (RemoteWebElement)driver.FindElement(By.Id("submit-button"));
Point location = remElement.LocationOnScreenOnceScrolledIntoView;


int viewportWidth = Convert.ToInt32(((IJavaScriptExecutor)driver).ExecuteScript("return document.documentElement.clientWidth"));
int viewportHeight = Convert.ToInt32(((IJavaScriptExecutor)driver).ExecuteScript("return document.documentElement.clientHeight"));


driver.SwitchTo();


int elementLocation_X = location.X;
int elementLocation_Y = location.Y;


IWebElement img = driver.FindElement(By.Id("submit-button"));


int elementSize_Width = img.Size.Width;
int elementSize_Height = img.Size.Height;


Size s = new Size();
s.Width = driver.Manage().Window.Size.Width;
s.Height = driver.Manage().Window.Size.Height;


Bitmap bitmap = new Bitmap(s.Width, s.Height);
Graphics graphics = Graphics.FromImage(bitmap as Image);
graphics.CopyFromScreen(0, 0, 0, 0, s);


bitmap.Save(filePath, System.Drawing.Imaging.ImageFormat.Png);


RectangleF part = new RectangleF(elementLocation_X, elementLocation_Y + (s.Height - viewportHeight), elementSize_Width, elementSize_Height);


Bitmap bmpobj = (Bitmap)Image.FromFile(filePath);
Bitmap bn = bmpobj.Clone(part, bmpobj.PixelFormat);
bn.Save(finalPictureFilePath, System.Drawing.Imaging.ImageFormat.Png);

In Node.js, I wrote the following code which works but it is not based on selenium's official WebDriverJS, but based on SauceLabs's WebDriver: WD.js and a very compact image library called EasyImage.

I just wanna emphasize that you cannot really take the screenshot of an element but what you should do is to first, take the screenshot of the whole page, then select the part of the page you like and crop that specific part:

browser.get(URL_TO_VISIT)
.waitForElementById(dependentElementId, webdriver.asserters.isDisplayed, 3000)
.elementById(elementID)
.getSize().then(function(size) {
browser.elementById(elementID)
.getLocation().then(function(location) {
browser.takeScreenshot().then(function(data) {
var base64Data = data.replace(/^data:image\/png;base64,/, "");
fs.writeFile(filePath, base64Data, 'base64', function(err) {
if (err) {
console.log(err);
}
else {
cropInFile(size, location, filePath);
}
doneCallback();
});
});
});
});

And the cropInFileFunction, goes like this:

var cropInFile = function(size, location, srcFile) {
easyimg.crop({
src: srcFile,
dst: srcFile,
cropwidth: size.width,
cropheight: size.height,
x: location.x,
y: location.y,
gravity: 'North-West'
},
function(err, stdout, stderr) {
if (err) throw err;
});
};

Surya's answer works great if you don't mind involving disk IO. If you'd rather not, then this method may be better for you

private Image getScreenshot(final WebDriver d, final WebElement e) throws IOException {
final BufferedImage img;
final Point topleft;
final Point bottomright;


final byte[] screengrab;
screengrab = ((TakesScreenshot) d).getScreenshotAs(OutputType.BYTES);


img = ImageIO.read(new ByteArrayInputStream(screengrab));


//crop the image to focus on e
//get dimensions (crop points)
topleft = e.getLocation();
bottomright = new Point(e.getSize().getWidth(),
e.getSize().getHeight());


return img.getSubimage(topleft.getX(),
topleft.getY(),
bottomright.getX(),
bottomright.getY());
}

If you prefer you can skip declaring screengrab and instead doing

img = ImageIO.read(
new ByteArrayInputStream(
((TakesScreenshot) d).getScreenshotAs(OutputType.BYTES)));

which is cleaner, but I left it in for clarity. You can then save it as a file or put it in a JPanel to your heart's content.

The AShot framework from Yandex can be used for taking screenshots in Selenium WebDriver scripts for

  • full web pages
  • web elements

This framework can be found on https://github.com/yandex-qatools/ashot.

The code for taking the screenshots is very straightforward:

ENTIRE PAGE

Screenshot screenshot = new AShot()
.shootingStrategy(new ViewportPastingStrategy(1000))
.takeScreenshot(driver);


ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\results.png"));

SPECIFIC WEB ELEMENT

Screenshot screenshot = new AShot()
.takeScreenshot(driver, driver.findElement(By.xpath("(//div[@id='ct_search'])[1]")));
    

ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\div_element.png"));

See more details and more code samples on this article.

public void GenerateSnapshot(string url, string selector, string filePath)
{
using (IWebDriver driver = new ChromeDriver())
{
driver.Navigate().GoToUrl(url);
var remElement = driver.FindElement(By.CssSelector(selector));
Point location = remElement.Location;


var screenshot = (driver as ChromeDriver).GetScreenshot();
using (MemoryStream stream = new MemoryStream(screenshot.AsByteArray))
{
using (Bitmap bitmap = new Bitmap(stream))
{
RectangleF part = new RectangleF(location.X, location.Y, remElement.Size.Width, remElement.Size.Height);
using (Bitmap bn = bitmap.Clone(part, bitmap.PixelFormat))
{
bn.Save(filePath, System.Drawing.Imaging.ImageFormat.Png);
}
}
}
driver.Close();
}
}

For everyone asking for code in C#, below is a simplified version of my implementation.

public static void TakeScreenshot(IWebDriver driver, IWebElement element)
{
try
{
string fileName = DateTime.Now.ToString("yyyy-MM-dd HH-mm-ss") + ".jpg";
Byte[] byteArray = ((ITakesScreenshot)driver).GetScreenshot().AsByteArray;
System.Drawing.Bitmap screenshot = new System.Drawing.Bitmap(new System.IO.MemoryStream(byteArray));
System.Drawing.Rectangle croppedImage = new System.Drawing.Rectangle(element.Location.X, element.Location.Y, element.Size.Width, element.Size.Height);
screenshot = screenshot.Clone(croppedImage, screenshot.PixelFormat);
screenshot.Save(String.Format(@"C:\SeleniumScreenshots\" + fileName, System.Drawing.Imaging.ImageFormat.Jpeg));
}
catch (Exception e)
{
logger.Error(e.StackTrace + ' ' + e.Message);
}
}

I'm using a modified version of @Brook's answer and is working fine even for elements that needs the page to be scrolled.

public void TakeScreenshot(string fileNameWithoutExtension, IWebElement element)
{
// Scroll to the element if necessary
var actions = new Actions(_driver);
actions.MoveToElement(element);
actions.Perform();
// Get the element position (scroll-aware)
var locationWhenScrolled = ((RemoteWebElement) element).LocationOnScreenOnceScrolledIntoView;
var fileName = fileNameWithoutExtension + ".png";
var byteArray = ((ITakesScreenshot) _driver).GetScreenshot().AsByteArray;
using (var screenshot = new System.Drawing.Bitmap(new System.IO.MemoryStream(byteArray)))
{
var location = locationWhenScrolled;
// Fix location if necessary to avoid OutOfMemory Exception
if (location.X + element.Size.Width > screenshot.Width)
{
location.X = screenshot.Width - element.Size.Width;
}
if (location.Y + element.Size.Height > screenshot.Height)
{
location.Y = screenshot.Height - element.Size.Height;
}
// Crop the screenshot
var croppedImage = new System.Drawing.Rectangle(location.X, location.Y, element.Size.Width, element.Size.Height);
using (var clone = screenshot.Clone(croppedImage, screenshot.PixelFormat))
{
clone.Save(fileName, ImageFormat.Png);
}
}
}

The two ifs were necessary (at least for the chrome driver) because the size of the crop exceeded in 1 pixel the screenshot size, when scrolling was needed.

using System.Drawing;
using System.Drawing.Imaging;
using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;


public void ScreenshotByElement()
{
IWebDriver driver = new FirefoxDriver();
String baseURL = "www.google.com/"; //url link
String filePath = @"c:\\img1.png";


driver.Navigate().GoToUrl(baseURL);
var remElement = driver.FindElement(By.Id("Butterfly"));
Point location = remElement.Location;


var screenshot = (driver as FirefoxDriver).GetScreenshot();
using (MemoryStream stream = new MemoryStream(screenshot.AsByteArray))
{
using (Bitmap bitmap = new Bitmap(stream))
{
RectangleF part = new RectangleF(location.X, location.Y, remElement.Size.Width, remElement.Size.Height);
using (Bitmap bn = bitmap.Clone(part, bitmap.PixelFormat))
{
bn.Save(filePath, ImageFormat.Png);
}
}
}
}

I believe this isn't going to work for you as you use C# and my solution includes a Java library, however maybe others will find it helpful.

For capturing custom screenshots you can use the Shutterbug library. The specific call for this purpose would be:

Shutterbug.shootElement(driver, element).save();

Consider using needle - tool for automated visual comparison https://github.com/bfirsh/needle , which has built-in functionality that allows to take screenshots of specific elements (selected by CSS selector). The tool works on Selenium's WebDriver and it's written in Python.

Below the function for taking snapshot a specific element in Selenium. Here the driver is a type of WebDriver.

private static void getScreenshot(final WebElement e, String fileName) throws IOException {
final BufferedImage img;
final Point topleft;
final Point bottomright;
final byte[] screengrab;
screengrab = ((TakesScreenshot) driver).getScreenshotAs(OutputType.BYTES);
img = ImageIO.read(new ByteArrayInputStream(screengrab));
topleft = e.getLocation();
bottomright = new Point(e.getSize().getWidth(), e.getSize().getHeight());
BufferedImage imgScreenshot=
(BufferedImage)img.getSubimage(topleft.getX(), topleft.getY(), bottomright.getX(), bottomright.getY());
File screenshotLocation = new File("Images/"+fileName +".png");
ImageIO.write(imgScreenshot, "png", screenshotLocation);
}

If you are looking for a JavaScript solution, here's my gist:

https://gist.github.com/sillicon/4abcd9079a7d29cbb53ebee547b55fba

The basic idea is the same, take the screen shot first, then crop it. However, my solution will not require other libraries, just pure WebDriver API code. However, the side effect is that it may increase the load of your testing browser.

Here is an extension function for C#:

public static BitmapImage GetElementImage(this IWebDriver webDriver, By by)
{
var elements = webDriver.FindElements(by);
if (elements.Count == 0)
return null;


var element = elements[0];
var screenShot = (webDriver as ITakesScreenshot).GetScreenshot();
using (var ms = new MemoryStream(screenShot.AsByteArray))
{
Bitmap screenBitmap;
screenBitmap = new Bitmap(ms);
return screenBitmap.Clone(
new Rectangle(
element.Location.X,
element.Location.Y,
element.Size.Width,
element.Size.Height
),
screenBitmap.PixelFormat
).ToBitmapImage();
}
}

Now you can use it to take the image of any element like this:

var image = webDriver.GetElementImage(By.Id("someId"));

Here is a Python 3 version using Selenium webdriver and Pillow. This program captures the screenshot of the whole page and crop the element based on its location. The element image will be available as image.png. Firefox supports saving element image directly using element.screenshot_as_png('image_name').

from selenium import webdriver
from PIL import Image


driver = webdriver.Chrome()
driver.get('https://www.google.co.in')


element = driver.find_element_by_id("lst-ib")


location = element.location
size = element.size


driver.save_screenshot("shot.png")


x = location['x']
y = location['y']
w = size['width']
h = size['height']
width = x + w
height = y + h


im = Image.open('shot.png')
im = im.crop((int(x), int(y), int(width), int(height)))
im.save('image.png')

Update

Now chrome also supports individual element screenshots. So you may directly capture the screenshot of the web element as given below.

from selenium import webdriver


driver = webdriver.Chrome()
driver.get('https://www.google.co.in')
image = driver.find_element_by_id("lst-ib").screenshot_as_png
# or
# element = driver.find_element_by_id("lst-ib")
# element.screenshot_as_png("image.png")

If you get an exception java.awt.image.RasterFormatException in chrome, or you want to scroll a element into view then capture a screenshot.

Here is a solution from @Surya answer.

        JavascriptExecutor jsExecutor = (JavascriptExecutor) driver;
Long offsetTop = (Long) jsExecutor.executeScript("window.scroll(0, document.querySelector(\""+cssSelector+"\").offsetTop - 0); return document.querySelector(\""+cssSelector+"\").getBoundingClientRect().top;");


WebElement ele = driver.findElement(By.cssSelector(cssSelector));


// Get entire page screenshot
File screenshot = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
BufferedImage  fullImg = ImageIO.read(screenshot);


// Get the location of element on the page
Point point = ele.getLocation();


// Get width and height of the element
int eleWidth = ele.getSize().getWidth();
int eleHeight = ele.getSize().getHeight();


// Crop the entire page screenshot to get only element screenshot
BufferedImage eleScreenshot= fullImg.getSubimage(point.getX(), Math.toIntExact(offsetTop),
eleWidth, eleHeight);
ImageIO.write(eleScreenshot, "png", screenshot);


// Copy the element screenshot to disk
File screenshotLocation = new File("c:\\temp\\div_element_1.png");
FileUtils.copyFile(screenshot, screenshotLocation);

c# code:

public Bitmap MakeElemScreenshot( IWebDriver driver, WebElement elem)
{
Screenshot myScreenShot = ((ITakesScreenshot)driver).GetScreenshot();


Bitmap screen = new Bitmap(new MemoryStream(myScreenShot.AsByteArray));
Bitmap elemScreenshot = screen.Clone(new Rectangle(elem.Location, elem.Size), screen.PixelFormat);


screen.Dispose();


return elemScreenshot;
}

Python 3

Tried with Selenium 3.141.0 and chromedriver 73.0.3683.68, this works,

from selenium import webdriver


chromedriver = '/usr/local/bin/chromedriver'
chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument('window-size=1366x768')
chromeOptions.add_argument('disable-extensions')
cdriver = webdriver.Chrome(options=chromeOptions, executable_path=chromedriver)


cdriver.get('url')
element = cdriver.find_element_by_css_selector('.some-css.selector')


element.screenshot_as_png('elemenent.png')

No need to get a full image and get a section of a fullscreen image.

This might not have been available when Rohit's answer was created.

I followed the sample code from @codeslord, but for some reason I had to access my screenshot data differently:

 # Open the Firefox webdriver
driver = webdriver.Firefox()
# Find the element that you're interested in
imagepanel = driver.find_element_by_class_name("panel-height-helper")
# Access the data bytes for the web element
datatowrite = imagepanel.screenshot_as_png
# Write the byte data to a file
outfile = open("imagepanel.png", "wb")
outfile.write(datatowrite)
outfile.close()

(using Python 3.7, Selenium 3.141.0 and Mozilla Geckodriver 71.0.0.7222)

This is my version, in C#, I was basically get most part from Brook's answer and modified it to fit my purpose

public static byte[] GetElementImage(this IWebElement element)
{
var screenShot = MobileDriver.Driver.GetScreenshot();
using (var stream = new MemoryStream(screenShot.AsByteArray))
{
var screenBitmap = new Bitmap(stream);
var elementBitmap = screenBitmap.Clone(
new Rectangle(
element.Location.X,
element.Location.Y,
element.Size.Width,
element.Size.Height
),
screenBitmap.PixelFormat
);
var converter = new ImageConverter();
return (byte[]) converter.ConvertTo(elementBitmap, typeof(byte[]));
}
}

I think most of the answers here are over-engineered. The way i did it is through 2 helper methods, the first to wait for an element based on any selector; and the second to take a screenshot of it.

Note: We cast the WebElement to a TakesScreenshot instance, so we only capture that element in the image specifically. If you want the full page/window, you should cast driver instead.

Edit: I forgot to say that i'm using Java and Selenium v3 (but should be the same for v4)

WebDriver driver = new FirefoxDriver(); // define this somewhere (or chrome etc)


public <T> T screenshotOf(By by, long timeout, OutputType<T> type) {
return ((TakesScreenshot) waitForElement(by, timeout))
.getScreenshotAs(type);
}


public WebElement waitForElement(By by, long timeout) {
return new WebDriverWait(driver, timeout)
.until(driver -> driver.findElement(by));
}

And then just screenshot whatever u want like this :

long timeout = 5;   // in seconds
/* Screenshot (to file) based on first occurence of tag */
File sc = screenshotOf(By.tagName("body"), timeout, OutputType.FILE);
/* Screenshot (in memory) based on CSS selector (e.g. first image in body
who's "src" attribute starts with "https")  */
byte[] sc = screenshotOf(By.cssSelector("body > img[href^='https']"), timeout, OutputType.BYTES);

To take a screenshot for a specific element you can now just use this:

public void takeCanvasScreenshot(WebElement element, String imageName) {
   

File screenshot = element.getScreenshotAs(OutputType.FILE);


try {
FileUtils.copyFile(screenshot, new File("src/main/resources/screenshots/" + imageName + ".png"));
} catch (IOException e) {
e.printStackTrace();
}
}

For C#, below code can work .

try
{

IWebElement transactions = driver.FindElement(By.XPath(".//*[@id='some element']"));

Screenshot screenshot = ((ITakesScreenshot)driver).GetScreenshot();

string title = "some title";

screenshot.SaveAsFile(title, ScreenshotImageFormat.Jpeg);

} catch (Exception) {

// handle if element not found

}