How To Find Broken Links with Selenium

Automatically find broken links using selenium

Links are used for navigating between webpages. Users are directed to a web page when they click or type a link on a web browser. So a broken link indicates a link that is not working. In other words, it will not navigate the user properly to the requested web page. It happens due to several reasons such as server-side errors, the absence of webpages, typing errors of users.

When a user visits a broken link, they are notified with an error message. While Valid URLs give 2XX status codes, broken URLs give status codes that begin with 400 series, and 500 series .4XX status codes indicate client-side errors, and 5XX status codes indicate server response errors.

Below are some reasons for broken links.

  • 400 Bad request error: This error code is received because of the wrong URL address. So the server cannot process the link to get the requested web page.
  • 404 Page Not Found error: The web page is not existing or removed by the owner.
  • Sometimes the system firewall can restrict reaching some web sites.
  • Users can insert the link incorrectly.

Having broken links on your website creates a bad experience for your users. It can seriously affect the reputation of your website. A website usually contains a large number of links. Manually testing each of these links is a time-consuming task. Therefore automating the Selenium Web Driver to check broken links is the best solution for this issue.

Testing broken links can be done, as shown in the steps below. The below code is a sample code for a test carried out to https://www.google.co.uk, and relevant facts are discussed below.

package automationproject;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.HttpURLConnection;
import java.util.Iterator;
import java.net.URL;
import java.util.List;
import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
public class MyBrokenLinks {

  public static void main(String[] args) {

System.setProperty("webdriver.ie.driver","C:\\Users\\tushar\\eclipse-workspace\\first test\\chromedriver.exe");
    WebDriver mydriver = new ChromeDriver();
    String myhomePage = "https://www.google.co.uk";
    String myurl = "";
    HttpURLConnection myhuc = null;
    int responseCode = 200;
    mydriver = new ChromeDriver();
    mydriver.manage().window().maximize();
    mydriver.get(myhomePage);
    List < WebElement > mylinks = mydriver.findElements(By.tagName("a"));
    Iterator < WebElement > myit = mylinks.iterator();
    while (myit.hasNext()) {

      myurl = myit.next().getAttribute("href");
      System.out.println(myurl);
      if (myurl == null || myurl.isEmpty()) {
        System.out.println("Empty URL or an Unconfigured URL");
        continue;
      }

      if (!myurl.startsWith(myhomePage)) {
        System.out.println("This URL is from another domain");
        continue;
      }

      try {
        myhuc = (HttpURLConnection)(new URL(myurl).openConnection());
        myhuc.setRequestMethod("HEAD");
        myhuc.connect();
        responseCode = myhuc.getResponseCode();
        if (responseCode >= 400) {
          System.out.println(myurl + " This link is broken");
        }
        else {
          System.out.println(myurl + " This link is valid");
        }

      } catch(MalformedURLException ex) {
        ex.printStackTrace();
      } catch(IOException ex) {
        ex.printStackTrace();
      }
    }

    mydriver.quit();
  }
}

Below are my test results.

blank

Each link that is  used in the codes of the web page can be found with the aid of the anchor tag‘<a>.’ The identified links are listed down

List<WebElement> mylinks = drive.findElements(By.tagName("a"));

Then an iterator is placed to move through the created list of links.

Iterator<WebElement> myit = mylinks.iterator();

Identification and Validation of URLs

This step is provided to check the URLs generated with a third party domain or to check it is empty or null. HREF of the anchor tag is stored in a variable called “URL,” and then it is checked as above.

myurl = myit.next().getAttribute("href");

For empty URLs, the below code is used.

if(myurl == null || myurl.isEmpty()){
    System.out.println("Empty URL or an Unconfigured URL");
    continue;
    }
    

The following code is used to determine where the URL belongs to, whether it belongs to the created domain or it is obtained from a third-party provider.

if(!myurl.startsWith(homePage)){
    System.out.println("This URL is from another domain");
    continue;
    }

HTTP Request Sending

Methods in the above, imported “HttpURLConnection” class allows you to send requests and capture responses from the HTTP response codes.

myhuc = (HttpURLConnection)(new URL(myurl).openConnection());

Here “HEAD” is set as request type without using  “GET” to return only headers instead of the body of the document.

myhuc.setRequestMethod("HEAD");

When the connect method is invoked, the actual connection of the URL will be established.

myhuc.connect();

HTTP response should be obtained by the getResponseCode() method.

responseCode = huc.getResponseCode();

Broken links can be determined by the response code number, as mentioned above. Any code that is larger than or equal to 400  can be identified as broken links.

if(responseCode >= 400){
System.out.println(myurl+" This link is broken");
}
else{
System.out.println(myurl+" This link is valid");
}

Testing a broken link is a crucial function to make a good website with an excellent user experience. Users can identify malfunctioning links using Selenium Web Driver testing quickly. This is a tester-friendly version to create a better website.

Tushar Sharma
Tushar Sharmahttps://www.automationdojos.com
Hi! This is Tushar, the author of 'Automation Dojos'. A passionate IT professional with a big appetite for learning, I enjoy technical content creation and curation. Hope you are having a good time! Don't forget to subscribe and stay in touch. Wishing you happy learning!

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Recent Posts

RELATED POSTS

Working with Selenium WebElements

Table of Contents 1. Selenium WebElements 2. WebElement Locators 3. Working With Text Box 4. Working With Buttons 5. Checkboxes and Radio Buttons 6....

How To Use TestNG with Selenium

1. What is TestNG? TestNG is an open-source automated testing framework with flexible and powerful features. It is inspired by JUnit and NUnit but with...

Finding Web Elements with Selenium

I'm going to explain in this tutorial about the usage of the findElement and findElements method of Selenium Webdriver on the Chrome web browser....

Automation Tools Comparison (SilkTest vs QTP vs Selenium)

While manual testing and automated testing go hand in hand, one of the important benefits of automated testing is the assurance that the software...

Â

RECENT 'HOW-TO'

How To Install Oh-My-Posh On Windows PowerShell

Oh-My-Posh is a powerful custom prompt engine for any shell that has the ability to adjust the prompt string with a function or variable. It does not...

MORE ON CODEX

MORE IN THIS CATEGORY

How To Do API Testing with JMeter

Introduction Application Programming Interface is a very popular term among developers. It is simply a request provider that responds to your request. In other words,...

How To Use Mouse and Keyboard Events with Selenium

In this tutorial, we will discuss how to use mouse click events and keyboard events with Selenium WebDriver. The mouse click and keyboard events...

How To Change Font for Eclipse Editor Pane

This article shows how to change the text size and style for the Eclipse editor pane. The font used for Eclipse editor pane can be...

How To Convert String To Date in Java

Introduction There are often scenarios in programming, where you will need to convert a date in String format to an actual Date object. For example,...

CHECKOUT TUTORIALS

Java Tutorial #4 – Control Statements

Introduction Control statements are used to change the flow of execution based on changes to certain variables in the code. One of the types of...
- Advertisement -spot_img