Aug 13th
2010

How To: Check PageRank Of All Pages Within Site


In usual, when you want to check the Google Pagerank of entire website (checking the PR of all the internal pages), you must have to visit each page individually, many online multiple Pagerank checkers can only offer you an ability to check PR of multiple domain a time, so in this tutorial, I will show you how to build an Entire Website Pagerank Checker script with PHP.

The process we need?

As our main target, the script we build must do two functions:

  • The first: Get all page links of the specific website.
  • The second: Checking the Google PageRank of each link then output all of them.

Step 1 Get all page links of the website

In this step, I’ve asked myself that “How can I get all internal page links of a website?”, then some ideas appeared in my mind: building an automatic bot to craw through the website (too hard to build and maybe you will get banned from the bad bot blocker if your bot visit the website many times a day), follow any internal link that appeared on the front of the website with depth level is 2 or 3 (I think it couldn’t go through all post), using google index to get the post link, using the RSS output (the same reason above)… And the final tool I want to use is a Website Sitemap (An XML sitemap), why?

  • Almost websites, especially WordPress sites are using XML Sitemap (with the help of Google XML Sitemaps plugin – 3,833,876 downloads up to now)
  • The sitemap usually shows all the page/post links of the website and we can get all of them for the main purpose.

Take the first look of the XML sitemap, for example: http://www.intenseblog.com/sitemap.xml… and the source of this:

Sitemap

As you see, the URL of each page was wrapped by <loc>...</loc> tags, all things we should do are get that values. So the first PHP code is:

<?php
function content($parser, $data){ // This function will get all the value between <loc>, <lastmod>, <changefreq>, <priority> of the sitemap
	global $arrlink; // We MUST set arrlink to a global array
		$first = substr($data, 0, 4); // we get first 4 character of each value...
		if ($first == "http") { // ... and then checking if the value is an URL...
			array_push($arrlink, $data); // ... then pass the value to arrlink
		}

}

function geturlcontent($url){
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_URL, $url);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
}

$data = geturlcontent($url);
$arrlink = array();
	$xml_parser = xml_parser_create();
	xml_set_character_data_handler($xml_parser, "content"); //Character data handler is called for every piece of a text in the XML document
	if(!(xml_parse($xml_parser, $data))){ // If there is an arror.. output it
		die("Error on line " . xml_get_current_line_number($xml_parser));
	}
	xml_parser_free($xml_parser); // If you don't call xml_parser_free() before your script ends, some sort of ugliness occurs with the webserver
?>

In the code above, we only use xml_set_character_data_handler NOT xml_set_element_handler because we only need the value between <loc>...</loc> tags, there’re nothing in <loc&gt tag for us.

For this Php script can work, we must get the content of the XML sitemap, I know many ways for us to do:
1/ Using fread: http://php.net/manual/en/function.fread.php
Fread is great if the sitemap has a small size, if greater, the script will only process small piece of the sitemap.xml (Fread can read maximun as 8192 bytes)

2/ Using file_get_contents: you should use this when your web server doesn’t support cURL, the code will be:

$data = file_get_contents($file);

3/ Using include and read_file

4/ Using cURL (I love this way) because it’s safer and faster to work, so I add this function to get the content of sitemap.xml:

function geturlcontent($url){
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_URL, $url);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
}

We create an array named as arrlink to store all the web page links we’ve got, at first, arrlink is empty then we create content function to pass the URL value to this array, by this array we can use it for any other purposes with php :)

And we’re almost finish the first step, lets move to the next step.

Step 2 Checking the Google PageRank of each link

First, we need a Php script to check the pagerank of a given URL, I found one free script which works very well, you can download it at: http://www.diagnosticoweb.com/wdscript/free.php, the script is free for using and can check the domain Alexa rank and popularity, Google backLink, DMOZ Listed…. but we only need Google Page Rank checking tool.

You must download and upload the script as the same folder of our script. then we make an input form for you to put in the URL of a Sitemap.xml:

			<form action="" method="post" name="checkpr" id="checkpr">
					<p>Put The XML Sitemap Link Here:</p> <input type="text" name="sitemap" value="<?php if (isset($_SESSION['sitemap'])) {echo $_SESSION['sitemap'];} ?>" size=80 />
					<input type="submit" value="Process" />
			</form>

… and a code to handle the form:

session_start(); // We need it because I want after the form button was clicked, the sitemap URL still appeared in the input field
$success = '';
if (isset($_POST['sitemap']) && $_POST['sitemap'] !== '')
{
	$success = 'ok';
	$url = $_POST['sitemap'];
	$_SESSION['sitemap'] = $_POST['sitemap'];
}

Ok, we will process the sitemap.xml that we were received:

			if ($success == 'ok') {
				echo '<ul>';
				ob_end_flush();
				for ($i=0;$i<count($arrlink); $i++) {
				$pr = getPageRank($arrlink[$i]);
					if ($i % 2) {
					echo '<li class="two"><a href="'.$arrlink[$i].'">'.$arrlink[$i].'</a>PageRank <font color="red">'.$pr.'</font><div class="clear"></div></li>'; flush(); sleep(1);
					}
					else {
					echo '<li class="one"><a href="'.$arrlink[$i].'">'.$arrlink[$i].'</a>PageRank <font color="red">'.$pr.'</font><div class="clear"></div></li>'; flush(); sleep(1);
					}
				}
				echo '</ul>';
				unset ($success); // reset the success variable
				session_unset(); // reset the current Session
			}

In the code above, I used ob_end_flush(); and flush(); sleep(1); to make the script output the content when running, if not, the script only output the content after the For loop finished.

The Final Script

After adding CSS markup, here is the final script code to get the Google Pagerank of entire website:

Final Sitemap Pagerank checking script

<?php
set_time_limit(0);
include('pagerank.php');
session_start();
$success = '';

function content($parser, $data){
	global $arrlink;
		$first = substr($data, 0, 4);
		if ($first == "http") {
			array_push($arrlink, $data);
		}

}

function geturlcontent($url){
	$ch = curl_init();
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_URL, $url);
	$data = curl_exec($ch);
	curl_close($ch);
	return $data;
}

if (isset($_POST['sitemap']) && $_POST['sitemap'] !== '')
{
	$success = 'ok';
	$url = $_POST['sitemap'];
	$_SESSION['sitemap'] = $_POST['sitemap'];
	$data = geturlcontent($url);
//	$data = file_get_contents($file); 

	$arrlink = array();

	$xml_parser = xml_parser_create();
	xml_set_character_data_handler($xml_parser, "content");
	if(!(xml_parse($xml_parser, $data))){
		die("Error on line " . xml_get_current_line_number($xml_parser));
	}
	xml_parser_free($xml_parser);
}
?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">
<head>
<title>Multiple PR Checker</title>
<style type='text/css'>
	body {min-width:950px;background:#eee}
	.container {width:940px;margin:10px auto;background:#fff;padding:10px;border:1px solid #ddd;text-align:center}
	#checkpr {padding:5px;background:#afc156;color:#fff;font-size:20px;text-align:center}
	input {padding:5px}
	ul {list-style:none;width:100%;padding:0;}
	li {padding:5px;margin:2px 0;border:1px solid #ddd;text-align:right;font-size:16px;}
	.one {background:#d7eaff}
	.two {background:#d9e4a3}
	li a {color:#18479B;text-decoration:none;float:left}
	.clear {clear:both}
</style>
</head>
<body>
<div class="container">
			<form action="" method="post" name="checkpr" id="checkpr">
					<p>Put The XML Sitemap Link Here:</p> <input type="text" name="sitemap" value="<?php if (isset($_SESSION['sitemap'])) {echo $_SESSION['sitemap'];} ?>" size=80 />
					<input type="submit" value="Process" />
			</form>
			<?php
			if ($success == 'ok') {
				echo '<ul>';
				ob_end_flush();
				for ($i=0;$i<count($arrlink); $i++) {
				$pr = getPageRank($arrlink[$i]);
					if ($i % 2) {
					echo '<li class="two"><a href="'.$arrlink[$i].'">'.$arrlink[$i].'</a>PageRank <font color="red">'.$pr.'</font><div class="clear"></div></li>'; flush(); sleep(1);
					}
					else {
					echo '<li class="one"><a href="'.$arrlink[$i].'">'.$arrlink[$i].'</a>PageRank <font color="red">'.$pr.'</font><div class="clear"></div></li>'; flush(); sleep(1);
					}
				}
				echo '</ul>';
				unset ($success);
				session_unset();
			}
			?>
</div>
</body>
</html>

Also, you can download it all at: http://www.intenseblog.com/file/check-pagerank.zip :)

The Blogging Profit Supremacy

Have you ever been dissatisfied with your income? Don't waste your time any more, take the lessons I've learned in 3 years and start making your money online the right way now!

Put your information below to DOWNLOAD this special report For $21 FREE! 
(Currently being sold on Amazon - Limited offer only for IntenseBlog's readers)
 



Powered by WPSubscribers
About Jenni R

Jenni is a banker who love to work with PHP coding, Wordpress blog and Web design... She currently is a Chief Branding Officer of Intense Blog. In this year, Jenni has built an awesome Wordpress plugin called as: WPSubscribers, which will help you build the mailing list instantly and was sold more than 1000 copies, let's check it out!

Connect with Jenni on Google, Facebook and Twitter.

Comments

  1. Techozens says:

    Thanks for this work friend. Will try this out.
    Nice work.
    Techozens last post: 6 reasons why U should switch to dofollow blog

  2. BLOG404 says:

    Wow thanks for the download link . There are no sites untill now which displays the page ranks ofall blog posts :D . Looks brillinat idea
    Thanks , will try it out
    BLOG404 last post: Blog Contest – Cash Prizes to be won !

  3. Cole Stan says:

    Getting the PR of all pages is very helpful in order for you to check what particular posts are loved by your audience. You could actually do strategies for it to take advantage of the result.
    Cole Stan last post: Birthday Gifts for Dad- You Can’t Go Wrong with These 3 Great Birthday Gifts for Dads

  4. Le Hoang says:

    Thank for your tip. It is relly wonderful.

  5. lawmacs says:

    thanks for the heads up i must say nice been here two days ago now you have a new theme site looking great
    lawmacs last post: 3 Ingredients To Success

  6. Jennifer R says:

    Thanks, there’re many things to do when set up a new theme :), nice to meet you.

  7. Ashfame says:

    Useful to have your own private checker ;)
    Ashfame last post: WordPress Multisite non-WWW forwards to Signup Page Problem Fix

  8. Steve says:

    Wow, this is exactly what I was looking to do as my next little project. I’m going to copypasta your code and play around it with in my admin section for my little server farm.
    Steve last post: Perspective Broker Authentication for Stackless Python

  9. kibagus says:

    thanks, nice info.. i think i must learn more about this..!
    kibagus last post: Mengenal Google SketchUp software desain grafis 2D dan 3D

  10. Danh says:

    Amazing ……….. I love you :D

  11. Cristian says:

    Hi Jennifer, great post.
    I don’t know php but i would like to try the script, could you put a download link to the files?

    Thank you
    Cristian last post: Piranha 3D 2010

  12. Cole Stan says:

    Is there a way to transform this idea into a WP plugin? Many people are looking for such a tool, including me :)
    Cole Stan last post: Anniversary Gifts For Men – Rock His World With These Romantic Gifts

  13. Pretty awesome and exactly what I was looking for, one thing when I just copied and used your script it needed a pagerank.php for the include, I realized that the zip from the previous link you say was needed has no php files with this name so the domaintool.php from the wdscript download needs to be renamed to pagerank.php and uploaded to the same directory you put your script. This took a few minutes to figure out but works great!
    Justin Germino last post: Are You an Askable Blogger

  14. tospider says:

    wow! this is a very cool idea to check PR
    tospider last post: Facebook will make About 15 Acquisitions in the Next Year

  15. Blogueigoo says:

    Great. I already searched something like this but I didn’t find. Thanks for the tutorial.
    Blogueigoo last post: Software de Gestão Financeira – GNUCash

  16. mantap pak artikelnya, ane copas dulu yah wat dipelajarin

  17. Seo Dizain says:

    Nice article. All aspects very good explained, I’ll try to follow them. Thank’s for sharing
    Seo Dizain last post: Из попы – в топы!

  18. This is going to come in so handy. I have so many sites I deal with this could really help me identify some good pages. THANKS!

  19. Though I’ve been using the OpenSiteExplorer, I may have to do this…

  20. Mila Sari says:

    amazing article.. can u share in PDF?
    Mila Sari last post: Torch BlackBerry 9800 – RIMs New Design

  21. Herbalife says:

    I tried this out and it works like a charm. It was even able to process my 1600+ URLs in my sitemap like a charm. Great tool. Now I can start tracking my important URLs and work the PR on them. Thanks.

  22. Brian says:

    It took me a little while to figure this out, because I also missed the fact that you have to download the ‘domaintool.php’ and rename it ‘pagerank.php’.

    So in the end I named my new page ‘prchecker.php’ using the script provided above and then uploaded both files ‘prchecker.php’ and ‘pagerank.php’. to my server in the same directory and it worked great after that.

    Great to see all my posts with PR logged against them. Now I really know whats going down well and what isn’t. Great script.

  23. Liem Saty says:

    you are so genius …. A + for you

  24. Ana from MarketMeSuite says:

    You are right, Jenni – this is a bit above my head, but thanks of the link; I am sure some of my readers will find it interesting.

    Ana
    Ana @ MarketMeSuite last post: How to Build an Email List 101

  25. Nice tutorial and easy to practice. Thanks!!
    jasa desain rumah last post: Desain Renovasi Rumah Klasik Kontemporer di Bintara

  26. Robintel says:

    Hi,

    I’m affraid the link to the other script is broken. :( Can you help?

    Thanks,
    Robin

  27. Robintel says:

    HI Jenni,

    Thanks a bunch for the file, it worked flawlessly. Also, I am very much impressed with the instant answer I got for you!

    Thanks, I’m charmed.

    Robin
    Robintel last post: Poze din Valea Dinozaurilor

  28. Thanks Jenny for great tip but is there any way to check pr for blogger blogs thanks :)

  29. sandy from Sydney SEO and Web Design says:

    Detailed explanation of the code and steps. Even with little or no PHP knowledge one can do this!. Thanks for the excellent SEO tip

  30. John says:

    I just uploaded this in my server and it seems to work right although i have some problems getting it to see all the links in the sitemap. Thanks alot for sharing. Great script :)

  31. Maki says:

    thanks for code :)

  32. Chuck from White Label SEO says:

    Nice post. Thanks for sharing your thoughts. I’m looking forward to reading your post updates.

  33. cyracks says:

    Thanks for this great tutorial i wonder if you could unveil such for blogger platform if at all there is any. I will try it on my new wordpress blog if i can grab it successfuly

  34. Hi Jenni,
    I tried the code but unfortunately pagerank pagerank values ​​instead of just writing it out does not appear. How can I fix this?

  35. Brian from Blogging for Business says:

    Seems like the wonderful tool you created has stopped working. I noticed a few Google PR checkers also stopped working, so there may have been an update to how Google present PR that has caused this glitch.

    I would love to get it working again so thought I would mention it to you to see if you were aware of the problem.

    Thanks

    Brian

  36. John from Website Design in Sydney says:

    This is great! You explained it real well. I will try this. Thanks!

  37. THis is going to be huge… thanks for the detail…. >>>>>>>>> used ob_end_flush(); and flush(); sleep(1); to make the script output the content when running, <<<<<<<<<<<<<<<< KEY
    Jacksonville SEO Expert last post: Google reorganization? What does it mean to your business?

  38. Miniclip says:

    is there any website who are catering this kind of all website pages checking for page rank? please help..

  39. Preethi says:

    I don’t have knowledge about PHP, but reading this post was interesting and knowing the page rank of each page of the website will only let you know that which post is more popular.
    -Preethi
    http://www.brainwavelive.com/services/python-application-development.html

  40. Very nice script. It help me a lot

  41. Stanley says:

    In August Google checked the way they do lookups and expired their old methods.
    You need to update any references that say:
    http://toolbarqueries.google.com/search
    and
    http://www.google.com/search
    to
    http://toolbarqueries.google.com/tbr
    Alternatively, you can find the latest pagerank scripts here:
    https://github.com/phurix/pagerank/

  42. Saikrishna says:

    An awesome article with a clean explanation :) thanks for sharing dear.

  43. Kuldeep from Revolutioners says:

    It does not seem to be working for me. I have the script uploaded on my webserver and it does not pull the pagerank or any other rank information.

    just visit

    http://revolutioners.com/pagerankchecker/

    and insert your sitemap or use mine which is

    http://revolutioners.com/sitemaps/revolutioners-com.xml

  44. Gitesh says:

    Excellent tips and script. Can you tell me is there any source to get this working script for download. I need to download this script to deploy in my new site.

  45. elena from tuscany villa rental says:

    Thank you so much, this really looks wonderful,… but it turns out that all my pages have no PR, while the Google toolbar show values from 0 to 4. Where am I doing wrong?
    I downloaded the two php files into a separate folder called PR. Then I access the file pr/index.php, I write the path to the ror.xml, and see all the files listed, but no PR :((
    Can you help me?

  46. Jonathan says:

    http://jargoned.com/internalpagerankchecker/

    This script still works and it works beautifully too. On my other website it located over 60 pages having pagerank. I was looking exactly for a service like this, so thank you, I have it uploaded myself, so the service will only die when my website die’s! :P

  47. arick from unirow says:

    I would like to check my blog’s pagerank which is not only front page but also all posts in it. still searching till now.

  48. Brandon says:

    Hello Jenni, I was wondering what your restrictions are for use of your code. I want to display a working form of such on my site based off of your tutorial, and would be more than happy to link back here.

    Just want to make sure you’re ok with it though!

  49. Jenni R says:

    I’ve just updated the new version of the script, thank you Stanley for your idea :)

    http://www.intenseblog.com/tutorials/check-pagerank-of-all-pages-within-site.html#comment-12123

  50. Pieter says:

    Looks like a great script, I downloaded it and tried the first xml sitemap but I am not sure if the script works i.e. any idea how long it would take to process an xml file of about 889 Kb? I suspect it is rather big.
    And isn’t it possible to let it generate output while running? I thought I read this somewhere.

    Or is it just not working?

    (p.s. I’ve put index.php and pagerank.php in a seperate directory)

  51. Kristy from free background check online says:

    hello!,I really like your writing so a lot! percentage we keep up a correspondence extra about your article on AOL? I require an expert in this area to unravel my problem. May be that is you! Having a look ahead to see you.

  52. Steve Wilson says:

    I tried this out and it works like a charm. It was even able to process my 1000 URLs in my sitemap like a charm.

  53. David from ninja games says:

    Jenni, Thanks for providing this for free. I have downloaded the script and I am going to test it out. Will I have to modify the PR checks or did you update the script ?
    Thanks
    David

  54. Thank you for the script helped me a lot :)
    Raquel Johnson last post: Top 4 Reasons for Buying External HDD [www.FORTRICKS.in]

  55. Özden cCan says:

    Thanks Jenny for great tip but is there any way to check pr for blogger blogs thanks :)

Speak Your Mind

*

CommentLuv badge
Please leave these two fields as-is:

This site uses KeywordLuv. Enter YourName@YourKeywords in the Name field to take advantage.