April 22nd, 2009, by alex

Using the Google Analytics API - getting total number of page views

At long last, Google released the Google Analytics API.  The timing couldn’t be better, since I was just trying to get to some information through screen scraping… which is never fun.

The API is pretty easy to use, and other than a typo which slowed me down way too much, it didn’t take long to write a simple PHP script to get the total number of page views across all my Analytics profiles.  This is a quick tutorial for using the API for this simple purpose.  Also check out the official API documentation.

The basic steps involved are:

1. Authenticate the user and get a one-time token from Google
2. Exchange the one-time token for a session token, which does not expire
3. Retrieve and parse a list of the user’s Google Analytics accounts and profiles
4. Retrieve and parse the page view count for each profile
5. Done!

1. Authenticate the user and get a one-time token from Google

The first step is authenticating the user.  Google offers several authentication methods, and the simplest to use for my purposes is AuthSub, which asks the user to login on Google’s site, and sends the user - along with an authentication token - back to my script. This means I never have to directly handle the user’s login and password.  The link presented to the user can be something like this:

<a href="https://www.google.com/accounts/AuthSubRequest?next=http://www.alexc.me/pageviewcounts.php
&amp;scope=https://www.google.com/analytics/feeds/
&amp;secure=0&amp;session=1">Click here to authenticate through Google.</a>

(my blog mangles some of this code; until I get it sorted out, there’s a complete version of the script at the end of this post).

The “next” parameter in this link specifies where the user should be forwarded after authenticating - here, I set it to the URL of my script. Google forwards to the address in the “next” parameter, and adds a “token” parameter with the authentication token. This means the user will be sent back to something like

http://www.alexc.me/pageviewcounts.php?token=CNK******__8B

The authentication token is used by adding an “Authorization” field to the header of all the GET or POST requests sent to Google’s API. A simple way to do this in PHP, using the cURL library, is:

	function make_api_call($url, $token)
	{
		$ch = curl_init();
		curl_setopt($ch, CURLOPT_URL, $url);
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		$curlheader[0] = sprintf("Authorization: AuthSub token=\"%s\"/n", $token);
		curl_setopt($ch, CURLOPT_HTTPHEADER, $curlheader);
		$output = curl_exec($ch);
		curl_close($ch);
		return $output;
	}

2. Exchange the one-time token for a session token, which does not expire

Now, the token returned above is only valid for one API call - our script will make several, so the next step is to exchange the one-time token for a session token, which does not expire. This can only happen if the “session=1″ parameter was set in the original URL which sent the user to Google.

	function get_session_token($onetimetoken) {
		$output = make_api_call("https://www.google.com/accounts/AuthSubSessionToken", $onetimetoken);

		if (preg_match("/Token=(.*)/", $output, $matches))
		{
			$sessiontoken = $matches[1];
		} else {
			echo "Error authenticating with Google.";
			exit;
		}
		return $sessiontoken;
	}

We now have everything we need to start using the Google Analytics API!

3. Retrieve and parse a list of the user’s Google Analytics accounts and profiles

Since the user can have a number of different accounts and profiles, and we need to know the IDs for these profiles before we can do anything, the first API call should retrieve the list of accounts and profiles:

		$accountxml = make_api_call("https://www.google.com/analytics/feeds/accounts/default", $sessiontoken);

As specified in the Google Analytics API docs, this should return an XML response similar to the following:

<?xml version="1.0" ?>
<feed xmlns='http://www.w3.org/2005/Atom'
  xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'>
  <id>http://www.google.com/analytics/feeds/accounts/liz@gmail.com</id>
  <updated>2008-09-13T16:12:49.000-07:00</updated>
  <title type="text">Account list for liz@gmail.com.</title>
  <link href="http://www.google.com/analytics/feeds/accounts/liz@gmail.com"
        rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml"/>
  <link href="http://www.google.com/analytics/feeds/accounts/liz@gmail.com"
        rel="self" type="application/atom+xml"/>
  <author>
    <name>Google Analytics</name>
  </author>
  <generator version="1.0">Google Analytics</generator>
  <openSearch:totalResults>4</openSearch:totalResults>
  <openSearch:startIndex>1</openSearch:startIndex>
  <openSearch:itemsPerPage>4</openSearch:itemsPerPage>
  <entry>
    <id>http://www.google.com/analytics/feeds/accounts/ga:4321</id>
    <updated>2008-09-03T10:55:54.000-07:00</updated>
    <title type="text">Darcy's Blog</title>
    <link href="http://www.google.com/analytics/feeds/accounts/liz%40gmail.com"
          rel="self" type="application/atom+xml"/>
    <dxp:property name='ga:accountId' value='12345'/>
    <dxp:property name='ga:accountName' value='Pride and Prejudice'/>
    <dxp:property name='ga:profileId' value='4321'/>
    <dxp:property name='ga:webPropertyId' value='UA-12345-1'/>
    <dxp:tableId>ga:4321</dxp:tableId>
  </entry>
  <entry>
    <id>http://www.google.com/analytics/feeds/accounts/ga:5555</id>
    <updated>2008-09-03T10:55:54.000-07:00</updated>
    <title type="text">Jane's Blog</title>
    <link href="http://www.google.com/analytics/feeds/accounts/liz%40gmail.com"
          rel="self" type="application/atom+xml"/>
    <dxp:property name='ga:accountId' value='12345'/>
    <dxp:property name='ga:accountName' value='Pride and Prejudice'/>
    <dxp:property name='ga:profileId' value='5555'/>
    <dxp:property name='ga:webPropertyId' value='UA-12345-2'/>
    <dxp:tableId>ga:5555</dxp:tableId>
  </entry>
 <entry>
    <id>http://www.google.com/analytics/feeds/accounts/ga:2222</id>
    <updated>2007-02-14T14:10:07.000-08:00</updated>
    <title type="text">Austen's Most-Adored Website</title>
    <link href="http://www.google.com/analytics/feeds/accounts/liz%40gmail.com"
          rel="self" type="application/atom+xml"/>
    <dxp:property name='ga:accountId' value='54321'/>
    <dxp:property name='ga:accountName' value='Jane Austen'/>
    <dxp:property name='ga:profileId' value='2222'/>
    <ga:webPropertyId>UA-54321-1</ga:webPropertyId>
    <dxp:tableId>ga:2222</dxp:tableId>
 </entry>
 <entry>
    <id>http://www.google.com/analytics/feeds/accounts/ga:3333</id>
    <updated>2007-02-14T14:10:07.000-08:00</updated>
    <title type="text">The Jane Austen Bookstore</title>
    <link href="http://www.google.com/analytics/feeds/accounts/liz%40gmail.com"
          rel="self" type="application/atom+xml"/>
    <dxp:property name='ga:accountId' value='54321'/>
    <dxp:property name='ga:accountName' value='Jane Austen'/>
    <dxp:property name='ga:profileId' value='3333'/>
    <dxp:property name='ga:webPropertyId' value='UA-54321-2'/>
    <dxp:tableId>ga:3333</dxp:tableId>
 </entry>
</feed>

This can be processed through whatever XML means you’re comfortable with. I’m using the following PHP code to extract the parts I need into an array:

	function parse_account_list($xml)
	{
		$doc = new DOMDocument();
		$doc->loadXML($xml);
		$entries = $doc->getElementsByTagName('entry');
		$i = 0;
		$profiles = array();
		foreach($entries as $entry)
		{
			$profiles[$i] = array();

			$title = $entry->getElementsByTagName('title');
			$profiles[$i]["title"] = $title->item(0)->nodeValue;

			$entryid = $entry->getElementsByTagName('id');
			$profiles[$i]["entryid"] = $entryid->item(0)->nodeValue;

			$properties = $entry->getElementsByTagName('property');
			foreach($properties as $property)
			{
				if (strcmp($property->getAttribute('name'), 'ga:accountId') == 0)
					$profiles[$i]["accountId"] = $property->getAttribute('value');

				if (strcmp($property->getAttribute('name'), 'ga:accountName') == 0)
					$profiles[$i]["accountName"] = $property->getAttribute('value');

				if (strcmp($property->getAttribute('name'), 'ga:profileId') == 0)
					$profiles[$i]["profileId"] = $property->getAttribute('value');

				if (strcmp($property->getAttribute('name'), 'ga:webPropertyId') == 0)
					$profiles[$i]["webPropertyId"] = $property->getAttribute('value');
			}

			$tableId = $entry->getElementsByTagName('tableId');
			$profiles[$i]["tableId"] = $tableId->item(0)->nodeValue;

			$i++;
		}
		return $profiles;
	}

4. Retrieve and parse the page view count for each profile

All that’s left now is going through each account and getting the number of pageviews. This is done through a call to https://www.google.com/analytics/feeds/data. The parameters of interest are “ids”, which should correspond to the dxp:tableId node in the above XML; “metrics=ga:pageviews”, which specifies that we’re interested in page views; and start-date and end-date. There is more information on the possible parameters in the API docs.

I am simply making an API call for each profile - I’m keeping the code simple here, but Google currently has a limit of 100 requests every 10 seconds, so any production code should consider the case of accounts with large numbers of profiles. I haven’t yet tried specifying more than one profile id in the “ids” parameter.

		$totalviews = 0;

		foreach($profiles as $profile)
		{
			// For each profile, get number of pageviews
			$requrl = sprintf("https://www.google.com/analytics/feeds/data?ids=%s&amp;metrics=ga:pageviews&amp;start-date=2007-06-01&amp;end-date=2009-04-21", $profile["tableId"]);
			$pagecountxml = make_api_call($requrl, $sessiontoken);

			$doc = new DOMDocument();
			$doc->loadXML($pagecountxml);

			$metrics = $doc->getElementsByTagName("metric");
			$views = $metrics->item(0)->getAttribute('value');
			$totalviews = $totalviews + $views;

			echo $profile["title"] . ": " . number_format($views) . "<br />";

			// echo $output2."<br />";
		}

		echo "Total views: " . number_format($totalviews);

5. Done!

And that’s it! With some minor tweaks, the entire script is:

Right-click and “save as”

You can try the live version here:

Clicky clicky

Note that this will send you to a link from Google asking you to grant a session token to the script, and it will show the information in YOUR Google Analytics account; the script doesn’t actually store the token, but if it did, that would provide it with unlimited read access to the data in your Google Analytics acccount, until you revoke the token from your Google Accounts page. If you’re not comfortable with this, download the script from the link above instead, and run it on your own server.

Coming up next: converting this script into a WordPress plugin.


72 comments Subscribe Comments

  1. Thanks for the great tutorial!
    I’m a PHP guy and a GA guy, but not an API guy :)

    This really helped!

  2. By JanOS on April 22nd 2009

    Great Job!

    I have a problem:
    “AuthSub target path prefix does not match the provided “next” URL.”

  3. JanOS: not sure what the problem is, but I am guessing the “next” parameter in the link to https://www.google.com/accounts/AuthSubRequest is somehow invalid.

    In the full script, I’m using the full_url() function to figure out the location of the PHP script and using that as the “next” parameter - I haven’t tested this extensively, it’s possible it doesn’t work as expected on other servers.

    If you’re running the script on your own server, can you confirm that the target of the “Click here to authenticate through Google.” link contains a “next” parameter which looks like the full valid URL to your (publicly available - since Google needs to forward the user back to it) script?

  4. By JanOS on April 22nd 2009

    full_url() works but i cant run your script in my server (Dreamhost)

  5. Can’t think of any reason the script wouldn’t work on Dreamhost. I’m using cURL, but I think Dreamhost supports it.

    I’m still shooting in the dark here, but do you have your domain registered as a web application with Google by any chance? (http://code.google.com/apis/accounts/docs/RegistrationForWebAppsAuto.html)

    If so, this page (http://drupal.org/node/432764) seems to suggest that my script won’t work, because it requests a non-secure token.

  6. Why when I reload the page after getting access to the account I get a “Error authenticating with Google.”?
    Session doesn’t work?

  7. By JanOS on April 23rd 2009

    Thxs! alex!

    my problem http://code.google.com/apis/accounts

  8. Giovanni,

    The script isn’t really meant for serious use - the biggest thing missing is that it doesn’t store the session token anywhere. When you reload the page, the one-time token received from Google is still in the URL, so the script tries to get a second session token using the same one-time token. To get around this load the script again without the ?token=****** part of the URL (this is still not ideal, because it requests a new session token every time the script is loaded).

  9. By clair on April 23rd 2009

    I always get a “Error authenticating with Google” if I download the script and run it in my server , but It works when it runs in your server.

  10. By clair on April 23rd 2009

    In this funtion:
    function make_api_call($url, $token)
    {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $curlheader[0] = sprintf(”Authorization: AuthSub token=\”%s\”/n”, $token);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $curlheader);
    $output = curl_exec($ch);
    curl_close($ch);
    return $output;
    }

    After this sentense:
    $output = curl_exec($ch);
    if we add this sentense,
    echo $output;
    it shows nothing?

  11. By Seb on April 24th 2009

    You’ve done a really nice job ! Thanks for that!

    clair, I guess your are using your script on a localhost server. So it sends instructions and parameters to Google Analytics who sends you the xml response on… localhost, address unknown by Google. That’s why your “$output” is empty whereas it works on “http://www.alexc.me”. I had the same problem yesterday.
    Put your script on a real hosted site, accessible from a web address and it should be ok.

    Seb

    PS : I’m french so excuse my approximate english.

  12. By clair on April 24th 2009

    Thx, Seb.
    Your guess is right. I’m using my script on my localhost server. But it works when I use JavaScript API of Google analytics data export.

    And the other problem is that I don’t have a real hosted site. I’m doing my internship, and I can just test it on my localhost :(

    PS: Your english is better than mine. I’m chinese and now I study in France:)

  13. Why don’t expand this script?

  14. By Seb on April 24th 2009

    What for and how do you think we can do it ?

  15. I think to create a complete PHP to analize all statistics.

  16. By Seb on April 24th 2009

    It’s a good idea Giovanni, I’m adapting this script to use it as a PHP class. Like you, I would like to get all the GA stats.

    For now, my only way is to send 1 URL request for 1 indicator, that’s quite unacceptable… =)
    $requrl = sprintf(”https://www.google.com/analytics/feeds/data?ids=%s&metrics=ga:pageviews&start-date=2009-01-01&end-date=2009-04-24″, $profile["tableId"]);
    $requrl_entrances = sprintf(”https://www.google.com/analytics/feeds/data?ids=%s&metrics=ga:entrances&start-date=2009-01-01&end-date=2009-04-24″, $profile["tableId"]);
    If one of you have an idea about this, I’m listening.

    clair for your other problem, I can create a domain on my private server and give you a FTP access. Obviously it’s free! ;) Here is my address if you want : sebastien.fabiani@viacesi.fr

  17. I’ve added dimensions and metrics, filtered by source to get visits. You can see the result here: http://www.lacompagniadelcavatappi.com/google/analytics/index2.php. I’ll send you my php file.

  18. By Seb on April 27th 2009

    Hi everybody !
    I tried your link but the only thing I could see was the name of my web site. There must be a problem, but it may come from me :s I don’t know…

  19. By Jake on April 27th 2009

    I uploaded this to my server (not localhost, so I am pretty sure it’s a different problem than clair) and I get the following message when I click through. Any thoughts would be appreicated.

    The page you have requested cannot be displayed. Another site was requesting access to your Google Account, but sent a malformed request. Please contact the site that you were trying to use when you received this message to inform them of the error. A detailed error message follows:

    The site “http://mywebsite.com” has not been registered.

  20. By Jake on April 28th 2009

    Solved the problem I was having, turns out it was due to some weirdness left over from when we originally registered google apps for our domain. If you curious, full explanation is written up here: http://themetricsystem.rjmetrics.com/2009/04/27/google-analytics-api-snags-malformed-request-the-site-has-not-been-registered/

  21. Jake, thanks for posting that solution.

  22. By dizzu on April 29th 2009

    Giovanni Putignano :
    I’ve added dimensions and metrics, filtered by source to get visits. You can see the result here: http://www.lacompagniadelcavatappi.com/google/analytics/index2.php. I’ll send you my php file.

    Could you send it to me at dizzu333@yahoo.com, please?

  23. By Siam on May 6th 2009

    Hi, great stuff!

    Any suggestions as to how I would go about if I wanted to create an RSS feed with a few selected metrics?

    Cheers,
    Siam

  24. By carstep on May 12th 2009

    Hi there,

    thanks for sharing this script, I am wondering why do I get some missing namespace prefixes of dxp and ga as warnings? I suppose there is are thwo namespaces missing by the xml content from google. Any thoughts about this?

    r. Sandor

  25. By Rob on May 12th 2009

    Can someone please post a solution for getting this script to work via session, rather than the one-time token? I would like to be able to reload the page without having to grant access each time.

  26. Hi,

    in function make_api_call:
    Is the “/n” supposed to be “\r\n” ?

    $curlheader[0] = sprintf(”Authorization: AuthSub token=\”%s\”/n”, $token);

    For some reason my key gets invalidated after the first attempt.
    I have tried with local domain and a live one.
    The authorized script is located in /test/ga/

    Slavi

  27. Update.
    I also tried your script and it needs to be reauthenticated after the first attempt.
    Maybe Google did some updates…

  28. Update2:

    Of course! The answer is in the docs. Pay special attention to “one-time”

    —– quote from authorization page —-
    nnnnnnnn.com is only requesting one-time access. If it needs access in the future, you will be prompted again for permission. nnnnnnnn.com will not have access to your password or any other personal information from your Google Account. Learn more
    —– /quote from authorization page —-

  29. Hi, I’m having a bit of trouble :(

    Every time I use this script on my server I get the error: “Error authenticating with Google.”

    I’ve tried several things and every time I still get this error.

    I’ve traced it down to getting noting returned from the CURL call for get_session_token, which might mean I get a 401 error.

    But I don’t understand why I’m getting this error and it’s not working. It works fine on your site and I’ve used CURL fine with other sites before.

    Any ideas?

  30. By Brett on October 5th 2009

    I’ve been trying to use this: http://www.swis.nl/ga/ and I keep getting Bad Authentication error. When I print out the array. I’m getting a 403 error. Could someone help me? I don’t know what to try next. Please help. Thanks

  31. Great script, I would like to know how to automate it so that the token is stored, and not required to authenticate every time. Is there anyway to input my id, email address, and password to “auto-authenticate”?

  32. By Abhirup Kundu on October 7th 2009

    Hi,

    I have a following use case:
    My application is a java based one. I want to use the google analytics APIs (similar to the way you have used them) to populate or fetch the data as a part of my application (eg. page visit count, etc.) Can I use the same API with java code? If not, how can I use these APIs for my purpose?

    Thanks in advance.

  33. By Abhirup Kundu on October 7th 2009

    Another question which I have is:

    the ‘make_api_call()’ which you are using.. how have you decided upon the parameter? Eg.

    make_api_call(”https://www.google.com/analytics/feeds/accounts/default”, $sessiontoken);

    What is the exact way of calling the google analytics APIs from our own application?

  34. By Mark Schenkel on November 5th 2009

    It took me a while to get this going; and I had to add an option. I too kept getting the “Error authenticating with Google” error, and curl_exec was returning nothing. The fix:

    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);

  35. Thanks Alec. Is there any method to display many profiles in a graph/report?

  36. By Adi Luhung on November 13th 2009

    hi.. thanks for sharing.. now i can retrive the visit count number after struggling for a while.. :) nice tutorial.. :)

    btw, there’s a typo on Code Snippet on Section 4 above, it written:
    $requrl = sprintf(”https://www.google.com/analytics/feeds/data?ids=%s&;metrics=ga:pageviews&s

    sholuld be:
    $requrl = sprintf(”https://www.google.com/analytics/feeds/data?ids=%s&metrics=ga:pageviews&s

    without ; after ids=%s&

    thanks again for the article

  37. i also have meet the problem
    “Error authenticating with Google.”

  38. Thanks for the info on this Alec. I have actually used some of the expertise to build a service of sorts which allows graphs and reports to be embedded into websites. The site is:
    http://www.embeddedanalytics.com

    It targets the non-technical/non-programmer community.

  39. By bharti on February 25th 2010

    Thanks for the info its worked out for me.. Cool stuff.

  40. By bharti on February 25th 2010

    here i could see only data from firefox and IE. Is there anyway i can get the data from all the browsers

  41. grate article but i don’t solve my problem

  42. Pretty good post But I can not reach some media files of your blog in using Mozilla Firefox browser..

  43. By voopeel on January 9th 2011

    Hello, thank you for big help. I tried to use Zend framework since it is recommended (for php development) and had some problems to change one-time token into session token. With standard curl sollution you put in that post it works very good.

  44. Hi! trying to add dimensions too, would you mind sending me a copy? thanks in advance

    Giovanni Putignano :
    I’ve added dimensions and metrics, filtered by source to get visits. You can see the result here: http://www.lacompagniadelcavatappi.com/google/analytics/index2.php. I’ll send you my php file.

  45. I am trying to make this work through the Drupal framework but it is giving authentication errors.

  46. By rumel on March 3rd 2011

    How to process this code with codeigniter. I get the Authtoken. but i don’t process this code. so i not connect to the google service. please anyone help me.

    GET /accounts/AuthSubSessionToken HTTP/1.1
    Content-Type: application/x-www-form-urlencoded
    User-Agent: Java/1.5.0_06
    Host: http://www.google.com
    Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
    Connection: keep-alive
    Authorization: AuthSub token=”token” data=”GET https://www.google.com/accounts/AuthSubSessionToken 1148503696 15948652339726849410″ sig=”MCwCFrV93K4agg==” sigalg=”rsa-sha1″

  47. By rumel on March 3rd 2011

    I get the token from google, now how can i connect the google api

  48. The API has not yet been updated with the upgrade to 20 goals so you can currently only use the API on the first 4 goals but hopefully that update won’t be far away.

  49. Heya. I was contemplating adding a link back to your website since both of our sites are based mostly around the same subject. Would you prefer I link to you using your website address: http://www.alexc.me/using-the-google-analytics-api-getting-total-number-of-page-views/74 or web site title: Using the Google Analytics API - getting total number of page views | Alex Curelea’s Dev Log. Please make sure to let me know at your earliest convenience. Thank you


Add your comment