Saturday, March 05, 2022

google apps script to turn blog posts into an ebook - calling blogger api

Google Docs have an EPUB export option. So, if we get the html content of the blog into a google doc, we can export as an ebook. Some blogger blogs have good content, but have some sort of html / CSS / char encoding issues so that merely doing

var response = UrlFetchApp.fetch(urls[i], { muteHttpExceptions: true });
 var ablob = response.getBlob();
      var AssetGDocId = Drive.Files.insert(
        { title: ' temp' + i + '.html', 
        mimeType: MimeType.GOOGLE_DOCS },
        ablob
      ).id;

results in a google doc with a single character per line!

So, going the blogger api route. But there are certain complications.

Blogger is not one of the services which are directly available via the "Add a service" dialog box,

When following the example at
needed to make a couple of changes as mentioned at
making the headers variable
Authorization: "Bearer " + ScriptApp.getOAuthToken(),
and adding 
"oauthScopes": [
    "https://www.googleapis.com/auth/drive",
    "https://www.googleapis.com/auth/blogger",
    "https://www.googleapis.com/auth/script.external_request"
  ],
to the manifest. To add to the manifest, we have to make it visible in the editor via the settings tab,




Further, the default apps script project does not have the blogger api automatically enabled as for other advanced services - we have to create a project and add the blogger api (and drive api in this case). We can ignore creating credentials - the app script will automatically create the oauth credentials. 

Once we create a new project at console.cloud.google.com, we have to add the required apis using the APIs and services tab on the left-hand side - Blogger api and Drive api in this case. And then associate the google apps script with that project under the script settings, Change project. The project number can be found from the Home -> Dashboard page, in the project info card of console.cloud.google.com.




A caution here - if creating credentials or editing the consent screen - sometimes there are errors creating credentials, and one of the triggers seems to be the use of multiple logins. If we are logged in to the cloud console by appending &authuser=2 (or something like that) to the url, like 
then credentials creation can run into all sorts of unexplained errors. "There was an error creating credentials...."
It would be safer to log in with only one account, perhaps by using incognito mode. Or maybe this firebase trick or other issues in this post can help in some cases.

After creating the consent screen, we have to add the scopes needed - again, drive and blogger apis in this case. And also add our own email id as a test user. 



Finally, after getting the api calls to work, I need to find out how to get all the posts without timing out. One way might be to use the search call instead of the list call and get "all posts labelled with 'suitable label' in September 2013" or something like that. That would probably be covered in a separate post here.

Edit: Getting all the posts with a particular label or a particular date range is a good way to narrow down the number of posts, and that is how I implemented. But it turned out that all this work of adding the blogger api etc was not required after all - the blogger atom feed api doesn't need any authentication for public blogs. 


No comments:

Post a Comment