Importing Toots from Mastodon to WordPress Posts Using Google Apps Script

Importing Toots from Mastodon to WordPress Posts Using Google Apps Script

I have been archiving my online writings on a private WordPress site since 2003. This includes diaries originally written in HTML, posts from other external platforms, and content created using MovableType – all of which have been migrated to WordPress. Although I don’t post regularly and not everything is archived, I still have an impressive collection spanning 15 years, totaling approximately 2,500 articles.
Recently, I wanted to import my past toots from Mastodon into WordPress in a similar manner. I was able to achieve this using Google Apps Script. By importing my toots into my WordPress environment, I can effectively manage my data and easily search through the content.
If you are looking to do something similar with your own Mastodon toots and WordPress site, I hope this article will be a helpful resource.

Objectives

Import past toots from my Mastodon account into WordPress.
Use Google Apps Script to avoid setting up a new environment.
Set the toot’s post date as the article’s post date in WordPress.
Use the first 20 characters of the toot’s body as the WordPress article title.
If the toot’s body is empty, set the WordPress article title as “no title”.
Categorize the articles as “Toots”.
Attach images from toots to the WordPress article body.
Convert hashtags in toots to WordPress tags.
Exclude boosts and replies.

What Was Not Done

Video attachments.
Other necessary or unnecessary things I didn’t think of.

Requirements

To fetch data from Mastodon, you need an access token.

Obtaining the Mastodon Access Token

Log in to your Mastodon instance.
Select “Development” from “User Settings” and click “New Application”.
Enter an application name and select the necessary permissions (scope). For this task, “read” permission is sufficient.
Click “Submit”.
Note the access token from the application details page.

Google Apps Script Code

Write the following code in Google Apps Script. Set the URL of the Mastodon instance, the obtained access token, and the URL of the WordPress site.
Execute the main() function. It may take a few minutes to complete. If a security warning appears during execution, please allow it.

// Main function: Convert Mastodon posts to WordPress format and save as an XML file
function main() {
const instanceUrl = “https://mastodon.example.com”; // Set the URL of the Mastodon instance
const accessToken = “your_token_here”; // Set the obtained access token
const siteUrl = “https://wordpress.example.com”; // Set the URL of the WordPress site

// Fetch Mastodon posts
const posts = fetchMastodonPosts(instanceUrl, accessToken);

// Convert fetched posts to WordPress format XML
const wxrContent = createWxrFile(posts, siteUrl);

// Create and save the XML file to Google Drive
const blob = Utilities.newBlob(wxrContent, “application/xml”, “mastodon_posts.xml”);
const file = DriveApp.createFile(blob);
Logger.log(“File created: ” + file.getUrl());
}

// Function to fetch Mastodon posts
function fetchMastodonPosts(instanceUrl, accessToken) {
const headers = { Authorization: “Bearer ” + accessToken };
const options = { method: “get”, headers: headers };

// Get user ID
const userId = getUserId(instanceUrl, options);

// Fetch all posts
return fetchAllPosts(instanceUrl, userId, options);
}

// Function to get user ID
function getUserId(instanceUrl, options) {
const userResponse = UrlFetchApp.fetch(`${instanceUrl}/api/v1/accounts/verify_credentials`, options);
return JSON.parse(userResponse.getContentText()).id;
}

// Function to fetch all posts
function fetchAllPosts(instanceUrl, userId, options) {
let posts = [];
let url = `${instanceUrl}/api/v1/accounts/${userId}/statuses`;

while (url) {
const response = UrlFetchApp.fetch(url, options);
const data = JSON.parse(response.getContentText());
posts = posts.concat(data);

// Get the URL of the next page
url = getNextPageUrl(response);
}

return posts;
}

// Function to get the URL of the next page from response headers
function getNextPageUrl(response) {
const links = response.getHeaders()[“Link”];
if (links && links.includes(‘rel=”next”‘)) {
return links.match(/<(.*)>; rel=”next”/)[1];
}
return null;
}

// Function to strip HTML tags
function stripHtmlTags(str) {
if (!str) return “”;
return str.toString().replace(/<[^>]*>/g, “”);
}

// Function to convert HTML content to WordPress blocks
function convertToWordPressBlocks(htmlContent) {
return htmlContent
.replace(/<p>(.*?)</p>/g, (match, content) => `<!– wp:paragraph –>n<p>${content}</p>n<!– /wp:paragraph –>n`)
.replace(/<img src=”(.*?)” alt=”(.*?)” />/g, (match, src, alt) => `<!– wp:image –>n<figure class=”wp-block-image”><img src=”${src}” alt=”${alt}” /></figure>n<!– /wp:image –>n`);
}

// Function to format date for RSS
function formatPubDate(date) {
const myDate = new Date(date);
const days = [“Sun”, “Mon”, “Tue”, “Wed”, “Thu”, “Fri”, “Sat”];
const months = [“Jan”, “Feb”, “Mar”, “Apr”, “May”, “Jun”, “Jul”, “Aug”, “Sep”, “Oct”, “Nov”, “Dec”];
return `${days[myDate.getUTCDay()]}, ${myDate.getUTCDate().toString().padStart(2, “0”)} ${months[myDate.getUTCMonth()]} ${myDate.getUTCFullYear()} ${myDate.getUTCHours().toString().padStart(2, “0”)}:${myDate.getUTCMinutes().toString().padStart(2, “0”)}:${myDate.getUTCSeconds().toString().padStart(2, “0”)} +0000`;
}

// Function to format date for WordPress
function formatDateToWordPress(date) {
const myDate = new Date(date);
return `${myDate.getFullYear()}-${String(myDate.getMonth() + 1).padStart(2, “0”)}-${String(myDate.getDate()).padStart(2, “0”)} ${String(myDate.getHours()).padStart(2, “0”)}:${String(myDate.getMinutes()).padStart(2, “0”)}:${String(myDate.getSeconds()).padStart(2, “0”)}`;
}

// Function to create WordPress eXtended RSS (WXR) file
function createWxrFile(posts, siteUrl) {
let xml = ‘<?xml version=”1.0″ encoding=”UTF-8″ ?>n’;
xml += ‘<rss version=”2.0″ xmlns:excerpt=”http://wordpress.org/export/1.2/excerpt/” xmlns:content=”http://purl.org/rss/1.0/modules/content/” xmlns:wfw=”http://wellformedweb.org/CommentAPI/” xmlns:dc=”http://purl.org/dc/elements/1.1/” xmlns:wp=”http://wordpress.org/export/1.2/”>n’;
xml += “<channel>n”;
xml += “<wp:wxr_version>1.2</wp:wxr_version>n”;
posts.forEach((post, index) => {
xml += createWxrItem(post, index, siteUrl);
});
xml += “</channel>n”;
xml += “</rss>n”;
return xml;
}

// Function to create WXR item for each post
function createWxrItem(post, index, siteUrl) {
// Skip replies and reblogs
if (post.in_reply_to_id !== null || post.reblog !== null) {
return “”;
}

let content = post.content;
const strippedContent = stripHtmlTags(post.content);
const title = post.spoiler_text ? `⚠️${post.spoiler_text}` : post.content ? strippedContent.substring(0, 20) : “no title”;
const postDate = formatDateToWordPress(post.created_at);
const postPubDate = formatPubDate(post.created_at);

// Extract hashtags
const hashtags = extractHashtags(content);

// Convert hashtags to links
content = convertHashtagsToLinks(content, siteUrl);

// Add spoiler text if present
if (post.spoiler_text) {
content = `<p>${post.spoiler_text}</p>${content}`;
}

// Add media attachments if present
if (post.media_attachments.length > 0) {
post.media_attachments.forEach((media) => {
const alt = media.description ? media.description : “”;
if (media.type === “image”) {
content += `nn<img src=”${media.url}” alt=”${alt}” />`;
}
});
}

// Construct WXR item
let xmlItem = `
<item>
<title><![CDATA[${title}]]></title>
<content:encoded><![CDATA[${convertToWordPressBlocks(content)}]]></content:encoded>
<excerpt:encoded><![CDATA[]]></excerpt:encoded>
<pubDate><![CDATA[${postPubDate}]]></pubDate>
<dc:creator><![CDATA[dummy]]></dc:creator>
<wp:post_id>${index + 1}</wp:post_id>
<wp:post_date><![CDATA[${postDate}]]></wp:post_date>
<wp:post_date_gmt><![CDATA[${postDate}]]></wp:post_date_gmt>
<wp:post_modified><![CDATA[${postDate}]]></wp:post_modified>
<wp:post_modified_gmt><![CDATA[${postDate}]]></wp:post_modified_gmt>
<wp:post_type>post</wp:post_type>
<wp:status><![CDATA[publish]]></wp:status>
<category domain=”category” nicename=”toots”><![CDATA[Toots]]></category>
`;

// Add hashtags as WordPress tags
hashtags.forEach((tag) => {
xmlItem += ` <category domain=”post_tag” nicename=”${tag}”><![CDATA[${tag}]]></category>n`;
});

xmlItem += ” </item>n”;
return xmlItem;
}

// Function to extract hashtags from content
function extractHashtags(content) {
const regex = /<a href=”[^”]*” class=”mention hashtag” rel=”tag”>#<span>([^<]+)</span></a>/g;
const hashtags = [];
let match;
while ((match = regex.exec(content)) !== null) {
hashtags.push(match[1]);
}
return hashtags;
}

// Function to convert hashtags to WordPress links
function convertHashtagsToLinks(content, siteUrl) {
return content.replace(/<a href=”[^”]*” class=”mention hashtag” rel=”tag”>#<span>([^<]+)</span></a>/g, function (match, tag) {
const tagUrl = `${siteUrl}/tag/${encodeURIComponent(tag)}/`;
return `<a href=”${tagUrl}” class=”hashtag”>#${tag}</a>`;
});
}

Download the WXR File

After running the script, an XML file will be created in Google Drive. Download this file.

Import into WordPress

From the WordPress admin panel, go to “Tools” → “Import” and select “WordPress”. Import the WXR file.
Before importing into the production environment, be sure to verify that the import works correctly in a test environment.

Import External Images into WordPress

Initially, images in imported articles are linked directly to the Mastodon instance. To import these images into WordPress, I used the Auto Upload Images plugin. Although it seems outdated, I couldn’t find another plugin with the same functionality.

Using the Auto Upload Images Plugin

Install and activate the plugin.
From the WordPress admin panel, select “Tools” → “Replace External Images”.
Select the target articles from the post list.
Click the “Replace” button to upload the images.

Set Featured Images

After importing images, set featured images as needed. To streamline this process, I used the XO Featured Image Tools plugin.

Using XO Featured Image Tools

Install and activate the plugin.
From the WordPress admin panel, select “Tools” → “Featured Image”.
Select the target posts and click “Create Featured Image from Image”.
Featured images will be generated automatically.

Impressions

When I decided to migrate my posts, I searched for tools but couldn’t find any that met my requirements. Some tools were outdated or didn’t fully import toots as WordPress posts.
Since no suitable tools were available, I decided to create my own script with the help of Perplexity. The most challenging part was generating a minimal WordPress eXtended RSS (WXR) file for the migration.
I couldn’t find the WXR specifications, so I exported articles from WordPress and wrote the script by mimicking the exported content.
For now, the script works in my environment. Whether it will work in other environments or continue to work in the future is uncertain, but I plan to use it as needed.
Next, I plan to create a script using Google Apps Script to continuously import new posts from Mastodon.

Please follow and like us:
Pin Share