Quick and dirty script to convert WordPress export file to Blogger / Atom XML
I’ve created a Python script that converts WordPress export files to Blogger/Atom XML format. Here’s how to use it:
The script takes two command-line arguments:
wordpress_export.xml
: Path to your WordPress export XML fileblogger_export.xml
: Path where you want to save the converted Blogger/Atom XML file
To run the script:
python wordpress_to_blogger.py wordpress_export.xml blogger_export.xml
The script performs the following conversions:
- Converts WordPress posts to Atom feed entries
- Preserves post titles, content, publication dates, and authors
- Maintains categories as Atom categories
- Handles post status (published/draft)
- Preserves HTML content formatting
- Converts dates to ISO format required by Atom
The script uses Python’s built-in xml.etree.ElementTree
module for XML processing and includes error handling to make it robust.
Some important notes:
- The script only converts posts (not pages or other content types)
- It preserves the HTML content of your posts
- It maintains the original publication dates
- It handles both published and draft posts
- The output is a valid Atom XML feed that Blogger can import
The file:
[python]#!/usr/bin/env python3
import xml.etree.ElementTree as ET
import sys
import argparse
from datetime import datetime
import re
def convert_wordpress_to_blogger(wordpress_file, output_file):
# Parse WordPress XML
tree = ET.parse(wordpress_file)
root = tree.getroot()
# Create Atom feed
atom = ET.Element(‘feed’, {
‘xmlns’: ‘http://www.w3.org/2005/Atom’,
‘xmlns:app’: ‘http://www.w3.org/2007/app’,
‘xmlns:thr’: ‘http://purl.org/syndication/thread/1.0’
})
# Add feed metadata
title = ET.SubElement(atom, ‘title’)
title.text = ‘Blog Posts’
updated = ET.SubElement(atom, ‘updated’)
updated.text = datetime.now().isoformat()
# Process each post
for item in root.findall(‘.//item’):
if item.find(‘wp:post_type’, {‘wp’: ‘http://wordpress.org/export/1.2/’}).text != ‘post’:
continue
entry = ET.SubElement(atom, ‘entry’)
# Title
title = ET.SubElement(entry, ‘title’)
title.text = item.find(‘title’).text
# Content
content = ET.SubElement(entry, ‘content’, {‘type’: ‘html’})
content.text = item.find(‘content:encoded’, {‘content’: ‘http://purl.org/rss/1.0/modules/content/’}).text
# Publication date
pub_date = item.find(‘pubDate’).text
published = ET.SubElement(entry, ‘published’)
published.text = datetime.strptime(pub_date, ‘%a, %d %b %Y %H:%M:%S %z’).isoformat()
# Author
author = ET.SubElement(entry, ‘author’)
name = ET.SubElement(author, ‘name’)
name.text = item.find(‘dc:creator’, {‘dc’: ‘http://purl.org/dc/elements/1.1/’}).text
# Categories
for category in item.findall(‘category’):
category_elem = ET.SubElement(entry, ‘category’, {‘term’: category.text})
# Status
status = item.find(‘wp:status’, {‘wp’: ‘http://wordpress.org/export/1.2/’}).text
if status == ‘publish’:
app_control = ET.SubElement(entry, ‘app:control’, {‘xmlns:app’: ‘http://www.w3.org/2007/app’})
app_draft = ET.SubElement(app_control, ‘app:draft’)
app_draft.text = ‘no’
else:
app_control = ET.SubElement(entry, ‘app:control’, {‘xmlns:app’: ‘http://www.w3.org/2007/app’})
app_draft = ET.SubElement(app_control, ‘app:draft’)
app_draft.text = ‘yes’
# Write the output file
tree = ET.ElementTree(atom)
tree.write(output_file, encoding=’utf-8′, xml_declaration=True)
def main():
parser = argparse.ArgumentParser(description=’Convert WordPress export to Blogger/Atom XML format’)
parser.add_argument(‘wordpress_file’, help=’Path to WordPress export XML file’)
parser.add_argument(‘output_file’, help=’Path to output Blogger/Atom XML file’)
args = parser.parse_args()
try:
convert_wordpress_to_blogger(args.wordpress_file, args.output_file)
print(f"Successfully converted {args.wordpress_file} to {args.output_file}")
except Exception as e:
print(f"Error: {str(e)}")
sys.exit(1)
if __name__ == ‘__main__’:
main()[/python]